mirror of https://github.com/grpc/grpc.git
parent
18f5b66b94
commit
14ea772bfd
1 changed files with 158 additions and 0 deletions
@ -0,0 +1,158 @@ |
||||
# Combiner Explanation |
||||
## Talk by ctiller, notes by vjpai |
||||
|
||||
Typical way of doing critical section |
||||
|
||||
``` |
||||
mu.lock() |
||||
do_stuff() |
||||
mu.unlock() |
||||
``` |
||||
|
||||
An alternative way of doing it is |
||||
|
||||
``` |
||||
class combiner { |
||||
run(f) { |
||||
mu.lock() |
||||
f() |
||||
mu.unlock() |
||||
} |
||||
mutex mu; |
||||
} |
||||
|
||||
combiner.run(do_stuff) |
||||
``` |
||||
|
||||
If you have two threads calling combiner, there will be some kind of |
||||
queuing in place. It's called `combiner` because you can pass in more |
||||
than one do_stuff at once and they will run under a common `mu`. |
||||
|
||||
The implementation described above has the issue that you're blocking a thread |
||||
for a period of time, and this is considered harmful because it's an application thread that you're blocking. |
||||
|
||||
Instead, get a new property: |
||||
* Keep things running in serial execution |
||||
* Don't ever sleep the thread |
||||
* But maybe allow things to end up running on a different thread from where they were started |
||||
* This means that `do_stuff` doesn't necessarily run to completion when `combiner.run` is invoked |
||||
|
||||
``` |
||||
class combiner { |
||||
mpscq q; // multi-producer single-consumer queue can be made non-blocking |
||||
state s; // is it empty or executing |
||||
|
||||
run(f) { |
||||
if (q.push(f)) { |
||||
// q.push returns true if it's the first thing |
||||
while (q.pop(&f)) { // modulo some extra work to avoid races |
||||
f(); |
||||
} |
||||
} |
||||
} |
||||
} |
||||
``` |
||||
|
||||
The basic idea is that the first one to push onto the combiner |
||||
executes the work and then keeps executing functions from the queue |
||||
until the combiner is drained. |
||||
|
||||
Our combiner does some additional work, with the motivation of write-batching. |
||||
|
||||
We have a second tier of `run` called `run_finally`. Anything queued |
||||
onto `run_finally` runs after we have drained the queue. That means |
||||
that there is essentially a finally-queue. This is not guaranteed to |
||||
be final, but it's best-effort. In the process of running the finally |
||||
item, we might put something onto the main combiner queue and so we'll |
||||
need to re-enter. |
||||
|
||||
`chttp2` runs all ops in the run state except if it sees a write it puts that into a finally. That way anything else that gets put into the combiner can add to that write. |
||||
|
||||
``` |
||||
class combiner { |
||||
mpscq q; // multi-producer single-consumer queue can be made non-blocking |
||||
state s; // is it empty or executing |
||||
queue finally; // you can only do run_finally when you are already running something from the combiner |
||||
|
||||
run(f) { |
||||
if (q.push(f)) { |
||||
// q.push returns true if it's the first thing |
||||
loop: |
||||
while (q.pop(&f)) { // modulo some extra work to avoid races |
||||
f(); |
||||
} |
||||
while (finally.pop(&f)) { |
||||
f(); |
||||
} |
||||
goto loop; |
||||
} |
||||
} |
||||
} |
||||
``` |
||||
|
||||
So that explains how combiners work in general. In gRPC, there is |
||||
`start_batch(..., tag)` and then work only gets activated by somebody |
||||
calling `cq::next` which returns a tag. This gives and API-level |
||||
guarantee that there will be a thread doing polling to actually make |
||||
work happen. However, some operations are not covered by a poller |
||||
thread, such as cancellation that doesn't have a completion. Other |
||||
callbacks that don't have a completion are the internal work that gets |
||||
done before the batch gets completed. We need a condition called |
||||
`covered_by_poller` that means that the item will definitely need some |
||||
thread at some point to call `cq::next` . This includes those |
||||
callbacks that directly cause a completion but also those that are |
||||
indirectly required before getting a completion. If we can't tell for |
||||
sure for a specific path, we have to assumed it is not covered by |
||||
poller. |
||||
|
||||
The above combiner has the problem that it keeps draining for a |
||||
potentially infinite amount of time and that can lead to a huge tail |
||||
latency for some operations. So we can tweak it by returning to the application |
||||
if we know that it is valid to do so: |
||||
|
||||
``` |
||||
while (q.pop(&f)) { |
||||
f(); |
||||
if (control_can_be_returned && some_still_queued_thing_is_covered_by_poller) { |
||||
offload_combiner_work_to_some_other_thread(); |
||||
} |
||||
} |
||||
``` |
||||
|
||||
`offload` is more than `break`; it does `break` but also causes some |
||||
other thread that is currently waiting on a poll to break out of its |
||||
poll. This is done by setting up a per-polling-island work-queue |
||||
(distributor) wakeup FD. The work-queue is the converse of the combiner; it |
||||
tries to spray events onto as many threads as possible to get as much concurrency as possible. |
||||
|
||||
So `offload` really does: |
||||
|
||||
``` |
||||
distributor.run(continue_from_while_loop); |
||||
break; |
||||
``` |
||||
|
||||
This needs us to add another class variable for a `distributor` |
||||
(called a `workqueue` in the code). |
||||
|
||||
``` |
||||
workqueue::run(f) { |
||||
q.push(f) |
||||
eventfd.wakeup() |
||||
} |
||||
|
||||
workqueue::readable() { |
||||
eventfd.consume(); |
||||
q.pop(&f); |
||||
f(); |
||||
if (!q.empty()) { |
||||
eventfd.wakeup(); // spray across as many threads as are waiting on this workqueue |
||||
} |
||||
} |
||||
``` |
||||
|
||||
In principle, `run_finally` could get starved, but this hasn't |
||||
happened in practice. If we were concerned about this, we could put a |
||||
limit on how many things come off the regular `q` before the `finally` |
||||
queue gets processed. |
||||
|
Loading…
Reference in new issue