mirror of https://github.com/grpc/grpc.git
commit
59f347e125
260 changed files with 8707 additions and 4575 deletions
@ -0,0 +1,121 @@ |
||||
# `epoll`-based pollset implementation in gRPC |
||||
|
||||
Sree Kuchibhotla (sreek@) [May - 2016] |
||||
(Design input from Craig Tiller and David Klempner) |
||||
|
||||
> Status: As of June 2016, this change is implemented and merged. |
||||
|
||||
> * The bulk of the functionality is in: [ev_poll_linux.c](https://github.com/grpc/grpc/blob/master/src/core/lib/iomgr/ev_epoll_linux.c) |
||||
> * Pull request: https://github.com/grpc/grpc/pull/6803 |
||||
|
||||
## 1. Introduction |
||||
The document talks about the proposed changes to `epoll`-based implementation of pollsets in gRPC. Section-2 gives an overview of the current implementation, Section-3 talks about the problems in the current implementation and finally Section-4 talks about the proposed changes. |
||||
|
||||
## 2. Current `epoll`-based implementation in gRPC |
||||
|
||||
![image](images/old_epoll_impl.png) |
||||
|
||||
**Figure 1: Current implementation** |
||||
|
||||
A gRPC client or a server can have more than one completion queue. Each completion queue creates a pollset. |
||||
|
||||
The gRPC core library does not create any threads[^1] on its own and relies on the application using the gRPC core library to provide the threads. A thread starts to poll for events by calling the gRPC core surface APIs `grpc_completion_queue_next()` or `grpc_completion_queue_pluck()`. More than one thread can call `grpc_completion_queue_next()`on the same completion queue[^2]. |
||||
|
||||
A file descriptor can be in more than one completion queue. There are examples in the next section that show how this can happen. |
||||
|
||||
When an event of interest happens in a pollset, multiple threads are woken up and there are no guarantees on which thread actually ends up performing the work i.e executing the callbacks associated with that event. The thread that performs the work finally queues a completion event `grpc_cq_completion` on the appropriate completion queue and "kicks" (i.e wakes ups) the thread that is actually interested in that event (which can be itself - in which case there is no thread hop) |
||||
|
||||
For example, in **Figure 1**, if `fd1` becomes readable, any one of the threads i.e *Threads 1* to *Threads K* or *Thread P*, might be woken up. Let's say *Thread P* was calling a `grpc_completion_queue_pluck()` and was actually interested in the event on `fd1` but *Thread 1* woke up. In this case, *Thread 1* executes the callbacks and finally kicks *Thread P* by signalling `event_fd_P`. *Thread P* wakes up, realizes that there is a new completion event for it and returns from `grpc_completion_queue_pluck()` to its caller. |
||||
|
||||
## 3. Issues in the current architecture |
||||
|
||||
### _Thundering Herds_ |
||||
|
||||
If multiple threads concurrently call `epoll_wait()`, we are guaranteed that only one thread is woken up if one of the `fds` in the set becomes readable/writable. However, in our current implementation, the threads do not directly call a blocking `epoll_wait()`[^3]. Instead, they call `poll()` on the set containing `[event_fd`[^4]`, epoll_fd]`. **(see Figure 1)** |
||||
|
||||
Considering the fact that an `fd` can be in multiple `pollsets` and that each `pollset` might have multiple poller threads, it means that whenever an `fd` becomes readable/writable, all the threads in all the `pollsets` (in which that `fd` is present) are woken up. |
||||
|
||||
The performance impact of this would be more conspicuous on the server side. Here are a two examples of thundering herds on the server side. |
||||
|
||||
Example 1: Listening fds on server |
||||
|
||||
* A gRPC server can have multiple server completion queues (i.e completion queues which are used to listen for incoming channels). |
||||
* A gRPC server can also listen on more than one TCP-port. |
||||
* A listening socket is created for each port the gRPC server would be listening on. |
||||
* Every listening socket's fd is added to all the server completion queues' pollsets. (Currently we do not do any sharding of the listening fds across these pollsets). |
||||
|
||||
This means that for every incoming new channel, all the threads waiting on all the pollsets are woken up. |
||||
|
||||
Example 2: New Incoming-channel fds on server |
||||
|
||||
* Currently, every new incoming channel's `fd` (i.e the socket `fd` that is returned by doing an `accept()` on the new incoming channel) is added to all the server completion queues' pollsets [^5]). |
||||
* Clearly, this would also cause all thundering herd problem for every read onthat fd |
||||
|
||||
There are other scenarios especially on the client side where an fd can end up being on multiple pollsets which would cause thundering herds on the clients. |
||||
|
||||
|
||||
## 4. Proposed changes to the current `epoll`-based polling implementation: |
||||
|
||||
The main idea in this proposal is to group 'related' `fds` into a single epoll-based set. This would ensure that only one thread wakes up in case of an event on one of the `fds` in the epoll set. |
||||
|
||||
To accomplish this, we introduce a new abstraction called `polling_island` which will have an epoll set underneath (See **Figure 2** below). A `polling_island` contains the following: |
||||
|
||||
* `epoll_fd`: The file descriptor of the underlying epoll set |
||||
* `fd_set`: The set of 'fds' in the pollset island i.e in the epoll set (The pollset island merging operation described later requires the list of fds in the pollset island and currently there is no API available to enumerate all the fds in an epoll set) |
||||
* `event_fd`: A level triggered _event fd_ that is used to wake up all the threads waiting on this epoll set (Note: This `event_fd` is added to the underlying epoll set during pollset island creation. This is useful in the pollset island merging operation described later) |
||||
* `merged_to`: The polling island into which this one merged. See section 4.2 (case 2) for more details on this. Also note that if `merged_to` is set, all the other fields in this polling island are not used anymore |
||||
|
||||
In this new model, only one thread wakes up whenever an event of interest happens in an epoll set. |
||||
|
||||
![drawing](images/new_epoll_impl.png) |
||||
|
||||
**Figure 2: Proposed changes** |
||||
|
||||
### 4.1 Relation between `fd`, `pollset` and `polling_island:` |
||||
|
||||
* An `fd` may belong to multiple `pollsets` but belongs to exactly one `polling_island` |
||||
* A `pollset` belongs to exactly one `polling_island` |
||||
* An `fd` and the `pollset(s`) it belongs to, have same `polling_island` |
||||
|
||||
### 4.2 Algorithm to add an `fd` to a `pollset` |
||||
|
||||
There are two cases to check here: |
||||
|
||||
* **Case 1:** Both `fd` and `pollset` already belong to the same `polling_island` |
||||
* This is straightforward and nothing really needs to be done here |
||||
* **Case 2:** The `fd `and `pollset` point to different `polling_islands`: In this case we _merge_ both the polling islands i.e: |
||||
* Add all the `fds` from the smaller `polling_island `to the larger `polling_island` and update the `merged_to` pointer on the smaller island to point to the larger island. |
||||
* Wake up all the threads waiting on the smaller `polling_island`'s `epoll_fd` (by signalling the `event_fd` on that island) and make them now wait on the larger `polling_island`'s `epoll_fd` |
||||
* Update `fd` and `pollset` to now point to the larger `polling_island` |
||||
|
||||
### 4.3 Directed wakeups: |
||||
|
||||
The new implementation, just like the current implementation, does not provide us any guarantees that the thread that is woken up is the thread that is actually interested in the event. So the thread that woke up executes the callbacks and finally has to 'kick' the appropriate polling thread interested in the event. |
||||
|
||||
In the current implementation, every polling thread also had a `event_fd` on which it was listening to and hence waking it up was as simple as signalling that `event_fd`. However, using an `event_fd` also meant that every thread has to use a `poll()` (on `event_fd` and `epoll_fd`) instead of doing an `epoll_wait()` and this resulted in the thundering herd problems described above. |
||||
|
||||
The proposal here is to use signals and kicking a thread would just be sending a signal to that thread. Unfortunately there are only a few signals available on posix systems and most of them have pre-determined behavior leaving only a few signals `SIGUSR1`, `SIGUSR2` and `SIGRTx (SIGRTMIN to SIGRTMAX)` for custom use. |
||||
|
||||
The calling application might have registered other signal handlers for these signals. `We will provide a new API where the applications can "give a signal number" to gRPC library to use for this purpose. |
||||
|
||||
``` |
||||
void grpc_use_signal(int signal_num) |
||||
``` |
||||
|
||||
If the calling application does not provide a signal number, then the gRPC library will relegate to using a model similar to the current implementation (where every thread does a blocking `poll()` on its `wakeup_fd` and the `epoll_fd`). The function` psi_wait() `in figure 2 implements this logic. |
||||
|
||||
**>> **(**NOTE**: Or alternatively, we can implement a turnstile polling (i.e having only one thread calling `epoll_wait()` on the epoll set at any time - which all other threads call poll on their `wakeup_fds`) |
||||
in case of not getting a signal number from the applications. |
||||
|
||||
|
||||
## Notes |
||||
|
||||
[^1]: Only exception is in case of name-resolution |
||||
|
||||
[^2]: However, a `grpc_completion_queue_next()` and `grpc_completion_queue_pluck()` must not be called in parallel on the same completion queue |
||||
|
||||
[^3]: The threads first do a blocking` poll()` with `[wakeup_fd, epoll_fd]`. If the `poll()` returns due to an event of interest in the epoll set, they then call a non-blocking i.e a zero-timeout `epoll_wait()` on the `epoll_fd` |
||||
|
||||
[^4]: `event_fd` is the linux platform specific implementation of `grpc_wakeup_fd`. A `wakeup_fd` is used to wake up polling threads typically when the event for which the polling thread is waiting is already completed by some other thread. It is also used to wake up the polling threads in case of shutdowns or to re-evaluate the poller's interest in the fds to poll (the last scenario is only in case of `poll`-based (not `epoll`-based) implementation of `pollsets`). |
||||
|
||||
[^5]: See more details about the issue here https://github.com/grpc/grpc/issues/5470 and for a proposed fix here: https://github.com/grpc/grpc/pull/6149 |
After Width: | Height: | Size: 52 KiB |
After Width: | Height: | Size: 44 KiB |
@ -1,42 +0,0 @@ |
||||
# Copyright 2015, Google Inc. |
||||
# All rights reserved. |
||||
# |
||||
# Redistribution and use in source and binary forms, with or without |
||||
# modification, are permitted provided that the following conditions are |
||||
# met: |
||||
# |
||||
# * Redistributions of source code must retain the above copyright |
||||
# notice, this list of conditions and the following disclaimer. |
||||
# * Redistributions in binary form must reproduce the above |
||||
# copyright notice, this list of conditions and the following disclaimer |
||||
# in the documentation and/or other materials provided with the |
||||
# distribution. |
||||
# * Neither the name of Google Inc. nor the names of its |
||||
# contributors may be used to endorse or promote products derived from |
||||
# this software without specific prior written permission. |
||||
# |
||||
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS |
||||
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT |
||||
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR |
||||
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT |
||||
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, |
||||
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT |
||||
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, |
||||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY |
||||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT |
||||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE |
||||
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. |
||||
|
||||
"""Runs protoc with the gRPC plugin to generate messages and gRPC stubs.""" |
||||
|
||||
from grpc.tools import protoc |
||||
|
||||
protoc.main( |
||||
( |
||||
'', |
||||
'-I../../protos', |
||||
'--python_out=.', |
||||
'--grpc_python_out=.', |
||||
'../../protos/helloworld.proto', |
||||
) |
||||
) |
@ -0,0 +1,67 @@ |
||||
/*
|
||||
* |
||||
* Copyright 2016, Google Inc. |
||||
* All rights reserved. |
||||
* |
||||
* Redistribution and use in source and binary forms, with or without |
||||
* modification, are permitted provided that the following conditions are |
||||
* met: |
||||
* |
||||
* * Redistributions of source code must retain the above copyright |
||||
* notice, this list of conditions and the following disclaimer. |
||||
* * Redistributions in binary form must reproduce the above |
||||
* copyright notice, this list of conditions and the following disclaimer |
||||
* in the documentation and/or other materials provided with the |
||||
* distribution. |
||||
* * Neither the name of Google Inc. nor the names of its |
||||
* contributors may be used to endorse or promote products derived from |
||||
* this software without specific prior written permission. |
||||
* |
||||
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS |
||||
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT |
||||
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR |
||||
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT |
||||
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, |
||||
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT |
||||
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, |
||||
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY |
||||
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT |
||||
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE |
||||
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. |
||||
* |
||||
*/ |
||||
|
||||
#ifndef GRPCXX_TEST_SERVER_CONTEXT_TEST_SPOUSE_H |
||||
#define GRPCXX_TEST_SERVER_CONTEXT_TEST_SPOUSE_H |
||||
|
||||
#include <map> |
||||
|
||||
#include <grpc++/server_context.h> |
||||
|
||||
namespace grpc { |
||||
namespace testing { |
||||
|
||||
// A test-only class to access private members and methods of ServerContext.
|
||||
class ServerContextTestSpouse { |
||||
public: |
||||
explicit ServerContextTestSpouse(ServerContext* ctx) : ctx_(ctx) {} |
||||
|
||||
// Inject client metadata to the ServerContext for the test. The test spouse
|
||||
// must be alive when ServerContext::client_metadata is called.
|
||||
void AddClientMetadata(const grpc::string& key, const grpc::string& value); |
||||
std::multimap<grpc::string, grpc::string> GetInitialMetadata() const { |
||||
return ctx_->initial_metadata_; |
||||
} |
||||
std::multimap<grpc::string, grpc::string> GetTrailingMetadata() const { |
||||
return ctx_->trailing_metadata_; |
||||
} |
||||
|
||||
private: |
||||
ServerContext* ctx_; // not owned
|
||||
std::multimap<grpc::string, grpc::string> client_metadata_storage_; |
||||
}; |
||||
|
||||
} // namespace testing
|
||||
} // namespace grpc
|
||||
|
||||
#endif // GRPCXX_TEST_SERVER_CONTEXT_TEST_SPOUSE_H
|
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,118 @@ |
||||
/*
|
||||
* |
||||
* Copyright 2016, Google Inc. |
||||
* All rights reserved. |
||||
* |
||||
* Redistribution and use in source and binary forms, with or without |
||||
* modification, are permitted provided that the following conditions are |
||||
* met: |
||||
* |
||||
* * Redistributions of source code must retain the above copyright |
||||
* notice, this list of conditions and the following disclaimer. |
||||
* * Redistributions in binary form must reproduce the above |
||||
* copyright notice, this list of conditions and the following disclaimer |
||||
* in the documentation and/or other materials provided with the |
||||
* distribution. |
||||
* * Neither the name of Google Inc. nor the names of its |
||||
* contributors may be used to endorse or promote products derived from |
||||
* this software without specific prior written permission. |
||||
* |
||||
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS |
||||
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT |
||||
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR |
||||
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT |
||||
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, |
||||
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT |
||||
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, |
||||
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY |
||||
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT |
||||
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE |
||||
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. |
||||
* |
||||
*/ |
||||
|
||||
#include "src/core/lib/iomgr/port.h" |
||||
|
||||
#ifdef GRPC_POSIX_WAKEUP_FD |
||||
|
||||
#include "src/core/lib/iomgr/wakeup_fd_cv.h" |
||||
|
||||
#include <errno.h> |
||||
#include <string.h> |
||||
|
||||
#include <grpc/support/alloc.h> |
||||
#include <grpc/support/log.h> |
||||
#include <grpc/support/sync.h> |
||||
#include <grpc/support/thd.h> |
||||
#include <grpc/support/time.h> |
||||
#include <grpc/support/useful.h> |
||||
|
||||
#define MAX_TABLE_RESIZE 256 |
||||
|
||||
extern cv_fd_table g_cvfds; |
||||
|
||||
static grpc_error* cv_fd_init(grpc_wakeup_fd* fd_info) { |
||||
unsigned int i, newsize; |
||||
int idx; |
||||
gpr_mu_lock(&g_cvfds.mu); |
||||
if (!g_cvfds.free_fds) { |
||||
newsize = GPR_MIN(g_cvfds.size * 2, g_cvfds.size + MAX_TABLE_RESIZE); |
||||
g_cvfds.cvfds = gpr_realloc(g_cvfds.cvfds, sizeof(fd_node) * newsize); |
||||
for (i = g_cvfds.size; i < newsize; i++) { |
||||
g_cvfds.cvfds[i].is_set = 0; |
||||
g_cvfds.cvfds[i].cvs = NULL; |
||||
g_cvfds.cvfds[i].next_free = g_cvfds.free_fds; |
||||
g_cvfds.free_fds = &g_cvfds.cvfds[i]; |
||||
} |
||||
g_cvfds.size = newsize; |
||||
} |
||||
|
||||
idx = (int)(g_cvfds.free_fds - g_cvfds.cvfds); |
||||
g_cvfds.free_fds = g_cvfds.free_fds->next_free; |
||||
g_cvfds.cvfds[idx].cvs = NULL; |
||||
g_cvfds.cvfds[idx].is_set = 0; |
||||
fd_info->read_fd = IDX_TO_FD(idx); |
||||
fd_info->write_fd = -1; |
||||
gpr_mu_unlock(&g_cvfds.mu); |
||||
return GRPC_ERROR_NONE; |
||||
} |
||||
|
||||
static grpc_error* cv_fd_wakeup(grpc_wakeup_fd* fd_info) { |
||||
cv_node* cvn; |
||||
gpr_mu_lock(&g_cvfds.mu); |
||||
g_cvfds.cvfds[FD_TO_IDX(fd_info->read_fd)].is_set = 1; |
||||
cvn = g_cvfds.cvfds[FD_TO_IDX(fd_info->read_fd)].cvs; |
||||
while (cvn) { |
||||
gpr_cv_signal(cvn->cv); |
||||
cvn = cvn->next; |
||||
} |
||||
gpr_mu_unlock(&g_cvfds.mu); |
||||
return GRPC_ERROR_NONE; |
||||
} |
||||
|
||||
static grpc_error* cv_fd_consume(grpc_wakeup_fd* fd_info) { |
||||
gpr_mu_lock(&g_cvfds.mu); |
||||
g_cvfds.cvfds[FD_TO_IDX(fd_info->read_fd)].is_set = 0; |
||||
gpr_mu_unlock(&g_cvfds.mu); |
||||
return GRPC_ERROR_NONE; |
||||
} |
||||
|
||||
static void cv_fd_destroy(grpc_wakeup_fd* fd_info) { |
||||
if (fd_info->read_fd == 0) { |
||||
return; |
||||
} |
||||
gpr_mu_lock(&g_cvfds.mu); |
||||
// Assert that there are no active pollers
|
||||
GPR_ASSERT(!g_cvfds.cvfds[FD_TO_IDX(fd_info->read_fd)].cvs); |
||||
g_cvfds.cvfds[FD_TO_IDX(fd_info->read_fd)].next_free = g_cvfds.free_fds; |
||||
g_cvfds.free_fds = &g_cvfds.cvfds[FD_TO_IDX(fd_info->read_fd)]; |
||||
gpr_mu_unlock(&g_cvfds.mu); |
||||
} |
||||
|
||||
static int cv_check_availability(void) { return 1; } |
||||
|
||||
const grpc_wakeup_fd_vtable grpc_cv_wakeup_fd_vtable = { |
||||
cv_fd_init, cv_fd_consume, cv_fd_wakeup, cv_fd_destroy, |
||||
cv_check_availability}; |
||||
|
||||
#endif /* GRPC_POSIX_WAKUP_FD */ |
@ -0,0 +1,80 @@ |
||||
/*
|
||||
* |
||||
* Copyright 2016, Google Inc. |
||||
* All rights reserved. |
||||
* |
||||
* Redistribution and use in source and binary forms, with or without |
||||
* modification, are permitted provided that the following conditions are |
||||
* met: |
||||
* |
||||
* * Redistributions of source code must retain the above copyright |
||||
* notice, this list of conditions and the following disclaimer. |
||||
* * Redistributions in binary form must reproduce the above |
||||
* copyright notice, this list of conditions and the following disclaimer |
||||
* in the documentation and/or other materials provided with the |
||||
* distribution. |
||||
* * Neither the name of Google Inc. nor the names of its |
||||
* contributors may be used to endorse or promote products derived from |
||||
* this software without specific prior written permission. |
||||
* |
||||
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS |
||||
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT |
||||
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR |
||||
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT |
||||
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, |
||||
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT |
||||
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, |
||||
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY |
||||
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT |
||||
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE |
||||
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. |
||||
* |
||||
*/ |
||||
|
||||
/*
|
||||
* wakeup_fd_cv uses condition variables to implement wakeup fds. |
||||
* |
||||
* It is intended for use only in cases when eventfd() and pipe() are not |
||||
* available. It can only be used with the "poll" engine. |
||||
* |
||||
* Implementation: |
||||
* A global table of cv wakeup fds is mantained. A cv wakeup fd is a negative |
||||
* file descriptor. poll() is then run in a background thread with only the |
||||
* real socket fds while we wait on a condition variable trigged by either the |
||||
* poll() completion or a wakeup_fd() call. |
||||
* |
||||
*/ |
||||
|
||||
#ifndef GRPC_CORE_LIB_IOMGR_WAKEUP_FD_CV_H |
||||
#define GRPC_CORE_LIB_IOMGR_WAKEUP_FD_CV_H |
||||
|
||||
#include <grpc/support/sync.h> |
||||
|
||||
#include "src/core/lib/iomgr/ev_posix.h" |
||||
|
||||
#define FD_TO_IDX(fd) (-(fd)-1) |
||||
#define IDX_TO_FD(idx) (-(idx)-1) |
||||
|
||||
typedef struct cv_node { |
||||
gpr_cv* cv; |
||||
struct cv_node* next; |
||||
} cv_node; |
||||
|
||||
typedef struct fd_node { |
||||
int is_set; |
||||
cv_node* cvs; |
||||
struct fd_node* next_free; |
||||
} fd_node; |
||||
|
||||
typedef struct cv_fd_table { |
||||
gpr_mu mu; |
||||
int pollcount; |
||||
int shutdown; |
||||
gpr_cv shutdown_complete; |
||||
fd_node* cvfds; |
||||
fd_node* free_fds; |
||||
unsigned int size; |
||||
grpc_poll_function_type poll; |
||||
} cv_fd_table; |
||||
|
||||
#endif /* GRPC_CORE_LIB_IOMGR_WAKEUP_FD_CV_H */ |
@ -1,196 +0,0 @@ |
||||
/*
|
||||
* |
||||
* Copyright 2015, Google Inc. |
||||
* All rights reserved. |
||||
* |
||||
* Redistribution and use in source and binary forms, with or without |
||||
* modification, are permitted provided that the following conditions are |
||||
* met: |
||||
* |
||||
* * Redistributions of source code must retain the above copyright |
||||
* notice, this list of conditions and the following disclaimer. |
||||
* * Redistributions in binary form must reproduce the above |
||||
* copyright notice, this list of conditions and the following disclaimer |
||||
* in the documentation and/or other materials provided with the |
||||
* distribution. |
||||
* * Neither the name of Google Inc. nor the names of its |
||||
* contributors may be used to endorse or promote products derived from |
||||
* this software without specific prior written permission. |
||||
* |
||||
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS |
||||
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT |
||||
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR |
||||
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT |
||||
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, |
||||
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT |
||||
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, |
||||
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY |
||||
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT |
||||
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE |
||||
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. |
||||
* |
||||
*/ |
||||
|
||||
#include "src/core/lib/iomgr/port.h" |
||||
|
||||
#ifdef GRPC_POSIX_SOCKET |
||||
|
||||
#include "src/core/lib/iomgr/workqueue.h" |
||||
|
||||
#include <stdio.h> |
||||
|
||||
#include <grpc/support/alloc.h> |
||||
#include <grpc/support/log.h> |
||||
#include <grpc/support/useful.h> |
||||
|
||||
#include "src/core/lib/iomgr/ev_posix.h" |
||||
#include "src/core/lib/profiling/timers.h" |
||||
|
||||
static void on_readable(grpc_exec_ctx *exec_ctx, void *arg, grpc_error *error); |
||||
|
||||
grpc_error *grpc_workqueue_create(grpc_exec_ctx *exec_ctx, |
||||
grpc_workqueue **workqueue) { |
||||
char name[32]; |
||||
*workqueue = gpr_malloc(sizeof(grpc_workqueue)); |
||||
gpr_ref_init(&(*workqueue)->refs, 1); |
||||
gpr_atm_no_barrier_store(&(*workqueue)->state, 1); |
||||
grpc_error *err = grpc_wakeup_fd_init(&(*workqueue)->wakeup_fd); |
||||
if (err != GRPC_ERROR_NONE) { |
||||
gpr_free(*workqueue); |
||||
return err; |
||||
} |
||||
sprintf(name, "workqueue:%p", (void *)(*workqueue)); |
||||
(*workqueue)->wakeup_read_fd = grpc_fd_create( |
||||
GRPC_WAKEUP_FD_GET_READ_FD(&(*workqueue)->wakeup_fd), name); |
||||
gpr_mpscq_init(&(*workqueue)->queue); |
||||
grpc_closure_init(&(*workqueue)->read_closure, on_readable, *workqueue); |
||||
grpc_fd_notify_on_read(exec_ctx, (*workqueue)->wakeup_read_fd, |
||||
&(*workqueue)->read_closure); |
||||
return GRPC_ERROR_NONE; |
||||
} |
||||
|
||||
static void workqueue_destroy(grpc_exec_ctx *exec_ctx, |
||||
grpc_workqueue *workqueue) { |
||||
grpc_fd_shutdown(exec_ctx, workqueue->wakeup_read_fd); |
||||
} |
||||
|
||||
static void workqueue_orphan(grpc_exec_ctx *exec_ctx, |
||||
grpc_workqueue *workqueue) { |
||||
if (gpr_atm_full_fetch_add(&workqueue->state, -1) == 1) { |
||||
workqueue_destroy(exec_ctx, workqueue); |
||||
} |
||||
} |
||||
|
||||
#ifdef GRPC_WORKQUEUE_REFCOUNT_DEBUG |
||||
void grpc_workqueue_ref(grpc_workqueue *workqueue, const char *file, int line, |
||||
const char *reason) { |
||||
if (workqueue == NULL) return; |
||||
gpr_log(file, line, GPR_LOG_SEVERITY_DEBUG, "WORKQUEUE:%p ref %d -> %d %s", |
||||
workqueue, (int)workqueue->refs.count, (int)workqueue->refs.count + 1, |
||||
reason); |
||||
gpr_ref(&workqueue->refs); |
||||
} |
||||
#else |
||||
void grpc_workqueue_ref(grpc_workqueue *workqueue) { |
||||
if (workqueue == NULL) return; |
||||
gpr_ref(&workqueue->refs); |
||||
} |
||||
#endif |
||||
|
||||
#ifdef GRPC_WORKQUEUE_REFCOUNT_DEBUG |
||||
void grpc_workqueue_unref(grpc_exec_ctx *exec_ctx, grpc_workqueue *workqueue, |
||||
const char *file, int line, const char *reason) { |
||||
if (workqueue == NULL) return; |
||||
gpr_log(file, line, GPR_LOG_SEVERITY_DEBUG, "WORKQUEUE:%p unref %d -> %d %s", |
||||
workqueue, (int)workqueue->refs.count, (int)workqueue->refs.count - 1, |
||||
reason); |
||||
if (gpr_unref(&workqueue->refs)) { |
||||
workqueue_orphan(exec_ctx, workqueue); |
||||
} |
||||
} |
||||
#else |
||||
void grpc_workqueue_unref(grpc_exec_ctx *exec_ctx, grpc_workqueue *workqueue) { |
||||
if (workqueue == NULL) return; |
||||
if (gpr_unref(&workqueue->refs)) { |
||||
workqueue_orphan(exec_ctx, workqueue); |
||||
} |
||||
} |
||||
#endif |
||||
|
||||
static void drain(grpc_exec_ctx *exec_ctx, grpc_workqueue *workqueue) { |
||||
abort(); |
||||
} |
||||
|
||||
static void wakeup(grpc_exec_ctx *exec_ctx, grpc_workqueue *workqueue) { |
||||
GPR_TIMER_MARK("workqueue.wakeup", 0); |
||||
grpc_error *err = grpc_wakeup_fd_wakeup(&workqueue->wakeup_fd); |
||||
if (!GRPC_LOG_IF_ERROR("wakeupfd_wakeup", err)) { |
||||
drain(exec_ctx, workqueue); |
||||
} |
||||
} |
||||
|
||||
static void on_readable(grpc_exec_ctx *exec_ctx, void *arg, grpc_error *error) { |
||||
GPR_TIMER_BEGIN("workqueue.on_readable", 0); |
||||
|
||||
grpc_workqueue *workqueue = arg; |
||||
|
||||
if (error != GRPC_ERROR_NONE) { |
||||
/* HACK: let wakeup_fd code know that we stole the fd */ |
||||
workqueue->wakeup_fd.read_fd = 0; |
||||
grpc_wakeup_fd_destroy(&workqueue->wakeup_fd); |
||||
grpc_fd_orphan(exec_ctx, workqueue->wakeup_read_fd, NULL, NULL, "destroy"); |
||||
GPR_ASSERT(gpr_atm_no_barrier_load(&workqueue->state) == 0); |
||||
gpr_free(workqueue); |
||||
} else { |
||||
error = grpc_wakeup_fd_consume_wakeup(&workqueue->wakeup_fd); |
||||
gpr_mpscq_node *n = gpr_mpscq_pop(&workqueue->queue); |
||||
if (error == GRPC_ERROR_NONE) { |
||||
grpc_fd_notify_on_read(exec_ctx, workqueue->wakeup_read_fd, |
||||
&workqueue->read_closure); |
||||
} else { |
||||
/* recurse to get error handling */ |
||||
on_readable(exec_ctx, arg, error); |
||||
} |
||||
if (n == NULL) { |
||||
/* try again - queue in an inconsistant state */ |
||||
wakeup(exec_ctx, workqueue); |
||||
} else { |
||||
switch (gpr_atm_full_fetch_add(&workqueue->state, -2)) { |
||||
case 3: // had one count, one unorphaned --> done, unorphaned
|
||||
break; |
||||
case 2: // had one count, one orphaned --> done, orphaned
|
||||
workqueue_destroy(exec_ctx, workqueue); |
||||
break; |
||||
case 1: |
||||
case 0: |
||||
// these values are illegal - representing an already done or
|
||||
// deleted workqueue
|
||||
GPR_UNREACHABLE_CODE(break); |
||||
default: |
||||
// schedule a wakeup since there's more to do
|
||||
wakeup(exec_ctx, workqueue); |
||||
} |
||||
grpc_closure *cl = (grpc_closure *)n; |
||||
grpc_error *clerr = cl->error; |
||||
cl->cb(exec_ctx, cl->cb_arg, clerr); |
||||
GRPC_ERROR_UNREF(clerr); |
||||
} |
||||
} |
||||
|
||||
GPR_TIMER_END("workqueue.on_readable", 0); |
||||
} |
||||
|
||||
void grpc_workqueue_enqueue(grpc_exec_ctx *exec_ctx, grpc_workqueue *workqueue, |
||||
grpc_closure *closure, grpc_error *error) { |
||||
GPR_TIMER_BEGIN("workqueue.enqueue", 0); |
||||
gpr_atm last = gpr_atm_full_fetch_add(&workqueue->state, 2); |
||||
GPR_ASSERT(last & 1); |
||||
closure->error = error; |
||||
gpr_mpscq_push(&workqueue->queue, &closure->next_data.atm_next); |
||||
if (last == 1) { |
||||
wakeup(exec_ctx, workqueue); |
||||
} |
||||
GPR_TIMER_END("workqueue.enqueue", 0); |
||||
} |
||||
|
||||
#endif /* GRPC_POSIX_SOCKET */ |
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in new issue