Currently I'm building some microservices for AWS. The choice of my team was using Drogon for the server-side functionality, and gRPC to consume other microservices.
The containers have 4 logical threads and 2 physicals. So my plan was to use one thread to listen/send HTTP requests, 2 worker threads for processing/hard tasks, and one thread for gRPC.
Currently an orchestrator can consume 5-10 microservices living in different hosts, like localhost:444, localhost:445, ... so in the orchestrator I think would need to have N channels and N stubs (because I can't share channels cause of the hosts).
Using the Async API, I have created one Completion Queue for all the clients, along with a thread that will constantly wait for the next result:
auto AsyncRpcEngine::start_completion_queue_loop_thread() -> void
{
std::thread{&AsyncRpcEngine::completion_queue_loop, this}.detach();
}
auto AsyncRpcEngine::retrieve_completion_queue() -> grpc::CompletionQueue&
{
return completion_queue_;
}
auto AsyncRpcEngine::completion_queue_loop() -> void
{
void* function_tag;
bool ok = false;
while (completion_queue_.Next(&function_tag, &ok))
{
auto function_call = static_cast< std::function<void ()>* >(function_tag);
(*function_call)();
delete function_call;
}
}
So I have a few questions:
Currently I'm building some microservices for AWS. The choice of my team was using Drogon for the server-side functionality, and gRPC to consume other microservices.
The containers have 4 logical threads and 2 physicals. So my plan was to use one thread to listen/send HTTP requests, 2 worker threads for processing/hard tasks, and one thread for gRPC.
Currently an orchestrator can consume 5-10 microservices living in different hosts, like localhost:444, localhost:445, ... so in the orchestrator I think would need to have N channels and N stubs (because I can't share channels cause of the hosts).
Using the Async API, I have created one Completion Queue for all the clients, along with a thread that will constantly wait for the next result:
auto AsyncRpcEngine::start_completion_queue_loop_thread() -> void
{
std::thread{&AsyncRpcEngine::completion_queue_loop, this}.detach();
}
auto AsyncRpcEngine::retrieve_completion_queue() -> grpc::CompletionQueue&
{
return completion_queue_;
}
auto AsyncRpcEngine::completion_queue_loop() -> void
{
void* function_tag;
bool ok = false;
while (completion_queue_.Next(&function_tag, &ok))
{
auto function_call = static_cast< std::function<void ()>* >(function_tag);
(*function_call)();
delete function_call;
}
}
So I have a few questions:
In case you haven't yet, give this a read to begin with https://grpc.io/docs/guides/performance/
We recommend using the callback API for all new development. We are quickly approaching the point where it will become the most performant option. That said, you will not have control of the number of threads that gRPC utilizes unless you create a custom EventEngine. gRPC will create its own threads and scale a thread pool to optimize gRPC performance - and it will automatically scale back the number of threads if performance decreases.
There is some overhead with multiple channels, but there is also a possibility that TCP connections are shared between channels (known as "subchannel connection sharing"). Performance also depends on your unique workload in your polling threads. I'd recommend you start with the performance guidelines above, and if you stick with managing your own threads in the async API, you will likely want to play with these parameters and see what performs best for your application. Note that performance for the Async API will change in future versions, as we make the Callback API the most performant option.