Microservice Architecture: Can you eliminate the synchronous calls between services completely in a system?

Anywhere you read about Microservices, it says microservice should communicate asynchronously. It is understandable why asynchronous communication is preferred as it removes dependencies and provides low-coupling, and availability, etc.
Suppose, there is a common authorization service that is invoked every time a user calls an API. In this scenario you cannot move further util you have the response from the authorization service. Although you can call the authorization service asynchronously using Async IO, however, it is still a request/reply pattern.
Questions I have
Is possible to get rid of synchronous communication or more appropriately request/reply pattern in microservices-based system design?
Although it is possible to implement a reply/response pattern asynchronously through messaging and callbacks, which add significant overhead and latency but is it worth converting every request/reply to asynchronously?
If synchronous calls cannot be eliminated completely, then which scenarios it is ok to have synchronous calls among microservices?

I think the short answer for your question is: request-reply pattern doesn't mean synchronous. It can also be asynchronous. Which you already mentioned.
Long answer:
Request-Reply is just a principle. For example you send an email to a friend. The message contains data relevant to you and you are expecting a response but didn't say that explicitly. Your friend will see the email when he will get back from work and then he may or may not reply to you. Only you know that you need an answer from him.
Now there are a few options while waiting for your response. Either block your entire life until your friend responds (which will mean synchronous communication) either do something else until the response arrives in your inbox (which is asynchronous).
Now, to the point:
Yes, you already have answered that at the second point. Even though it is possible I think it should be used where it is required.
For the right scenario, yes. The messaging system have very good performances so the latency should not be an issue. When a latency problem occurs in a messaging system there are other options to improve it.
There is one more thing that needs to be added. Synchronous doesn't always mean blocking. In a reactive world, if you make an HTTP call to another service the caller sends the request and then awaits for the response in a non-blocking manner. When the responses arrives, the caller is notified the the response has arrived and so the process continues. While "awaiting" the CPU can do other stuff.


How does a microservice return data to the caller when using a message broker? or a message queue?

I am prettty new to microservices, and I am trying to figure out how to set a micro-service architecture in which my publisher that emits an event, can receive a response with data from the consumer within the publisher?
From what i have read about message-brokers and message-queues, it seems like it's one-way communication. The producer emits an event (or rather, sends a message) which is handled by the message broker, and then the consumer consumes that event and performs some action.
This allows for decoupled code, which is part of what im looking for, but i dont understand if the consumer is able to return any data to the caller.
Say for example I have a microservice that communicates with an external API to fetch data. I want to be able to send a message or emit an event from my front-facing server, which then calls the service that fetches data, parses the data, and then returns that data back to my servver1 (front-facing server)
Is there a way to make message brokers or queues bidirectional? Or is it only useable in one direction. I keep reading message brokers allow services to communicate with each other, but I only find examples in which data flow goes one way.
Even reading rabbitMQ documentation hasn't really made it very clear to me how i could do this
In general, when talking about messaging, it's one-way.
When you send a letter to someone you're not opening up a mind-meld so that they telepathically communicate their response to you.
Instead, you include a return address (or some other means of contacting you).
So to map a request-response interaction when communicating with explicit messaging (e.g. via a message queue), the solution is the same: you include some directions which the recipient can/will interpret as "send a response here". That could, for instance be, "publish a message on this queue with this correlation ID".
Your publisher then, after sending this message, subscribes to the queue it's designated and waits for a message with the expected correlation ID.
Needless to say, this is fairly elaborate: you are, in some sense, reimplementing a decent portion of a session protocol like TCP on top of a datagram protocol like IP (albeit in this case, we may have some stronger reliability guarantees than we'd get from IP). It's worth noting that this sort of request-response interaction intrinsically couples the two parties (we can't really say "sender and receiver": each is the other's audience), so we're basically putting in some effort to decouple the two sides and then some more effort to recouple them.
With that in mind, if the actual business use case calls for a request-response interaction like this, consider implementing it with an actual request-response protocol (e.g. REST over HTTP or gRPC...) and accept that you have this coupling.
Alternatively, if you really want to pursue loose coupling, go for broke and embrace the asynchronicity at the heart of the universe (maybe that way lies true enlightenment?). Have your publisher return success with that correlation ID as soon as its sent its message. Meanwhile, have a different service be tracking the state of those correlation IDs and exposing a query interface (CQRS, hooray!). Your client can then check at any time whether the thing it wanted succeeded, even if its connection to your publisher gets interrupted.
Queues are the wrong level of abstraction for request-reply. You can build an application out of them, but it would be nontrivial to support and operate.
The solution is to use an orchestration system like temporal.io or AWS Step Functions. These services out of the box provide state management, asynchronous communication, and automatic recovery in case of various types of failures.

What is the difference between a circuit breaker and a bulkhead pattern?

Can we use both together in Spring Boot during the development of microservice?
These are fundamentally different patterns.
A circuit breaker pattern is implemented on the caller, to avoid overwhelming a service which may be struggling to handle calls. A sample implementation in Spring can be found here.
A bulkhead pattern is implemented on the service, to prevent a failure during the handling of a single incoming call impacting the handling of other incoming calls. A sample implementation in Spring can be found here.
The only thing these patters have in common is that they are both designed to increase the resilience of a distributed system.
While you can certainly use them together in the same service, you must understand that they are not related to each other, as one is concerned with making calls and the other is concerned with handling calls.
Yes, they can be used together, but it's not always necessary.
As #tom redfern said, circuit breaker is implemented on the caller side. So, if you are sending request to another service, you should wrap those requests into a circuit breaker specific to that service. Keep in mind that every other third party system or service should have it's own circuit breaker. Otherwise, the unavailability of one system will impact the requests that you are sending to the other by opening the circuit breaker.
More informations about circuit breaker can be found here: https://learn.microsoft.com/en-us/azure/architecture/patterns/circuit-breaker
Also, #tom redfern is right again in the case of bulkheading, this is a pattern which is implemented in the service that is called. So, if you are reacting to external requests by spanning other multiple requests or worloads, you should avoid doing all those worloads into a single unit (thread). Instead, separate the worloads into pieces (thread pools) for each request that you have spanned.
More information about bulkheading can be found here: https://learn.microsoft.com/en-us/azure/architecture/patterns/bulkhead
Your question was if it's possible to use both these patterns in the same microservice. The answer is: yes, you can and very often the situation implies this.

How do I close the loop on batched writes in AWS?

I have an endpoint in my api that supports writes. The resource in question is collaborative, so it is reasonable to expect that there will be parallel write requests arriving concurrently.
If the number of writes is small, then this is relatively straight forward to do with a simple lambda - read the current state, compute the new state, compare and swap, spin until the swap succeeds or until we give up. In either case, we compute the appropriate http response and return it to the caller.
If the API is successful, then eventually the waste of conflicting writes becomes expensive enough to address.
It looks as though the natural response is to copy the requests into a queue, with a function that consumes batches; within each batch, we process the requests in sequence, storing the new write, and computing the appropriate response to the request.
What are the options for getting those computed responses copied into the http responses, and what are the trade offs to be be considered?
My sense is that in handling the http request, after (synchronously) enqueue the message, I need to block/poll on something that will eventually be populated with the response to the request.
I'm not sure if this will count an an answer, but I do not agree that the natural response is to copy/queue/block; that feels like you're just trading optimistic concurrency control for a kind of pessimistic one (and you'd probably have an easier time just implementing a lock using e.g. Redis - not to mention there are other issues with Lambda itself that would make the approach you describe even more difficult).
Users probably do not want an API like this as it would have high latency.
In my opinion an API that is well designed for collaborate modification of some shared state has higher order constructs that make the API successful: thinking of a conversation as an example, you would decompose the chat in to individual messages, where each message is in reply to some other message; the concurrent modification to the conversation is append-only for the most part (you might allow a user to edit an individual message but that's not a point of resource contention) and you might do things like count the number of messages within the conversation asynchronously such that it is eventually consistent.
You can look at the domain of your API and see if there's a way to expose modification to it in such a way that reduces contention by making modifications target sub-entities (even if the API represents this as a single resource, the storage engine does not have to).
Another option is looking in to a model like event sourcing, where the changes themselves are literally appended and you derive the state from some snapshot plus recent changes.

C++ IRC Client design

I'm attempting to write an RFC 2812 compliant C++ IRC library.
I am having some trouble with the design of the client itself.
From what I have read IRC communication tends to be asynchronous.
I am using boost::asio::async_read and boost::asio::async_write.
From reading the documentation I have gathered that you cannot perform multiple async_write requests before one is completed. You therefore end up with rather nested callbacks. Doesn't this defeat the purpose of doing async calls? Wouldn't it just be better to use synchronous calls to prevent the nesting? If not, why?
Secondly, if I am not mistaken, each boost::asio::async_write should be followed up by a boost::asio::async_read to receive the server's response to the commands sent. My client's functions, therefore, would need to take a callback parameter so a user of the class may do something after the client receives a response (ex. send another message...).
If I were to continue implementing this with async, should I keep a std::deque<std::tuple<message, callback>> and each time a boost::asio::async_write is finished, and there is a tuple in the queue, dequeue and send the message then raise the callback? Would this be the optimal way to implement this system?
I'm thinking since messages are sent all the time I'm going to have to implement some kind of listener loop that queues up responses, but how would you associate these responses with the specific command that triggered them? Or in the case that the response is just a message to the channel from another user?
The IRC protocol is a full-duplex protocol. As such, one should always be listening to the server connection expecting commands to process. It could be argued that one should primarily use the messages received from the server to update state, rather than correlating request and responses, as the server may not respond to a command or may respond much later than expected. For example, one may issue a WHOIS command, but receive multiple PRIVMSG commands before receiving a response to WHOIS. For a chat client, a user would likely expect being able to receive chat messages while waiting for a response to WHOIS. Hence, having a async_write() to async_read() call chain may not be ideal in handling the protocol.
For a given socket, the Asio documentation does recommend not initiating additional read operations if there is an outstanding composed read operation and not initiating additional write operations if there is an outstanding composed write operation. Queuing up messages and having an asynchronous call chains process from the queue is a great way to fulfill this recommendation. Consider reading this answer for a nice solution using a queue and an asynchronous call chain.
Also, be aware that the server may send a PING command even on an active connection. When the client is responding with a PONG command, it may be necessary to insert the PONG command near the front of the outbound queue so that it gets sent out as soon as possible.
Doesn't this defeat the purpose of doing async calls?
The usual solution is to use strands:
Why do I need strand per connection when using boost::asio?
You are free to queue multiple asynchronous operations on the same io objects using an (implicit) strand¹.
Using a strand ensures that the completion handlers are invoked on that same logical thread.
On the Protocol
You could indeed keep a queue of commands and await responses for each command before sending the next.
You might be a little bit smarter about this if you can spot the correlation due the different type of reply, but then you'd need to keep queues per type of command. I'd consider that premature optimization.

What is the difference between event driven model and reactor pattern? [closed]

From the wikipedia Reactor Pattern article:
The reactor design pattern is an event handling pattern for handling service requests delivered concurrently to a service handler by one or more inputs.
It named a few examples, e.g. nodejs, twisted, eventmachine
But what I understand that above is popular event driven framework, so make them also a reactor pattern framework?
How to differentiate between these two? Or they are the same?
The reactor pattern is more specific than "event driven programming". It is a specific implementation technique used when doing event driven programming. However, the term is not used with much accuracy in typical conversation, so you should be careful about using it and expecting your audience to understand you, and you should be careful in how you interpret the term when you encounter its use.
One way to look at the reactor pattern is to consider it closely related to the idea of "non-blocking" operations. The reactor sends out notifications when certain operations can be completed without blocking. For example, select(2) can be used to implement the reactor pattern for reading from and writing to sockets using the standard BSD socket APIs (recv(2), send(2), etc). select will tell you when you can receive bytes from a socket instantly - because the bytes are present in the kernel receiver buffer for that socket, for example.
Another pattern you might want to consider while thinking about these ideas is the proactor pattern. In contrast to the reactor pattern, the proactor pattern has operations start regardless of whether they can finish immediately or not, has them performed asynchronously, and then arranges to deliver notification about their completion.
The Windows I/O Completion Ports (IOCP) API is one example where the proactor pattern can be seen. When performing a send on a socket with IOCP, the send operation is started regardless of whether there is any room in the kernel send buffer for that socket. The send operation continues (in another thread, perhaps a thread in the kernel) while the WSASend call completes immediately. When the send actually completes (meaning only that the bytes being sent have been copied into the kernel send buffer for that socket), a callback function supplied to the WSASend call is invoked (in a new thread in the application).
This approach of starting operations and then being notified when they are complete is central to the idea of asynchronous operations. Compare it to non-blocking operations where you wait until an operation can complete immediately before attempting to perform it.
Either approach can be used for event driven programming. Using the reactor pattern, a program waits for the event of (for example) a socket being readable and then reads from it. Using the proactor pattern, the program instead waits for the event of a socket read completing.
Strictly speaking, Twisted misuses the term reactor. The Twisted reactor which is based on select(2) (twisted.internet.selectreactor) is implemented using non-blocking I/O, which is very reactor-like. However, the interface it exposes to application code is asynchronous, making it more proactor-like. Twisted also has a reactor based on IOCP. This reactor exposes the same asynchronous application-facing API and uses the proactor-like IOCP APIs. This hybrid approach, varying from platform to platform in its details, makes neither the term "reactor" nor "proactor" particularly accurate, but since the API exposed by twisted.internet.reactor is basically entirely asynchronous instead of non-blocking, proactor would probably have been a better choice of name.
I think that this separation "non-blocking" and "asynchronous" is wrong, as the main implication of "asynchronous" is "non-blocking". Reactor pattern is about asynchronous (so non-blocking) calls, but synchronous (blocking) processing of those calls. Proactor is about asynchronous (non-blocking) calls and asynchronous (non-blocking) processing of those calls.
To handle TCP connections, there are two competing web architectures, namely thread-based architecture and event-driven architecture.
Thread-Based Architecture
The oldest way of implementing a multi-threaded server is following the “thread per connection” approach. In order to control and limit the number of running threads, a single dispatcher thread can be used along with a bounded blocking queue and a thread pool.
The dispatcher blocks on a TCP socket for new connections and offers them to the bounded blocking queue. TCP connections exceeding the bound of the queue will be dropped allowing the accepted connections to operate with a desirable and predictable latency.
Event-Driven Architecture
Separating threads from connections, event-driven architecture only allows threads to be used for events on specific handlers.
This creative concept allows Reactor Pattern to come out of the shelf and show off. A system built on this architecture consists of event creators and event consumers.
The Reactor Pattern
The reactor pattern is the most popular implementation technique of event-driven architecture for TCP connection handling. In simple terms, it uses a single-threaded event loop, blocking on events and dispatches those events to corresponding handlers.
There is no need for other threads to block on I/O, as long as handlers for events are registered to take care of them. Considering a TCP connection, we can easily refer events to these instances: connected, input-ready, output-ready, timeout, and disconnected.
Reactor pattern decouples the modular application-level code from reusable reactor implementation. To achieve that, the architecture of the reactor pattern consists of two important participants — Reactor and Handlers.
A Reactor runs in a separate thread, and it is reacting to the I/O events such as connected, input-ready, output-ready, timeout and disconnected, by dispatching the work to the appropriate registered handler.
A Handler performs the actual work or the response that needs to be done with an I/O event. A Reactor responds to I/O events by dispatching the appropriate handler.
“Pattern Languages of Program Design” by Jim Coplien and Douglas C. Schmidt which was published way back in 1995, is one of the books that has explained the Reactor Pattern in detail.
