How to handle a failed or significantly delayed webhook?

How to handle a failed or significantly delayed webhook? - spring-boot

We have a middleware that depends on another system to execute payment requests. This third-party system usually sends a webhook later when a payment request is performed from our end and successfully done at their end after processing. Sometimes they failed or significantly delayed sending webhook and there is no retry mechanism at their end. However, they have a status query API at their end to know the current status of the payment request.
We update our payment status based on this webhook and this is very vital for our system. Now for the use case, we have found two ways to handle this failed webhook
Run a scheduler to cater failed webhook requests and check with their status query API
Implement a Queue, where a new entry will be added to the queue when an original payment request took place and fire status query API Using Time-out events eg. SQS.
The above way around has its own pros and cons. Is there any other way around to handle this use case? If no, which one of two would be the best choice.

One option is to use an orchestrator like temporal.io to implement your business logic. The code to act on the webhook as well as poll the status API in parallel would be pretty simple.

Related

go concurrency architecture surrounding multi step process

I need to create a system that consumes a message from gcp pub/sub which kicks off the following steps:
POST client api to create (synchronous)
POST client api to register (Async) - requires polling until status changes
POST client api to create another entity (leveraging id returned from step 2)
POST client api to register (Async) - requires polling until status changes
I was thinking about setting up a job queue with multiple workers - and pass the results from each step into a results channel. My question is best practices surrounding the polling piece.

REST API uses asynchronous (events) internally

I am implementing a REST API that internally places a message on a message queue and receives a message as a response on a different topic.
How could API implementation handle publishing and consuming different messages and responds to the client?
What if it never receives a message?
How does the service handle this time-out scenario?
Example
I am implementing a REST API to process an order. The implementation internally publishes a series of messages to verify the payment, update inventory, and prepare shipping info. Finally, it sends the response back to the client.

Queues are too low-level abstraction to implement your requirements directly. Look at an orchestration solution like temporal.io that makes programming such async systems trivial.
Disclaimer: I'm one of the founders of the Temporal open source project.

How could API implementation handle publishing and consuming different messages and responds to the client?
Even though messaging systems can be used in RPC like fashion:
there is a request topic/queue and a reply topic/queue
with a request identifier in the messages' header/metadata
this type of communication kills the promise of the messaging system: decouple components in time and space.
Back to your example. If ServiceA receives the request then it publishes a message to topicA and returns with an 202 Accepted status code to indicate that the request is received but not yet processed completely. In the response you can indicate an url on which the consumer of ServiceA's API can retrieve the latest status of its previously issued request.
What if it never receives a message?
In that case the request related data remains in the same state as it was at the time of the message publishing.
How does the service handle this time-out scenario?
You can create scheduled jobs to clean-up never finished/got stuck requests. Based on your business requirements you can simple delete them or transfer them to manual processing by the customer service.
Order placement use case
Rather than creating a customer-facing service which waits for all the processing to be done you can define several statuses/stages of the process:
Order requested
Payment verified
Items locked in inventory
...
Order placed
You can inform your customers about these status/stage changes via websocket, push notification, e-mail, etc.. The orchestration of this order placement flow can be achieved for example via the Saga pattern.

Paypal sandbox subcription webhooks very slow

I'm using PayPal webhooks to get subscription information automatically.
However, we have to wait about 20 seconds between the payment and the subscription activation.
Is it because of the sandbox environment? Is the production environment faster?
This is important because the customers have to wait and if waiting time could be avoided, it would be better.

The sandbox is slower in general, but you will need to test yourself in live -- and the speed of asynchronous notifications vary in different conditions.
If you need a faster notification, what you can do is have the client-side onApprove event call your server (with a JS fetch similar to this demo, plus a body payload if desired), and have the server route that handles that fetch use the Subscriptions API to get the status of the subscription, and see whether it is in fact active in that API response direct from PayPal.
Such a client-side trigger of a server route would happen in parallel to waiting for the webhook notification, so whichever completes first will mark the subscription as active in your records. This way you are not relying on either the client-side trigger nor waiting for the webhook, but rather whichever happens first.

Should I do Solana transactions synchronously in my apis?

So I know that compared to most more decentralized blockchains that Solana is fast.
Still, the question I have is whether I should implement my transactions asynchronously off of a queue - so that I can retry failures if something fails etc
This gets further complicated for example if I am using metaplex to create a token with associated metadata etc as it involves 2 transactions: 1 to create the token and another to create the token-metadata for the token

You're absolutely encouraged to retry transactions as appropriate. Here is an excerpt from the Solana Cookbook which gives the TLDR for retrying:
RPC nodes will attempt to rebroadcast transactions using a generic algorithm
Application developers can implement their own custom rebroadcasting logic
Developers should take advantage of the maxRetries parameter on the sendTransaction JSON-RPC method
Developers should enable preflight checks to raise errors before transactions are submitted
Before re-signing any transaction, it is very important to ensure that the initial transaction’s blockhash has expired
You can read the full source at https://solanacookbook.com/guides/retrying-transactions.html

Opayo - How to handle timeouts?

I currently manage an integration into Opayo Direct (v4.00). We send requests to Opayo, which most of the time work fine, but occasionally they time out (Our timeout limit is currently set to 20s, which is ages for a consumer to wait).
Does anyone know of a way to either:
Retry the payment request without double charging the consumer? or,
Send a follow up request to get the status of the submitted payment?
For 2. it looks like you need valid transaction identifier from Opayo, which of course we won't have as the request has timed out.
I can't see any mention of idempotency or guidance for what to do in this situation, even in their most recent API specification (PI integration).
Has anyone come up with a workable solution for this problem, other than change provider?

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio