Handle large dataset in restful API without pagination - oracle

Need to send morethan 2 million records from an rest API by connecting to oracle database. The response size could be morethan 3GB. I know it will hit the performance of API and may get outofmemory errors.
I'm researching on http chunking and webflux options to avoid performance issues from an API and also for consumer. Not sure whether it will resolve my issue.
I would like to know do we have any approaches to handle large datasets like 2or 3 millions of data?
Want to understand best approach for my problem.
I tried with streamingresponsebody but the backend connection got closed as the request didn't completed in 5minutes.

Related

Snowflake's Asynchronous External function not respecting HttpStatus 429

I have implemented an API which adhere with the Snowflake's Asynchronous External Function.
In our developed system, we are using AWS API gateway, Lambda function and a Third Party API( TPA).
In our scenarios, we store certain information in Snowflake's table and try to enrich this table using Snowflake's External User Defined Function.
We are able to enrich the table if the number of records are less. If we try to enrich the 3 millions of records, then after certain time, our TAPI starts sending HTTP 429. This is a indicator which tells our lambda function to slow the number of Snowflake's requests.
We understand this and the moment Lambda function gets the HTTP 429, then it sends the HTTP 429 back to Snowflake in any polling/post requests. It is expected that Snowflake will slow down the request rather than throwing an error and stopped processing further.
Below response to Snowflake
{
"statusCode" : 429
}
And it is a fixed situation which looks like Snowflake is not respecting HTTP 429 in the Request-Reply Pattern.
Snowflake does handle HTTP 4xx responses when working with external functions.
Have you engaged support? I have worked with customers having this issue, and snowflake team is able to review.
AWS API gateway has a default limit of 10000 rps.
Please review Designing High Performance External Functions
Remote services should return HTTP response code 429 when overloaded.
If Snowflake sees HTTP 429, Snowflake scales back the rate at which it
sends rows, and retries sending batches of rows that were not
processed successfully.
Your options for resolution are:
Work with AWS to increase your API Gateway rate limit.
However, some proxy services, including Amazon API Gateway and Azure
API Management, have default usage limits. When the request rate
exceeds the limit, these proxy services throttle requests. If
necessary, you might need to ask AWS or Azure to increase your quota
on your proxy service.
or
Try using a smaller warehouse, so that snowflake sends less volume to API gateway per second. This has obvious drawback of you running slower.

Fetching large webhook in an optimal way

So I was sending a webhook from SendGrid, comprising 10000 records in it. At my endpoint, I am using rabbitmq queue to process the webhook further. The issue I am facing is in MongoDb update process as it is experiencing too much load. I've searched around and didn't get any results for this issue. What I was thinking to do is to send data to mongodb in chunks after consuming the message or is there any way around where I can receive the webhook in chunks before pushing it to my queue?

Performance improvement idea about saving large number of Objects into the dabatase

My web application has one feature where it allows the user to send messages to all friends. The number of friends can be 100K to 200K. The application is using Spring and Hibernate.
It entails fetching the friends' info, building the message object and saving it to the database. After all the messages are sent (actually saved to the db), a notification will popup showing how many messages are sent successfully such as 99/100 sent or 100/100 sent.
However, when I was load testing this feature. It took an extremely long time to finish. I am trying to improve the performance. One approach I tried was to divide the friends into small batches and fetch/save each batch concurrently and wait on all of them to finish. But that still didn't get too much improvement.
I was wondering if there are any other ways I can try. Another approach I can think of is to use WebSockets to send each batch and update the notification after each batch and start the next batch until all the batches are sent. But how can the user still get the notification after he navigates away from the message page? The Websocket logic on the client side has to be somewhere global, correct?
Thank you in advance.

Configure Apollo GraphQL hanging requests

I'd like to set an arbitrary timer on my graphQL requests. Say, I make a request and it takes longer than 10 seconds for Apollo to send back an error.
Thoughts?
Would I need to do this with the Apollo client and Apollo server (say additional service requests such as databases, etc.)?
There are three different places where timeouts might make sense:
1. For the connection to the server
To have a timeout for requests sent to the server, you could build a wrapper around the network interface, which would reject query promises after x seconds.
2. For the query resolution on the GraphQL server
To implement a per-query timeout on the server, you could put the query start time on the context at the start of the query, and wrap each resolve function with a function that either returns the promise from the resolver, or rejects when the timeout has elapsed.
3. For the connection between your GraphQL server and the backends
To implement timeouts for requests to the backend, you can simply make the fetch-requests to the backends time out after a certain amount of time.
PS: It's worth noting that the solutions above will cause queries or requests to time out, but they won't cancel them, which means that your server or backends will most likely continue doing work that is now wasted. However, cancelling is an entirely different problem, and it's also harder to address.

What's the rate limit on the square connect api?

Currently the documentation just says:
If Connect API endpoints receive too many requests associated with the same application or access token in a short time window, they might respond with a 429 Too Many Requests error. If this occurs, try your request again at a later time.
Much appreciated!
Currently, Connect API rate limits are on the order of 10 QPS. This limit might change in the future and should not be relied on.

Resources