Linux RH5.4 OS, ext3 file system
In time T1, when i read/write a file, a i/o request will be send to OS(disk?) working queue, suppose the disk spend 10 ms to serve this request, now the time is T2=T1+10ms,
then the question is: When the request was removed from OS(disk?) working queue? at T1 or T2?
thank you.
Up until about 2.6.31, it was up to each individual device driver to decide when to remove requests from the kernel request queue.
In more recent kernels, requests are always removed from the queue as the driver retrieves them to send to the device.
If the disk has a queue (i.e. ATA NCQ), a request will remain in the disk's queue from the time when the driver places it there until the driver overwrites it with a later request (after the disk has signalled completion of the first request). Disk queues aren't strictly queues as there is no head or tail, it's just a pool of command slots that can be started independently.
Related
We are working on a POC to send messages to clients/browsers over Websockets. We are using AWS APIGateway Websockets for it, after client requests a connection, the connection is created and id is stored in Dynamo DB. Whenever there is an update, AWS Lambda fetches all the connection ids from DDB and iterate over them and send message to clients over the websocket connections.
This solution works fine with less number of clients but fails at scale, because lambda has to iterate through large number of connections. Is there a support from APIGateway to broadcast messages to all clients about the updates, if not what approach can we take to support large number of clients using Websockets?
Is there a support from APIGateway to broadcast messages to all clients about the updates
There has not been any way via the API Gateway API (at least the v3 javascript API) to send to client connections without explicitly knowing the connection ID.
what approach can we take to support large number of clients using Websockets?
Scanning DynamoDB is not ideal in terms of cost or performance. I've learned this the hard way.
I would consider either creating your own websocket server and hosting it via EC2 or switching your data storage to something outside of the traditional offerings of AWS, assuming your requirements are minimal (i.e., only needing to store connection IDs).
I am working on a similar project (WebSocket API Gateway + DynamoDB + Lambda triggered by a FIFO SQS Queue to publish messages to the connected users) and I realized that what was slowing down everything when broadcasting the messages was the postToConnection method.
At first, I tried multithreading in python to make multiple calls in parallel but I soon realized it didn't change anything.
At some point, I realized that the memory setting for my Lambda was still the default 128mb. I was not hitting up the memory limit at all, but within the config page of the Lambda, I noticed this sentence:
Your function is allocated CPU proportional to the memory configured.
The Memory (MB) setting determines the amount of memory available for your
Lambda function during invocation. Lambda allocates CPU power linearly
in proportion to the amount of memory configured. At 1,769 MB, a
function has the equivalent of one vCPU (one vCPU-second of credits
per second. To increase or decrease the memory and CPU power allocated
to your function, set a value between 128 MB and 10240 MB.
Upon increasing the memory setting (and CPU at the same time), I immediately noticed a huge boost in performance. I can't say what is the "ideal" setting for the number of connections, but just increasing it to 512mb made all the difference in our case.
Hope this helps!
I'm using Chronicle Queue v5.17.0 to process messages and my understanding is that the queue does not lose messages even if the java process dies (due to the fact that queue uses memory mapped file which is flushed by the OS).
Will some messages be lost if the VM dies or OS crashes before it flushes memory content to file?
Is there a way to control messages flush to disk?
Thank you!
Yes, If the data has not been flushed to disk it will be lost. There are also no guarantees that the disk that you are writing to has not become corrupted.
Even a force flush to disk cannot be relied on, as such, if you wish to guarantee that no messages are lost we recommend that you use chronicle-queue-enterprise to replicate your queues data to another host, Once the acknowledgement has been received ( for each message ) you now have a safe copy of each message, for more information on chronicle-queue enterprise please contact sales#chronicle.software
What are the best practices regarding sessions in an application that is designed to fetch messages from a MQ server every 5 seconds?
Should I keep one session open for the whole time (could be weeks or longer), or better open a session, fetch the messages, and then close the session again?
I am using the .net IBM XMS v8 client library.
Adding to what #Attila Repasi's response, I would go for a consumer with message listener attached. The message listener would get called whenever a message needs to be delivered to application. This avoids application explicitly calling receive() to retrieve messages from queue and waste CPU cycles if there are no messages on the queue.
Check the XMS.NET best practices
Keep the connection and session open for a longer period if your application sends or receive message continuously. Creation of connection or session is a time consuming operation and consumes lot of resources and involves network flow (for client connections).
I'm not sure what you are calling a session, but typically applications connect to the queue manager serving them once at start, and keep that connection up while running.
I don't see a reason to disconnect just to reconnect 5 seconds later.
As for keeping the queues open, it depends on your environment.
If there are no special circumstances, I would keep the queue open.
I think the most worth thinking about is how you issue the GETs to read the messages.
I've been recently attempting to send a workload of read operations to a 2-node Cassandra cluster (version 2.0.9, with rf=2). My intention was to send a number of reads at a rate that is higher than the capacity of my backend servers, thereby overwhelming them and resulting in server-side queuing. To do so, I'm using the datastax java driver (cql version 2) to run my operations asynchronously (in other words, the calling thread doesn't block waiting for a response).
The problem is that I'm unable to reach a high-enough sending-rate to overload my backend servers. The # of requests that I'm sending is being somehow throttled by Cassandra. To confirm this, I've ran clients from two different machines simultaneously, and the total number of requests sent per unit time is still peaking at the same value. I'm wondering if there's a mechanism that is employed by Cassandra to throttle the amount of requests that are being received? Otherwise, what else might be causing this behavior?
Each request received by Cassandra will be handled by multiple thread pools implementing a staged event-driven architecture, where requests will be queued for each stage. You can use nodetool tpstats to inspect the current status of each queue. Once too many requests are about to overwhelm the server, Cassandra will shed load by dropping requests once queues are about to reach their capacity. You'll notice this by numbers shown in the dropped section of tpstats. In case no requests are dropped, all of them will eventually complete, but you may see higher latencies using nodetool cfhistograms or WriteTimeoutExceptions on the client.
The network bandwidth from Cassandra side is throttling the amount of requests that are being received.
As far as I know their is no other mechanism employed by Cassandra to prevent itself from receiving too much requests. Timeout Exception is the main mechanism that Cassandra use to avoid crashing when it is overloaded.
Yes, Cassandra has multiple ways to throttle incoming requests. The first action on your part would be to find out which mechanism is the culprit. Then you can tune this mechanism to fit your needs.
The first step to find out where the block occurs, would be to connect to JMX with jconsole or similar and look at the queues and block values.
If I would hazard a guess, check MessagingService for timeouts and dropped messages between nodes. Then check the native transport requests for blocked tasks before the request even get to the stages.
Is number of message queue(s) a Windows App can have is equal to threads in that app that create windows?
Message queue is created only when thread creates a window. Before that a thread does not have any message queue. If creation of the queue will succeed or not depends on the amount of resources on the computer. The more RAM and free hard drive space you have, more treads and msg queues can be created. There is no any hardcoded limit on this.
In my experience starting more that a thousand threads most likely will create a problem even on a big box. But this still depends on the type of loaded drivers, number of network connections, etc.