I'm trying to test a web application using Jmeter. In a typical web application system, there exist packet drops, specially when the queues are full [1]. In case of such packet droping event, does Jemter re transmit the request again, or is that considered as a failed request (marked as error) ?
[1] Wang, Q., Lai, C. A., Kanemasa, Y., Zhang, S., & Pu, C. (2017). A Study of Long-Tail Latency in n-Tier Systems: RPC vs. Asynchronous Invocations. Proceedings - International Conference on Distributed Computing Systems, (1), 207–217. https://doi.org/10.1109/ICDCS.2017.32
By default it doesn’t in order ti detect such issues that can be due to configuration issues.
But this behaviour is adjustable using 2 properties:
httpclient4.retrycount
Number of retries to attempt. Retry will be done on Idempotent Http Methods by default. If you want to retry for all methods, see property httpclient4.request_sent_retry_enabled
Defaults to: 0
httpclient4.request_sent_retry_enabled
Set this property to true if it's OK to retry requests that have been sent. This mean that both Idempotent and non Idempotent requests will be retried. This should usually be false, but it can be useful when testing against some Load Balancers like Amazon ELB.
Defaults to: false
See:
https://jmeter.apache.org/usermanual/properties_reference.html#httpclient4
Related
I am trying to build a ZeroMQ pattern where,
There can be many clients connecting to a single server endpoint
Server will distribute incoming client tasks to available workers (will be mapped to the number of cores on the server)
These tasks are long running (in hours) and need to perform a lot of local I/O
During each task execution (iteration) there will be data/messages (potentially in order of [GB]s) sent back and forth between the client and the server worker
Client and server workers need to know if there are failures/errors on the peer side, so that they can recover (retry) or shutdown gracefully and try later
Based on the above, I presume that the ROUTER/DEALER pattern would be useful. PUB/SUB is discarded as I need to know if the peer fails.
I tried using various combinations of the ROUTER/DEALER pattern but I am unable to ensure that multiple messages from a client reach the same worker within an iteration. I understand that I need to implement a broker/forwarder/device that routes the incoming messages to the right recipient/handler/worker. But I am unable to map the frontend and backend sockets in the broker. I am looking at MajorDomo pattern, but I guess there has to be a simpler broker model that could just route the messages to the assigned worker. (not really get into services)
I am looking for some examples, if there are any or any guidance on what I may be missing. I am trying to build this in Golang.
Q : "What would be the right ZMQ Pattern?"
Based on the complex composition of all the requirements posted under items 1 - 5, I dare to say, The Right would be NOT to use a single one of the standard, built-in, ZeroMQ trivial primitive Communication Archetype Patterns, but to rather create a multi-layered application-specific composition of a ( M + N + 1 hot-standby robust-enough?) (self-resilient?) Signalling-Messaging infrastructure, that covers all your current ( and possibly extensible for any future one ) application-level requirements, like depicted here for a way simpler distributed-computing use-case, where but a trivial remote-SigKILL was implemented.
Yes, the best would be to create ( and maintain ) your own formalised signalling, that the application level can handle and interact across -- like the heart-beating for detecting dead-worker(s) + permitting to re-instate such failed jobs right on-detected failures (most probably re-located and/or re-scheduled to take place & respective resources not statically pre-mapped, but where physically most feasible at the re-instating moment of time - so even more telemetry signalling will help you decide about the re-instating of the such failed micro-jobs).
ZeroMQ is a fabulous framework right for such complex signalling and messaging hierarchies, so your System Architect's imagination is the only ceiling in this concept.
ZeroMQ will take the rest and do all the hard work nice and easily.
I would like to understand how to detect the failed service ( in a fast / reliably manner ), ie the service what is a root of all 5xx responses?
Let me try to elaborate. Lets assume we have 300+ microservices and they have only synchroneous http interaction via GET request without any data modifications ( we assume it for simplicity ). Each customer request may transform in calling ~10 different microservices, moreover it could be a 'calling chain' of requests, ie API Gateway calls 3 different microservices, each of them calls 1-5 more, each of these 1-5 calls 1-5 more etc.
We closely monitor 5xx errors on each of microservice and react on these errors.
Now one of the microservices fails. It appears to be somewhere in the end of a 'calling chain', which means that other microservices which depend on it will start to return 5xx as well.
Yes, there are circuit breakers, yes they become 'triggered / opened' and instead of calling the downstream service, they right away return error as well ( in most cases we cannot return a good fallback like empty response ).
So we see that relatively big amount of microservices return 5xx. Like 30-40 microservices return 5xx, we see 30-40 triggered / opened circuit breakers.
How to detect a failed microservice, a root of all evil, in a fast manner?
Did anybody encounter this issue?
Regards
You will need to implement a distributed tracing solution that tracks the origin transaction with a global ID. The name of this global identifier is typically called Correlation ID and it is generated by the very first service which creates the request and propagated to all the other microservices that work together to fulfill the request.
Take a look at OpenTracing for your implementation needs. It provides libraries for you to add the instrumentation required for identifying faulty microservices in a distributed environment.
However, if you really do have 300 microservices all using synchronous calls...maybe it is time to consider using asynchronous communications to eliminate the temporal coupling inherent in synchronous communications.
I am writing an SNMP agent and plan to write agent to process SNMP request one by one. Means that as when a request arrives at port 161 - will not accept any further request until response / timeout completes.
I am no sure of many SNMP clients - but is it that the SNMP request are sync and sequential - is there any way that they can come in bulk at a single time?
I think SNMP queries can easily come in bursts due to multiple independent managers polling your agent and/or a single anxious manager retrying the same command if your agent is not quick enough to respond.
When it comes to writing SNMP agents, the other consideration would be to estimate the maximum possible time for the agent to gather required data to respond. I believe it should not be the OID-average, but the OID-maximum. In other words, should your agent serve 100 OIDs, out of which querying one "slow" OID would lead to the entire (synchronous) agent to block and stop serving others - this situation might undermine the credibility of your agent on the network...
On top of that, if you happen to hit the same slow OID multiple time in a row (e.g. manager retries), the delay might be accumulating, effectively blocking out other queries.
To summarize: I think high-performance SNMP agent should have the following traits:
Support massively concurrent SNMP commands processing
Have non-blocking data source access for gathering managed objects data
Have some form of caching or rate limiting to protect computationally expensive data sources from cocky SNMP managers
On the other hand, if your SNMP agent is serving a small piece of static data on a low-power hardware and you do not expect too many managers ever talking to you, perhaps you could get away with a simplistic synchronous SNMP agent...
BTW, BSD sockets interface would hold a queue of unprocessed UDP packets so your agent would have a chance to catch up.
The premise of your question is flawed, as there is no concept of "coming in bulk at a single time" — no matter in which order the UDP datagrams making up an SNMP packet are received, and no matter how long a duration lies between the receipt of each packet by your network interface, your operating system will present the SNMP packets to you in receipt order, in sequence. You have one listen port, and one read buffer. So this synchronicity is already how network data processing works and you shouldn't worry about it.
I would say though, that if you are waiting for some resource to become available while processing an SNMP request (as suggested by your use of the word "timeout"), you probably ought to get on and start processing your other pending SNMP requests in the meantime, or you risk your whole stack grinding to a halt. It's not fair to make a manager wait some unknown duration for a response to request B just because some other manager made a request A that is experiencing a delay in being serviced. That being said, you probably do want some upper limit on how many requests can be serviced at any one time, to prevent potential DDoSsing — choosing this value can only be done by you, with your knowledge of the use case and the ecosystem.
Get requests are one OID per request, GetBulk request can ask for several OIDs in one request. Also SNMP client can use async mode sending multiple requests with minimal intervals and waiting for replies.
Packets can also arrive out-or-order due to network delays and equal-cost routes. Your can experiment sending requests with snmpget, snmpgetbulk, snmpbulkwalk and use tcpdump to see what is on the wire.
So, in general, your agent has to be ready to accept bursts of requests.
For simplicity, if request rate is low and your agent can reply fast enough, you can use one-by-one processing. Some of requests can fail in this case, but clients can retry request and finally get reply from agent.
I am developing an SMPP platform that has to be capable to delivere specific amount of sms per second.
This has been easily implemented using amqp with spring integration.
But:
I need to run the project as an active-active service on 2 nodes and each node has a connection to 2 SMSC.
For this configuration, I have an allowed traffic of 100 msg/s and I need to ideally spread my traffic on all the available connections.
A simple poller can be easily configured to 25 msg/s for each node (4 * 25 = 100) but if one of my connection is down, I want to spread the lost capacity to the other nodes/connections in live.
For this I would like to create a dynamic poller that gets information about connection status in redis and just adapts the amount of messages allowed per poll at runtime (0 for the broken connection and 33% for the 3 others for example, or 50% if there is only 2 connections on 4 available).
Is it possible to implement this behavior with a custom PollerMetadata or should I look for some other solution?
Poll is quite heavy and may be consider "old-fashion" these day.
I highly recommend to try using : Sse (server send event) or websocket.
Many technology also support both above solution (spring...)
You can find more detail in this article:
https://codeburst.io/polling-vs-sse-vs-websocket-how-to-choose-the-right-one-1859e4e13bd9
When building a pub-sub service using ZeroMQ on a Linux system, is there any way to enforce concurrent subscriber limits?
For example, I might want to create a ZeroMQ publisher service on a resource-limited system, and want to prevent overloading the system by setting a limit of, say, 100 concurrent connections to the tcp publisher endpoint. After that limit is reached, all subsequent connection attempts from ZeroMQ subscribers would fail.
I understand ZeroMQ doesn't provide notifications about connect/disconnect, but I've been looking for socket options that might allow such limits -- so far, no luck.
Or is this something that should be handled at some other level, perhaps within the protocol?
Yes, ZeroMQ is a Can-Do messaging framework:
Besides the trivial Formal Communication Pattern Framework elements ( the library primitives ), the strongest powers behind the ZeroMQ is the ability to develop one's own messaging system(s).
In your case, it is enough to enrich the scene with a few additional things ... a SUB-process -> PUB-process message-flow-channel, so as to allow PUB-side process to count a number of SUB-process instances concurrently connected and to allow for a disconnect ( a step delegated rather "back" to a SUB-process side suicside move, as the classical PUB-process, intentionally, has no instrumentation to manage subscriptions ) once a limit is dynamically achieved.
Plus add some dynamics for the inter-node signalling to start re-counting and/or to equip the SUB-process side(s) with a self-advertising mechanism to push-keepAliveSIG-s to the PUB-side and expect this signalling to be a weak and informative-only indication as there are many real-world collisions, where decentralised node simply fail to deliver a "guaranteed-delivery" message(s) and a well designed, distributed, low-latency, high-performance system has to cope well with this reality and have the self-healing state-recovery policies designed and in-built into own behaviour.
( Fig. courtesy imatix/ZeroMQ )
The ZeroMQ library can be thought of as a very powerful LEGO-tool-box for designing cool distributed systems, than a ready-made / batteries-included, stiff, quasi-solution-for-just-a-few-academic-cases ( well, it might be considered such, but just for some no-brainer's life, while our lives are much more colourful & teasing, aren't they ? )
So, "How to?"
Worth, definitely worth a few days to read the both of Pieter Hintjens' books & a few weeks for shifting one's mind to start designing with the ZeroMQ full-powers on one's side.
With just a few Python add-on habits ( a zmq.Context() early-setup, and not forgetting a finally: aContext.term() )
There's no way that I'm aware of to configure ZMQ to limit connections automatically... however, you have other options to accomplish what you're looking for. Perhaps the "traditional" way to accomplish this is with a second set of "network communication" sockets... perhaps REQ/REP from subscriber to publisher, asking for permission to connect.
You also have the option, depending on your version of ZMQ (and I've never used it and I can't find it in 5 minutes of searching, so I don't know how recent your version must be) to use XPUB/XSUB sockets, which can accomplish bi-directional communication. You can connect with XSUB, send a subscribe request, then receive a positive or negative response (you might have to play with your subscriber topics to communicate directly with just the single subscriber, I'm not sure), and react accordingly.
Either way, you'll be allowing a connection of some sort between the two systems and then either allowing it or terminating it depending on the situation. This could be less than completely ideal since you'll have to carve out a little overhead to handle connections that you'll be refusing... let's say you're saturated at 100 clients and all of a sudden get 100 new subscribe requests... you may or may not be able to cope with that sort of burst traffic.
You can test out the overhead in alternative communication mediums... like you could publish a webservice that indicates subscriber status that a client could check first, but that may not be any better to have clients connecting that way.
If you're absolutely at the limit of your resources, you'll have to set up a second server to handle subscriber status:
Server 1 is your publisher. You could set it up with a PUB socket and a REP socket.
Server 2 is your status server. It has a REQ socket. Have it subscribe to something like "system-status" or some such thing as that. It will also have your mechanism for communicating with new subscribers, be that a ZMQ socket or a web service or whatever else.
A client will request status from your status server. The status server will send a request to your publisher, which will increment it's subscriber count and reply with success, or keep its subscriber count and reply with failure. This success or failure will be communicated back to the subscriber, which will use that information to connect or not.
Disconnections will have to be communicated in a similar way... and you'll have to use some sort of heartbeating round-robin to confirm clients weren't a victim of catastrophic failure.
This will allow your publisher to make intelligent choices about whether it has resources or not. If you just want to set a static number, you don't even need the connection between the status server and the publisher, you can just keep count on the status server... but just to ensure the overall health of the network then it's probably best not to go that simplistic route.
Anyway, those are just some ideas to accomplish what you're looking for. ZMQ gives you options with which to craft your solutions moreso than actual solutions.