Netflix Ribbon MaxAutoRetries and MaxAutoRetriesNextServer - microservices

Suppose that we have MaxAutoRetries set to 1 and MaxAutoRetriesNextServer set to 2 and I have 3 dependency servers (A,B,C).
My question is, does MaxAutoRetries also apply to MaxAutoRetriesNextServer?
What will be the total number of requests sent to all the three servers, considering that every request will timeout and the first server to be called is server A? If you could also explain a bit the flow and order of the ribbon retries it would be appreciated.
Server A: 2
Server B: 2
Server C: 2
Or maybe just
Server A: 2
Server B: 1 (because MaxAutoRetries is neglected when trying another server)
Server C: 1
Also, what happens when MaxAutoRetriesNextServer is bigger than the actual number of dependent servers?
Will ribbon retry the same servers multiple times, or will it just stop when all the servers have been called at least once?

Related

How many parallel tasks can do with FUNCTION_WORKER_PROCESS_COUNT equals 10

Azure Function architecture:
So we are setting a pipeline in Azure Datafactory, it contains 4 triggers to trigging Function1 at the same time with 4 different parameters.
Pipeline -> Function1-param1 Function1-param2 Function1-param3 Function1-param4
Yesterday, I tried to trigger that pipeline 2 times in 5 minutes, e.g. 10:30 and 10:31. That means, that time I tiggered Function1 for 8 times in 5 minutes.
Pipeline ->
time 1 Function1-param1 Function1-param2 Function1-param3 Function1-param4 10:30
time 2 Function1-param1 Function1-param2 Function1-param3 Function1-param4 10:31
Strange thing is that, we expect 8 calls running parallel cause FUNCTION_WORKER_PROCESS_COUNT is setting to 10, but there are only 6 calls running parallel, 2 are running after that.
So the question is, what is the relationship between FUNCTION_WORKER_PROCESS_COUNT and the tasks that can running parallel.
Function is written in Powershell 7.
AFAIK, the maximum number of worker processes per Function host instance is limited by the FUNCTIONS WORKER PROCESS COUNT variable. These instances are treated as independent VMs, with the FUNCTIONS WORKER PROCESS COUNT limit applied to each one.
If the FUNCTIONS WORKER PROCESS COUNT is set to 10, each host instance will perform 10 individual functions at the same time.
Multiple Workers means Multiple Process Ids of the Same Function App which is a logical collection of Functions.
One Worker Process can host all functions of one Function App where the Single Host has the default value as 1 to the FUNCTIONS_WORKER_PROCESS_COUNT and the Function Host means it is the Physical/Virtual Host where Function App runs as Windows/Linux Process.
Refer here for more information on mechanism of FUNCTIONS_WORKER_PROCESS_COUNT.
Scaling out also allows you to adapt to changes in demand more quickly.
To meet resource requirements, services may usually be swiftly added or removed. This flexibility and quickness efficiently saves spending by only using (and paying for) the resources required at the moment.
Refer this article for more information on the benefits of Scale Up and Scale out.
References of more information on FUNCTIONS_WORKER_PROCESS_COUNT:
Azure Functions - Functions App Settings - functions_worker_process_count
Azure Functions - Best Practices - FUNCTIONS_WORKER_PROCESS_COUNT

GOMAXPROCS for Go service in Kubernetes

I'm trying to stress test our Go service in kubernetes.
The service is just an http server that accepts requests, send requests to another service, perform some string manipulations and return back response to the original request.
We started with
cpu.requests = 1
cpu.limit = 2
Note: host VM has 6 CPUs
With the following test scenario:
Repeat for 20 times:
1. Send 40 parallel requests
2. Sleep for 200ms
What we observed is Gomaxprocs by default is set to 6 (following host specs)
and we get network i/o timeout after some iterations of test.
In addition, cpu consumption falls to 0 after some time (any idea what might happen here? Go runtime scheduler get stuck?)
Issue is resolved by setting Gomaxprocs explicitly to 1.
Some basic Googling led me to article like https://github.com/uber-go/automaxprocs/issues/12
But not many other articles/documentations that warn us about this GOMAXPROCS behavior on kubernetes.
Help appreciated:
Any other articles that elaborate how misconfigured GOMAXPROCS affect Go service in kubernetes?
What to do if cpu.requests is set to 500mCPU? is GOMAXPROCS=1 still adequate? or it simply means cpu.requests must be at least 1?

JMeter - Wrong number of users in results from remote load testing

I was using non-GUI mode to perform a remote load testing with Jmeter from master server (Linux) to 5 slave servers (Linux). 5x "n" users have been run, "n" users on each server.
The results have been written to master server.
There are samples from all servers in results file but they relate to the number of active users from particular servers ("n") and not from all servers (5x "n").
There are no information in the result file about the real number of active users on all the servers.
As a result, a maximum number of active users is "n" on generated graphs which does not reflect the real load (5x "n" users).
Has anyone got a similar problem?
Is there anything I can do to correct the results already gathered?
Should I change any JMeter parameter to get the correct results in the next run?
Short Answer:
This is normal and no, there's nothing in JMeter you can do to fix it.
Long Answer:
Each Load Generator creates a number of threads n: the threads will be numbered 1-n. When the Controller collects all of the information, it sees 5 results for Thread 1, 5 results for Thread 2, ... Thread n. The Controller has no way of knowing that each of them are 5 separate concurrent threads and not just the same thread 5 sequential times.
Fixing it:
It depends on what you mean by a maximum number of active users is "n" on generated graphs. If this is something inside JMeter, then no, you can't fix it.
If it's a report-generator that you have created yourself, then yes, you can fix it by passing in the number of load generators.

Multiple backend non blocking calls from NodeJS gives slow response

I have a specific use case which Im trying to solve using Node. The response time that I receive from NodeJS is not the one I expect.
The application is an express.js web application. The flow is as below
a. Request reaches the server.
b. Based on the parameter, backend REST Service is invoked.
c. The response of the REST Service has links to multiple other objects.
d. Navigate each of the link and agrregate the data.
e. This data is formatted (not much) and send to the client.
The actual test data-
The response from C has got 100 links and hence I make 100 parallel calls (Im using async.map). Each of the backend service responds in less than 30 msecs. But the overall response time for 100 requests is 4 seconds. This is considerably high.
What I have observed is :
The time difference between the first backend request and the last backend request is around 3 seconds. I believe that this is due to the fact that Node is single threaded and it takes 3 seconds to place all of the 100 http requests.
The code that I use to make parallel calls is given below
var getIndividualRecord = function(entity,callback1)
{
httpExecutor.executeRequest( entity.link.url, callback1);
}
var aggregateData = function(err, results)
{
callback(null, results);
}
async.map(childObjects, getIndividualRecord, aggregateData);
The childObjects is an array with 100 records. httpExecutor makes a REST invocation using request module.
Is there something wrong Im doing or is this a wrong use case for Node?
You're assumption is correct: node is single threaded, so while your HTTP requests happen in a nonblocking manner (requests are made right after the other without even waiting for the response from the server), they don't truly happen in simultaneously.
So, yes, it's probable it takes Node 3 seconds to get through all these requests and process them.
There are a few ways "around" this, which might work depending on your situation:
Could you use Node's cluster module to spawn multiple node apps and each do a portion of the work? Then you would be doing things simultaneously (since you have N Node processes going on).
Use a background queue mechanism (aka: Resque, Beanstalk) and have a background worker (or a process spawned with Cluster) to distribute the work (to Node worker processes waiting around to pick things off this queue)
Refactor your web app a little bit to deal with the fact that parts will take a while. Perhaps render most of the page then onload make an ajax request that fires off the 3 second route and then puts the results in some DOM element when the AJAX request comes back.
i have similar scenario and similar observation.
in my case I run node app using pm2. in app there are 2 sub servers (let's call them A and B). pm2 spawns 2 processes per each server. from a client i call server A, it calculates simple thing and call server B in async manner. when server B responds server A sends data back to client.
very simple scenario but when I used jmeter to create 1000 threads (where each thread makes 50 calls)to call server A I got average response around 4 sec (for 50000 calls).
server B responds after 50ms and I think this is the problem. during first 50ms nodejs processes lots of incoming requests and then it cannot quickly process responses from server B and incoming calls.
I would expect that application code is executed in single thread but there supposed to be background threads to deal with all the rest. it seems this is not the case.

Redis mget vs get

Setup:
We have a setup of redis in which we have a master and 4 slaves of redis running on same machine. Reason to use the multiple instances were -
To avoid hot keys
Memory was not a constraint as number of keys were small ~10k ( We have a extra large EC2 machine)
Requests:
We approximately make 60 get request from redis per client request. We consolidate 60 gets in 4 mgets. We make a single connection for all the request ( to one of the slave picked up randomly ).
Questions
Does it make sense to run multiple instances of redis with replicated data in the slaves?
Does making mgets instead of gets in our case help us where we have all the instances on the same machine?
Running multiple redis instances on the same machine can be useful. Redis is single threaded so if your machine has multiple cores, you can get more CPU power by using multiple instances. Craigslist runs in this configuration as documented here: http://blog.zawodny.com/2011/02/26/redis-sharding-at-craigslist/.
mget versus get should help since you are only making 4 round trips to the redis server as opposed to 60, increasing throughput - running multiple instances on the same machine shouldn't change that.

Resources