How many concurrent connections can MarkLogic server process? - performance

Is there an upper limit on how many concurrent connections that MarkLogic can process? For example, ASP.NET is restricted to processing 10 requests concurrently, regardless of infrastructure and hardware. Is there a similar restriction in MarkLogic server? If not, are there any benchmarks that give some indication as to how many connections a typical instance can handle?

Given a large enough budget there is no practical limit on the number of concurrent connections.
The basic limit is the application server thread count, although excess requests will also pile up in the backlog queue. According to groups.xsd each application server is limited to at most 256 threads. The backlog seems to have no maximum, but most operating systems will silently limit it to something between 256-4096. So depending on whether or not you count the backlog, a single app server on a single host could have 256-4352 concurrent connections.
After that you can use multiple app servers, and add hosts to the cluster. Use a load balancer if necessary. Most operating systems will impose a limit of around 32,000 - 64,000 open sockets per host, but there is no hard limit on the number of hosts or app servers. Eventually request ids might be a problem, but those are 64-bit numbers so there is a lot of headroom.
Of course none of this guarantees that your CPU, memory, disk, and network can keep up with the demand. That is a separate problem, and highly application-specific.


Azure Redis Cache GET throughput per client connection The document mentions the Throughput numbers for GETS. But there are multiple client connections possible and there is also a limit based on the Pricing tier.
Question: Is the given number on "GET Requests per second" per client connection OR after creating a max possible connection with Redis cache and running GET operations from each client?
That's the total GETs/second regardless of the number of connections. I believe we tested with 50 connections. With lower numbers of connections, you may hit bottlenecks in the throughput of client instances or network connections before hit the limits of the server.
We always recommend benchmarking throughput with your application's actual architecture and workload to find actual cache capabilities for your use case:

Records are inserting less in the database when we increase the thread group count from 100 to 200 in Jmeter

Initially i have ran a load test with 100 users for 10 minutes and 1000 records got inserted in the database for the below scenarios.
Employee Creation -- Test script design took 1 minute
Employee Update -- Test script design took 2 minutes
And then I ran the same load test with 200 users for 10 minutes and 1100 records got inserted without any error logs or deadlocks.
My question is when we increase/double the thread group count from 100 to 200, Records insertion also should be double or approximately double. then why is it not happening? Same case with the number requests/samples.
You reached a maximum in your test throughput at about 110 records per min. In other words, you have a bottleneck on client or server, which doesn't allow 200 users to process request concurrently and/or within the same amount of time (either some users wait until they can start processing a request, or each request takes longer, so total number of requests is lower).
Some bottlenecks can be resolved by you (if they are related to script, JMeter configuration or JMeter machine), others have to be resolved on server side (by whoever has access to it), and some cannot be resolved at all (they are true bottlenecks of your app).
Without knowing your application, it's hard to suggest anything beyond general "checklist" items:
Verify JMeter script and check if it has any places where it may wait, take a long time, and so on. For example if your ramp-up period is too high, it may be that "first" user will finish execution, before "last" user even started it. Scriptable samplers, pre- and post-processors may cause delays as well.
Make sure JMeter is configured properly to handle 200 concurrent threads. For example if JMeter heap is set too low, it could be that JMeter is very slow, as it constantly needs to run GC. See this question for how to look at and configure memory (it discusses out of memory error, but even without that error inadequate memory can cause slowness)
Make sure JMeter machine is configured correctly to allow creation of 200+ HTTP connections concurrently. A common issue on both Windows and Linux machine is that people assume that they can have 65535 connections (as maximal number of ports), but in reality, both Windows and Linux limit number of ports they allow by default to be used. Also after the use port may remain in TIME_WAIT or CLOSE_WAIT state for several minutes, which makes it unusable. As a result, running out of ports is quite common. Here's how to monitor and resolve this issue on Windows and Linux.
Check JMeter machine performance as a whole: does it have enough CPU, memory; is it swapping memory, etc.
If none of the above is a problem, you need to look at how requests arrive to the server. If client is capable of sending 200 concurrent requests (which you should have established in previous steps), but server receives them at slower rate, then maybe something in the network slows things down. For example something like slow DNS resolution or slow routing between JMeter and server can cause issues.
Also Item #3 on the client is also applicable to the server.
If requests do arrive to the server at the same speed as they are sent from the client, then probably their processing by the server slows down as number of parallel requests goes up. This is where you are on dev and devOP territory, and probably need to work with them to identify bottlenecks on server side. It could be configuration of the web or application server, application itself, ... anything on app way pretty much.
Performance testing is 10% execution, and 90% analysis and identification of bottlenecks, so here you go. How to reduce emit delay with many concurrent connections?

Im running a 4-core Amazon EC2 instance(m3.xlarge) with 200.000 concurrent connections with no ressouce problems(each core at 10-20%, memory at 2/14GB). Anyway if i emit a message to all the user connected first on a cpu-core gets it within milliseconds but the last connected user gets it with a delay of 1-3 seconds and each CPU core goes up to 100% for 1-2 seconds. I noticed this problem even at "only" 50k concurrent users(12.5k per core).
How to reduce the delay?
I tried changing redis-adapter to mongo-adapter with no difference.
Im using this code to get sticky sessions on multiple cpu cores:
The test was very simple: The clients do just connect and do nothing more. The server only listens for a message and emits to all.
EDIT: I tested single-core without any cluster/adapter logic with 50k clients and the same result.
I published the server, single-core-server, benchmark and html-client in one package:
OK, let's break this down a bit. 200,000 users on four cores. If perfectly distributed, that's 50,000 users per core. So, if sending a message to a given user takes .1ms each of CPU time, that would take 50,000 * .1ms = 5 seconds to send them all.
If you see CPU utilization go to 100% during this, then a bottleneck probably is CPU and maybe you need more cores on the problem. But, there may be other bottlenecks too such as network bandwidth, network adapters or the redis process. So, one thing to immediately determine is whether your end-to-end time is directly proportional to the number of clusters/CPUs you have? If you drop to 2 cores, does the end-to-end time double? If you go to 8, does it drop in half? If yes for both, that's good news because that means you probably are only running into CPU bottleneck at the moment, not other bottlenecks. If that's the case, then you need to figure out how to make 200,000 emits across multiple clusters more efficient by examining code and finding ways to optimize your specific situation.
The most optimal the code could be would be to have every CPU do all it's housekeeping to gather exactly what it needs to send to all 50,000 users and then very quickly each CPU does a tight loop sending 50,000 network packets one right after the other. I can't really tell from the redis adapter code whether this is what happens or not.
A much worst case would be where some process gets all 200,000 socket IDs and then goes in a loop to send to each socket ID where in that loop, it has to lookup on redis which server contains that connection and then send a message to that server telling it to send to that socket. That would be a ton less efficient than instructing each server to just send a message to all it's own connected users.
It would be worth trying to figure out (by studying code) where in this spectrum, the + redis combination is.
Oh, and if you're using an SSL connection for each socket, you are also devoting some CPU to crypto on every send operation. There are ways to offload the SSL processing from your regular CPU (using additional hardware).

Max connection pool size and autoscaling group

In Sequelize.js you should configure the max connection pool size (default 5). I don't know how to deal with this configuration as I work on an autoscaling platform in AWS.
The Aurora DB cluster on r3.2xlarge allows 2000 max connections per read replica (you can get that by running SELECT ##MAX_CONNECTIONS;).
The problem is I don't know what should be the right configuration for each server hosted on our EC2s. What should be the right max connection pool size as I don't know how many servers will be launched by the autoscaling group? Normally, the DB MAX_CONNECTIONS value should be divided by the number of connection pools (one by server), but I don't know how many server will be instantiated at the end.
Our concurrent users count is estimated to be between 50000 and 75000 concurrent users at our release date.
Did someone get previous experience with this kind of situation?
It has been 6 weeks since you asked, but since I got involved in this recently I thought I would share my experience.
The answer various based on how the application works and performs. Plus the characteristics of the application under load for the instance type.
1) You want your pool size to be > than the expected simultaneous queries running on your host.
2) You never want your a situation where number of clients * pool size approaches your max connection limit.
Remember though that simultaneous queries is generally less than simultaneous web requests since most code uses a connection to do a query and then releases it.
So you would need to model your application to understand the actual queries (and amount) that would happen for your 75K users. This is likely a lot LESS than 75K/second db queries a second.
You then can construct a script - we used jmeter - and run a test to simulate performance. One of the items we did during our test was to increase the pool higher and see the difference in performance. We actually used a large number (100) after doing a baseline and found the number made a difference. We then dropped it down until it start making a difference. In our case it was 15 and so I set it to 20.
This was against t2.micro as our app server. If I change the servers to something bigger, this value likely will go up.
Please note that you pay a cost on application startup when you set a higher number...and you also incur some overhead on your server to keep those idle connections so making larger than you need isn't good.
Hope this helps.

What is the number of concurrent users support for Nodejs?

i need to scale my system to handle at least 500k users. I came across nodejs and it's quite intriguing.
Do anyone have any idea of how many concurrent users it can support? Has anyone really tested it?
Do you expect all this users to have persistent tcp connections to your server concurrently?
The bottleneck is probably memory with V8 1gb limit (1.7 on 64bit)
You can try to load test with several hundreds to few thousands connections, log heap usage and extrapolate to find one node instance connections limit.
Good question, but hard to answer. I think the amount of concurrent users is dependent on the amount of processing done with each request and the hardware you are using, eg. amount of memory and processor speed. If you want to use multiple cores, you could use multi-node. Multi-node will start multiple node instances. I never used it, but it looks promising.
You could do a quick test using ab, part of apache.
500k concurrent users is quite a lot, and would make me consider using multiple servers and a load-balancer.
Just my 2ct. Hope this helps.
