Performance issue for batch insertion into marklogic - performance

I have the requirement to insert 10,000 docs into marklogic in less than 10 seconds.
I tested in one single-node marklogic server in the following way:
use xdmp:spawn to pass the doc insertion task to task server;
use xdmp:document-insert without specify forest explicitly;
the task server has 8 theads to process tasks;
We have enabled CPF.
The performance is very bad: it took 2 minutes to finish the 10,000 doc creation.
I'm sure the performance will be better if I tested it in a cluster environment, but I'm not sure whether it can finish in less than 10 seconds.
Please advise the way of improving the performance.

I would start by gathering more information. What version of MarkLogic is this? What OS is it running on? What's the CPU? RAM? What's the storage subsystem? How many forests are attached to the database?
Then gather OS-level metrics, to see if one of the subsystems is an obvious bottleneck. For now I won't speculate beyond that.

If you need a fast load, I wouldn't use xdmp:spawn for each individual document, nor use CPF. But 2 minutes for 10k docs doesn't necessarily sound slow. On the other hand, I have reached up to 3k/sec, but without range indexes, transforms, whatsoever. And a very fast disk (e.g. ssd)..
HTH!

Assuming 2 socket server, 128GB-256GB of ram, fast IO (400-800MB/sec sustained)
Appropriate number of forests (12 primary or 6 primary/6 secondary)
More than 8 threads assuming enough cores
CPF off
Turn on perf history, look in metrics, and you will see where the bottleneck is.
SSD is not required - just IO throughput...which multiple spinning disks provide without issue.

Related

What's a sensible basic OLTP configuration for Postgres?

We're just starting to investigate using Postgres as the backend for our system which will be used with an OLTP-type workload: > 95% (possibly >99%) of the transactions will be inserting 1 row into 4 separate tables, or updating 1 row. Our test machine is running 9.5.6 (using out-of-the-box config options) on a modest cloud-hosted Windows VM with a 4-core i7 processor, with a conventional 7200 RPM disk. This is much, much slower than our targeted production hardware, but useful right now for finding bottlenecks in our basic design.
Our initial tests have been pretty discouraging. Although the insert statements themselves run fairly quickly (combined execution time is around 2ms), the overall transaction time is around 40ms, due to the commit statement taking 38 ms. Furthermore, during a simple 3-minute load test (5000 transactions), we're only seeing about 30 transactions per second, with pgbadger reporting 3 minutes spent in "commit" (38 ms avg.), and the next highest statements being the inserts at 10 (2ms) and 3 (0.6 ms) respectively. During this test, the cpu on the postgres instance is pegged at 100%
The fact that the time spent in commit is equal to the elapsed time of the test tells me the that not only is commit serialized (unsurprising, given the relatively slow disk on this system), but that it is consuming a cpu during that duration, which surprises me. I would have assumed before the fact that if we were i/o bound, we would be seeing very low cpu usage, not high usage.
In doing a bit of reading, it would appear that using Asynchronous Commits would solve a lot of these issues, but with the caveat of data loss on crashes/immediate shutdown. Similarly, grouping transactions together into a single begin/commit block, or using multi-row insert syntax improves throughput as well.
All of these options are possible for us to employ, but in a traditional OLTP application, none of them would be (you need to have fast, atomic, synchronous transactions). 35 transactions per second on a 4-core box would have unacceptable 20 years ago on other RDBMs running on much slower hardware than this test machine, which makes me think that we're doing this wrong, as I'm sure Postgres is capable of handling much higher workloads.
I've looked around but can't find some common-sense config options that would serve as starting points for tuning a Postgres instance. Any suggestions?
If COMMIT is your time hog, that probably means:
Your system honors the FlushFileBuffers system call, which is as it should be.
Your I/O is miserably slow.
You can test this by setting fsync = off in postgresql.conf – but don't ever do this on a production system. If that improves performance a lot, you know that your I/O system is very slow when it actually has to write data to disk.
There is nothing that PostgreSQL (or any other reliable database) can improve here without sacrificing data durability.
Although it would be interesting to see some good starting configs for OLTP workloads, we've solved our mystery of the unreasonably high CPU during the commits. Turns out it wasn't Postgres at all, it was Windows Defender constantly scanning the Postgres data files. The team that set up our VM that was hosting the test server didn't understand that we needed a backend configuration as opposed to a user configuration.

PostgreSQL Scalability: Towards Millions TPS

I read this article http://akorotkov.github.io/blog/2016/05/09/scalability-towards-millions-tps/
I try this benchmark in stronger machine but I achieve 20K transactions per second in max , how is possible to achieve 400K - 1.8M transactions per second? it is fake?
I'm not sure if it's worth commenting 4 years after :)
That's not fake. The usage of the term TPS for read-only pgbench transactions might be confusing. It would probably better to use the term QPS. However, the post explicitly says it's read-only pgbench benchmark and even the graph title says it's "pgbench -S".
Nowadays, it's quite easy to reproduce. For instance, you can get c5d.18xlarge AWS machine and get even higher numbers on modern PostgreSQL.
Nevertheless, I believe it's possible that you get just 40k read-only TPS on some "stronger machine". For instance, the "stronger machine" might appear to be NUMA with a very slow interconnect. Such a machine would be bad for PostgreSQL, it would be stuck there. The first steps for understanding what's going on would be collecting perf profile and statistics of PostgreSQL wait events while running the pgbench.

What is the number of concurrent users support for Nodejs?

i need to scale my system to handle at least 500k users. I came across nodejs and it's quite intriguing.
Do anyone have any idea of how many concurrent users it can support? Has anyone really tested it?
Do you expect all this users to have persistent tcp connections to your server concurrently?
The bottleneck is probably memory with V8 1gb limit (1.7 on 64bit)
You can try to load test with several hundreds to few thousands connections, log heap usage and extrapolate to find one node instance connections limit.
Good question, but hard to answer. I think the amount of concurrent users is dependent on the amount of processing done with each request and the hardware you are using, eg. amount of memory and processor speed. If you want to use multiple cores, you could use multi-node. Multi-node will start multiple node instances. I never used it, but it looks promising.
You could do a quick test using ab, part of apache.
500k concurrent users is quite a lot, and would make me consider using multiple servers and a load-balancer.
Just my 2ct. Hope this helps.

cassandra massive write perfomance problem

I have server with 4 GB RAM and 2x 4 cores CPU. When I start perform massive writes in Cassandra all works fine initially, but after a couple hours with 10K inserts per second database grows up to 25+ GB, and performance go down to 500 insert per seconds!
I find out this because compacting operations is very slow but I don't understand why? I set 8 concurrent compacting threads but Cassandra don't use 8 threads; only 2 cores are loaded.
Appreciate any help.
We've seen similar problems with Cassandra out-the-box, see:
http://www.acunu.com/blogs/richard-low/cassandra-under-heavy-write-load-part-ii/
One solution to these sort of performance degradation issues (but by no means the only) is to consider a different storage engine, like Castle, used in the above blog post - its opensource (GPL v2), has much better performance and degrades much more gracefully. The code is here (I've just pushed up a branch for Cassandra 0.8 support):
https://bitbucket.org/acunu/fs.hg
And instructions on how to get started are here:
http://support.acunu.com/entries/20216797-castle-build-instructions
(Full disclosure: I work for Acunu, so may be a little biased ;-)

Are Amazon's micro instances (Linux, 64bit) good for MongoDB servers?

Do you think using an EC2 instance (Micro, 64bit) would be good for MongoDB replica sets?
Seems like if that is all they did, and with 600+ megs of RAM, one could use them for a nice set.
Also, would they make good primary (write) servers too?
My database is only 1-2 gigs now but I see it growing to 20-40 gigs this year (hopefully).
Thanks
They COULD be good - depending on your data set, but likely they will not be very good.
For starters, you dont get much RAM with those instances. Consider that you will be running an entire operating system and all related services - 613mb of RAM could get filled up very quickly.
MongoDB tries to keep as much data in RAM as possible and that wont be possible if your data set is 1-2 gigs and becomes even more of a problem if your data set grows to 20-40 gigs.
Secondly they are labeled as "Low IO performance" so when your data swaps to disk (and it will based on the size of that data set), you are going to suffer from disk reads due to low IO throughput.
Be aware that micro instances are designed for spiky CPU usage, and you will be throttled to the "low background level" if you exceed the allotment.
The AWS Micro Documentation has good information of what they are intended for.
Between the CPU and not very good IO performance my experience with using micros for development/testing has not been very good. (larger instance types have been fine though), but a micro may work for your use case.
However, there are exceptions for a config or arbiter nodes, I believe a micro should be good enough for these types of machines.
There is also some mongodb documentation specific to EC2 which might help.

Resources