MongoDB sharded cluster 25 slower than standalone node - performance

I'm confused by the situation and trying to fix this for a couple of days now. I'm running 3 shard on top of three 3-members replica sets (rs0, rs1 and rs2). All is working so far. Data is distributed over the 3 shards as well as cloned within the replica sets.
BUT: importing data into one of the replica set works fine with constantly 40k docs/s but by enabling sharding slows the entire process down to just 1.5k docs/s.
I've populated the data via different methods:
generated some random data in the mongo shell (running in my mongos)
JSON data import via mongoimport
MongoDB dump restore from another server via mongorestore
All of them result in just 1.5k doc/s which is disappointing. The mongod's are physical Xeon boxes with 32GB each, the 3 config servers are virtual servers (40 GB HDD, 2 GB RAM, if that matters), the mongos is running on my app server. By the way, the value of 1.5k inserts/s doesn't depend on the shard key, same behaviour for a dedicated shard key (single field key as well as compound key) as well as hashed shard key on _id field.
I tried a lot, even reinstalled the entire cluster twice. The question is: what is the bottleneck in this setup:
config servers running on virtual server? -> shouldn't be problematic due to the low resource consumption of config servers
mongos? -> running multiple Mongos on a dedicated box behind HAproxy might be an alternative, haven't tested that yet

Let's do the math first: how big are your documents? Keep in mind that they have to be transferred over the net multiple times depending on your write concern.
May be you are experiencing this because of the indices which have to be build.
Please try this:
Disable all indices except the one for _id (which is not possible anyway, iirc)
Load your data
Reenable indices.
Enable sharding and balancing if not done already
This is the suggested way for importing data into a shared cluster anyway and should speed up your import considerably. Some (cautious !) fiddling with storage.syncPeriodSecs and storage.journal.commitIntervalMs might help, too.
The delay can occur even when storing the data on the primary shard. Depending on the size of your indices, they may slow down bulk operations considerably. You might also want to have a look at the replication.secondaryIndexPrefetch config option.
Another thing might be that your oplog simply gets filled faster than the replication can take place. Problem here: once it is created, you can not increase it's size. I am not sure wether it is safe to delete and recreate it in standalone mode and then reshare the replica set, but I doubt it. So the safe option would be to have the instance actually leave the replica set, reinstall it with a more appropriate oplog size and add the instance to the replica set as if it were the first time. If you don't care for the data, simply shut the replica set down, adjust the oplog size in the config file, delete the data dir and restart and reinitialize the replica set. Thinking of your problem twice, this sounds like the best bet to me, since the opllog isn't involved in standalone mode, iirc.
If you still have the same performance issues, my bet is on problems with disk or network IO.
You have a fairly standard setup, your mongos instance is running on a different machine than your mongod (be it a standalone or the primary of a replica set). You might want to check a few things:
Name resolution latency for resolving the name of your primary and secondary shards from the machine running your mongos instance. I can not count the times installing nscd improved performance for various operations.
network latency from your mongos instance to your primary shard. Assuming you have a firewall between your AppServer and your cluster, you might want to talk to the respective administrator.
In case you are using external authentication, try to measure how long it takes.
When using some sort of tunneling (e.g. stunnel or encryption like SSL/TLS), make sure you disable name resolution. Please keep in mind that encrypting and decrypting may take a relatively long time.
Measure random disk IO on the mongod instances

I was facing a similar performance issue. What helped to solve the performance issue was I ended up setting the mongod instance that was running on the same host as the mongos as the primary shard.
using the following command:
mongos> use admin
mongos> db.runCommand( { movePrimary: "mydb", to: "shard0003" } )
After making this change (without touching the load balancer or tweaking anything else), I was able to load a relatively large dataset (25 million rows) using a loader I had written, and the entire procedure took about 15 minutes instead of hours/days.

Related

Is there someway to make distributed table still work for query when one of the shard servers down?

There is a common case that we will update the clickhouse's config which must restart clickhouse to take effect. And during the restarting, the query services depend on clickhouse's distributed table will return the exception due to disconnecting with the restarting server.
So,as the title says, what I want is the way to make distributed table still work for query when one of the shard server down. Thanks.
I see two ways:
Since this server failure is transient, you can refactor your server-side code by adding retry-policy to your request (for c# I would recommend use Polly)
Use the proxy (load-balancer) to CH (for example chproxy).
UPDATE
When one node is restarting in a cluster the distributed table created over replicated tables should be accessible (of course request shouldn't be sent to restarted node).
Availability of data is achieved by using replication, therefore, you need to create Replicated*-tables over materialized view and then create Distributed-tables over Replicated*-tables.
Please look at the articles CH Data Distribution, Distributed vs Shard vs Replicated..
and as a working example (it is not your case) to CH Circular cluster topology.

Redis memory usage issue

I have a cluster with two Redis docker instances (v3.2.5) I use for caching responses from Spring boot microservices.
I've disabled all persistence and the number of keys is stable over time, all of them expiring between 5 minutes and 1 day.
Despite this, I can see the memory usage creeping up. It looks like once a day (around midnight) it uses a lot of memory and then releases some of it.
Does anyone have any idea what this process may be, if there's any way to configure Redis to avoid using that much memory?
The number of keys I have doesn't justify this amount of memory
UPDATE
After taking a snapshot of the database and loading the data on a fresh new Redis instance (same version, same config) the memory_used_human is 10 times lower than the original one.
Is it possible that key expiration doesn't really delete keys from memory?

Maximum number of databases in redis

In redis, select "number" gives access to specific database at that index. I my redis config it is set 16(why ?). We require high scaling of our application so what is the max limit for that?
The default number of Redis databases is 16, but can be configured to more. You probably have 16 in your config because of that default (see Storing Data with Redis).
Databases (in Redis) are a way to partition data logically (think "namespace", "key-space" or, in RDBMS terms, a schema). Redis databases have nothing to do with scalability, so your "max limit" question is out of context.
To scale you would want to do as Sergio suggests in his comment: create separate Redis instances/clusters for separate applications.
Answer: 1 Million (Maybe) or 100K on Linux (Maybe)
Official Info
So the official documentation indicates that the default setting is 16. This may be changed in redis.conf. The official documentation does not indicate the range that is allowed here.
Original Research
Through experimentation on my local Windows 10 WSL Debian install I found that I could set the conf value to anything and the server would start up fine.
However when I then attempted to select a database via command line my computer would freeze. I tried several values and the system worked perfectly and quickly at 1,000,000 (one million) and froze out at 10,000,000 (ten million). This number seems rather arbitrary in the computer world so either it is a memory limitation (seems unlikely but I don't know WSL's memory handling) or an arbitrary limitation set by the developers.
I ran some similar tests on my CentOS 7 Box and redis refused to start at 1 Million. But started fine at 100 thousand. No idea why it's different from my windows system or why it just refused to start instead of starting and then failing when a database was selected like my WSL version.
Disclaimer
As already stated by #kit in his previous answer databases are not designed for "scaling" but rather for "namespaces". For example a SAAS may run one code base but hundreds of clients each client with their own "namespace" or redis database. This allows you to flush a client without affecting others and minimize the administrative overhead. But running dozens of unique wordpress instances would be better suited for a unique install of redis each.
The default number of databases in Redis is 16,index:0~15。You can edit your redis.conf file to adjust this number:
Steps:
1)edit config file
vi /etc/redis.conf
Default config path is /etc/redis.conf on Centos when installed with Yum.
2)find keyword:"databases"
# Set the number of databases. The default database is DB 0, you can select
# a different one on a per-connection basis using SELECT <dbid> where
# dbid is a number between 0 and 'databases'-1
databases 16
databases 16:16 is the default on new installations
3)Update the number of databases:
databases 30
4)end,save and quit
The default number of databases is limited to 16. You can change it by making the changes in etc/redis/redis.conf file.
sudo vim etc/redis/redis.conf
change databases 16 to
databases 150 ,you can type any number instead of 150.
Also, change supervised no to
supervised systemd in redis.conf file.
Restart redis:
sudo systemctl restart redis.service
Now try selecting any databases higher that 16.
redis-cli
select 34
Answer is "unlimited"...
Here is the question and its answer from redis faqs page:
How many Redis databases can I create and manage?
The number of Redis databases is unlimited. The limiting factor is the
available memory in the cluster, and the number of shards in the
subscription.
Note the impact of the specific database configuration on the number
of shards it consumes. For example:
-Enabling database replication, without enabling database clustering, creates two shards: a master shard and a replica shard.
-Enabling database clustering creates as many database shards as you configure.
-Enabling both database replication and database clustering creates double the number of database shards you configure.
for mor details:
https://redis.com/faqs/#:~:text=The%20number%20of%20Redis%20databases,of%20shards%20in%20the%20subscription.

Mongo - quick exports of all documents without performace hit on other queries?

I need Mongo cluster doing 2 operations:
get/update a single document - Mongo is great for realtime changes, excelent speed.
export all documents into JSON file (one file for a category, there are cca 15 categories) - this is very slow, when I use regular query. May be I do not know, what command or options to use ... or I would need to fit it whole into RAM, which is expesive. Even replication to a new mongo instance is much faster (takes hours) then a query and writing data to disk (takes days).
I have about 10m documents. Mongo data on disk has 250Gb. There are cca 15 categories for which I need separate files (at the moment all documents are in 1 collection regardless of category).
Which command should I use to export all data into files in a couple of hours?
How large aws instances should I use to speed it up, but not to pay too much for RAM. Would it help? Operation 2) must not cause a performace hit for operation 1) -- I cannot stop Mongo and use mongoexport.
I am not sure what kind of servers you are using but this may provide some further insights regarding the export/file creation performance and not shutting off mongo. One presumes you are working with a sharded and replicated cluster.
In my case I am on Azure VMs running Windows server in a replicated and sharded cluster. So I would take a copy of the Azure blobs associated with the data disks on a secondary in each RS. You should stop your balancer and lock the db on the secondary to do this. This should take a couple of minutes at most to copy only 250gb. Then I would restore the blobs to disks on a new VM.
Then you could query data out of this VM without affecting your cluster's performance. You may additionally add indexing fir this export process since you are on a separate instance now.
Personally I use PowerShell to do this in Azure. Golang may be a better choice to write your queries in due to its parallel capabilities if JavaScript via the mongo shell fails you. I've had JS work faster than python code but it also depends on what you know.
This is just one way but it does address some of the criteria you posted.

MongoDB preload documents into RAM for better performance

I want MongoDB to hold query results in RAM for longer period of time (say 30 minutes if memory is available). Is it possible? OR is there any way i can make sure that the data is pre-loaded into RAM before subsequent queries on it.
In fact i am wondering about simple query results performance by MongoDB. I have a dedicated server with 10GB RAM and my db.stats() are as follows;
db.stats();
{
"db": "test",
"collections":16,
"objects":625690,
"avgObjSize":68.90,
"dataSize":43061996,
"storageSize":1121402888,
"numExtents":74,
"indexes":25,
"indexSize":28207200,
"fileSize":469762048,
"nsSizeMB":16,
"ok":1
}
Now when i query single document (as mentioned here) from a web service it loads in 1.3 seconds. Subsequent calls of same queries gives response in 400ms and then after few seconds, it again starts taking 1.3 seconds. Looks like MongoDB has lost the previous queried document from Memory, where as there is no other queries asking for data mapped to RAM.
Please explain this and let me know any way to make subsequent queries faster responding.
Your observed performance problem on an initial query is likely one of the following issues (in rough order of likelihood):
1) Your application / web service has some overhead to initialize on first request (i.e. allocating memory, setting up connection pools, resolving DNS, ...).
2) Indexes or data you have requested are not yet in memory, so need to be loaded.
3) The Query Optimizer may take a bit longer to run on the first request, as it is comparing the plan execution for your query pattern.
It would be very helpful to test the query via the mongo shell, and isolate whether the overhead is related to MongoDB or your web service (rather than timing both, as you have done).
Following are some notes related to MongoDB.
Caching
MongoDB doesn't have a "caching" time for documents in memory. It uses memory-mapped files for disk I/O and the documents in memory are based on your active queries (documents/indexes you've recently loaded) as well as the available memory. The operating system's virtual memory manager is in charge of caching, and typically will follow a Least-Recently Used (LRU) algorithm to decide which pages to swap out of memory.
Memory Usage
The expected behaviour is that over time MongoDB will grow to use all free memory to store your active working data set.
Looking at your provided db.stats() numbers (and assuming that is your only database), it looks like your database size is current about 1Gb so you should be able to keep everything within your 10Gb total RAM unless:
there are other processes competing for memory
you have restarted your mongod server and those documents/indexes haven't been requested yet
In MongoDB 2.2, there is a new touch command you can use to load indexes or documents into memory after a server restart. This should only be used on initial startup to "warm up" the server, as otherwise you could be unhelpfully forcing actual "active" data out of memory.
On a linux system, for example, you can use the top command and should see that:
virtual bytes/VSIZE will tend to be the size of the entire database
if the server doesn't have other processes running, resident bytes/RSIZE will be the total memory of the machine (this includes file system cache contents)
mongod should not use swap (since the files are memory-mapped)
You can use the mongostat tool to get a quick view of your mongod activity .. or more usefully, use a service like MMS to monitor metrics over time.
Query Optimizer
The MongoDB Query Optimizer compares plan execution for a query pattern every ~1,000 write operations, and then caches the "winning" query plan until the next time the optimizer runs .. or you explicitly call an explain() on that query.
This should be a straightforward one to test: run your query in the mongo shell with .explain() and look at the ms timings, and also the number of index entries and documents scanned. The timing for an explain() isn't the actual time the queries will take to run, as it includes the cost of comparing the plans. The typical execution will be much faster .. and you can look for slow queries in your mongod log.
By default MongoDB will log all queries slower than 100ms, so this provides a good starting point to look for queries to optimize. You can adjust the slow ms value with the --slowms config option, or using the Database Profiler commands.
Further reading in the MongoDB documentation:
Caching
Checking Server Memory Usage
Database Profiler
Explain
Monitoring & Diagnostics

Resources