2 days ago, I've noticed that an EC2 auto scaling group was rebalancing nodes within the set Availablity Zones.
On that same day, I've set the AZRebalance as a suspended process.
Today, without any apparent reason the ASGroup started rebalancing...
I did check that the AZRebalance wasn't removed...
I can also add that this ASGroup is an EKS node group... Not sure if it could make a difference.
Related
I have 5 different machine with each scaled 5 spring boot instance that uses kafka-streams application. I am using 50 partitions compacted topic with different 2-3 topics and each my instance has 10 concurrency. I am using docker swarm and docker volume. Using these topics KTable or KStream do some flatMap, map and join operations with my kafka streams app.
props.put(StreamsConfig.STATE_DIR_CONFIG, /tmp/kafka-streams);
props.put(StreamsConfig.REPLICATION_FACTOR_CONFIG, 3);
props.put(StreamsConfig.NUM_STANDBY_REPLICAS_CONFIG, 2);
props.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 100);
props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, EXACTLY_ONCE);
props.put("num.stream.threads", 10);
props.put("application.id", applicationId);
If everything goes OK nothing is wrong or no data loss in my application with .join() operations, but when one of my instance is down my join operations are not able to do the join actually.
My question is: When the app is restarted or redeployed (and given that it's working inside a non-persistent container) its state is gone right? Than my join operations don't work. It is When I redeploy my instance and populate my compacted topic from elasticsearch with the latest entities my join operations are OK. So I think when my application starts at new machine my local state-store is gone ? But the kafka document says:
If tasks run on a machine that fails and are restarted on another machine, Kafka Streams guarantees to restore their associated state stores to the content before the failure by replaying the corresponding changelog topics prior to resuming the processing on the newly started tasks. As a result, failure handling is completely transparent to the end user.
Note that the cost of task (re)initialization typically depends primarily on the time for restoring the state by replaying the state stores' associated changelog topics. To minimize this restoration time, users can configure their applications to have standby replicas of local states (i.e. fully replicated copies of the state). When a task migration happens, Kafka Streams then attempts to assign a task to an application instance where such a standby replica already exists in order to minimize the task (re)initialization cost. See num.standby.replicas at the Kafka Streams Configs Section.
(https://kafka.apache.org/0102/documentation/streams/architecture)
Does my downed instance refresh kafka state-store when it goes up ? If it is why I am losing data and I have no idea :/ Or can't reload state-store because of commit_offset because all my instance's use same applicationId ?
Thanks !
The changelog topics are always read from the earliest offset, and they're compacted, so they don't lose data.
If you're joining non compact topics, then sure, you lose data, but that's not limited to Kafka Streams or your specific use case... You'll need to configure the topic to retain data for at least as long as you think it'll take you to solve any issues with topic downtime. While the data is retained, you can always seek your consumer to it
If you want persistent storage, use a volume mount to your container via Kubernetes, for example, or plug in a state state store stored externally to the container like Redis : https://github.com/andreas-schroeder/redisks
In an auto scaling group, if there are equal number of instances in multiple availability zones, which availability zone will be selected for terminating instances as per the AWS default termination policy? Is it randomly selected?
According to the documentation, if you did not assign a specific termination policy to the group, it uses the default termination policy.
In the scenario when an equal number of instances are there in multiple availability zones, Auto Scaling group selects the Availability Zone with the instances that use the oldest launch configuration.
If the instances were launched from the same launch configuration, then the Auto Scaling group selects the instance that is closest to the next billing hour and terminates it.
We moved Algolia search from our local development environment to our staging environment. On staging we have 144,000 sample orders and 100,000 products. Both of these numbers are smaller than our production environment.
We inserted our app id and other credentials and saved. We're using AOE scheduler to execute our crons. algoliasearch_run_queue has been running for 5 hours now and it seems to be making the same queries:
SELECT SUM(order_items.qty_ordered) AS ordered_qty, order_items.name AS order_items_name, `o....
I believe this is related to ranking = ordered_qty. This cron is holding up all processing of subsequent crons, meaning other magento task, (order emails, indexing, etc) will not take place during the time this one is running.
What is the fix for this?
An improvement has been done in 1.4.3 but will probably not resolve the issue for such big store. Computing ordered_qty can indeed be long but it's used to have a good relevance.
For one of my hadoop deployments I'm taking previous generation instances (m1,xlarge, m1.large etc).m1.xlarge instance comes with 4X420 GiB instance store. Is instance store is safer to store data or do i need to go for EBS?
thanks
It really depends on how much persistence you want for your data (or how you would define safer). The instance store will be lost if the instance is terminated or stopped. You also have the risk of AWS "losing" your instance if you plan to run them for a long time (in my case, I've seen instances fail after approximately one year, however we've had instances run flawlessly for over 3 years).
So if you need persistance, go with EBS and, if needed, compensate performance differences by using EBS with provisioned IOPS or an RAID array of EBS volumes. If your use case is just importing data, crunching it in hadoop and exporting it somewhere else, you can safely choose the instance store (we're doing this using EMR, for example).
I'm confused by the situation and trying to fix this for a couple of days now. I'm running 3 shard on top of three 3-members replica sets (rs0, rs1 and rs2). All is working so far. Data is distributed over the 3 shards as well as cloned within the replica sets.
BUT: importing data into one of the replica set works fine with constantly 40k docs/s but by enabling sharding slows the entire process down to just 1.5k docs/s.
I've populated the data via different methods:
generated some random data in the mongo shell (running in my mongos)
JSON data import via mongoimport
MongoDB dump restore from another server via mongorestore
All of them result in just 1.5k doc/s which is disappointing. The mongod's are physical Xeon boxes with 32GB each, the 3 config servers are virtual servers (40 GB HDD, 2 GB RAM, if that matters), the mongos is running on my app server. By the way, the value of 1.5k inserts/s doesn't depend on the shard key, same behaviour for a dedicated shard key (single field key as well as compound key) as well as hashed shard key on _id field.
I tried a lot, even reinstalled the entire cluster twice. The question is: what is the bottleneck in this setup:
config servers running on virtual server? -> shouldn't be problematic due to the low resource consumption of config servers
mongos? -> running multiple Mongos on a dedicated box behind HAproxy might be an alternative, haven't tested that yet
Let's do the math first: how big are your documents? Keep in mind that they have to be transferred over the net multiple times depending on your write concern.
May be you are experiencing this because of the indices which have to be build.
Please try this:
Disable all indices except the one for _id (which is not possible anyway, iirc)
Load your data
Reenable indices.
Enable sharding and balancing if not done already
This is the suggested way for importing data into a shared cluster anyway and should speed up your import considerably. Some (cautious !) fiddling with storage.syncPeriodSecs and storage.journal.commitIntervalMs might help, too.
The delay can occur even when storing the data on the primary shard. Depending on the size of your indices, they may slow down bulk operations considerably. You might also want to have a look at the replication.secondaryIndexPrefetch config option.
Another thing might be that your oplog simply gets filled faster than the replication can take place. Problem here: once it is created, you can not increase it's size. I am not sure wether it is safe to delete and recreate it in standalone mode and then reshare the replica set, but I doubt it. So the safe option would be to have the instance actually leave the replica set, reinstall it with a more appropriate oplog size and add the instance to the replica set as if it were the first time. If you don't care for the data, simply shut the replica set down, adjust the oplog size in the config file, delete the data dir and restart and reinitialize the replica set. Thinking of your problem twice, this sounds like the best bet to me, since the opllog isn't involved in standalone mode, iirc.
If you still have the same performance issues, my bet is on problems with disk or network IO.
You have a fairly standard setup, your mongos instance is running on a different machine than your mongod (be it a standalone or the primary of a replica set). You might want to check a few things:
Name resolution latency for resolving the name of your primary and secondary shards from the machine running your mongos instance. I can not count the times installing nscd improved performance for various operations.
network latency from your mongos instance to your primary shard. Assuming you have a firewall between your AppServer and your cluster, you might want to talk to the respective administrator.
In case you are using external authentication, try to measure how long it takes.
When using some sort of tunneling (e.g. stunnel or encryption like SSL/TLS), make sure you disable name resolution. Please keep in mind that encrypting and decrypting may take a relatively long time.
Measure random disk IO on the mongod instances
I was facing a similar performance issue. What helped to solve the performance issue was I ended up setting the mongod instance that was running on the same host as the mongos as the primary shard.
using the following command:
mongos> use admin
mongos> db.runCommand( { movePrimary: "mydb", to: "shard0003" } )
After making this change (without touching the load balancer or tweaking anything else), I was able to load a relatively large dataset (25 million rows) using a loader I had written, and the entire procedure took about 15 minutes instead of hours/days.