RabbitMQ AWS autoscaling - amazon-ec2

I am new to RabbitMQ and I am evaluating it for my next project. Is it possible to use AWS autoscaling with RabbitMQ? How would the multiple instances coordinate messages across multiple instance queues? I see that RabbitMQ has clustering capabilities but appears not to fit in an autoscaling model. I did find this post,
How to set up autoscaling RabbitMQ Cluster AWS
It fixed the scale-up issues but did not address what to do when the instances scale-down. The issue with scaling-down is the potential for messages still in the queues when the instance is removed. Clustering is ok but would like to leverage autoscaling whenever possible.

Related

How can I subscribe to amazon SNS topic with all pods in cluster?

I have a Kubernetes cluster with 5 replica sets running a Spring Boot server and I would like to subscribe each pod to an amazon sns topic individually. Is it possible? How can I do that?
Thanks!
If you are running a Spring server, you can create an endpoint and mark those endpoints as subscribers to your sns topic. You would need one subscription for each pod (you should be able to get pod address information by running kubectl describe svc ${service_name}) in order for sns to properly fanout messages to all of your pods. AWS outlines this process here
edit: worth noting the above process is not very robust since that pod address list may not be static. it may be better to subscribe your service to the sns topic, since this abstracts the changing pod IP list, and implement pod-to-pod communication, similar to what is outlined in this article

How to setup a clustering of a queue servers for beanstalkd?

We've two Queue Servers, both attached to the application. so far Server 1 receives all the queued jobs and processed it. I would like to set up a Cluster so that the load is spread across 2 servers. anyone can suggest how to setup a cluster.
Thanks.
Beanstalkd doesn't offer this feature.
Alternatives are:
you setup a soft sharding to route requests to queue A or B
you can use alternatives like Redis Queue, Cloud Pub/Sub from Google Cloud Platform
Beanstalkd in a single instance setup can support multiple TCP connections, and generally outperforms Redis. Below are a few benchmarks, however benchmarks are subjective.
Benchmarks References
https://ph4r05.deadcode.me/blog/2017/12/16/laravel-queueing-benchmark.html
https://adam.herokuapp.com/past/2010/4/24/beanstalk_a_simple_and_fast_queueing_backend/
So a vertical scaling is usually sufficient.
The problem however, is availability, when the single beanstalkd instance goes away.
You can checkout coolbeans, this project is in alpha. It provides a replicated beanstalkd https://github.com/1xyz/coolbeans

Scaling a microservice with frontend and backend instances

I am developing a series of microservices using Spring Boot and plan to deploy them on Kubernetes.
Some of the microservices are composed of an API which writes messages to a kafka queue and a listener which listens to the queue and performs the relevant actions (e.g. write to DB etc, construct messsages for onward processing).
These services work fine locally but I am planning to run multiple instances of the microservice on Kubernetes. I'm thinking of the following options:
Run multiple instances as is (i.e. each microservice serves as an API and a listener).
Introduce a FRONTEND, BACKEND environment variable. If the FRONTEND variable is true, do not configure the listener process. If the BACKEND variable is true, configure the listener process.
This way I can start scale how may frontend / backend services I need and also have the benefit of shutting down the backend services without losing requests.
Any pointers, best practice or any other options would be much appreciated.
You can do as you describe, with environment variables, or you may also be interested in building your app with different profiles/bean configuration and make two different images.
In both cases, you should use two different Kubernetes Deployments so you can scale and configure them independently.
You may also be interested in a Leader Election pattern where you want only one active replica if it only make sense if one single replica processes the events from a queue. This can also be solved by only using a single replica depending on your availability requirements.

Create new EC2 instance on increase in number of messages in queue

Is there a way to create new EC2 instances on an increasing number of messages in RabbitMq queue?
Giving for granted that you know how to set up an Auto Scaling Group, you can configure your group to adjust in capacity according to demand, in response to Amazon CloudWatch metrics.
The thing is, you can store your own metrics in CloudWatch using the PutMetricData function.
So you should:
somehow send to CloudWatch the number of messages RabbitMq is managing, maybe with a cron script;
check that CloudWatch is receiving your data;
create a Launch Template for your scaling EC2 instances;
create an Auto Scaling Group setting a trigger tied to your new
CloudWatch metric.

Lost messages when migrating RabbitMQ from one EC2 instance to another

I have RabbitMQ installed and working well on an EC2 CentOS 6 instance, with an assortment of queues and topics. I decided to migrate this working instance to another, new EC2 server instance with the same OS and initial setup, just smaller.
I created an AMI (Amazon server image) from the existing installation, and then used this AMI to create a new server instance. RabbitMQ came up just fine, as did all the topics, users, virtual hosts, queues, etc.
However, the queues all came back with 0 messages in them, although messages did exist in the queues before creating the server image.
Questions:
Did I miss something in my migration?
Where are messages are explicitly 'stored' while they're within rabbit queues?
I believe the messages were sent as 'Persistent' but not 100% sure about that. I am aware of replication of RabbitMQ instances, but figured this method of server recreation would be simpler/quicker?
#robthewolf's comments got be searching some more but with a slightly different slant (around whether one could explicitly save off queue messages in a backing database/key-value store)
That led me to this old, but seemingly still-relevant blog post that clearly describes rabbit's current 'persistence' methods for all cases (persistent publishing, durable, etc.)
http://www.rabbitmq.com/blog/2011/01/20/rabbitmq-backing-stores-databases-and-disks/
If messages were persistent you can check this SO question - RabbitMq uses Mnesia storage which is connected to ip address of machine it is running on, so few tweaks in the answers there can resolve the issue.

Resources