we are struggling with low performance on Apache Superset.
We are using docker-compose deployment, with default Redis cache and one Celery Worker, and one Celery Beat.
The stack is running on our own hardware, so we can manage "unlimited" resources for our stack.
At this moment we are struggling with very slow UI if all users are using Superset (not much, like 40 users at the same time).
Can you help me, how to get better performance?
Related
I am going to install redis cluster for the use for my applications,
I was planning install them by using prepared helm chart,
But there's a saying goes:
Redis installed in k8s will have less performance compared to standalone installations, because of shared hardware resources (CPU, memories...)
Is that true?
As already mentioned by Burak in the comments you can choose to have a dedicated node(s) only for the Redis pods in order to avoid resource sharing with other services.
Also it is worth to mention that Redis performance is tied to the underlying VM specifications. Redis is single-threaded so a fast CPUs with large caches would perform better. Multi-cores do not directly affects performance. If your workload is relatively small (objects are less than 10 KB), memory is not as critical in order to optimize the performance.
Finally, you can use the redis-benchmark in order to test the performance yourself. There are plenty of examples to check out. Or use other tools like memtier_benchmark or Redis Memory Analyzer.
I am having a large website that struggling a bit, uptime not great and speed as well, and there is a lot of load on it. I am thinking of moving into google cloud but I don't have time to manage the server and become the host.
So my idea is to just serve the database from google cloud (so I can benefit from the auto-scale) and leave the website files where they are now.
My question is: Will that put less load on the cpu? and will it eventually improve the website uptime?
Thanks
According to your question, I think yes it will help you to improve website performance but you will see a big jump in the performance because of the database utilize more CPU and ram in the server and when you provide a separate machine for the database it will increase performance but if you want to decrease website loading time then there are other services which I suggest you services like [Cloudflare][1] or any CDN it will help you and you can use web server optimization techniques.
You can use Google CloudSQL Service if you are using MySQL or Postgres Database. Else you can use Google Compute Engine VM which you have to manage. If you want a complete website auto-scaling option I would suggest you can go with Google App Engine by which you can easily do auto-scale companies like many well-funded startups.
https://www.cloudflare.com/
https://cloud.google.com/sql/docs
https://cloud.google.com/compute/pricing
If you want to move your data to GCP, I highly recommend using Cloud SQL. If budget is not an issue, auto-scaling would be advantageous.
Will that put less load on the cpu?
Most likely it will, since your taking away the database processes on your server.
You may also want to look into using Google App Engine connected to Cloud Sql in the same region will have less latency.
Having run through configuration of both the Hadoop Big Insights and Apache Spark services on Bluemix, I noticed that Hadoop is very configurable.I have a choice of how many nodes there will be in the cluster and the RAM and CPU cores of those nodes as well as hard disk space
But the Spark service seems less configurable. The only choice I have is to choose between 2 and 30 Spark executors.
I am working with Bluemix as part of an IBM IC4 project to evaluate these services, so I have a few questions about this.
Is it possible to configure the Spark service in a similar way to the Hadoop service? i.e. choose nodes, RAM of nodes, CPU cores etc.
What are Spark executors in this context? Are they nodes? If so, what are their specifications?
Is there a plan to improve the options for Spark's configuration in the future?
Apologies for the questions but I need to know these specifications in order to carry out my work.
The Big Insights service is what some would call a hosted service. Which is to say, when you provision on instance of this service you get your own cluster with nodes configured as specified in the chosen plan. Consequently, you'll want to know exactly what each node you're paying for gives you. On the other hand, the Apache Spark service is a shared compute service, wherein you pay for compute to run your spark programs. Running spark is about in-memory compute, and creating RDDs over sources of data hosted by other data services. So in this context, what matters is how many concurrent jobs can I run and how many parallel tasks can I run with how much memory, and so on. In the Spark service plan, these executors seem to be an abstraction on this compute horsepower; unfortunately, hard for you to map that to physical hardware if you care about that. The plan description needs more elaboration and details about how one translates this abstraction to how you map to your workload needs.
However, I understand that this should be improved considerably at some point in the near future. There have been rumors about moving to only a single spark service plan where you can dial in, whenever you want, how much compute you need and that would take effect when you click "go", for all spark jobs from that point forward; it seems like you can twiddle the dials until you get what you want, see what that would cost, then lock it in until next time you need to change it. I can image something even more dynamic than that on a per-job basis. But anyway, seems like the direction things may be going for this compute service.
I came across this (and various others) while researching redis:
http://redis-cloud.com/
I am using redis with mongodb. I will be mostly using redis as a cache with very high number of reads. Does it make practical sense to use a cloud based solution? Personally, I believe network latency would play a spoil sport here if the redis server is in a different network (cloud based solution). The lag in fetching the data from the cloud redis server (in a different network) for each request would subvert/decrease the benefits of the caching layer (wouldn't it be better to wait a bit longer to fetch the records from mongo instead). Will I be able to reap maximum benefits if the redis server is in the same subnet?
Additionally, how difficult (administrative overhead) is it to run a redis server. Pardon me if I sound ignorant, as I do more of programming, and less of system administration work - so I prefer to ask the experts. Thank you for you help.
If you're concerned about network overhead with running just redis in the cloud, why not run your entire stack in the cloud, or just mongodb and redis in the cloud?
I checked out redis-cloud, and while interesting, I would not use it. The administrative overhead for redis is very low. You can deploy a redis master with multiple read slaves with little effort.
Does anyone have an experience running clustered Tigase XMPP servers on Amazon's EC2, primarily I wish to know about anything that might trip me up that is non-obvious. (For example apparently running Ejabberd on EC2 can cause issues due to Mnesia.)
Or if you have any general advice to installing and running Tigase on Ubuntu.
Extra information:
The system I’m developing uses XMPP just to communicate (in near real-time) between a mobile app and the server(s).
The number of users will initially be small, but hopefully will grow. This is why the system needs to be scalable. Presumably for a just a few thousand users you wouldn’t need a cc1.4xlarge EC2 instance? (Otherwise this is going to be very expensive to run!)
I plan on using a MySQL database hosted in Amazon RDS for the XMPP server database.
I also plan on creating an external XMPP component written in Python, using SleekXMPP. It will be this external component that does all the ‘work’ of the server, as the application I’m making is quite different from instant messaging. For this part I have not worked out how to connect an external XMPP component written in Python to a Tigase server. The documentation seems to suggest that components are written specifically for Tigase - and not for a general XMPP server, using XEP-0114: Jabber Component Protocol, as I expected.
With this extra information, if you can think of anything else I should know about I’d be glad to know.
Thank you :)
I have lots of experience. I think there is a load of non-obvious problems. Like the only reliable instance to run application like Tigase is cc1.4xlarge. Others cause problems with CPU availability and this is just a lottery whether you are lucky enough to run your service on a server which is not busy with others people work.
Also you need an instance with the highest possible I/O to make sure it can cope with network traffic. The high I/O applies especially to database instance.
Not sure if this is obvious or not, but there is this problem with hostnames on EC2, every time you start instance the hostname changes and IP address changes. Tigase cluster is quite sensitive to hostnames. There is a way to force/change the hostname for the instance, so this might be a way around the problem.
Of course I am talking about a cluster for millions of online users and really high traffic 100k XMPP packets per second or more. Generally for large installation it is way cheaper and more efficient to have a dedicated servers.
Generally Tigase runs very well on Amazon EC2 but you really need the latest SVN code as it has lots of optimizations added especially after tests on the cloud. If you provide some more details about your service I may have some more suggestions.
More comments:
If it comes to costs, a dedicated server is always cheaper option for constantly running service. Unless you plan to switch servers on/off on hourly basis I would recommend going for some dedicated service. Costs are lower and performance is way more predictable.
However, if you really want/need to stick to Amazon EC2 let me give you some concrete numbers, below is a list of instances and how many online users the cluster was able to reliably handle:
5*cc1.4xlarge - 1mln 700k online users
1*c1.xlarge - 118k online users
2*c1.xlarge - 127k online users
2*m2.4xlarge (with 5GB RAM for Tigase) - 236k online users
2*m2.4xlarge (with 20GB RAM for Tigase) - 315k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 400k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 312k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 327k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 280k online users
A few more comments:
Why amount of memory matters that much? This is because CPU power is very unreliable and inconsistent on all but cc1.4xlarge instances. You have 8 virtual CPUs but if you look at the top command you often see one CPU is working and the rest is not. This insufficient CPU power leads to internal queues grow in the Tigase. When the CPU power is back Tigase can process waiting packets. The more memory Tigase has the more packets can be queued and it better handles CPU deficiencies.
Why there is 5*m2.4xlarge 4 times? This is because I repeated tests many times at different days and time of the day. As you can see depending on the time and date the system could handle different load. I guess this is because Tigase instance shared CPU power with some other services. If they were busy Tigase suffered from CPU under power.
That said I think with installation of up to 10k online users you should be fine. However, other factors like roster size greatly matter as they affect traffic, and load. Also if you have other elements which generate a significant traffic this will put load on your system.
In any case, without some tests it is impossible to tell how really your system behaves or whether it can handle the load.
And the last question regarding component:
Of course Tigase does support XEP-0114 and XEP-0225 for connecting external components. So this should not be a problem with components written in different languages. On the other hand I recommend using Tigase's API for writing component. They can be deployed either as internal Tigase components or as external components and this is transparent for the developer, you do not have to worry about this at development time. This is part of the API and framework.
Also, you can use all the goods from Tigase framework, scripting capabilities, monitoring, statistics, much easier development as you can easily deploy your code as internal component for tests.
You really do not have to worry about any XMPP specific stuff, you just fill body of processPacket(...) method and that's it.
There should be enough online documentation for all of this on the Tigase website.
Also, I would suggest reading about Python support for multi-threading and how it behaves under a very high load. It used to be not so great.