Cog vs Triton Inference Server - production

I'm considering Cog and Triton Inference Server for inference in production.
Does someone know what is the difference in capabilities as well as in run times between the two, especially on AWS?

Related

AWS Lambda vs EC2 REST API

I am not an expert in AWS but have some experience. Got a situation where Angular UI (host on EC2) would have to talk to RDS DB instance. All set so far in the stack except API(middle ware). We are thinking of using Lambda (as our traffic is unknown at this time). Again here we have lot of choices to make on programming side like C# or Python or Node. (we are tilting towards C# or Python based on the some research done and skills also Python good at having great cold start and C# .NET core being stable in terms of performance).
Since we are with Lambda offcourse we should go in the route of API GATEWAY. all set but now, can all the business logic of our application can reside in Lambda? if so wouldnt it Lambda becomes huge and take performance hit(more memory, more computational resources thus higher costs?)? then we thought of lets have Lambda out there to take light weight processing and heavy lifting can be moved to .NET API that host on EC2?
Not sure of we are seeing any issues in this approach? Also have to mention that, Lambda have to call RDS for CRUD operations then should I think to much about concurrency issues? as it might fall into state full category?
The advantage with AWS Lambda here is scaling. As you know already , cuz Lambda is fully managed so we can take advantages of it for this case.
If you host API on EC2, so we don't have "scaling" part in place. And of course, you can start using ECS, Auto Scaling Group ... but it is bring you to another route.
Are you building this application for learning or for production?

Client rendering time

I know that in order to measure end-to-end response time for any application scenario, we need to compute: server time + network time + client time
While I know for sure, server and network time are impacted by load, I want to know if client time too is impacted by load??
If client rendering time isn't impacted by load then will it be appropriate, if we do a test with 100 users and measure server time with help of any performance testing tool (like HP LoadRunner, JMeter etc); then measure client rendering time with single user and finally present end-to-end time by adding client time to server time?
Any views on this will be appreciated.
Reagrds,
What you are describing is a very old concept, termed a GUI Virtual User. LoadRunner, and other classical tools such as SilkPerfomer, QALoad and Rational Performance tester, have always had the ability to run one or two graphical virtual users created with the functional automation test tools from the vendor in question to address the question of user "weight" of the GUI.
This capability went out of vogue for a while with the advent of the thin client web, but now that web clients are growing in thickness with complex client side code this question is being asked more often.
Don't worry about actual "rendering time," the time taken to draw the screen elements, since you cannot control that anyway. It will vary from workstation to workstation depending upon what is running on the host and most development shops don't have a reconciliation path to Microsoft, Mozilla, Opera, Google or Apple to ask them to tune up the rendering on their browsers if someone finds a problem in the actual rendering engine of the browser.

Scaling Tigase XMPP server on Amazon EC2

Does anyone have an experience running clustered Tigase XMPP servers on Amazon's EC2, primarily I wish to know about anything that might trip me up that is non-obvious. (For example apparently running Ejabberd on EC2 can cause issues due to Mnesia.)
Or if you have any general advice to installing and running Tigase on Ubuntu.
Extra information:
The system I’m developing uses XMPP just to communicate (in near real-time) between a mobile app and the server(s).
The number of users will initially be small, but hopefully will grow. This is why the system needs to be scalable. Presumably for a just a few thousand users you wouldn’t need a cc1.4xlarge EC2 instance? (Otherwise this is going to be very expensive to run!)
I plan on using a MySQL database hosted in Amazon RDS for the XMPP server database.
I also plan on creating an external XMPP component written in Python, using SleekXMPP. It will be this external component that does all the ‘work’ of the server, as the application I’m making is quite different from instant messaging. For this part I have not worked out how to connect an external XMPP component written in Python to a Tigase server. The documentation seems to suggest that components are written specifically for Tigase - and not for a general XMPP server, using XEP-0114: Jabber Component Protocol, as I expected.
With this extra information, if you can think of anything else I should know about I’d be glad to know.
Thank you :)
I have lots of experience. I think there is a load of non-obvious problems. Like the only reliable instance to run application like Tigase is cc1.4xlarge. Others cause problems with CPU availability and this is just a lottery whether you are lucky enough to run your service on a server which is not busy with others people work.
Also you need an instance with the highest possible I/O to make sure it can cope with network traffic. The high I/O applies especially to database instance.
Not sure if this is obvious or not, but there is this problem with hostnames on EC2, every time you start instance the hostname changes and IP address changes. Tigase cluster is quite sensitive to hostnames. There is a way to force/change the hostname for the instance, so this might be a way around the problem.
Of course I am talking about a cluster for millions of online users and really high traffic 100k XMPP packets per second or more. Generally for large installation it is way cheaper and more efficient to have a dedicated servers.
Generally Tigase runs very well on Amazon EC2 but you really need the latest SVN code as it has lots of optimizations added especially after tests on the cloud. If you provide some more details about your service I may have some more suggestions.
More comments:
If it comes to costs, a dedicated server is always cheaper option for constantly running service. Unless you plan to switch servers on/off on hourly basis I would recommend going for some dedicated service. Costs are lower and performance is way more predictable.
However, if you really want/need to stick to Amazon EC2 let me give you some concrete numbers, below is a list of instances and how many online users the cluster was able to reliably handle:
5*cc1.4xlarge - 1mln 700k online users
1*c1.xlarge - 118k online users
2*c1.xlarge - 127k online users
2*m2.4xlarge (with 5GB RAM for Tigase) - 236k online users
2*m2.4xlarge (with 20GB RAM for Tigase) - 315k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 400k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 312k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 327k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 280k online users
A few more comments:
Why amount of memory matters that much? This is because CPU power is very unreliable and inconsistent on all but cc1.4xlarge instances. You have 8 virtual CPUs but if you look at the top command you often see one CPU is working and the rest is not. This insufficient CPU power leads to internal queues grow in the Tigase. When the CPU power is back Tigase can process waiting packets. The more memory Tigase has the more packets can be queued and it better handles CPU deficiencies.
Why there is 5*m2.4xlarge 4 times? This is because I repeated tests many times at different days and time of the day. As you can see depending on the time and date the system could handle different load. I guess this is because Tigase instance shared CPU power with some other services. If they were busy Tigase suffered from CPU under power.
That said I think with installation of up to 10k online users you should be fine. However, other factors like roster size greatly matter as they affect traffic, and load. Also if you have other elements which generate a significant traffic this will put load on your system.
In any case, without some tests it is impossible to tell how really your system behaves or whether it can handle the load.
And the last question regarding component:
Of course Tigase does support XEP-0114 and XEP-0225 for connecting external components. So this should not be a problem with components written in different languages. On the other hand I recommend using Tigase's API for writing component. They can be deployed either as internal Tigase components or as external components and this is transparent for the developer, you do not have to worry about this at development time. This is part of the API and framework.
Also, you can use all the goods from Tigase framework, scripting capabilities, monitoring, statistics, much easier development as you can easily deploy your code as internal component for tests.
You really do not have to worry about any XMPP specific stuff, you just fill body of processPacket(...) method and that's it.
There should be enough online documentation for all of this on the Tigase website.
Also, I would suggest reading about Python support for multi-threading and how it behaves under a very high load. It used to be not so great.

Amazon EC2 Capacity & Workflow Questions

I’m hoping some of you with experience using amazon EC2 could offer some advice… of course it’ll be subjective which is fine, I’m pretty sure your guestimate would be better than mine.
I am planning on moving all my client’s websites from shared hosting environments to Amazon EC2. They’re all pretty low traffic sites (the busiest site receives around 50 unique visitors a day). There’s about 8 sites, but I may expand this as I take on more projects and host more sites… current capacity planning is for say 12 sites.
Each site runs on ASP.Net (Umbraco CMS), and requires a SQL Server database.
My thoughts are one of the following:
Setup a Small Instance (1.7gb RAM, 1 EC2 Compute Unit), and run IIS and SQL Server Express on that server.
Setup 2 Micro Instances (613MB Ram each, Up to 2 EC2 Compute Units) – one for IIS, the other for SQL Server.
Which arrangement do you think would work the best for my requirements. I’ve started setting up a Micro instance with Server 2008, SQL Server Express, etc… and finding it not coping with the memory requirements, hence considering expanding. I could always configure on a Small instance, then export the AMI and fire it up in a Micro instance after, and do the same every time any serious changes to the server are required. I guess I could even do all updates etc on a spare Small Spot instance, then switch load that AMI up in a Micro and transfer the IP Address across, so I don’t need to do too much work on the production servers. I figure if I store all my website data files on EBS Volumes, then it should be fairly easy to move hosting between servers with minimal downtime, while never working on a production server.
I’m interested to know what you all think, and what strategies you employ for such activities as upgrades, windows updates, software installations, etc.
And what capacity do you think I’d need for my requirements.
Cheers
Greg
Well, first-up, Server 2008 doesn't play well in the 613MB RAM the Micro instance gives you. It runs, but it's a dog, and it barks louder the more services (IIS, SSE, etc) you layer on top. We using nothing smaller than a Small for Server 2008, and in fact typically do the environment config in a Medium and scale down to Small once the heavy lifting is complete and the OS is ready to use. Server 2003, however, seems to breathe easier on a Micro - but we still do the config on a larger instance and scale down.
We're running low-traffic websites on Server 2003/IIS6 in a Micro, with a Server 2008/SS install on a shared, separate, Small instance. We do also have one Server 2008/IIS7 Micro build running, but only to remind ourselves why we don't use it more widely. ;)
Larger websites run Server 2008/IIS7 in either Small or Medium instances, but almost always still using that shared separate SS instance for database services. We try not to deploy multiple SS installations, since it makes maintenance and backups more complex.
Stashing content and config on EBS Volumes is of course good practice, unless you like rebuilding the entire system whenever an Instance disappears. Snapshotting your Instances periodically is also good practice, since you can spin-up a new Instance from a baseline AMI and swap the snapshot in as a boot Volume for fast recovery in the event of disaster.

Any thoughts on RightScale and Scalr for dynamic Ec2 instance management [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm looking for a cost effective tool for managing an web app on Ec2. Rightscale seems to the big dog and charges for it. Scalr looks like a more cost effective solution but it's hard to find out any real customer experiences..
The key aspects I'm looking for is a load balancer (http and https) and a way to automatically bring online additional web servers capacity as load increases as well as terminate the instances when load falls off.
From what I can tell, lots of people are rolling their own stuff here. We're trying to release an app and don't really want to have to fight too many heavy sys admin battles. Given the importance of performance etc I'd be grateful to hear advise and experiences from the field on this.
I am a Scalr user, a Scalr.net subscriber, and have become a Scalr enthusiast. I cannot possibly afford Rightscale.
Scalr can do what you ask.
Scalr has three images (each with 32/64 bit versions), plus a base (generic) image:
1) A load balancer image, running nginx. A highly available setup requires two of these. Scalr will manage your nameservice, and round robin between them. If one goes down, Scalr will remove it from DNS and bring up another instance. It is possible to run other load balancers, but nginx is the default.
2) Several application server images are available, running Apache/Tomcat/Rails. You setup your application here, be it PHP/Perl/Python/Java/Ruby/whatever. nginx routes requests between these instances grouped by unique user (based on IP + browser). Scalr monitors these for upness too, and replaces broken instances.
3) A MySQL database image, with automatic master/slave replication. Just deploy your schema, and Scalr handles replication and replaces defunct servers. It will also backup your data periodically. Scalr's DNS provides master and slave hostnames, so you can have your app read from the slaves and write to the master.
All of these instance types will auto-scale based on load. You start with the base image closest to what you're doing, and then you customize them for your application. For instance, we deploy our Perl/Catalyst app on the apache server instances but we serve static content from the nginx front-end servers. We had to modify our application slightly to use read/write database handles.
All in all, it took about three weeks of working through bugs in Scalr to get our application to a reliable state where I am confident that it IS highly available with Scalr. Their support was phenomenal, so the bugs didn't bother me too much, and the system is really coming along. It is approaching serious reliability.
As a side note, the best feature of Scalr is the 'Synchronize to All' feature, which auto-bundles your AMI and re-deploys it on a new instance - all without a service interruption. This saves you the time of going through the lengthy EC2 image/AMI creation process, which can otherwise make very simple admin tasks take 20 minutes. You can use this whether you are scaling your server farm or not - it would be very handy even on a single instance.
I pay Scalr.net $50 a month to host the service for me because I think it saves me time and money. The bottom line so far is this: at my last gig, we had a systems guy working on our highly available Linux DB + app server setup for a year... and he failed to achieve the kind of reliability that I achieved in three weeks. The savings by using Scalr as compared to rolling my own are extreme.
All that being said, if I could afford Rightscale, I would be using Rightscale. But the up-front fee and $500 a month make that impossible. There has been talk of waving the up-front fee in exchange for waving the consulting that it includes, but the monthly service fee isn't going anywhere.
I should mention that at the moment, sclar.net's website is down, so if I wanted to manage any of my server farms (don't have them up atm), I simply couldn't right now. It is not clear whether scaling is working for scalr.net subscribers right now, or not. Which is to say... this is perhaps not a mature solution yet. This doesn't happen often, before tonight the only downtime I've experienced were in periods of a few minutes at a time. But yeah... its down RIGHT NOW, so I must mention it :)
I would suggest a thorough reading of the support group at http://groups.google.com/group/scalr-discuss before making your decision. If you pick Scalr, be prepared to test your setup and work through any issues you have on the google group.
I will comment on your question, since giving a concrete answer is a little ambitious.
First, I see that you have haproxy on your tags. That is definitely the best load balancing software proven in EC2. There is documentation and experiences in the AWS forums on the use of haproxy.
I am unable to give you an opinion on scalr, but Rightscale is going the right direction. One of RightScale most interesting features in their roadmap is that they are a mgmt cloud system for any cloud not just EC2 of Amazon. That makes them very promising when trying to request load balancing and upscaling in need.
Also you can signup for a developer free account on rightscale and you can test some of their AMI and free scripts, they are pretty impressive.
Well, this might sound like I am working there or something, but I am a just a cloud user, no connection with them. If that crosses your mind.
I hope this helps, at least adds to the discussion.
Geo
Been on Scalr for about two months now and have slowly transitioned several production applications to the platform with good results. I strongly recommend them for quick turn around/support and value. I would like to see them improve availability of their platform.
All in all, a good fit for the original poster based on the simple use case presented.
Every service has a bad day. AWS services see down time. However, there are still users running their apps on AWS.
I have a few farms on Scalr.net and compared to Rightscale. I don't have to pay an arm and a leg.
Overall, service is very reliable. And now with the scripting engine i can setup my own scripts to govern my instances.
With Regards
Hareem Haque
Both services (rightscale and scalr) are great. The offer is not the same and the price is not the same too. But they are both what I was looking for. Regaring our budget scalr fits my needs. I found the support through a google group very strange at the beginning, but it is very fast and efficient.
Their solution is also open source (not bad) and they also have a V2 in their roadmap with support to other providers.
Wait and see, but til now, I'm very happy with it
Deciding on the right choice may not be as cut and dry as everyone expects. I have met with and heard talks from Scalr about their platform and have also listened to RightScale discuss their platform. If you have a simple SOA (App Server - Database Server - File Server), then either choice will be right for your company.
Ultimately, if you have created some custom middleware and you rely on known sockets or specific points for handshakes, you will need to consider load-balancing and auto-scaling what you can and fall back to your own solutions for what can't be managed with either of these services.
Some people say that automatic scaling won't solve the problem
I am looking into Scalr right now and although it all looks good, I decided to continue with my own scripting for the purpose of cloud management / scaling. I have 8 servers right now and am paying only the AWS fees. I use chef (self-hosted), nagios, and a lot of other tools. My databases are mysql and mongodb, load balancer is haproxy, app layer is rails. Until I need 100s of servers, I think I will just keep scriptin' ;-)

Resources