Our requirement from our next web app is that we will be able to deploy a new version of the web app without a downtime.
how is it possible to achieve such task?
does it mean we need to run 2 different servers (tomcats) ? and redirect users to each one when needed?
are there tools that are doing this specific task? in what category these tools in?
Thanks
Just use Tomcat's parallel deployment feature. It is available from Tomcat 7 onwards.
Don't forget, 100% availability is impossible - it may happen for a certain period, but no one can guarantee it, no matter what setup you have.
But since you're looking for a smooth change from one version to another, then the best you can do is update one node and switch nodes then. Of course, since you likely have sessions which shouldn't disconnected, you'll need to make sure that an instance (e.g. load balancer) directs all new requests to the new node, whereas old session requests stay on the old node until no one uses it anymore, after which you can upgrade the second node and finally, balance load again to both nodes.
Related
We are building a reporting app on Laravel that need to fetch users data from a third-party server that allow 1 request per seconds.
We need to fetch 100K to 1000K rows based on user and we can fetch max 250 rows per request.
So the restriction is:
1. We can send 1 request per seconds
2. 250 rows per request
So, it requires 400-4000 request/jobs to fetch a user data, So, loading data for multiple users is very time-consuming and the server gets slow.
So, now, we are planning to load the data using multiple servers, like 4-10 servers to fetch users data, so we can send 10 requests per second from 10 servers.
How can we design the system and process jobs from multiple servers?
Is it possible to use a dedicated server for hosting Redis and connect to that Redis server from multiple servers and execute jobs? Can any conflict/race-condition happen?
Any hint or prior experience related to this would be really helpful.
The short answer is yes, this is absolutely possible and is something I've implemented in production apps many times before.
Redis is just like any other service and can run anywhere, with clients from anywhere, connecting to it. It's all up to your configuration of the server to dictate how exactly that happens (and adding passwords, configuring spiped, limiting access via the firewall, etc.). I'd reccommend reading up on the documentation they have in the Administration section here: https://redis.io/documentation
Also, when you do make the move to a dedicated Redis host, with multiple clients accessing it, you'll likely want to look into having more than just one Redis server running for reliability, high availability, etc. Redis has efficient and easy replication available with a few simple configuration commands, which you can read more about here: https://redis.io/topics/replication
Last thing on Redis, if you do end up implementing a master-slave set up, you may want to look into high availability and auto-failover if your Master instance were to go down. Redis has a really great utility built into the application that can monitor your Master and Slaves, detect when the Master is down, and automatically re-configure your servers to promote one of the slaves to the new master. The utility is called Redis Sentinel, and you can read about that here: https://redis.io/topics/sentinel
For your question about race conditions, it depends on how exactly you write your jobs that are pushed onto the queue. For your use case though, it doesn't sound like this would be too much of an issue, but it really depends on the constraints of the third-party system. Either way, if you are subject to a race condition, you can still implement a solution for it, but would likely need to use something like a Redis Lock (https://redis.io/topics/distlock). Taylor recently added a new feature to the upcoming Laravel version 5.6 that I believe implements a version of the Redis Lock in the scheduler (https://medium.com/#taylorotwell/laravel-5-6-preview-single-server-scheduling-54df8e0e139b). You can look into how that was implemented, and adapt for your use case if you end up needing it.
If a worker role or for that matter web roles are continuously serving both long/short running requests. How does continuous delivery work in this case? Obviously pushing a new release in the cloud will abort current active sessions on the servers. What should be the strategy to handle this situation?
Cloud Services have production and staging slots, so you can change it whenever you want. Continuous D or I can be implemented by using Visual Studio Team Services, and i would recommend it - we use that. As you say, it demands to decide when you should switch production and staging slots (for example, we did that when the user load was very low, in our case it was a night, but it can be different in your case). Slots swapping is very fast process and it is (as far as i know) the process of changing settings behind load balancers not physical deployment.
https://azure.microsoft.com/en-us/documentation/articles/cloud-services-continuous-delivery-use-vso/#step6
UPD - i remember testing that, and my experience was that incoming connections were stable (for example, RDP) and outgoing are not. So, i can not guarantee that existing connections will be ended gracefully, but from my experience there were no issues.
I am wondering what would be the best practice for deploying updates to a (MVC) Go web application. Imagine the following scenario :
1) Code and test some changes for my Go Web Application
2) Deploy update without anyone currently using the previous version getting interrupted.
I don't know how to make sure point 2) can be covered - when somebody is sending a request to the server and I rebuild/restart it just in this moment, he gets an error - even if the request just uses a part of the code I did not touch or that is backwards-compatible, or if I just added a new Request-handler.
Maybe I'm missing something trivial or a well-known pattern as I am just in the process of learning go and my previous web applications were ASP.NET- or php-applications where this was no issue as I did not need to restart the webserver on code changes.
It's not just an issue with Go, but in general we can divide the problem into two separate ones:
Making sure current requests do not get terminated and affect user experience.
Making sure there is no down-time in which new requests cannot be handled.
The first one is easier to tackle: You just don't violently kill your server, but tell it to exit, causing a "Drain phase", in which it does not accept new requests and only finishes the currently running requests, and exits. This can be done by listening on signals for example, and entering the app into a special state.
It's not trivial with Go as the default http server doesn't support shutting it down, but you can start a server with a net.Listener, and then keep a reference to it an close it when the time is due.
Now, doing only approach one and then starting the service again will cause new requests not to be accepted while this is going on, and we all know this can take a number of seconds in extreme cases.
So what we need is another instance of the server already running with the new code, the instant the old one is not responding to new requests, right? That can be done in several ways:
Having more than one server, and a load-balancer on top of them, allowing one (or more) server to take the load while we restart another. That's the simplest way, and the way most people do it. If you need N servers to take the load of your users, just keep N+1 and restart one at a time.
Using socket sharing tricks. In Newer Linux kernels, Many processes can listen and accept on the same port. What you do is simply start the new instance and then tell the old one to finish and exit. This way there is no pause. This is done by setting SO_REUSEPORT on the listening socket.
The above can be automated with ready to ship solutions, like Einhorn, that deals with all the details for you, see https://github.com/stripe/einhorn
Another approach is documented in this blog post: http://blog.nella.org/?p=879
I am trying to automate the startup of my Selenium Grid.
I have the Hub registered as a service, so that starts when the machine starts, but
the literature tells me I can't do the same with the node, because it won't be in a User context, and so I would not be able to get screenshots etc.
I've seen vague hints that you can add something to the registry to start a program, but I'm not really convinced thats what I want.
IT pulls down the servers for upgrades at intervals, and sessions are set to time out after X amount of inactivity, so its a tedious and silly process to open remote desktops to all 6 nodes, in order to log in, then start the process every time.
How do you best manage this?
- Configure the machines to auto-login, and place startSeleniumNode.bat in that users start-folder?
- Add some kind of commandline entry in the jenkins build script that launches the test, to call each of the 6 nodes in turn to start the selenium node (and how would you do that?)
Take a look at AlwaysUp - it allows you to run almost any application as a Windows service including Selenium Grid hubs and nodes.
I've previously created a fairly large Grid infrastructure using AlwaysUp for node management. It's very useful for starting up Grid on boot and lets you specify a user account to run as, schedule restarts at regular intervals and a lot more.
Using the WebSphere Integrated Solutions Console, a large (18,400 file) web application is updated by specifying a war file name and going through the update screens and finally saving the configuration. The Solutions Console web UI spins a while, then it returns, at which point the user is able to start the web application.
If the application is started after the "successful update", it fails because the files that are the web application have not been exploded out to the deployment directory yet.
Experimentation indicates that it takes on the order of 12 minutes for the files to appear!
One more bit of background that may be significant: There are 19 application servers on this one WebSphere instance. WebSphere insists that there be a lot of chatter between them, even though they don't need anything from each other. I wondered if this might be slowing things down when it comes to deployment. Or if there's some timer in the bowels of WebSphere that is just set wrong (usual disclaimers apply...I'm just showing up and finding this situation...I didn't configure this installation).
Additional Information:
This is a Network Deployment configuration, and it's all on one physical host.
* ND 6.1.0.23
Is this a standalone or a ND set up? I am guessing it is ND set up considering you have stated that they are 19 app servers. The nodes should be synchronized with the deployment manager so that the updated files are available to the respective nodes.
After you update and save the changes, try and synchronize the nodes with the dmgr (or alternatively as part of the update process, click on review and the check the box which says synchronize nodes) and this would distribute the changes to the various nodes.
The default interval, i believe is 1 minute.
12 minutes certainly sounds a lot. Is there any possibility of network being an issue here?
HTH
Manglu