I want to instantiate different containers from the same image to serve different requests and process different data.
Once a request is received by Docker, it has to instantiate a container (C1) from image (I), to work on a related dataset file D1.
For the second request, Docker has to instantiate a container (C2) from image (I) as well, to work on dataset file D2.
And so on ...
Does Docker have a built-in facility to orchestrate this kind of business or I have to write my own service to receive requests and start the corresponding containers to serve them?
Kindly provide your guidance on what is the best way to do so.
I think what you're looking for is a serverless / function as a service framework. AFAIK Docker doesn't have anything like this built in. Try to take a look at OpenFaaS or Kubeless. Both are frameworks that should help you get started with implementing your case.
Related
Question:
Is there an option within spring or its embedded servlet container to open ports when spring is ready to handle traffic?
Situation:
In the current setup i use a spring boot application running in google cloud run.
Circumstances:
Cloud run does not support liveness/readyness probes, it considers an open port as "application ready".
Cloud run sends request to the container although spring is not ready to handle requests.
Spring start its servlet container, open its ports while still spinning up its beans.
Problem:
Traffic to an unready application will result in a lot of http 429 status codes.
This affects:
new deployments
scaling capabilities of cloud run
My desire:
Configure spring/servlet container to delay opening ports when application is actually ready
Delaying opening ports to the time the application is ready would ease much pain without interfering too much with the existing code base.
Any alternatives not causing too much pain?
Things i found and considered not viable
Using native-image is not an option as it is considered experimental and consumes more RAM at compile time than our deployment pipeline agents allow to allocate (max 8GB vs needed 13GB)
another answer i found: readiness check for google cloud run - how?
which i don't see how it could satisfy my needs, since spring-boot startup time is still slow. That's why my initial idea was to delay opening ports
I did not have time to test the following, but one thing i stumbled upon is
a blogpost about using multiple processes within a container. Though it is against the recommendation of containers principles, it seems viable for the time until cloud run supports probes of any type.
As you are well aware of the fact that “Cloud Run currently does not have a readiness/liveness check to avoid sending requests to unready applications” I would say there is not much that can be done on Cloud Run’s side except :
Try and optimise the Spring boot app as per the docs.
Make a heavier entrypoint in Cloud Run service that takes care of
more setup tasks. This stackoverflow thread mentions how “A
’heavier’ entrypoint will help post-deploy responsiveness, at the
cost of slower cold-starts” ( this is the most relevant solution
from a Cloud Run perspective and outlines the issue correctly)
Run multiple processes in a container in Cloud Run as you
mentioned.
This question seems more directed at Spring Boot specifically and I found an article with a similar requirement.
However, if you absolutely need the app ready to serve when requests come in, we have another alternative to Cloud Run, Google Kubernetes Engine (GKE) which makes use of readiness/liveness probes.
Currently we are running a NodeJS webApp using serverless. The API Gateway is using a single API endpoint for the entire application and routing is handled internally. So basically single http {Any+} endpoint for entire application.
My question is,
1, Whats the disadvantage of this method?? ( I know lambda is build for FaaS but right now we are handling it as a monolithic function.)
2, How much instance can lambda run at a time if we are following this method? Can it handle a million+ request at single time?
Every help would be appreciated. Thanks!
Disadvantage would be as you say - it's monolithic so you've not modularised your code at all. The idea is that adjusting one function shouldn't affect the rest, but in this case it can.
You can run as many as you like concurrently; you can set limits though (and there are some limits initially for safety which can be removed).
If you are running the function regularly it should also 'warm start' i.e. have a shorter boot time after the first time.
I'm planning to run a .sh script after some containers are up to make some REST requests.
Created a job to wait for these initial containers, filled initContainers tag in .yaml.
First, i thought creating a container with using a Linux distro as a base image. Then it didn't seem quite right.
Isn't that waste of resources ? What is the best practice for this situation ?
Thanks in advance.
You can use any kind of image as base image. The only resource you're consuming (not wasting) is disk space.
Kubernetes will pull the image and that will consume disk space on the node. As soon as the init container has done it's work, it will be stopped and won't use any more RAM or CPU which are the real precious resources in the cloud or on bare metal.
If you're worried about the image size, you can also try to use the same image as the user container (the main container) but starting a different command from it.
This will make the node just pull on image and no additional space appart from your script will be consumed.
Another option could be to use a very small distro like alpine.
If you can write your initialization routine in go you can also use your go binary as image like discribed here.
I plan to use ElasticBeanstalk docker runtime.
We plan to construct a multi container that launches Laravel application container and Laravel queue worker container.
There is no problem deploying them.
However, there is a doubt about how they scales.
For example, if only the load of the Laravel application container increases, only that container increases and the queue worker container does not increase?
Or does the two containers increase in the same way in any case?
I want the former to be like.
If you have knowledge, tell me.
Thanks for reading.
When trying to use the prediction service for a model deployed by steam, this is what I see:
Notice that when I click the "Predict" button, I get a prediction label response from the model. But there are no input fields being displayed. Why is this happening?
I start my steam session like this:
I launch h2o flow
java -Xmx4g -jar h2o.jar
I start the steam jetty server for the prediction service (as instructed here):
java -Xmx6g -jar var/master/assets/jetty-runner.jar var/master/assets/ROOT.war
I use -Xmx6g because I was getting a java.lang.OutOfMemoryError
from the prediction service earlier.
I launch the steam server:
./steam serve master --prediction-service-host=localhost --prediction-service-port-range=12345:22345
I use a custom port range for the prediction service since I was having problems deploying models from steam where it could not access port 8080 (if anyone knows a better way around this please let me know). From here, I import model from the localhost h2o flow server in steam and deploy it to get the screen show at the top of this post.
I was having problems before where the prediction service builder server (launched with GRADLE_OPTS=-Xmx6g ./gradlew jettyRunWar following the instruction here) was not showing input fields for .war files built from mojos (see here), but I am using a model imported directly from h2o flow into steam in this case. If anyone knows what is going on here it would be a big help. Thanks :)
UPDATE
Used a smaller similar model (POJO size of ~200MB) and can now see input fields (after waiting on the prediction service screen for ~10sec.). Can't tell what kind of file the model is currently being transferred as under the hood though, I assume POJO now. One weird thing though is that the input fields also include the models binomial response labels (as if the user could just choose the response as input).
As I explained in this other question Using MOJOS in H2O Steam Prediction Service Builder this is because the UI has not been updated to handle MOJOs, it currently only handles POJOs.
You can use command line (or other tools) to send data to and get predictions from the prediction service. How to do this is explained here: https://github.com/h2oai/steam/tree/master/prediction-service-builder