I'm running a benchmark to test IO runtime differences between Docker containers and its host, and I noticed something strange. I've performed random writes/reads.
The storage driver for the container is aufs.
If the file to be written/read is smaller or equals to 1GB, docker is faster than the host (otherwise, if the file is bigger, docker is slower).
Why do I get those results for small files?
Related
I have read that there is a significant hit to performance when mounting shared volumes on windows. How does this compared to only having say the postgres DB inside of a docker volume (not shared with host OS) or the rate of reading/writing from/to flat files?
Has anyone found any concrete numbers around this? I think even a 4x slowdown would be acceptable for my usecase if it is only for disc IO performance... I get the impression that mounted + shared volumes are significantly slower on windows... so I want to know if foregoing this sharing component help improve matters into an acceptable range.
Also if I left Postgres on bare metal can all of my docker apps access Postgres still that way? (That's probably preferred I would imagine - I have seen reports of 4x faster read/write staying bare metal) - but I still need to know... because my apps deal with lots of copy / read / moving of flat files as well... so need to know what is best for that.
For example, if shared volumes are really bad vs keeping it only on the container, then I have options to push files over the network to avoid the need for a shared mounted volume as a bottleneck...
Thanks for any insights
You only pay this performance cost for bind-mounted host directories. Named Docker volumes or the Docker container filesystem will be much faster. The standard Docker Hub database images are configured to always use a volume for storage, so you should use a named volume for this case.
docker volume create pgdata
docker run -v pgdata:/var/lib/postgresql/data -p 5432:5432 postgres:12
You can also run PostgreSQL directly on the host. On systems using the Docker Desktop application you can access it via the special hostname host.docker.internal. This is discussed at length in From inside of a Docker container, how do I connect to the localhost of the machine?.
If you're using the Docker Desktop application, and you're using volumes for:
Opaque database storage, like the PostgreSQL data: use a named volume; it will be faster and you can't usefully directly access the data even if you did have it on the host
Injecting individual config files: use a bind mount; these are usually only read once at startup so there's not much of a performance cost
Exporting log files: use a bind mount; if there is enough log I/O to be a performance problem you're probably actively debugging
Your application source code: don't use a volume at all, run the code that's in the image, or use a native host development environment
Trying to deploy a containerized springboot app using docker.
here's my Dockerfile:
ROM openjdk:8
ADD app-1.0.jar app.jar
EXPOSE 8080
ENTRYPOINT ["java", "-Xmx64m", "-Xss256k", "-jar", "ecalio-landing.jar"]
And runned a container like this:
sudo docker run -d -m256m --restart=always server.tomcat.max-threads=5 --name=ecalio-landing
Once deployed, i used apache benchmark to test the backend and how much requests it can reach using this command:
ab -n 20000 -c 10 http://www.ecalio.com/
Basically, trying to know if the backend can reach 20000 requests 10 at a time, because i've limited the container memory consumption to 256mb.
The container starts with 220mb and reaches a level arround 245mb and doesn't go further even if i rerun the same ab command
However, when i try to reach the backend using a browser, 0.1mb is consumed each time i refresh the browser and obviously the container crushes once it reaches 256mb of memory consumption.
How come does such a thing happends ?
I don't want my container to consume much memory, it's basically a simple app which makes use of jpa with 1 entity model only, and perform only 1 retrieve request called each time / url is called using a simple controller (#Controller) and returns a single html page rendered with thymeleaf
I've used Java VisualVM With my app (launched locally on my machine without a docker container) and i can see clearly that my app doesn't have any leak of memory, the heap memory doesn't go further more than 68mb and used heap is always being cleared by the GC ...
After so much struggles i've found the solution to that, and it's too ridiculous...
When those 2 options aren't passed to the JVM, it supposes that the container shares the same amount of resources as the host machine even if the -m parameeter is passed to the container's creation command...
-XX:+UnlockExperimentalVMOptions
-XX:+UseCGroupMemoryLimitForHeap
It means if i create a container using the -m300m to specify that my container shoudn't allocate more than 300mb, the jvm will still think that the container has right to 2gb of memory (where 2gb is my machine's physical memory)
Using these options, i was able to get my app working on 256mb container... How amazing when i know that one time, my container consummed up to 800mb...
sources:
Official openjdk's docker image
Interresting article
My new Dockerfile:
FROM openjdk:8
ADD app.jar app.jar
EXPOSE 8080
ENTRYPOINT ["java", "-server", "-XX:+UnlockExperimentalVMOptions", "-XX:+UseCGroupMemoryLimitForHeap", "-jar", "app.jar"]
the -server is useful to let the JVM knows that the app is to be excuted in a server environnement, which will leads to some changement including GC dedicated algorithm for server environnements and some other behaviors which may be found in the offical documentation.
Note that no Xmx or Xss or whatever additional options are needed for memory limitation, as the JVM will fix everything by itself (more details in the article below)
Another thing to know is that this configuration is done automatically in the OpenJDK 11.
I pulled a standard docker ubuntu image and ran it like this:
docker run -i -t ubuntu bash -l
When I do an ls inside the container I see a proper filesystem and I can create files etc. How is this different from a VM? Also what are the limits of how big a file can I create on this container filesystem? Also is there a way I can create a file inside the container filesystem that persists in the host filesystem after the container is stopped or killed?
How is this different from a VM?
A VM will lock and allocate resources (disk, CPU, memory) for its full stack, even if it does nothing.
A Container isolates resources from the host (disk, CPU, memory), but won't actually use them unless it does something. You can launch many containers, if they are doing nothing, they won't use memory, CPU or disk.
Regarding the disk, those containers (launched from the same image) share the same filesystem, and through a COW (copy on Write) mechanism and UnionFS, will add a layer when you are writing in the container.
That layer will be lost when the container exits and is removed.
To persists data written in a container, see "Manage data in a container"
For more, read the insightful article from Jessie Frazelle "Setting the Record Straight: containers vs. Zones vs. Jails vs. VMs"
I am new to lxc and docker. Does docker max client count depend solely on CPU and RAM or are there some other factors associated with running multiple containers simultaneously?
As mentioned in the comments to your question, it will largely depend on the requirements of the applications inside the containers.
What follows is anecdotal data I collected for this answer (This is on a Macbook Pro with 8 cores, 16Gb and Docker running in VirtualBox with boot2docker 2Gb, using 2 MBP cores):
I was able to launch 242 (idle) redis containers before getting:
2014/06/30 08:07:58 Error: Cannot start container c4b49372111c45ae30bb4e7edb322dbffad8b47c5fa6eafad890e8df4b347ffa: pipe2: too many open files
After that, top inside the VM reports CPU use around 30%-55% user and 10%-12% system (every redis process seems to use 0.2%). Also, I get time outs while trying to connect to a redis server.
I just started with Docker because I'd like to use it to run parallel tasks.
My problem is that I don't understand how Docker handles the resources on the host (CPU, RAM, etc.): i.e. how can I evaluate the maximum number of containers to run at the same time?
Thanks for your suggestions.