Utilizing all cores on single-machine Hadoop - hadoop

Currently I'm working on loading data into a Titan graph with Hadoop (Titan version 0.5.4, Hadoop version 2.6.0). I'm using a single-server (pseudo-distributed) Hadoop cluster, with the purpose of extending to a full cluster with more machines of the same hardware. I'm trying to setup Hadoop in such a way that I get full core utilization. Until now, I though I had made some decent setup with good configuration parameters, but when Hadoop is executing and loads data into the Titan graph, I don't see full utilization of all cores on my machine.
The situation is as follows. The machine I'm using has the following hardware specifications:
CPU: 32 cores
RAM: 256GB
Swap memory: 32GB
Drives: 8x128GB SSD, 4x2TB HDD
The data I'm loading into a Titan graph with Hadoop has the following specifications:
Total size: 848MB
Split into four files (487MB, 142MB, 219MB and 1.6MB), each containing vertices of one single type, together with all the vertex properties and outgoing edges.
While setting up the Hadoop cluster, I tried to use some logic reasoning for setting the configuration parameters of Hadoop to their (what I think is the) optimal setting. See this reasoning below.
My machine has 32 cores, so in theory I could split up my input size into chuncks of which the size is big enough to end up with around 32 chuncks. So, for 848MB of input, I could set dfs.block.size to 32MB, which would lead to around (848MB / 32MB ~ ) 27 chunks.
In order to ensure that each map task receives one chunck, I set the value of mapred.min.split.size to a bit less than the block size, and mapred.max.split.size to a bit more than the block size (for example 30MB and 34MB, respectively).
The available memory needed per task is a bit vague for me. For example, I could set mapred.child.java.opts to a value of -Xmx1024m to give each task (e.g. each mapper/reducer) 1GB of memory. Given that my machine has 256GB memory in total - subtracting some from it to reserve for other purposes leaving me around 200GB of memory - I could end up with a total of (200GB / 1GB = ) 200 mappers and reducers. Or, when I give each task 2GB of memory, I would end up with a total of 100 mappers and reducers. The amount of memory given to each task also depends on the input size, I guess. Anyway, this leads to values for mapred.tasktracker.map/reduce.tasks.maximum of around 100, which might already be too much given the fact I have only 32 cores. Therefore, maybe setting this parameter to 32 for both map and reduce might be better? What do you think?
After these assumptions, I end up with the following configuration.
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.block.size</name>
<value>33554432</value>
<description>Specifies the sizeof data blocks in which the input dataset is split.</description>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<description>The runtime framework for executing MapReduce jobs. Can be one of local, classic or yarn.</description>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx2048m</value>
<description>Java opts for the task tracker child processes.</description>
</property>
<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>32</value>
<description>The maximum number of map tasks that will be run simultaneously by a tasktracker.</description>
</property>
<property>
<name>mapred.tasktracker.reduce.tasks.maximum</name>
<value>32</value>
<description>The maximum number of reduce tasks that will be run simultaneously by a tasktracker.</description>
</property>
<property>
<name>mapred.min.split.size</name>
<value>31457280</value>
<description>The minimum size chunk that map input should be split into.</description>
</property>
<property>
<name>mapred.max.split.size</name>
<value>35651584</value>
<description>The maximum size chunk that map input should be split into.</description>
</property>
<property>
<name>mapreduce.job.reduces</name>
<value>32</value>
<description>The default number of reducers to use.</description>
</property>
<property>
<name>mapreduce.job.maps</name>
<value>32</value>
<description>The default number of maps to use.</description>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>2048</value>
<description>The minimum allocation for every container request at the RM, in MBs.</description>
</property>
</configuration>
Executing Hadoop with these settings does not give my full core utilization on my single machine. Not all cores are busy throughout all MapReduce phases. During the Hadoop execution, I also took a look at the IO throughput using the iostat command (iostat -d -x 5 3 giving me three reports of 5 second intervals). A sample of such a report is shown below.
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.07 0.02 0.41 0.29 2.37 12.55 0.01 16.92 5.18 17.43 2.47 0.10
sdb 0.07 2.86 4.90 10.17 585.19 1375.03 260.18 0.04 2.96 23.45 8.55 1.76 2.65
sdc 0.08 2.83 4.89 10.12 585.48 1374.71 261.17 0.07 4.89 30.35 8.12 2.08 3.13
sdd 0.07 2.83 4.89 10.10 584.79 1374.46 261.34 0.04 2.78 26.83 6.71 1.94 2.91
sde 0.00 0.00 0.00 0.00 0.05 0.80 278.61 0.00 10.74 2.55 32.93 0.73 0.00
sdf 0.00 0.00 0.00 0.00 0.05 0.80 283.72 0.00 10.30 1.94 33.09 0.68 0.00
sdg 0.00 0.00 0.00 0.00 0.05 0.80 283.83 0.00 10.24 1.99 32.75 0.68 0.00
sdh 0.00 0.00 0.00 0.00 0.05 0.80 284.13 0.00 10.29 1.96 32.99 0.69 0.00
sdi 0.00 0.00 0.00 0.00 0.05 0.80 284.87 0.00 17.89 2.35 60.33 0.74 0.00
sdj 0.00 0.00 0.00 0.00 0.05 0.80 284.05 0.00 10.30 2.01 32.96 0.68 0.00
sdk 0.00 0.00 0.00 0.00 0.05 0.80 284.44 0.00 10.20 1.99 32.62 0.68 0.00
sdl 0.00 0.00 0.00 0.00 0.05 0.80 284.21 0.00 10.50 2.00 33.71 0.69 0.00
md127 0.00 0.00 0.04 0.01 0.36 6.38 279.84 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 14.92 36.53 1755.46 4124.20 228.57 0.00 0.00 0.00 0.00 0.00 0.00
I'm no expert in disk utilization, but could these value mean that I'm IO-bound somewhere, for example on disks sdb, sbc or sdd?
Edit: maybe a better indication of CPU utilization and IO throughput can be given by using the sar command. Here are results for 5 reports, 5 seconds aprt (sar -u 5 5):
11:07:45 AM CPU %user %nice %system %iowait %steal %idle
11:07:50 AM all 12.77 0.01 0.91 0.31 0.00 86.00
11:07:55 AM all 15.99 0.00 1.39 0.56 0.00 82.05
11:08:00 AM all 11.43 0.00 0.58 0.04 0.00 87.95
11:08:05 AM all 8.03 0.00 0.69 0.48 0.00 90.80
11:08:10 AM all 8.58 0.00 0.59 0.03 0.00 90.80
Average: all 11.36 0.00 0.83 0.28 0.00 87.53
Thanks in advance for any reply!

Set this parameter in yarn-site.xml to a number of cores you machine has:
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>32</value>
</property>
Then run pi from hadoop-examples jar and observe with resource manager's web page how many mappers are being executed at the same time

Related

Why Ceph turns status to Err when there is still available storage space

I built a 3 node Ceph cluster recently. Each node had seven 1TB HDD for OSDs. In total, I have 21 TB of storage space for Ceph.
However, when I ran a workload to keep writing data to Ceph, it turns to Err status and no data can be written to it any more.
The output of ceph -s is:
cluster:
id: 06ed9d57-c68e-4899-91a6-d72125614a94
health: HEALTH_ERR
1 full osd(s)
4 nearfull osd(s)
7 pool(s) full
services:
mon: 1 daemons, quorum host3
mgr: admin(active), standbys: 06ed9d57-c68e-4899-91a6-d72125614a94
osd: 21 osds: 21 up, 21 in
rgw: 4 daemons active
data:
pools: 7 pools, 1748 pgs
objects: 2.03M objects, 7.34TiB
usage: 14.7TiB used, 4.37TiB / 19.1TiB avail
pgs: 1748 active+clean
Based on my comprehension, since there is still 4.37 TB space left, Ceph itself should take care about how to balance the workload and make each OSD to not be at full or nearfull status. But the result doesn't work as my expectation, 1 full osd and 4 nearfull osd shows up, the health is HEALTH_ERR.
I can't visit Ceph with hdfs or s3cmd anymore, so here comes the question:
1, Any explanation about current issue?
2, How can I recover from it? Delete data on Ceph node directly with ceph-admin, and relaunch the Ceph?
Not get an answer for 3 days and I made some progress, let me share my findings here.
1, It's normal for different OSD to have size gap. If you list OSD with ceph osd df, you will find that different OSD has different usage ratio.
2, To recover from this issue, the issue here means the cluster crush due to OSD full. Follow steps below, it's mostly from redhat.
Get ceph cluster health info by ceph health detail. It's not necessary but you can get the ID of failed OSD.
Use ceph osd dump | grep full_ratio to get current full_ratio. Do not use statement listed at above link, it's obsoleted. The output can be like
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85
Set OSD full ratio a little higher by ceph osd set-full-ratio <ratio>. Generally, we set ratio to 0.97
Now, the cluster status will change from HEALTH_ERR to HEALTH_WARN or HEALTH_OK. Remove some data that can be released.
Change OSD full ratio back to previous ratio. It can't be 0.97 always cause it's a little risky.
Hope this thread is helpful to some one who ran into same issue. The details about OSD configuration please refer to ceph.
Ceph requires free disk space to move storage chunks, called pgs, between different disks. As this free space is so critical to the underlying functionality, Ceph will go into HEALTH_WARN once any OSD reaches the near_full ratio (generally 85% full), and will stop write operations on the cluster by entering HEALTH_ERR state once an OSD reaches the full_ratio.
However, unless your cluster is perfectly balanced across all OSDs there is likely much more capacity available, as OSDs are typically unevenly utilized. To check overall utilization and available capacity you can run ceph osd df.
Example output:
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
2 hdd 2.72849 1.00000 2.7 TiB 2.0 TiB 2.0 TiB 72 MiB 3.6 GiB 742 GiB 73.44 1.06 406 up
5 hdd 2.72849 1.00000 2.7 TiB 2.0 TiB 2.0 TiB 119 MiB 3.3 GiB 726 GiB 74.00 1.06 414 up
12 hdd 2.72849 1.00000 2.7 TiB 2.2 TiB 2.2 TiB 72 MiB 3.7 GiB 579 GiB 79.26 1.14 407 up
14 hdd 2.72849 1.00000 2.7 TiB 2.3 TiB 2.3 TiB 80 MiB 3.6 GiB 477 GiB 82.92 1.19 367 up
8 ssd 0.10840 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 up
1 hdd 2.72849 1.00000 2.7 TiB 1.7 TiB 1.7 TiB 27 MiB 2.9 GiB 1006 GiB 64.01 0.92 253 up
4 hdd 2.72849 1.00000 2.7 TiB 1.7 TiB 1.7 TiB 79 MiB 2.9 GiB 1018 GiB 63.55 0.91 259 up
10 hdd 2.72849 1.00000 2.7 TiB 1.9 TiB 1.9 TiB 70 MiB 3.0 GiB 887 GiB 68.24 0.98 256 up
13 hdd 2.72849 1.00000 2.7 TiB 1.8 TiB 1.8 TiB 80 MiB 3.0 GiB 971 GiB 65.24 0.94 277 up
15 hdd 2.72849 1.00000 2.7 TiB 2.0 TiB 2.0 TiB 58 MiB 3.1 GiB 793 GiB 71.63 1.03 283 up
17 hdd 2.72849 1.00000 2.7 TiB 1.6 TiB 1.6 TiB 113 MiB 2.8 GiB 1.1 TiB 59.78 0.86 259 up
19 hdd 2.72849 1.00000 2.7 TiB 1.6 TiB 1.6 TiB 100 MiB 2.7 GiB 1.2 TiB 56.98 0.82 265 up
7 ssd 0.10840 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 up
0 hdd 2.72849 1.00000 2.7 TiB 2.0 TiB 2.0 TiB 105 MiB 3.0 GiB 734 GiB 73.72 1.06 337 up
3 hdd 2.72849 1.00000 2.7 TiB 2.0 TiB 2.0 TiB 98 MiB 3.0 GiB 781 GiB 72.04 1.04 354 up
9 hdd 2.72849 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 up
11 hdd 2.72849 1.00000 2.7 TiB 1.9 TiB 1.9 TiB 76 MiB 3.0 GiB 817 GiB 70.74 1.02 342 up
16 hdd 2.72849 1.00000 2.7 TiB 1.8 TiB 1.8 TiB 98 MiB 2.7 GiB 984 GiB 64.80 0.93 317 up
18 hdd 2.72849 1.00000 2.7 TiB 2.0 TiB 2.0 TiB 79 MiB 3.0 GiB 792 GiB 71.65 1.03 324 up
6 ssd 0.10840 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 up
TOTAL 47 TiB 30 TiB 30 TiB 1.3 GiB 53 GiB 16 TiB 69.50
MIN/MAX VAR: 0.82/1.19 STDDEV: 6.64
As you can see in the above output, the used OSDs vary from 56.98% (OSD 19) to 82.92% (OSD 14) utilized, which is a significant variance.
As only a single OSD is full, and only 4 of your 21 OSD's are nearfull you likely have a significant amount of storage still available in your cluster, which means that it is time to perform a rebalance operation. This can be done manually by reweighting OSDs, or you can have Ceph do a best-effort rebalance by running the command ceph osd reweight-by-utilization. Once the rebalance is complete (i.e you have no objects misplaced in ceph status) you can check for the variation again (using ceph osd df) and trigger another rebalance if required.
If you are on Luminous or newer you can enable the Balancer plugin to handle OSD rewighting automatically.

building software using bazel is slower than make?

My team has a project which is not too big, built by make -js, cost 40 seconds, when using bazel, the time incresed to 70 secs. And here is the profile of the build process of bazel. I noticed that SKYFUNCTION takes 47% of the time cost, is that reasonable?
PROFILES
the last section of it:
Type Total Count Average
ACTION 0.03% 77 0.70 ms
ACTION_CHECK 0.00% 4 0.90 ms
ACTION_EXECUTE 40.40% 77 912 ms
ACTION_UPDATE 0.00% 74 0.02 ms
ACTION_COMPLETE 0.19% 77 4.28 ms
INFO 0.00% 1 0.05 ms
VFS_STAT 1.07% 117519 0.02 ms
VFS_DIR 0.27% 4613 0.10 ms
VFS_MD5 0.22% 151 2.56 ms
VFS_DELETE 4.43% 53830 0.14 ms
VFS_OPEN 0.01% 232 0.11 ms
VFS_READ 0.06% 3523 0.03 ms
VFS_WRITE 0.00% 4 0.97 ms
WAIT 0.05% 156 0.56 ms
SKYFRAME_EVAL 6.23% 1 10.830 s
SKYFUNCTION 47.01% 687 119 ms
#ittai, #Jin, #Ondrej K, I have tried to switched off the sandboxing in bazel, it seems much faster than with it switched on. here is the comparison:
SWITCHED ON: 70s
SWITCHED OFF: 33s±2
the skyFunction still takes 47% of all the execution time. but the everage times it takes turned from 119ms to 21ms.

Why does Spark only use one executor on my 2 worker node cluster if I increase the executor memory past 5 GB?

I am using a 3 node cluster: 1 master node and 2 worker nodes, using T2.large EC2 instances.
The "free -m" command gives me the following info:
Master:
total used free shared buffers cached
Mem: 7733 6324 1409 0 221 4555
-/+ buffers/cache: 1547 6186
Swap: 1023 0 1023
Worker Node 1:
total used free shared buffers cached
Mem: 7733 3203 4530 0 185 2166
-/+ buffers/cache: 851 6881
Swap: 1023 0 1023
Worker Node 2:
total used free shared buffers cached
Mem: 7733 3402 4331 0 185 2399
-/+ buffers/cache: 817 6915
Swap: 1023 0 1023
In the yarn-site.xml file, I have the following properties set:
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>7733</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>7733</value>
</property>
In $SPARK_HOME/conf/spark-defaults.conf I am setting the spark.executor.cores at 2 and spark.executor.instances at 2.
When looking at the spark-history UI after running my spark application, both executors (1 and 2) show up in the "Executors" tab along with the driver. In the cores column on that same page, it says 2 for both executors.
When I set the executor-memory at 5G and lower, my spark application runs fine with both worker node executors running. When I set the executor memory at 6G or more, only one worker node runs an executor. Why does this happen? Note: I have tried increasing the yarn.nodemanager.resource.memory-mb and it doesn't change this behavior.

Hadoop TeraSort not using all cluster nodes

Question
Regarding the TeraSort demo in hadoop, please suggest if the symptom is as expected or the workload should be distributed.
Background
Started Hadoop (3 nodes in a cluster) and run the TeraSort benchmark as below in Executions.
I expected all 3 nodes would get busy and all CPU would be utilized (400% in top). However only the node on which the job started got busy and the CPU was not fully utilized. For example if it is started on sydspark02, top shows as below.
I wonder this is as expected or if there is a configuration issue by which the workload is not distributed among the nodes.
sydspark02
top - 13:37:12 up 5 days, 2:58, 2 users, load average: 0.22, 0.06, 0.12
Tasks: 134 total, 1 running, 133 sleeping, 0 stopped, 0 zombie
%Cpu(s): 27.5 us, 2.7 sy, 0.0 ni, 69.8 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st
KiB Mem: 8175980 total, 1781888 used, 6394092 free, 68 buffers
KiB Swap: 0 total, 0 used, 0 free. 532116 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2602 hadoop 20 0 2191288 601352 22988 S 120.7 7.4 0:15.52 java
1197 hadoop 20 0 105644 976 0 S 0.3 0.0 0:00.16 sshd
2359 hadoop 20 0 2756336 270332 23280 S 0.3 3.3 0:08.87 java
sydspark01
top - 13:38:32 up 2 days, 19:28, 2 users, load average: 0.15, 0.07, 0.11
Tasks: 141 total, 1 running, 140 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.2 us, 1.2 sy, 0.0 ni, 96.6 id, 2.1 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 10240364 total, 10092352 used, 148012 free, 648 buffers
KiB Swap: 0 total, 0 used, 0 free. 8527904 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10635 hadoop 20 0 2766264 238540 23160 S 3.0 2.3 0:11.15 java
11353 hadoop 20 0 2770680 287504 22956 S 1.0 2.8 0:08.97 java
11057 hadoop 20 0 2888396 327260 23068 S 0.7 3.2 0:12.42 java
sydspark03
top - 13:44:21 up 5 days, 1:01, 1 user, load average: 0.00, 0.01, 0.05
Tasks: 124 total, 1 running, 123 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.7 us, 0.0 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 8175980 total, 4552876 used, 3623104 free, 1156 buffers
KiB Swap: 0 total, 0 used, 0 free. 3818884 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
29374 hadoop 20 0 2729012 204180 22952 S 3.0 2.5 0:07.47 java
Executions
> sbin/start-dfs.sh
Starting namenodes on [sydspark01]
sydspark01: starting namenode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-hadoop-namenode-sydspark01.out
sydspark03: starting datanode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-hadoop-datanode-sydspark03.out
sydspark02: starting datanode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-hadoop-datanode-sydspark02.out
sydspark01: starting datanode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-hadoop-datanode-sydspark01.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-hadoop-secondarynamenode-sydspark01.out
> sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop-2.7.3/logs/yarn-hadoop-resourcemanager-sydspark01.out
sydspark01: starting nodemanager, logging to /usr/local/hadoop-2.7.3/logs/yarn-hadoop-nodemanager-sydspark01.out
sydspark03: starting nodemanager, logging to /usr/local/hadoop-2.7.3/logs/yarn-hadoop-nodemanager-sydspark03.out
sydspark02: starting nodemanager, logging to /usr/local/hadoop-2.7.3/logs/yarn-hadoop-nodemanager-sydspark02.out
> hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar teragen 100000000 /user/hadoop/terasort-input
> hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar terasort /user/hadoop/terasort-input /user/hadoop/terasort-output
Configuration files
slaves
sydspark01
sydspark02
sydspark03
core-sites.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/tmp</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://sydspark01:9000</value>
</property>
</configuration>
Environment
Ubuntu 14.04.4 LTS 4CPU on VMWare
Hadoop 2.7.3
Java 8
Monitoring with JMC
When running the JMC on the data node in the node where the job is executed.
CPU
Only about 25% of CPU resource (1 CPU out of 4) is used.
Memory
Yarn
$ yarn node -list
16/10/03 15:36:03 INFO client.RMProxy: Connecting to ResourceManager at sydspark01/143.96.102.161:8032
Total Nodes:3
Node-Id Node-State Node-Http-Address Number-of-Running-Containers
sydspark03:57249 RUNNING sydspark03:8042 0
sydspark02:42220 RUNNING sydspark02:8042 0
sydspark01:50445 RUNNING sydspark01:8042 0
yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>sydspark01:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>sydspark01:8032</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>sydspark01:8088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>sydspark01:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>sydspark01:8033</value>
</property>
</configuration>

Not enough ram to run whole docker-compose stack

Our microservice stack has now crept up to 15 small services for business logic like Auth, messaging, billing, etc. It's now getting to the point where a docker-compose up uses more ram than our devs have on their laptops.
It's not a crazy amount, about 4GB, but I regularly feel the pinch on my 8GB machine (thanks, Chrome).
There's app-level optimisations that we can be, and are, doing, sure, but eventually we are going to need an alternative strategy.
I see a two obvious options:
Use a big cloudy dev machine, perhaps provisioned with docker-machine and aws.
spinning up some machines into a shared dev cloud, like postgres and redis
These aren't very satisfactory, in (1), local files aren't synced, making local dev a nightmare, and in (2) we can break each other's envs.
Help!
Apendix I: output from docker stats
CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O
0ea1779dbb66 32.53% 137.9 MB / 8.186 GB 1.68% 46 kB / 29.4 kB 42 MB / 0 B
12e93d81027c 0.70% 376.1 MB / 8.186 GB 4.59% 297.7 kB / 243 kB 0 B / 1.921 MB
25f7be321716 34.40% 131.1 MB / 8.186 GB 1.60% 38.42 kB / 23.91 kB 39.64 MB / 0 B
26220cab1ded 0.00% 7.274 MB / 8.186 GB 0.09% 19.82 kB / 648 B 6.645 MB / 0 B
2db7ba96dc16 1.22% 51.29 MB / 8.186 GB 0.63% 10.41 kB / 578 B 28.79 MB / 0 B
3296e274be54 0.00% 4.854 MB / 8.186 GB 0.06% 20.07 kB / 1.862 kB 4.069 MB / 0 B
35911ee375fa 0.27% 12.87 MB / 8.186 GB 0.16% 29.16 kB / 6.861 kB 7.137 MB / 0 B
49eccc517040 37.31% 65.76 MB / 8.186 GB 0.80% 31.53 kB / 18.49 kB 36.27 MB / 0 B
6f23f114c44e 31.08% 86.5 MB / 8.186 GB 1.06% 37.25 kB / 29.28 kB 34.66 MB / 0 B
7a0731639e31 30.64% 66.21 MB / 8.186 GB 0.81% 31.1 kB / 19.39 kB 35.6 MB / 0 B
7ec2d73d3d97 0.00% 10.63 MB / 8.186 GB 0.13% 8.685 kB / 834 B 10.4 MB / 12.29 kB
855fd2c80bea 1.10% 46.88 MB / 8.186 GB 0.57% 23.39 kB / 2.423 kB 29.64 MB / 0 B
9993de237b9c 40.37% 170 MB / 8.186 GB 2.08% 19.75 kB / 1.461 kB 52.71 MB / 12.29 kB
a162fbf77c29 24.84% 128.6 MB / 8.186 GB 1.57% 59.82 kB / 54.46 kB 37.81 MB / 0 B
a7bf8b64d516 43.91% 106.1 MB / 8.186 GB 1.30% 46.33 kB / 31.36 kB 35 MB / 0 B
aae18e01b8bb 0.99% 44.16 MB / 8.186 GB 0.54% 7.066 kB / 578 B 28.12 MB / 0 B
bff9c9ee646d 35.43% 71.65 MB / 8.186 GB 0.88% 63.3 kB / 68.06 kB 45.53 MB / 0 B
ca86faedbd59 38.09% 104.9 MB / 8.186 GB 1.28% 31.84 kB / 18.71 kB 36.66 MB / 0 B
d666a1f3be5c 0.00% 9.286 MB / 8.186 GB 0.11% 19.51 kB / 648 B 6.621 MB / 0 B
ef2fa1bc6452 0.00% 7.254 MB / 8.186 GB 0.09% 19.88 kB / 648 B 6.645 MB / 0 B
f20529b47684 0.88% 41.66 MB / 8.186 GB 0.51% 12.45 kB / 648 B 23.96 MB / 0 B
We have been struggling with this issue as well, and still don't really have an ideal solution. However, we have two ideas that we are currently debating.
Run a "Dev" environment in the cloud, which is constantly updated with the master/latest version of every image as it is built. Then each individual project can proxy to that environment in their docker-compose.yml file... so they are running THEIR service locally, but all the dependencies are remote. An important part of this (from your question) is that you have shared dependencies like databases. This should never be the case... never integrate across the database. Each service should store its own data.
Each service is responsible for building a "mock" version of their app that can be used for local dev and medium level integration tests. The mock versions shouldn't have dependencies, and should enable someone to only need a single layer from their service (the 3 or 4 mocks, instead of the 3 or 4 real services each with 3 or 4 of their own and so on).

Resources