I'm trying to setup a new cookbook for Cassandra, and on cassandra.yaml file which has the follow comments about optimal settings:
# For workloads with more data than can fit in memory, Cassandra's
# bottleneck will be reads that need to fetch data from
# disk. "concurrent_reads" should be set to (16 * number_of_drives) in
# order to allow the operations to enqueue low enough in the stack
# that the OS and drives can reorder them.
#
# On the other hand, since writes are almost never IO bound, the ideal
# number of "concurrent_writes" is dependent on the number of cores in
# your system; (8 * number_of_cores) is a good rule of thumb.
However, there's no way to determine the numbers of cores or numbers of disk drives predefined in the attributes because the deployed servers could have different hardware settings.
Is it possible to dynamically override the attributes with the deployed hardware settings? I read the Opscode doc and I don't think it has a way to capture the output from
cat /proc/cpuinfo | grep processor | wc -l
I was thinking about something like this:
cookbook-cassandra/recipes/default.rb
cores = command "cat /proc/cpuinfo | grep processor | wc -l"
node.default["cassandra"]["concurrent_reads"] = cores*8
node.default["cassandra"]["concurrent_writes"] = cores*8
cookbook-cassandra/attributes/default.rb
default[:cassandra] = {
...
# determined by 8 * number of cores
:concurrent_reads => 16,
:concurrent_writes => 16,
..
}
You can capture stdout in Chef with mixlib-shellout (documentation here: https://github.com/opscode/mixlib-shellout).
In your example, you could do something like:
cc = Mixlib::ShellOut.new("cat /proc/cpuinfo | grep processor | wc -l")
cores = cc.run_command.stdout.to_i # runs it, gets stdout, converts to integer
I have found a way to do this in recipes, but I haven't deployed it yet to any box to verify it yet.
num_cores = Integer(`cat /proc/cpuinfo | grep processor | wc -l`)
if ( num_cores > 8 && num_cores != 0 ) # sanity check
node.default["cassandra"]["concurrent_reads"] = (8 * num_cores)
node.default["cassandra"]["concurrent_writes"] = (8 * num_cores)
end
I am using chef 11 so this may not be available on previous versions, but there's a node['cpu'] attribute with info about the cpus, cores and etc.
chef > x = nodes.show 'nodename.domain'; true
=> true
chef > x['cpu']['total']
=> 16
And you can use it on your recipes. That's how the Nginx cookbook does it.
Related
I'm trying to use the kernel's cpuset to isolate my process. To obtain this, I follow the instructions(2.1 Basic Usage) from kernel doc cpusets, however, it didn't work in my environment.
I have tried in both my centos7 server and my ubuntu16.04 work pc, but neither did work.
centos kernel version:
[root#node ~]# uname -r
3.10.0-327.el7.x86_64
ubuntu kernel version:
4.15.0-46-generic
What I have tried is as follows.
root#Latitude:/sys/fs/cgroup/cpuset# pwd
/sys/fs/cgroup/cpuset
root#Latitude:/sys/fs/cgroup/cpuset# cat cpuset.cpus
0-3
root#Latitude:/sys/fs/cgroup/cpuset# cat cpuset.mems
0
root#Latitude:/sys/fs/cgroup/cpuset# cat cpuset.cpu_exclusive
1
root#Latitude:/sys/fs/cgroup/cpuset# cat cpuset.mem_exclusive
1
root#Latitude:/sys/fs/cgroup/cpuset# find . -name cpuset.cpu_excl
usive | xargs cat
0
0
0
0
0
1
root#Latitude:/sys/fs/cgroup/cpuset# mkdir my_cpuset
root#Latitude:/sys/fs/cgroup/cpuset# echo 1 > my_cpuset/cpuset.cpus
root#Latitude:/sys/fs/cgroup/cpuset# echo 0 > my_cpuset/cpuset.mems
root#Latitude:/sys/fs/cgroup/cpuset# echo 1 > my_cpuset/cpuset.cpu_exclusive
bash: echo: write error: Invalid argument
root#Latitude:/sys/fs/cgroup/cpuset#
It just printed the error bash: echo: write error: Invalid argument.
Google it, however, I can't get the correct answers.
As I pasted above, before my operation, I confirmed that the cpuset root path have enabled the cpu_exclusive function and all the cpus are not been excluded by other sub-cpuset.
By using ps -o pid,psr,comm -p $PID, I can confirm that the cpus can be assigned to some process if I don't care cpu_exclusive. But I have also proved that if cpu_exclusive is not set, the same cpus can also be assigned to another processes.
I don't know if it is because some pre-setting are missed.
What I expected is "using cpuset to obtain exclusive use of cpus". Can anyboy give any clues?
Thanks very much.
i believe it is a mis-understanding of cpu_exclusive flag, as i did. Here is the doc https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.txt, quoting:
If a cpuset is cpu or mem exclusive, no other cpuset, other than
a direct ancestor or descendant, may share any of the same CPUs or
Memory Nodes.
so one possible reason you have bash: echo: write error: Invalid argument, is that you have some other cgroup cpuset enabled, and it conflicts with your operations of echo 1 > my_cpuset/cpuset.cpu_exclusive
please run find . -name cpuset.cpus | xargs cat to list all your cgroup's target cpus.
assume you have 12 cpus, if you want to set cpu_exclusive of my_cpuset, you need to carefully modify all the other cgroups to use cpus, eg. 0-7, then set cpus of my_cpuset to be 8-11. After all these cpus configurations , you can set cpu_exclusive to be 1.
But still, other process can still use cpu 8-11. Only the tasks that belongs to the other cgroups will not use cpu 8-11
for me, i had some docker container running, which prevents me from setting my cpuset cpu_exclusive
with kernel doc, i do not think it is possible to use cpus exclusively by cgroup itself. One approach (i know this approach is running on production) is that we isolate cpus, and manage the cpu affinity/cpuset by ourselves
I just want to share a small script that I made to enhance the docker stats command.
I am not sure about the exactitude of this method.
Can I assume that the total amount of memory consumed by the complete Docker deployment is the sum of each container consumed memory ?
Please share your modifications and or corrections. This command is documented here: https://docs.docker.com/engine/reference/commandline/stats/
When running a docker stats The output looks like this:
$ docker stats --all --format "table {{.MemPerc}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.Name}}"
MEM % CPU % MEM USAGE / LIMIT NAME
0.50% 1.00% 77.85MiB / 15.57GiB ecstatic_noether
1.50% 3.50% 233.55MiB / 15.57GiB stoic_goodall
0.25% 0.50% 38.92MiB / 15.57GiB drunk_visvesvaraya
My script will add the following line at the end:
2.25% 5.00% 350.32MiB / 15.57GiB TOTAL
docker_stats.sh
#!/bin/bash
# This script is used to complete the output of the docker stats command.
# The docker stats command does not compute the total amount of resources (RAM or CPU)
# Get the total amount of RAM, assumes there are at least 1024*1024 KiB, therefore > 1 GiB
HOST_MEM_TOTAL=$(grep MemTotal /proc/meminfo | awk '{print $2/1024/1024}')
# Get the output of the docker stat command. Will be displayed at the end
# Without modifying the special variable IFS the ouput of the docker stats command won't have
# the new lines thus resulting in a failure when using awk to process each line
IFS=;
DOCKER_STATS_CMD=`docker stats --no-stream --format "table {{.MemPerc}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.Name}}"`
SUM_RAM=`echo $DOCKER_STATS_CMD | tail -n +2 | sed "s/%//g" | awk '{s+=$1} END {print s}'`
SUM_CPU=`echo $DOCKER_STATS_CMD | tail -n +2 | sed "s/%//g" | awk '{s+=$2} END {print s}'`
SUM_RAM_QUANTITY=`LC_NUMERIC=C printf %.2f $(echo "$SUM_RAM*$HOST_MEM_TOTAL*0.01" | bc)`
# Output the result
echo $DOCKER_STATS_CMD
echo -e "${SUM_RAM}%\t\t\t${SUM_CPU}%\t\t${SUM_RAM_QUANTITY}GiB / ${HOST_MEM_TOTAL}GiB\tTOTAL"
From the documentation that you have linked above,
The docker stats command returns a live data stream for running containers.
To limit data to one or more specific containers, specify a list of container names or ids separated by a space.
You can specify a stopped container but stopped containers do not return any data.
and then furthermore,
Note: On Linux, the Docker CLI reports memory usage by subtracting page cache usage from the total memory usage.
The API does not perform such a calculation but rather provides the total memory usage and the amount from the page cache so that clients can use the data as needed.
According to your question, it looks like you can assume so, but also do not forget it also factors in containers that exist but are not running.
Your docker_stats.sh does the job for me, thanks!
I had to add unset LC_ALL somewhere before LC_NUMERIC is used though as the former overrides the latter and otherwise I get this error:
"Zeile 19: printf: 1.7989: Ungültige Zahl." This is probably due to my using a German locale.
There is also a discussion to add this feature to the "docker stats" command itself.
Thanks for sharing the script! I've updated it in a way it depends on DOCKER_MEM_TOTAL instead of HOST_MEM_TOTAL as docker has it's own memory limit which could differ from host total memory count.
I'm new to Ansible an thus this question may seem silly to more advanced users, I'm not sure if it's possible to do what I'm asking since Ansibel is very limited when it comes to loops and conditionals.
I'm performing tasks on a Virtual Connect switch thus I'm limited to use the raw module.
I have a following STDOUT:
=========================================================================
Profile Port Network PXE/IP MAC Address Allocated Status
Name Boot Order Speed
(min-max)
=========================================================================
CLO01ES 1 CLO_355 UseBIOS/Au 00-17-A4-77-58-0 -- -- OK
X02 _1 to 0
-------------------------------------------------------------------------
CLO01ES 2 CLO_355 UseBIOS/Au 00-17-A4-77-58-0 -- -- OK
X02 _2 to 2
-------------------------------------------------------------------------
CLO01ES 3 Multipl UseBIOS/Au 00-17-A4-77-58-0 -- -- OK
X02 e to 4
Network
-------------------------------------------------------------------------
CLO01ES 4 Multipl UseBIOS/Au 00-17-A4-77-58-0 -- -- OK
X02 e to 6
Network
-------------------------------------------------------------------------
<omitted>
The issue is that STDOUT can have multiple lines with different profiles in them i.e. I don't know the line numbers or MAC addresses beforehand.
What I want to achive is a status check. If Profile CLO01ESX02 has Network Name: Multiple Network twice, then I want to skip the task.
Whenever I was googling parsing variables or STDOUT I would get just basic answers.
Is this possible with Ansible or am I forced to write a custom script?
It can't be done directly by any native Ansible modules but using shell command should do the trick.
- name: Check output
command: <your_command_here> | grep CLO01ESX02 | grep "Multiple Network" | wc -l
register: wc
failed_when: wc.stdout|int > 1
I have a script that I'm using to build a config for icinga2. The network is large, multiple /13's large. When I run the script I keep getting the RTTVAR has grown to over 2.3 seconds, decreasing to 2.0 error. I've tried raising my gc_thresh and breaking up the subnets. I've dived through the little info from google and can't seem to find a fix. If anyone has any ideas, I'd really appreciate it. I'm on Ubuntu 16.04
My script:
# Find devices and create IP list
i=72
while [ $i -lt 255 ]
do
echo "$(date) - Scanning xx.$i.0.0/16" >> files/scan.log
nmap -sn --host-timeout 5 xx.$i.0.0/16 -oG - | awk '/Up$/{print $2}' >> files/ip-list
let i=i+1
done
My /etc/sysctl.conf
# Force gc to clean-up quickly
net.ipv4.neigh.default.gc_interval = 3600
# Set ARP cache entry timeout
net.ipv4.neigh.default.gc_stale_time = 3600
# Setup DNS threshold for arp
net.ipv4.neigh.default.gc_thresh3 = 8192
net.ipv4.neigh.default.gc_thresh2 = 4096
net.ipv4.neigh.default.gc_thresh1 = 2048
Edit: added host-timeout 5 removed -n
I can suggest you tu use ping scan. If you want an "overall sight" of your network you can use
nmap -sP -n
It decreases the time a little bit comparing to nmap -sn , you can check it with small examples.
As I said in a comment. Use --host-timeout and --max-retries and that will improve your performance.
I have a job running on production which process xml files.
xml files counts around 4k and of size 8 to 9 GB all together.
After processing we get CSV files as output. I've a cat command which will merge all CSV files to a single file I'm getting:
Errno::ENOMEM: Cannot allocate memory
on cat (Backtick) command.
Below are few details:
System Memory - 4 GB
Swap - 2 GB
Ruby : 1.9.3p286
Files are processed using nokogiri and saxbuilder-0.0.8.
Here, there is a block of code which will process 4,000 XML files and output is saved in CSV (1 per xml) (sorry, I'm not suppose to share it b'coz of company policy).
Below is the code which will merge the output files to a single file
Dir["#{processing_directory}/*.csv"].sort_by {|file| [file.count("/"), file]}.each {|file|
`cat #{file} >> #{final_output_file}`
}
I've taken memory consumption snapshots during processing.It consumes almost all part of the memory, but, it won't fail.
It always fails on cat command.
I guess, on backtick it tries to fork a new process which doesn't get enough memory so it fails.
Please let me know your opinion and alternative to this.
So it seems that your system is running pretty low on memory and spawning a shell + calling cat is too much for the few memory left.
If you don't mind loosing some speed, you can merge the files in ruby, with small buffers.
This avoids spawning a shell, and you can control the buffer size.
This is untested but you get the idea :
buffer_size = 4096
output_file = File.open(final_output_file, 'w')
Dir["#{processing_directory}/*.csv"].sort_by {|file| [file.count("/"), file]}.each do |file|
f = File.open(file)
while buffer = f.read(buffer_size)
output_file.write(buffer)
end
f.close
end
You are probably out of physical memory, so double check that and verify your swap (free -m). In case you don't have a swap space, create one.
Otherwise if your memory is fine, the error is most likely caused by shell resource limits. You may check them by ulimit -a.
They can be changed by ulimit which can modify shell resource limits (see: help ulimit), e.g.
ulimit -Sn unlimited && ulimit -Sl unlimited
To make these limit persistent, you can configure it by creating the ulimit setting file by the following shell command:
cat | sudo tee /etc/security/limits.d/01-${USER}.conf <<EOF
${USER} soft core unlimited
${USER} soft fsize unlimited
${USER} soft nofile 4096
${USER} soft nproc 30654
EOF
Or use /etc/sysctl.conf to change the limit globally (man sysctl.conf), e.g.
kern.maxprocperuid=1000
kern.maxproc=2000
kern.maxfilesperproc=20000
kern.maxfiles=50000
I have the same problem, but instead of cat it was sendmail (gem mail).
I found problem & solution here by installing posix-spawn gem, e.g.
gem install posix-spawn
and here is the example:
a = (1..500_000_000).to_a
require 'posix/spawn'
POSIX::Spawn::spawn('ls')
This time creating child process should succeed.
See also: Minimizing Memory Usage for Creating Application Subprocesses at Oracle.