out of memory kernel crash - linux-kernel

I am facing an issue on my system related to the out of memory (OOM) error. Under this condition, the oom kill utility of the linux kills a process (called "the bad process") using a specific algorithm to free up the space.
I want to print the memory, process stats just before this condition happens.
mm/oom_kill.c contains the function out_of_memory(). I wanted to print my stats just before this function moves ahead with the killing of "the bad process". For this i wrote the following bash script
#!/bin/bash
# Script to print process related info
echo "Vmstat " > OOM_memresults
vmstat >> OOM_memresults
echo >> OOM_memresults
echo "SLABINFO" >> OOM_memresults
cat /proc/slabinfo >> OOM_memresults
echo >> OOM_memresults
echo "Status of process getting killed" >> OOM_memresults
cat /proc/$1/status >> OOM_memresults
Now the problem i am facing is to find a way to call this script from the kernel code.
I cannot use system("scriptname") as system function is not available in linux kernel, neither we can use exec and its variants.
Any ideas how i can call this script from the kernel code or any other way i can print the process, memory related info at any instant from the kernel code.
The "current" function gives the information about the currently running process and its task_struct but its very difficult to pull any useful info out of it.

Related

Bash Script slows down executed program

I have a program where i test different data sets and configuration. I have a script to execute all of those.
imagine my code as :
start = omp_get_wtime()
function()
end = omp_get_wtime()
print(end-start)
and the bash script as
for a in "${first_option[#]}"
do
for b in "${second_option[#]}"
do
for c in "${third_option[#]}"
do
printf("$a $b $c \n")
./exe $a $b $c >> logs.out
done
done
done
now when i execute the exact same configurations by hand, i get varying results from 10 seconds to 0.05 seconds but when i execute the script, i get the same results on the up side but for some reason i can't get any timings lower than 1 seconds. All the configurations that manually compute at less than a second get written in the file at 1.001; 1.102; 0.999 ect...
Any ideas of what is going wrong?
Thanks
My suggestion would be to remove the ">> logs.out" to see what happens with the speed.
From there you can try several options:
Replace ">> log.out" with "| tee -a log.out"
Investigate stdbuf and if your code is python, look at "PYTHONUNBUFFERED=1" shell variable. See also: How to disable stdout buffer when running shell
Redirect bash print command with ">&2" (write to stderr) and move ">> log.out" or "| tee -a log.out" behind the last "done"
You can probably see what is causing the delay by using:
strace -f -t bash -c "<your bash script>" | tee /tmp/strace.log
With a little luck you will see which system call is causing the delay on the bottom of the screen. But it is a lot of information to process. Alternatively look for the name of your "./exe" in "/tmp/strace.log" after tracing is done. And then look for the system calls after invocation (process start of ./exe) that eat most time. Could be just many calls ... Don't spent to much time on this if you don't have the stomach for it.

Running into a race-condition, even with a 'wait'

I'm facing a strange race condition in my bash program. I tried duplicating it via a simple enough demo program but, obviously, as true for all/most timing-related race demonstration attempts, I couldn't.
Here's an abstracted version of the program that DOES NOT duplicate the issue, but let me still explain:
# Abstracted version of the original program
# that is NOT able to demo the race.
#
function foo() {
local instance=$1
# [A lot of logic here -
# all foreground commands, nothing in the background.]
echo "$instance: test" > /tmp/foo.$instance.log
echo "Instance $instance ended"
}
# Launch the process in background...
#
echo "Launching instance 1"
foo 1 &
# ... and wait for it to complete.
#
echo "Waiting..."
wait
echo "Waiting... done. (wait exited with: $?)"
# This ls command ALWAYS fails in the real
# program in the 1st while-iteration, complaining about
# missing files, but works in the 2nd iteration!
#
# It always works in the very 1st while-iteration of the
# abstracted version.
#
while ! ls -l /tmp/foo.*; do
:
done
In my original program (and NOT in the above abstracted version), I do see Waiting... done. (wait exited with: 0) on stdout, just as I see in the above version. Yet, the ls -l always fails in the original, but always works in the above abstracted version in the very first while loop iteration.
Also, the ls command fails despite seeing the Instance 1 ended message on stdout. The output is:
$ ./myProgram
Launching instance 1
Waiting...
Waiting... done. (wait exited with: 0)
Instance 1 ended
ls: cannot access '/tmp/foo.*': No such file or directory
/tmp/foo.1
$
I noticed that the while loop can be safely done away with if I put a sleep 1 right before ls in my original program, like so:
# This too works in the original program:
sleep 1
ls -l /tmp/foo.*
Question: Why isn't wait working as expected in my original program? Any suggestions to at least help troubleshoot the problem?
I'm using bash 4.4.19 on Ubuntu 18.04.
EDIT: I just also verified that the call to wait in the original, failing program is exiting with a status code of 0.
EDIT 2: Shouldn't the Instance 1 ended message appear BEFORE Waiting... done. (wait exited with: 0)? Could this be a 'flushing problem' with OS' disk-buffer/cache when dealing with background processes in bash?
EDIT 3: If instead of the while loop or sleep 1 hacks, I issue a sync command, then, voila, it works! But why should I have to do a sync in one program but the other?
I noticed that each the following three hacks work, but not quite sure why:
Hack 1
while ! ls -l /tmp/foo.*; do
:
done
Hack 2
sleep 1
ls -l /tmp/foo.*
Hack 3
sync
ls -l /tmp/foo.*
Could this be a 'flushing problem' with OS' disk-buffer/cache, especially when dealing with background processes, especially in bash? In other words, the call to wait seems to returning BEFORE it flushes the diskcache (or, BEFORE the OS, on its own realizes and, is done flushing the diskcache).
EDIT Thanks to #Jon, his was a very close guess and got me thinking in the right direction, along with the age-old, bit-wise tweaking advice from #chepner.
The Real Problem: I was starting foo, not directly/plainly as shown in my inaccurate abstracted version in my original question, but via another launchThread function that, after doing some bookkeeping, would also say foo 1 & in its body. And the call to launchThread was itself suffixed with an &! So, my wait was really waiting on launchThread and not on foo! The sleep, sync, and while just were helping buy more time for foo to complete, which is why introducing them worked. The following is a more accurate demonstration of the problem, even though you may or may not be able to duplicate it on your own system (due to scheduling/timing variance across systems):
#!/bin/bash -u
function now() {
date +'%Y-%m-%d %H:%M:%S'
}
function log() {
echo "$(now) - $#" >> $logDir/log # Line 1
}
function foo() {
local msg=$1
log "$msg"
echo " foo ended"
}
function launchThread() {
local f=$1
shift
"$f" "$#" & # Line 2
}
logDir=/tmp/log
/bin/rm -rf "$logDir"
mkdir -p "$logDir"
echo "Launching foo..."
launchThread foo 'message abc' & # Line 3
echo "Waiting for foo to finish..."
wait
echo "Waiting for foo to finish... done. (wait exited with: $?)"
ls "$logDir"/log*
Output of the above buggy program:
Launching foo...
Waiting for foo to finish...
Waiting for foo to finish... done. (wait exited with: 0)
foo ended
ls: cannot access '/tmp/log/log*': No such file or directory
If I remove the & from EITHER Line 2 OR from Line 3, the program works correctly, with the following as output:
Launching foo...
Waiting for foo to finish...
foo ended
Waiting for foo to finish... done. (wait exited with: 0)
/tmp/log/log
The program also works correctly if I remove the $(now) part from Line 1.

Debugging file descriptor leak ( in kernel ?)

I am working in this relatively large code base where I am seeing a file descriptor leak and processes start complaining that they are not able to open files after I run certain programs.
Though this happens after 6 days , I am able to reproduce the problem in 3-4 hours by reducing the value in /proc/sys/fs/file-max to 9000.
There are many processes running at any moment. I have been able to pin point couple of processes that could be causing the leak. However, I don't see any file descriptor leak either through lsof or through /proc//fd.
If I kill the processes(they communicate with each other) that I am suspecting of leaking, the leak goes away. FDs are released.
cat /proc/sys/fs/file-nr in a while(1) loop shows the leak. However, I don't see any leak in any process.
Here is a script I wrote to detect that leak is happening :
#!/bin/bash
if [ "$#" != "2" ];then
name=`basename $0`
echo "Usage : $name <threshold for number of pids> <check_interval>"
exit 1
fi
fd_threshold=$1
check_interval=$2
total_num_desc=0
touch pid_monitor.txt
nowdate=`date`
echo "=================================================================================================================================" >> pid_monitor.txt
echo "****************************************MONITORING STARTS AT $nowdate***************************************************" >> pid_monitor.txt
while [ 1 ]
do
for x in `ps -ef | awk '{ print $2 }'`
do
if [ "$x" != "PID" ];then
num_fd=`ls -l /proc/$x/fd 2>/dev/null | wc -l`
pname=`cat /proc/$x/cmdline 2> /dev/null`
total_num_desc=`expr $total_num_desc + $num_fd`
if [ $num_fd -gt $fd_threshold ]; then
echo "Proces name $pname($x) and number of open descriptor = $num_fd" >> pid_monitor.txt
fi
fi
done
total_nr_desc=`cat /proc/sys/fs/file-nr`
lsof_desc=`lsof | wc -l`
nowdate=`date`
echo "$nowdate : Total number of open file descriptor = $total_num_desc lsof desc: = $lsof_desc file-nr descriptor = $total_nr_desc" >> pid_monitor.txt
total_num_desc=0
sleep $2
done
./monitor.fd.sh 500 2 &
tail -f pid_monitor.txt
As I mentioned earlier, I don't see any leak in /proc//fd for any , but leak is happening for sure and system is running out of file descriptors.
I suspect something in the kernel is leaking. Linux kernel version 2.6.23.
My questions are follows :
Will 'ls /proc//fd' show list descriptors for any library linked to the process with pid . If not how do i determine when there is a leak in the library i am linking to.
How do I confirm that leak is in the userspace vs. in kernel.
If the leak is in the kernel what tools can I use to debug ?
Any other tips you can give me.
Thanks for going through the question patiently.
Would really appreciate any help.
Found the solution to the problem.
There was a shared memory attach happening in some function and that function was getting called every 30 seconds. The shared memory attach was never getting detached , hence the descriptor leak. I guess /proc//fd doesn't show shared memory attach as a descriptor. Hence my script was not able to catch file descriptor leak.
Which processes start complaining? And what is the error you see? What is the output of your monitoring script?
To open a file you need two things, a file descriptor, and a struct file - or file description. The file descriptor is what userspace uses, inside the kernel it is used to lookup the struct file. It's not clear to me which you are leaking.

Concurrent logging in bash scripts

I am currently trying to figure out why a shell script fails at concurrent logging every once in a while.
I have a shell function like the following:
log()
{
local l_text=$1
local l_file="/path/to/logs/$(date +%Y%m%d)_script.log"
local l_line="$(date +'%Y-%m-%d %H:%M:%S') $(hostname -s) ${l_text}"
echo ${l_line} >> ${l_file}
}
Now every once in a while this fails with a syntax error:
/path/to/script.sh: command substitution: line 163: syntax error near unexpected token `)'
/path/to/script.sh: command substitution: line 163: `hostname -s) ${l_text}'
The problem is, that I have multiple sub-processes, which each want to log as well as send traps (during which logging is performed as well). I haved debugged the problem and found out, that this happens, when the function is entered three times simultaneously. First the main process enters, then the child. After the date part of l_text is executed, main get's interrupted by a trap which is caused by child and in this trap tries to log something. The child and the trap finish their logging nicely, but then main is resumed after the trap and tries to execute the hostname part (presumedly) and fails with this error.
So it seems like main does not like being put to sleep while it is producing the $(date +'%Y-%m-%d %H:%M:%S') $(hostname -s) ${l_text} part of the log statement and cannot resume nicely. I was assuming this should work fine, because I am just using local variables and thread safe output methods.
Is this a general concurrency problem I am running into here? Or is this very specific for the trap mechanism in bash scripts? I know about the commodities of SIGNAL handling in C, so I am aware that only certain operations are allowed in SIGNAL handlers. However I am not aware if the same precautions also apply when handling SIGNALs in a bash script. I tried to find documentation on this, but none of the documents I could find gave any indications of problems with SIGNAL handling in scripts.
EDIT:
Here is an actuall simple script that can be used to replicate the problem:
#!/bin/bash
log() {
local text="$(date +'%Y-%m-%d %H:%M:%S') $(hostname -s) $1"
echo $text >> /dev/null
}
sub_process() {
while true; do
log "Thread is running"
kill -ALRM $$
sleep 1
done
}
trap "log 'received ALRM'" ALRM
sub_process &
sub_process_pid=$!
trap "kill ${sub_process_pid}; exit 0" INT TERM
while true; do
log "Main is running"
sleep 1
done
Every once in a while this script will get killed because of a syntax error in line 5. Line 5 is echo $text >> /dev/null, but since the syntax error also mentiones the hostname command, similar to the one I posted above, I am assuming there is an of-by-one error as well and the actual error is in line 4, which is local text="$(date +'%Y-%m-%d %H:%M:%S') $(hostname -s) $1".
Does anybody know what to do with the above script to correct it? I alread tried moving out the construction of the string into some temporary variables:
log() {
local thedate=$(date +'%Y-%m-%d %H:%M:%S')
local thehostname=$(hostname -s)
local text="${thedate} ${thehostname} $1"
echo $text >> /dev/null
}
This way the error appears less frequently, but it still is present, so this is not a real fix.
I would say that this is definitely a bug in bash and I would encourage you to report it to the bash developers. At the very least, you should never get a syntax error for what is syntactically correct code.
For the record, I get the same results as you with GNU bash, version 4.2.10(1)-release (x86_64-pc-linux-gnu).
I found that you can workaround the problem by not calling a function in your trap handler. E.g. replacing
trap "log 'received ALRM'" ALRM
with
trap "echo $(date +'%Y-%m-%d %H:%M:%S') $(hostname -s) received ALRM" ALRM
makes the script stable for me.
I know about the commodities of SIGNAL handling in C, so I am aware
that only certain operations are allowed in SIGNAL handlers. However I
am not aware if the same precautions also apply when handling SIGNALs
in a bash script.
I guess you shouldn't have to take special precautions but apparently in practice you do. Given that the problem seem to go away without the function call, I'm guessing that something in bash either isn't re-entrant where it should be or fails prevent re-entry in the first place.

How can I have output from one named pipe fed back into another named pipe?

I'm adding some custom logging functionality to a bash script, and can't figure out why it won't take the output from one named pipe and feed it back into another named pipe.
Here is a basic version of the script (http://pastebin.com/RMt1FYPc):
#!/bin/bash
PROGNAME=$(basename $(readlink -f $0))
LOG="$PROGNAME.log"
PIPE_LOG="$PROGNAME-$$-log"
PIPE_ECHO="$PROGNAME-$$-echo"
# program output to log file and optionally echo to screen (if $1 is "-e")
log () {
if [ "$1" = '-e' ]; then
shift
$# > $PIPE_ECHO 2>&1
else
$# > $PIPE_LOG 2>&1
fi
}
# create named pipes if not exist
if [[ ! -p $PIPE_LOG ]]; then
mkfifo -m 600 $PIPE_LOG
fi
if [[ ! -p $PIPE_ECHO ]]; then
mkfifo -m 600 $PIPE_ECHO
fi
# cat pipe data to log file
while read data; do
echo -e "$PROGNAME: $data" >> $LOG
done < $PIPE_LOG &
# cat pipe data to log file & echo output to screen
while read data; do
echo -e "$PROGNAME: $data"
log echo $data # this doesn't work
echo -e $data > $PIPE_LOG 2>&1 # and neither does this
echo -e "$PROGNAME: $data" >> $LOG # so I have to do this
done < $PIPE_ECHO &
# clean up temp files & pipes
clean_up () {
# remove named pipes
rm -f $PIPE_LOG
rm -f $PIPE_ECHO
}
#execute "clean_up" on exit
trap "clean_up" EXIT
log echo "Log File Only"
log -e echo "Echo & Log File"
I thought the commands on line 34 & 35 would take the $data from $PIPE_ECHO and output it to the $PIPE_LOG. But, it doesn't work. Instead I have to send that output directly to the log file, without going through the $PIPE_LOG.
Why is this not working as I expect?
EDIT: I changed the shebang to "bash". The problem is the same, though.
SOLUTION: A.H.'s answer helped me understand that I wasn't using named pipes correctly. I have since solved my problem by not even using named pipes. That solution is here: http://pastebin.com/VFLjZpC3
it seems to me, you do not understand what a named pipe really is. A named pipe is not one stream like normal pipes. It is a series of normal pipes, because a named pipe can be closed and a close on the producer side is might be shown as a close on the consumer side.
The might be part is this: The consumer will read data until there is no more data. No more data means, that at the time of the read call no producer has the named pipe open. This means that multiple producer can feed one consumer only when there is no point in time without at least one producer. Think of it of door which closes automatically: If there is a steady stream of people keeping the door always open either by handing the doorknob to the next one or by squeezing multiple people through it at the same time, the door is open. But once the door is closed it stays closed.
A little demonstration should make the difference a little clearer:
Open three shells. First shell:
1> mkfifo xxx
1> cat xxx
no output is shown because cat has opened the named pipe and is waiting for data.
Second shell:
2> cat > xxx
no output, because this cat is a producer which keeps the named pipe open until we tell him to close it explicitly.
Third shell:
3> echo Hello > xxx
3>
This producer immediately returns.
First shell:
Hello
The consumer received data, wrote it and - since one more consumer keeps the door open, continues to wait.
Third shell
3> echo World > xxx
3>
First shell:
World
The consumer received data, wrote it and - since one more consumer keeps the door open, continues to wait.
Second Shell: write into the cat > xxx window:
And good bye!
(control-d key)
2>
First shell
And good bye!
1>
The ^D key closed the last producer, the cat > xxx, and hence the consumer exits also.
In your case which means:
Your log function will try to open and close the pipes multiple times. Not a good idea.
Both your while loops exit earlier than you think. (check this with (while ... done < $PIPE_X; echo FINISHED; ) &
Depending on the scheduling of your various producers and consumers the door might by slam shut sometimes and sometimes not - you have a race condition built in. (For testing you can add a sleep 1 at the end of the log function.)
You "testcases" only tries each possibility once - try to use them multiple times (you will block, especially with the sleeps ), because your producer might not find any consumer.
So I can explain the problems in your code but I cannot tell you a solution because it is unclear what the edges of your requirements are.
It seems the problem is in the "cat pipe data to log file" part.
Let's see: you use a "&" to put the loop in the background, I guess you mean it must run in parallel with the second loop.
But the problem is you don't even need the "&", because as soon as no more data is available in the fifo, the while..read stops. (still you've got to have some at first for the first read to work). The next read doesn't hang if no more data is available (which would pose another problem: how does your program stops ?).
I guess the while read checks if more data is available in the file before doing the read and stops if it's not the case.
You can check with this sample:
mkfifo foo
while read data; do echo $data; done < foo
This script will hang, until you write anything from another shell (or bg the first one). But it ends as soon as a read works.
Edit:
I've tested on RHEL 6.2 and it works as you say (eg : bad!).
The problem is that, after running the script (let's say script "a"), you've got an "a" process remaining. So, yes, in some way the script hangs as I wrote before (not that stupid answer as I thought then :) ). Except if you write only one log (be it log file only or echo,in this case it works).
(It's the read loop from PIPE_ECHO that hangs when writing to PIPE_LOG and leaves a process running each time).
I've added a few debug messages, and here is what I see:
only one line is read from PIPE_LOG and after that, the loop ends
then a second message is sent to the PIPE_LOG (after been received from the PIPE_ECHO), but the process no longer reads from PIPE_LOG => the write hangs.
When you ls -l /proc/[pid]/fd, you can see that the fifo is still open (but deleted).
If fact, the script exits and removes the fifos, but there is still one process using it.
If you don't remove the log fifo at the cleanup and cat it, it will free the hanging process.
Hope it will help...

Resources