O_TRUNC ignored when writing to the /proc filesystem - linux-kernel

Trying to get rid of Ubuntu's apport by clearing /proc/sys/kernel/core_pattern using
sh -c ': > /proc/sys/kernel/core_pattern' does not work.
It looks like the O_TRUNC flag is ignored when writing to the /proc filesystem:
echo nonsense >| /proc/sys/kernel/core_pattern
strace sh -c ': > /proc/sys/kernel/core_pattern # do not call apport'
...
openat(AT_FDCWD, "/proc/sys/kernel/core_pattern", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
...
close(3) = 0
cat /proc/sys/kernel/core_pattern
nonsense
I get an empty file when doing this in regular filesystems instead of the /proc filesystem.
Is that a kernel bug or a feature, perhaps even a documented one?
Edit: Clearing this setting through sysctl does not work:
sysctl kernel.core_pattern=""
sysctl: malformed setting "kernel.core_pattern="
It seems the sysctl program is unable to clear any kernel parameters while man core explicitly describes that an empty value is used to disable the mechanism.
Yes, echo >| /proc/sys/kernel/core_pattern works instead, but the object of this question is to find out whether this is a kernel bug, not to find a workaround.

This is because the O_TRUNC flag of the open* family of syscalls merely updates the size of the inode associated with the opened file. This change is performed right after finding the inode and before finalizing the struct file that is then used by the kernel for any actual operation on the opened file. Truncation is performed before the call to any ->open() file_operations handler implemented by whichever kernel module/driver/subsystem (like for example the sysctl subsystem) and is therefore transparent to the handler.
In other words, the file_operations handlers of the virtual sysctl files (e.g. /proc/sys/kernel/*) merely see a file with a 0 (zero) size (->i_size field of struct inode), they do not know whether this was the result of a truncation or a "normal" open, nor they should need such information.
Since sysctl files (just as the near totality of procfs files) do not really bother tracking sizes for understandable reasons, their functionality is only implemented in terms of read and write system calls (which also do not update the size in any way).
Indeed, using : > PATH will merely do open + close, while a simple echo > PATH will write a newline character after opening, thus you observe two different outcomes. You would observe the same behavior as : > PATH using truncate -s 0 PATH, though this time the truncation is done explicitly after opening through ftruncate (at least on my system).
man core explicitly describes that an empty value is used to disable the mechanism
[...]
Is that a kernel bug or a feature, perhaps even a documented one?
Human readable/writable files under procfs are usually designed to work in a line-oriented fashion, so I would assume that the term "empty" here simply means that the value of the option is empty as a result of writing an empty line to the file. If anything, I would call this an undocumented feature rather than a bug.
Here's some example traces on my system:
root#xxx:~# cat /proc/sys/kernel/core_pattern
|/usr/share/apport/apport %p %s %c %d %P %E
root#xxx:~# strace -f -e openat,write,close,dup2 sh -c ': > /proc/sys/kernel/core_pattern'
...
openat(AT_FDCWD, "/proc/sys/kernel/core_pattern", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
close(1) = 0
dup2(3, 1) = 1
close(3) = 0
...
+++ exited with 0 +++
root#xxx:~# cat /proc/sys/kernel/core_pattern
|/usr/share/apport/apport %p %s %c %d %P %E
root#xxx:~#
root#xxx:~# cat /proc/sys/kernel/core_pattern
|/usr/share/apport/apport %p %s %c %d %P %E
root#xxx:~# strace -f -e openat,write,close,dup2 sh -c 'echo > /proc/sys/kernel/core_pattern'
...
openat(AT_FDCWD, "/proc/sys/kernel/core_pattern", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
close(1) = 0
dup2(3, 1) = 1
close(3) = 0
write(1, "\n", 1) = 1
...
+++ exited with 0 +++
root#xxx:~# cat /proc/sys/kernel/core_pattern
root#xxx:~#
If you want to take a look at the actual implementation of open/read/write/close for sysfs files you can check /kernel/sysctl.c. There are different tables present for different sysctl facilities e.g. kernel, vm, etc.

Related

Assigning a macro value in Make from a shell command

The I/O blocksize is going to figure prominently in a Makefile I need to write, so I need a way to calculate it. This script does what I want:
> cat blksz.bash
#!/bin/bash
bsl=$(du --block-size=1 testfile)
bsl=($bsl)
echo ${bsl[0]}
(Someone might have a better way to do it, but if you can bear with me, that's not really the most general point of the question.)
I can call this from my Makefile, and it works fine:
> cat Makefile
BLOCKSIZE := $(shell ./blksz.bash)
blocksize:
echo $(BLOCKSIZE)
> make blocksize
echo 4096
4096
Then I think this is such a small script, wouldn't it be better to just put it in the Makefile. But then it no longer works.
> cat Makefile
BLOCKSIZE := $(shell bsl=$(du --block-size=1 testfile) ; \
bsl=($bsl) ; \
echo ${bsl[0]})
blocksize:
echo $(BLOCKSIZE)
> make blocksize
echo
i.e. BLOCKSIZE is never defined. Clearly, I have defined the shell command incorrectly in the Makefile. Can anyone tell me the correct way to do this? Also, there might be better ways to get the block size, but the broader issue of how to get a return value out of a shell command so the Makefile can see it will probably come up again for me at some point, and that is fundamentally what I am trying to figure out.
One last thing, regarding duplicate questions, there are a few similar questions around, but nothing that gets quite at what I am asking, AFAICT. I think what makes this different is that I am using variables within the shell command, and somehow their contents are being lost.
To answer your question directly, $ is a special character to make (it introduces make variables and functions) so if you want to pass it to the shell you have to escape it as $$. Plus the issue of array variables not being available in /bin/sh as mentioned by #tripleee.
So your command must be:
BLOCKSIZE := $(shell /bin/bash -c 'bsl=$$(du --block-size=1 testfile) ; bsl=($$bsl) ; echo $${bsl[0]}').
Personally I don't like to use bash-specific features nor awk but YMMV. A better way to do this IMO is to use the stat program:
BLOCKSIZE := $(shell stat -c %s testfile)
If you don't want to stat, then another way would be:
BLOCKSIZE := ${shell set -- $$(du --block-size=1 testfile); echo $$1}
Unless you hack SHELL=/bin/bash your shell commands will be executed by /bin/sh. But you don't need Bash for this.
BLOCKSIZE := $(shell du --block-size=1 testfile | awk '{ print $$1 }')

Loop over different directories and change of file name

I have the following subject folder structure:
./sub-CC0006/func
..
./sub-CC0199/func
Within the func folder I have a file called sub-CC0006_ses-core2p2_task-loi3_run-01_events.tsv. When I tried to put the below code in a loop it did not work. (I tried to first loop to each subject directory and then change the .tsv file names based on the different subject number.)
awk -F"\t" -v OFS="\t" '{
for (i=1;i<=NF;i++) {
if ($i == "NaN") $i="n/a"
}
print $0
}' sub-CC0006_ses-core2p2_task-loi3_run-01_events.tsv > sub-CC0006_ses-core2p2_task-loi3_run-01_events_new.tsv &&
mv sub-CC0006_ses-core2p2_task-loi3_run-01_events_new.tsv sub-CC0006_ses-core2p2_task-loi3_run-01_events.tsv
Here is an extract from one of the files I am trying to manipulate:
onset
response_time
9
NaN
12
1.4
The basic technique for overwriting a file with an edited version of the file uses a generic temporary file name as the intermediary file.
I'm assuming that in the sub-CC0199 directory, the func subdirectory will contain sub-CC01999_ses-core2p2_task-loi3_run-01_events.tsv and that any other files in the directory are to be ignored, and similarly for each other directory. The script becomes simpler if you simply want to process all the files (or all the *.tsv files, or some other pattern match) in each of the func subdirectories for each of the subjects.
tmpfile=$(mktemp "map.XXXXXX")
trap "rm -f $tmpfile; exit 1" 0 1 2 3 13 15
suffix="_ses-core2p2_task-loi3_run-01_events.tsv"
for directory in sub-CC0???
do
file="$directory/func/$directory$suffix"
if [ -f "$file" ]
then
awk '…' "$file" > "$tmpfile" &&
mv "$tmpfile" "$file"
fi
done
rm -f "$tmpfile" # Remove the temporary
trap 0 # Cancel the 'exit' trap; the script exits with status 0
If you're worried about preserving links (or ownership, or permissions) on the original file, or that the original file might be a symlink you want to preserve, you can use cp "$tmpfile" "$file"; rm -f "$tmpfile" instead of mv. It's slightly slower, though — but unless the files are big, probably not measurably slower.
You could generate the temporary file name within the loop; it might be marginally safer to do so if you're worried about malicious actors. The file is new (did not exist before) when created by mktemp, but after you've moved it, a malicious person could create their own symlink to somewhere sensitive so the script could damage other files unexpectedly. (You could also copy the temporary file over the original without removing the temporary, so the same file is used for each .tsv file — the options are legion.) You're probably not working in an environment that hostile, though.
The trap list is for "EXIT" (0) and signals 1 (SIGHUP), 2 (SIGINT), 3 (SIGQUIT), 13 (SIGPIPE) and 15 (SIGTERM). I learned to script when only the numbers worked — and they're compact. If you want to be slightly more modern, you could list the short names of the signals and conditions:
trap "rm -f $tmpfile; exit 1" EXIT HUP INT QUIT PIPE TERM
…
trap EXIT
or (to cancel multiple traps, though it's unnecessary when the script is about to exit):
trap - EXIT HUP INT QUIT PIPE TERM

Bash: How to assign output of command that ends with segmentation fault to variable

I am using a small program written by someone else in bash that runs according to cron on my Synology NAS and basically it does search for subtitles for my movies collection and convert their encoding to utf8 if needed.
In general the main bash script calls another subscripts, and unfortunetly it doesn't work 100% as it should. During my investigation I have narrowed down the problem being this specific function in one of the subscripts:
subs_getCharset_SO() {
local file="$1"
local charset=
local et=
tools_isDetected "file" || return $G_RETFAIL
et=$(file \
--brief \
--mime-encoding \
--exclude apptype \
--exclude tokens \
--exclude cdf \
--exclude compress \
--exclude elf \
--exclude soft \
--exclude tar \
"$file" | wrappers_lcase_SO) || {
return $G_RETFAIL
}
case "$et" in
*utf*) charset="UTF8";;
*iso*) charset="ISO-8859-2";;
us-ascii) charset="US-ASCII";;
csascii) charset="CSASCII";;
*ascii*) charset="ASCII";;
*) charset="WINDOWS-1250";;
esac
echo "$charset"
}
It turns out that running the file command on every movie file causes always a Segmentation fault. I have reproduced it by running this command in terminal manually:
admin#Synek:/volume1/video/Filmy/Ghostland.2018$ file --brief --mime encoding Ghostland.2018.txt
The output is:
utf-8
Segmentation fault
So my main problem as I think is that the output of the file command is not assigned to et variable. My goal ideally would be to capture the first line of the output and assign it to et variable. Or at least redirect the output to a file, so far I have tried some solutions that I have found in the web:
admin#Synek:/volume1/video/Filmy/Ghostland.2018$ { file --brief --mime-encoding ./Ghostland.2018.txt; } 2> log
which outputs in terminal just the line that I need and omits the Segmentation fault message:
utf8
Running:
admin#Synek:/volume1/video/Filmy/Ghostland.2018$ cat log
Gives:
Segmentation fault
But I just can't find a way to get the first line before Segmentation fault written in the log output file.
Any help appreciated!
When stdout is to a TTY, GNU libc (like most implementations) configures line-buffering by default, so output written with the standard C library is printed whenever a full line is complete (since it's assumed that a human is watching and wants to see results as soon as they're available, even if that makes overall execution take longer). By contrast, when stdout is to a FIFO or a file, a larger output buffer is used for better efficiency.
Because a SIGSEGV doesn't allow a program to flush its buffers, that means that data still in the buffer at the time of the failure is lost.
On a system with GNU coreutils, you can configure unbuffered or line-buffered stdout (by default, programs can still override it) using the tool stdbuf:
result=$(stdbuf -o0 file --brief --mime-encoding ./Ghostland.2018.txt)
...or, on systems without GNU coreutils but with expect installed, you can use the tool unbuffer:
result=$(unbuffer file --brief --mime-encoding ./Ghostland.2018.txt)
See BashFAQ #9 for more background on buffering and its control from the shell.

Redirecting Multiple stdouts

I have three files ina a directory named One, Two, Three.
One contains:
asdfg
Two contains:
qwerty
Three contains:
zxcvb
When I give the following command:
$ls > One > Two > Three
Then I give the following command:
$cat One
Output:
Nothing
Then I give the following command:
$cat Two
Output:
Nothing
Then I give the following command:
$cat Three
Output:
One
Three
Two
Can anyone please shed light on what exaclty is happening here? Why do the files One & Two become empty and the why does the output of the ls command get written only to the file Three?
As mentioned here bash doesn't support demultiplexing, and it doesn't support multiplexing either. The effect that you're seeing is that each file is opened O_TRUNC, which means all content is destroyed. When multiple redirections are specified they're closed again right away, only the last redirection is kept and actually receives any data.
Here's an excerpt of strace running the command:
strace -f -e open,dup2,close bash -c 'ls > one > two > three'
...
open("one", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
dup2(3, 1) = 1
close(3) = 0
open("two", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
dup2(3, 1) = 1
close(3) = 0
open("three", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
dup2(3, 1) = 1
close(3) = 0
...
As you can see bash keeps overwriting the same file descriptor. so only the last one will receive any data. One work around is to use tee which writes what it receives on stdin to all its arguments and stdout:
ls | tee One Two Three > /dev/null
On a side note, zsh does support this:
zsh$ setopt multios
zsh$ ls > One > Two > Three
You can only redirect once. The first redirect opens the first file for writing (removing any previous contents), then closes it immediately when the next redirect overrides it.
You seem to be looking for the tee command.

": > file" VS "> file"

Is there any differences between ": > file" and "> file"?
$ : > file.out
$ ls -l file.out
-rw-rw---- 1 user user 0 Mar 18 21:08 file.out
$ > file.out
$ ls -l file.out
-rw-rw---- 1 user user 0 Mar 18 21:08 file.out
: is the shell built-in NO-OP or null operation. So yeah, directing it to a file ends up with an empty file, as does directing nothing to a file. There's a sense, I suppose, in which your source is a different kind of nothing, but the result is the same. According to the advanced Bash scripting guide, the "> file.out" formulation won't work on some systems.
Note that in both cases (unlike "touch") the file contents will be replaced with nothing if the file already exists.
Using : > file.out is more portable to non-bash. For instance, zsh defines the default null-command as cat, not : (unless in an emulation mode). If you ever end up needing to make the script work with a /bin/sh which is not bash (eg, a *BSD system, any closed-source OS, or even on some GNU/Linux distributions now, where dash is used), you'll make your life easier if you use : > file.out
According to POSIX, both work but the version with : aborts if the redirection fails while the version with just the redirection just returns a non-zero exit status. In the latter case it is more portable to use true.
Bash only does this right in POSIX mode.
Aliases or functions named : violate a POSIX constraint on the application and are not portable.
The only difference I can think of is that you can redefine : via alias or function definitions. For example, you may want to truncate files most of the time (using the default do-nothing behavior of :), but force files to have a standard header in some cases. The form > file cannot be redefined.
For example:
#! /bin/bash
test -n "$ADD_COPYRIGHT" &&
:() { echo "# Copyright (c) 2010 MyName"; echo; }
# Truncate the file to zero size, unless ADD_COPYRIGHT is set, in which case
# the file is truncated to contain only a copyright notice.
: > file
# Add content to the file
some_command >> file

Resources