made fifo file without using mkfifo or mknod - fifo

I was trying to stream mp3 music through gnuradio using vlc and mpg123 player. Following this site's example
http://www.opendigitalradio.org/Simple_FM_transmitter_using_gnuradio
The commands are:
$ mkfifo stream_32k.fifo
$ mpg123 -r32000 -m -s http://maxxima.mine.nu:8000 >stream_32k.fifo
Using my own mp3 stream, I followed the example, however there was one time I FORGOT to put
$ mkfifo stream_32k.fifo
to the terminal and instead only typed
$ mpg123 -r32000 -m -s http://localhost:8080/mp3 >stream_32k.fifo
directly to the terminal. The result was a .fifo file that is not highlighted (like the one created with mkfifo)
When using it with gnuradio, the fifo file made with mkfifo could only be played once and its size would always return back to 0 bytes.
While the one I accidentally created without using mkfifo kept the bytes for a long time and i could access it anytime i wanted which proved more beneficial to me.
Is there a disadvantage in making fifos this way? Also can somebody please tell me what I actually did?
Thank you so much!

You just created a regular file. As such it kept the bytes on the disk, where the real FIFO has nothing to do with permanent disk storage, it's essentially a buffer in memory which you give a "disk name" so that file oriented commands can work with it. The disadvantage is that when you're writing a permanent disk file you can not read from it at the same time (generally speaking, it depends how writing program actually writes, but you can not rely on it).
If you think that having .fifo in the file name makes it a FIFO then it's not right. mkfifo utility is what makes a filename attached to a FIFO.
If you want to keep the file and play the stream at the same time you can use an utility like tee:
mkfifo stream.fifo
mpg123 ...... | tee saved_stream.mp3 > stream.fifo
And then play from stream.fifo like you always do. Tee will 'capture' the bytes passing through it and save them to disk.

Related

How to stream one file to multiple pipelines efficiently

I have a script that wants to run several programs / pipelines over a very large file. Example:
grep "ABC" file > file.filt
md5sum file > file.md5
The kernel will try to cache the file in RAM, so if it is read again soon it may be a copy from RAM. However the files are large, and the programs run at wildly different speeds, so this is unlikely to be effective. To minimise IO usage I want to read the file once.
I know of 2 ways to duplicate the data using tee and moreutils' pee:
<file tee >(md5sum > file.md5) | grep "ABC" > file.filt
<file pee 'md5sum > file.md5' 'grep "ABC" > file.filt'
Is there another 'best' way? Which method will make the fewest copies? Does it make a difference which program is >() or |-ed to? Will any of these approaches attempt to buffer data in RAM if one program is too slow? How do they scale to many reader programs?
tee (command) opens each file using fopen, but sets _IONBF (unbuffered) on each. It reads from stdin, and fwrites to each FILE*.
pee (command) popens each command, sets each to unbuffered, reads from stdin, and fwrites to each FILE*.
popen uses pipe(2), which has a capacity of 65536 bytes. Writes to a full buffer will block. pee also uses /bin/sh to interpret the command, but I think that will not add any buffering/copying.
mkfifo (command) uses mkfifo (libc), which use pipes underneath, opening the file/pipe blocks until the other end is opened.
bash <>() syntax (subst.c:5712) uses either pipe or mkfifo. pipe if /dev/fds are supported. It does not use the c fopen calls so does not set the buffering.
So all three variants (pee, tee >(), mkfifo ...) should end up with identical behaviour, reading from stdin and writing to pipes without buffering. The data is duplicated at each read (from kernel to user), and then again at each write (user back to kernel), I think tees fwrites will not cause an extra layer of copying (as there is no buffer). Memory usage could increase to a maximum of 65536 * num_readers + 1 * read_size (if no one is reading). tee writes to stdout first, then each file/pipe in order.
Given this pee just works around other shells (fish!) lack of >() operator equivalent, there seems to be no need for it with bash. I prefer tee when you have bash, but pee is nice when you don't. The bash <() is not replaced by pee of course. Manually mkfifoing and redirecting is tricky and unlikely to deal with errors nicely.
pee could probably be changed by implementing using the tee library function (instead of fwrite). I think this would cause the input to be read at the speed of the fastest reader, and potentially fill up the kernel buffers.
AFAIK, there is no "best way" to achieve this. But I can give you another approach, more verbose, not a one liner, but maybe clearer because each command is written in its own. Use named pipes:
mkfifo tmp1 tmp2
tee tmp1 > tmp2 < file &
cat tmp1 | md5sum > file.md5 &
cat tmp2 | grep "ABC" > file.filt &
wait
rm tmp1 tmp2
Create as many name pipes as commands to be run.
tee to the named pipes the input file (tee outputs its input in the standard output, so the last name pipe must be a redirection), let it run in background.
Use the different named pipes as input to the different commands to run. Let them run in the background.
Finally, wait for the jobs to finish and remove the temporary named pipes.
The drawback of this approach, when the programs have gread variability in theirs speeds is that all of them will read the files at the same pace (limit is the buffer size, once full for one of the pipes, the others will have to wait too), so if one of them is resource-hungry (like memory hungry), the resources will be used for the whole lifespan of all the processes.

Understanding tty + bash

I see that I can use one bash session to print text in another as follows
echo './myscript' > /dev/pts/0 # assuming session 2 is using this tty
# or
echo './myscript' > /proc/1500/fd/0 # assuming session 2's pid is 1500
But why does the text ./myscript only print and not execute? Is there anything that I can do to execute my script this way?
(I know that this will attract a lot of criticism which will perhaps fill any answers that follow with "DON'T DO THAT!" but the real reason I wish to do this is to automatically supply a password to sshfs. I'm working with a local WDMyCloud system, and it deletes my .authorized_keys file every night when I turn off the power.)
why does the text ./myscript only print and not execute?
Input and output are two different things.
Writing to a terminal puts data on the screen. Reading from a terminal reads input from the keyboard. In no way does writing to the terminal simulate keyboard input.
There's no inherent coupling between input and output, and the fact that keys you press show up on screen at all is a conscious design decision: the shell simply reads a key, and then both appends it to its internal command buffer, and writes a copy to the screen.
This is purely for your benefit so you can see what you're typing, and not because the shell in any way cares what's on the screen. Since it doesn't, writing more stuff to screen has no effect on what the shell executes.
Is there anything that I can do to execute my script this way?
Not by writing to a terminal, no.
Here is an example using a FIFO:
#!/usr/bin/bash
FIFO="$(mktemp)"
rm -fv "$FIFO"
mkfifo "$FIFO"
( echo testing123 > "$FIFO" ) &
cat "$FIFO" | sshfs -o password_stdin testing#localhost:/tmp $HOME/tmp
How you store the password and send it to the FIFO is up to you
You can accomplish what you want by using an ioctl system call:
The ioctl() system call manipulates the underlying device parameters of special files. In particular, many operating characteristics of character special files (e.g., terminals) may be controlled with ioctl() requests.
For the 'request' argument of this system call, you'll want TIOCSTI, which is defined as 0x5412 in my header files. (grep -r TIOCSTI /usr/include to verify for your environment.)
I accomplish this as follows in ruby:
fd = IO.sysopen("/proc/#{$$}/fd/0", 'wb')
io = IO.new(fd, 'wb')
"puts 9 * 16\n".chars.each { |c| io.ioctl 0x5412, c };

shell - keep writing to the same file[name] even if externally changed

There's a process, which should not be interrupted, and I need to capture the stdout from it.
prg > debug.log
I can't quite modify the way the process outputs its data, though I'm free to wrap its launch as I see fit. Background, piping the output to other commands, etc, that's all a fair game. The process, once started, must run till the end of times (and can't be blocked, say, waiting for a fifo to be emptied). The writing isn't very fast, and the file can be cut at arbitrary place, if it exceeds predefined size.
Now, the problem is the log would grow to fill all available space, and so it must be rotated, oldest instances deleted/overwritten. And now there's the problem...
If I do
mv debug.log debug.log.1
the file debug.log vanishes forever while debug.log.1 keeps growing.
If I do
cp debug.log debug.log.1
rm debug.log
the file debug.log.1 doesn't grow, but debug.log vanishes forever, and all consecutive output from the program is lost.
Is there some way to make the stdout redirect behave like typical log writing - if the file vanished, got renamed or such, create it again?
(this is all working under busybox, so lightweight solutions are preferred.)
If the application in question holds the log file open all the time and cannot be told to close and re-open the log file (as many applications can) then the only option I can think of is to truncate the file in place.
Something like this:
cp debug.log debug.log.1
: > debug.log
You may use the split command or any other command which reopens a new file
prg | split --numeric-suffixes --lines=100 debug.log.
The reason is, that the redirected file output is sent to a file handle, not the file name.
So the process has to close the file handle and open a new one.
You may use rotatelogs to do it in Apache HTTPD style, if you like:
http://httpd.apache.org/docs/2.2/programs/rotatelogs.html
In bash style, you can use a script:
fn='debug.log'
prg | while IFS='' read in; do
if ...; then # time or number of lines read or filesize or ...
i=$((i+1))
mv "$file" "$fn.$i" # rename the file
> "$file" # make it empty
fi
echo "$in" >> "$file" # fill the file again
done

Displaying stdout on screen and a file simultaneously

I'd like to log standard output form a script of mine to a file, but also have it display to me on screen for realtime monitoring. The script outputs something about 10 times every second.
I tried to redirect stdout to a file and then tail -f that file from another terminal, but for some reason tail is updating the screen significantly slower than the script is writing to the file.
What's causing this lag? Is there an alternate method of getting one standard output stream both on my terminal and into a file for later examination?
I can't say why tail lags, but you can use tee:
Redirect output to multiple files, copies standard input to standard output and also to any files given as arguments. This is useful when you want not only to send some data down a pipe, but also to save a copy.
Example: <command> | tee <outputFile>
How much of a lag do you see? A few hundred characters? A few seconds? Minutes? Hours?
What you are seeing is buffering. Almost all file reads and writes are buffered. This includes input and output and there is also some buffering taking place within pipes. It's just more efficient to pass a packet of data around rather than a byte at a time. I believe data on HFS+ file systems are stored in UTF-16 while Mac OS X normally use UTF-8 as a default. (NTFS also stores data using UTF-16 while Windows uses code pages for character data by default).
So, if you run tail -f from another terminal, you may be seeing buffering from tail, but when you use a pipe and then tee, you may have a buffer in the pipe, and in the tee command which maybe why you see the lag.
By the way, how do you know there's a lag? How do you know how quickly your program is writing to the disk? Do you print out something in your program to help track the writes to the file?
In that case, you might not be lagging as much as you think. File writes are also buffered. So, it is very possible that the lag isn't from the tail -f, but from your script writing to the file.
Use tee command:
tail -f /path/logFile | tee outfile

How to read data and read user response to each line of data both from stdin

Using bash I want to read over a list of lines and ask the user if the script should process each line as it is read. Since both the lines and the user's response come from stdin how does one coordinate the file handles? After much searching and trial & error I came up with the example
exec 4<&0
seq 1 10 | while read number
do
read -u 4 -p "$number?" confirmation
echo "$number $confirmation"
done
Here we are using exec to reopen stdin on file handle 4, reading the sequence of numbers from the piped stdin, and getting the user's response on file handle 4. This seems like too much work. Is this the correct way of solving this problem? If not, what is the better way? Thanks.
You could just force read to take its input from the terminal, instead of the more abstract standard input:
while read number
do
< /dev/tty read -p "$number?" confirmation
echo "$number $confirmation"
done
The drawback is that you can't automate acceptance (by reading from a pipe connected to yes, for example).
Yes, using an additional file descriptor is a right way to solve this problem. Pipes can only connect one command's standard output (file descriptor 1) to another command's standard input (file descriptor 1). So when you're parsing the output of a command, if you need to obtain input from some other source, that other source has to be given by a file name or a file descriptor.
I would write this a little differently, making the redirection local to the loop, but it isn't a big deal:
seq 1 10 | while read number
do
read -u 4 -p "$number?" confirmation
echo "$number $confirmation"
done 4<&0
With a shell other than bash, in the absence of a -u option to read, you can use a redirection:
printf "%s? " "$number"; read confirmation <&4
You may be interested in other examples of using file descriptor reassignment.
Another method, as pointed out by chepner, is to read from a named file, namely /dev/tty, which is the terminal that the program is running in. This makes for a simpler script but has the drawback that you can't easily feed confirmation data to the script manually.
For your application, killmatching, two passes is totally the right way to go.
In the first pass you can read all the matching processes into an array. The number will be small (dozens typically, tens of thousands at most) so there are no efficiency issues. The code will look something like
set -A candidates
ps | grep | while read thing do candidates+=("$thing"); done
(Syntactic details may be wrong; my bash is rusty.)
The second pass will loop through the candidates array and do the interaction.
Also, if it's available on your platform, you might want to look into pgrep. It's not ideal, but it may save you a few forks, which cost more than all the array lookups in the world.

Resources