Indent output of an program waiting for required remote interaction in ksh - ksh

Using ksh93, i have a local binary program which prints a few lines of output containing a PIN code that should be entered on a remote system. When the correct PIN is entered on the remote system, this is automatically detected and the local program completes. I don't have to provide any local input to the program.
I'd just like to indent each of the program's initial output lines with a few spaces.
Since the following waits for command to complete, i can't see the output i require containing the pin :
command | nawk '{ printf "%2s%s\n", "", $0 }'
I've looked into ksh co-processes, but i can't seem find a solution. I have the feeling i'm making this more difficult than it probably is...

Related

Output time to a file with the Unix "time" command, but leave the output of the command to the console

I time a command that has some output. I want to output the real time from the time command to a file, but leave the output of the command to the console.
For example, if I do time my_command I get this printed in the console:
several lines of output from my_command ...
real 1m25.970s
user 0m0.427s
sys 0m0.518s
In this case, I want to store only 1m25.970s to a file, but still print the output of the command to the console.
The time command is tricky. The POSIX specification of time
doesn't define the default output format, but does define a format for the -p (presumably for 'POSIX') option. Note the (not easily understood) discussion of command sequences in pipelines.
The Bash specification say time prefixes a 'pipeline', which means that time cmd1 | cmd2 times both cmd1 and cmd2. It writes its results to standard error. The Korn shell is similar.
The POSIX format requires a single space between the tags such as real and the time; the default format often uses a tab instead of a space. Note that the /usr/bin/time command may have yet another output format. It does on macOS, for example, listing 3 times on a single line, by default, with the label after the time value; it supports -p to print in an approximation to the POSIX format (but it has multiple spaces between label and time).
You can easily get all the information written to standard error into a file:
(time my_command) 2> log.file
If my_command or any programs it invokes reports any errors to standard error, those will got to the log file too. And you will get all three lines of the output from time written to the file.
If your shell is Bash, you may be able to use process substitution to filter some of the output.
I wouldn't try it with a single command line; the hieroglyphs needed to make it work are ghastly and best encapsulated in shell scripts.
For example, a shell script time.filter to capture the output from time and write only the real time to a log file (default log.file, configurable by providing an alternative log file name as the first argument
#!/bin/sh
output="${1:-log.file}"
shift
sed -E '/^real[[:space:]]+(([0-9]+m)?[0-9]+[.][0-9]+s?)/{ s//\1/; w '"$output"'
d;}
/^(user|sys)[[:space:]]+(([0-9]+m)?[0-9]+[.][0-9]+s?)/d' "$#"
This assumes your sed uses -E to enable extended regular expressions.
The first line of the script finds the line containing the real label and the time after it (in a number of possible formats — but not all). It accepts an optional minutes value such as 60m05.003s, or just a seconds value 5.00s, or just 5.0 (POSIX formats — at least one digit after the decimal point is required). It captures the time part and prints it to the chosen file (by default, log.file; you can specify an alternative name as the first argument on the command line). Note that even GNU sed treats everything after the w command as file name; you have to continue the d (delete) command and the close brace } on a newline. GNU sed does not require the semicolon after d; BSD (macOS) sed does. The second line recognizes and deletes the lines reportin the user and sys times. Everything else is passed through unaltered.
The script processes any files you give it after the log file name, or standard input if you give it none. A better command line notation would use an explicit option (-l logfile) and getopts to specify the log file.
With that in place, we can devise a program that reports to standard error and standard output — my_command:
echo "nonsense: error: positive numbers are required for argument 1" >&2
dribbler -s 0.4 -r 0.1 -i data -t
echo "apoplexy: unforeseen problems induced temporary amnesia" >&2
You could use cat data instead of the dribbler command. The dribbler command as shown reads lines from data, writes them to standard output, with a random delay with a gaussian distribution between lines. The mean delay is 0.4 seconds; the standard deviation is 0.1 seconds. The other two lines are pretending to be commands that report errors to standard error.
My data file contained a nonsense 'poem' called 'The Great Panjandrum'.
With this background in place, we can run the command and capture the real time in log.file, delete (ignore) the user and system time values, while sending the rest of standard error to standard error by using:
$ (time my_command) 2> >(tee raw.stderr | time.filter >&2)
nonsense: error: positive numbers are required for argument 1
So she went into the garden
to cut a cabbage-leaf
to make an apple-pie
and at the same time
a great she-bear coming down the street
pops its head into the shop
What no soap
So he died
and she very imprudently married the Barber
and there were present
the Picninnies
and the Joblillies
and the Garyulies
and the great Panjandrum himself
with the little round button at top
and they all fell to playing the game of catch-as-catch-can
till the gunpowder ran out at the heels of their boots
apoplexy: unforeseen problems induced temporary amnesia
$ cat log.file
0m7.278s
(The time taken is normally between 6 and 8 seconds. There are 17 lines, so you'd expect it to take around 6.8 seconds at 0.4 seconds per line.) The blank line is from time; it is pretty hard to remove that blank line, and only that blank line, especially as POSIX says it is optional. It isn't worth it.

When data is piped from one program via | is there a way to detect what that program was from the second program?

Say you have a shell command like
cat file1 | ./my_script
Is there any way from inside the 'my_script' command to detect the command run first as the pipe input (in the above example cat file1)?
I've been digging into it and so far I've not found any possibilities.
I've been unable to find any environment variables set in the process space of the second command recording the full command line, the command data the my_script commands sees (via /proc etc) is just _./my_script_ and doesn't include any information about it being run as part of a pipe. Checking the process list from inside the second command even doesn't seem to provide any data since the first process seems to exit before the second starts.
The best information I've been able to find suggests in bash in some cases you can get the exit codes of processes in the pipe via PIPESTATUS, unfortunately nothing similar seems to be present for the name of commands/files in the pipe. My research seems to be saying it's impossible to do in a generic manner (I can't control how people decide to run my_script so I can't force 3rd party pipe replacement tools to be used over build in shell pipes) but it just at the same time doesn't seem like it should be impossible since the shell has the full command line present as the command is run.
(update adding in later information following on from comments below)
I am on Linux.
I've investigated the /proc/$$/fd data and it almost does the job. If the first command doesn't exit for several seconds while piping data to the second command can you read /proc/$$/fd/0 to see the value pipe:[PIPEID] that it symlinks to. That can then be used to search through the rest of the /proc//fd/ data for other running processes to find another process with a pipe open using the same PIPEID which gives you the first process pid.
However in most real world tests I've done of piping you can't trust that the first command will stay running long enough for the second one to have time to locate it's pipe fd in /proc before it exits (which removes the proc data preventing it being read). So if this method will return any information is something I can't rely on.

Bash, cygwin, run command with user input (disable process switch)

I want to run command in console and insert all user data needed.
#!/bin/bash
program < data &
My code works, but after less than second program disappears (only blinks).
How can I run program, pass data from file and stay in that program(I have no need to continue bash script after app launching.)
Inasmuch as the program you are launching reads data from its standard input, it is reasonable to suppose that when you say that you want to "stay in that program" you mean that you want to be able to give it further input interactively. Moreover, I suppose that the program disappears / blinks either because it is disconnected from the terminal (by operation of the & operator) or because it terminates when it detects end-of-file on its standard input.
If the objective is simply to prepend some canned input before the interactive input, then you should be able to achieve that by piping input from cat:
cat data - | program
The - argument to cat designates the standard input. cat first reads file data and writes it to standard out, then it forwards data from its standard input to its standard output. All of that output is fed to program's standard input. There is no need to exec, and do not put either command into the background, as that disconnects it from the terminal (from which cat is obtaining input and to which program is, presumably, writing output).

What is a simple explanation for how pipes work in Bash?

I often use pipes in Bash, e.g.:
dmesg | less
Although I know what this outputs, it takes dmesg and lets me scroll through it with less, I do not understand what the | is doing. Is it simply the opposite of >?
Is there a simple, or metaphorical explanation for what | does?
What goes on when several pipes are used in a single line?
Is the behavior of pipes consistent everywhere it appears in a Bash script?
A Unix pipe connects the STDOUT (standard output) file descriptor of the first process to the STDIN (standard input) of the second. What happens then is that when the first process writes to its STDOUT, that output can be immediately read (from STDIN) by the second process.
Using multiple pipes is no different than using a single pipe. Each pipe is independent, and simply links the STDOUT and STDIN of the adjacent processes.
Your third question is a little bit ambiguous. Yes, pipes, as such, are consistent everywhere in a bash script. However, the pipe character | can represent different things. Double pipe (||), represents the "or" operator, for example.
In Linux (and Unix in general) each process has three default file descriptors:
fd #0 Represents the standard input of the process
fd #1 Represents the standard output of the process
fd #2 Represents the standard error output of the process
Normally, when you run a simple program these file descriptors by default are configured as following:
default input is read from the keyboard
Standard output is configured to be the monitor
Standard error is configured to be the monitor also
Bash provides several operators to change this behavior (take a look to the >, >> and < operators for example). Thus, you can redirect the output to something other than the standard output or read your input from other stream different than the keyboard. Specially interesting the case when two programs are collaborating in such way that one uses the output of the other as its input. To make this collaboration easy Bash provides the pipe operator |. Please note the usage of collaboration instead of chaining. I avoided the usage of this term since in fact a pipe is not sequential. A normal command line with pipes has the following aspect:
> program_1 | program_2 | ... | program_n
The above command line is a little bit misleading: user could think that program_2 gets its input once the program_1 has finished its execution, which is not correct. In fact, what bash does is to launch ALL the programs in parallel and it configures the inputs outputs accordingly so every program gets its input from the previous one and delivers its output to the next one (in the command line established order).
Following is a simple example from Creating pipe in C of creating a pipe between a parent and child process. The important part is the call to the pipe() and how the parent closes fd1 (writing side) and how the child closes fd1 (writing side). Please, note that the pipe is a unidirectional communication channel. Thus, data can only flow in one direction: fd1 towards fd[0]. For more information take a look to the manual page of pipe().
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
int main(void)
{
int fd[2], nbytes;
pid_t childpid;
char string[] = "Hello, world!\n";
char readbuffer[80];
pipe(fd);
if((childpid = fork()) == -1)
{
perror("fork");
exit(1);
}
if(childpid == 0)
{
/* Child process closes up input side of pipe */
close(fd[0]);
/* Send "string" through the output side of pipe */
write(fd[1], string, (strlen(string)+1));
exit(0);
}
else
{
/* Parent process closes up output side of pipe */
close(fd[1]);
/* Read in a string from the pipe */
nbytes = read(fd[0], readbuffer, sizeof(readbuffer));
printf("Received string: %s", readbuffer);
}
return(0);
}
Last but not least, when you have a command line in the form:
> program_1 | program_2 | program_3
The return code of the whole line is set to the last command. In this case program_3. If you would like to get an intermediate return code you have to set the pipefail or get it from the PIPESTATUS.
Every standard process in Unix has at least three file descriptors, which are sort of like interfaces:
Standard output, which is the place where the process prints its data (most of the time the console, that is, your screen or terminal).
Standard input, which is the place it gets its data from (most of the time it may be something akin to your keyboard).
Standard error, which is the place where errors and sometimes other out-of-band data goes. It's not interesting right now because pipes don't normally deal with it.
The pipe connects the standard output of the process to the left to the standard input of the process of the right. You can think of it as a dedicated program that takes care of copying everything that one program prints, and feeding it to the next program (the one after the pipe symbol). It's not exactly that, but it's an adequate enough analogy.
Each pipe operates on exactly two things: the standard output coming from its left and the input stream expected at its right. Each of those could be attached to a single process or another bit of the pipeline, which is the case in a multi-pipe command line. But that's not relevant to the actual operation of the pipe; each pipe does its own.
The redirection operator (>) does something related, but simpler: by default it sends the standard output of a process directly to a file. As you can see it's not the opposite of a pipe, but actually complementary. The opposite of > is unsurprisingly <, which takes the content of a file and sends it to the standard input of a process (think of it as a program that reads a file byte by byte and types it in a process for you).
In short, as described, there are three key 'special' file descriptors to be aware of. The shell by default send the keyboard to stdin and sends stdout and stderr to the screen:
A pipeline is just a shell convenience which attaches the stdout of one process directly to the stdin of the next:
There are a lot of subtleties to how this works, for example, the stderr stream might not be piped as you would expect, as shown below:
I have spent quite some time trying to write a detailed but beginner friendly explanation of pipelines in Bash. The full content is at:
https://effective-shell.com/docs/part-2-core-skills/7-thinking-in-pipelines/
A pipe takes the output of a process, by output I mean the standard output (stdout on UNIX) and passes it on the standard input (stdin) of another process. It is not the opposite of the simple right redirection > which purpose is to redirect an output to another output.
For example, take the echo command on Linux which is simply printing a string passed in parameter on the standard output. If you use a simple redirect like :
echo "Hello world" > helloworld.txt
the shell will redirect the normal output initially intended to be on stdout and print it directly into the file helloworld.txt.
Now, take this example which involves the pipe :
ls -l | grep helloworld.txt
The standard output of the ls command will be outputed at the entry of grep, so how does this work?
Programs such as grep when they're being used without any arguments are simply reading and waiting for something to be passed on their standard input (stdin). When they catch something, like the ouput of the ls command, grep acts normally by finding an occurence of what you're searching for.
Pipes are very simple like this.
You have the output of one command. You can provide this output as the input into another command using pipe. You can pipe as many commands as you want.
ex:
ls | grep my | grep files
This first lists the files in the working directory. This output is checked by the grep command for the word "my". The output of this is now into the second grep command which finally searches for the word "files". Thats it.
The pipe operator takes the output of the first command, and 'pipes' it to the second one by connecting stdin and stdout.
In your example, instead of the output of dmesg command going to stdout (and throwing it out on the console), it is going right into your next command.
| puts the STDOUT of the command at left side to the STDIN of the command of right side.
If you use multiple pipes, it's just a chain of pipes. First commands output is set to second commands input. Second commands output is set to next commands input. An so on.
It's available in all Linux/widows based command interpreter.
All of these answere are great. Something that I would just like to mention, is that a pipe in bash (which has the same concept as a unix/linux, or windows named pipe) is just like a pipe in real life.
If you think of the program before the pipe as a source of water, the pipe as a water pipe, and the program after the pipe as something that uses the water (with the program output as water), then you pretty much understand how pipes work.
And remember that all apps in a pipeline run in parallel.
Regarding the efficiency issue of pipe:
A command can access and process the data at its input before previous pipe command to complete that means computing power utilization efficiency if resources available.
Pipe does not require to save output of a command to a file before next command to access its input ( there is no I/O operation between two commands) that means reduction in costly I/O operations and disk space efficiency.
If you treat each unix command as a standalone module,
but you need them to talk to each other using text as a consistent interface,
how can it be done?
cmd input output
echo "foobar" string "foobar"
cat "somefile.txt" file *string inside the file*
grep "pattern" "a.txt" pattern, input file *matched string*
You can say | is a metaphor for passing the baton in a relay marathon.
Its even shaped like one!
cat -> echo -> less -> awk -> perl is analogous to cat | echo | less | awk | perl.
cat "somefile.txt" | echo
cat pass its output for echo to use.
What happens when there is more than one input?
cat "somefile.txt" | grep "pattern"
There is an implicit rule that says "pass it as input file rather than pattern" for grep.
You will slowly develop the eye for knowing which parameter is which by experience.

How do I jump to the first line of shell output? (shell equivalent of emacs comint-show-output)

I recently discovered 'comint-show-output' in emacs shell mode, which jumps to the first line of shell output, which I find incredibly handy when looking at shell output that exceeds a screen length. The advantages of this command over scrolling with 'page up' are A) you don't have to scan with your eyes for the first line of the output B) you only have to hit the key combo once (instead of 'page up' a number of times which probably is not known beforehand).
I thought about ending all my commands with '| more' but actually this is not what I want since most of the time, I want to retain all output in the terminal buffer, and I usually want to see the end of the shell output first.
I use OSX. Is there a terminal app (on os x) and shell (on remote linux) combination equivalent (so I can do something similar without using emacs all the time - I know, crazy talk)? I normally use bash, but would be fine with switching shells just for this feature.
The way I do this sort of thing is by sending my output to a file and then watching the file as it is written. You still get the results of the command dumped to terminal history in real time and can still inspect the output's actual contents further after the fact (or in another terminal, etc...)
command > output &
tail -f output
head output
You could always do something in bash like this:
alias foo='!! | more'
which would make foo run the previous command with more. I'm not sure of any way to do exactly what you are suggesting.
If you're expecting a lot of output and don't want to run your command twice, you can use tee(1) to fork the output:
my-command | tee /tmp/my-command.log | less
This will pipe the output to a paginator (less), while simultaneously logging the output to a file (in this case, a file named /tmp/my-command.log). If you need to review the output after you've quit from less, you can just cat the log file instead of re-running the command.

Resources