Pipe input into a script - shell

I have written a shell script in ksh to convert a CSV file into Spreadsheet XML file. It takes an existing CSV file (the path to which is a variable in the script), and then creates a new output file .xls. The script has no positional parameters. The file name of the CSV is currently hardcoded into the script.
I would like to amend the script so it can take the input CSV data from a pipe, and so that the .xls output data can also be piped or redirected (>) to a file on the command line.
How is this achieved?
I am struggling to find documentation on how to write a shell script to take input from a pipe. It appears that 'read' is only used for std input from kb.
Thanks.
Edit : script below for info (now amended to take input from a pipe via the cat, as per the answer to the question.
#!/bin/ksh
#Script to convert a .csv data to "Spreadsheet ML" XML format - the XML scheme for Excel 2003
#
# Take CSV data as standard input
# Out XLS data as standard output
#
DATE=`date +%Y%m%d`
#define tmp files
INPUT=tmp.csv
IN_FILE=in_file.csv
#take standard input and save as $INPUT (tmp.csv)
cat > $INPUT
#clean input data and save as $IN_FILE (in_file.csv)
grep '.' $INPUT | sed 's/ *,/,/g' | sed 's/, */,/g' > $IN_FILE
#delete original $INPUT file (tmp.csv)
rm $INPUT
#detect the number of columns and rows in the input file
ROWS=`wc -l < $IN_FILE | sed 's/ //g' `
COLS=`awk -F',' '{print NF; exit}' $IN_FILE`
#echo "Total columns is $COLS"
#echo "Total rows is $ROWS"
#create start of Excel File
echo "<?xml version=\"1.0\"?>
<?mso-application progid=\"Excel.Sheet\"?>
<Workbook xmlns=\"urn:schemas-microsoft-com:office:spreadsheet\"
xmlns:o=\"urn:schemas-microsoft-com:office:office\"
xmlns:x=\"urn:schemas-microsoft-com:office:excel\"
xmlns:ss=\"urn:schemas-microsoft-com:office:spreadsheet\"
xmlns:html=\"http://www.w3.org/TR/REC-html40\">
<DocumentProperties xmlns=\"urn:schemas-microsoft-com:office:office\">
<Author>Ben Hamilton</Author>
<LastAuthor>Ben Hamilton</LastAuthor>
<Created>${DATE}</Created>
<Company>MCC</Company>
<Version>10.2625</Version>
</DocumentProperties>
<ExcelWorkbook xmlns=\"urn:schemas-microsoft-com:office:excel\">
<WindowHeight>6135</WindowHeight>
<WindowWidth>8445</WindowWidth>
<WindowTopX>240</WindowTopX>
<WindowTopY>120</WindowTopY>
<ProtectStructure>False</ProtectStructure>
<ProtectWindows>False</ProtectWindows>
</ExcelWorkbook>
<Styles>
<Style ss:ID=\"Default\" ss:Name=\"Normal\">
<Alignment ss:Vertical=\"Bottom\" />
<Borders />
<Font />
<Interior />
<NumberFormat />
<Protection />
</Style>
<Style ss:ID=\"AcadDate\">
<NumberFormat ss:Format=\"Short Date\"/>
</Style>
</Styles>
<Worksheet ss:Name=\"Sheet 1\">
<Table>
<Column ss:AutoFitWidth=\"1\" />"
#for each row in turn, create the XML elements for row/column
r=1
while (( r <= $ROWS ))
do
echo "<Row>\n"
c=1
while (( c <= $COLS ))
do
DATA=`sed -n "${r}p" $IN_FILE | cut -d "," -f $c `
if [[ "${DATA}" == [0-9][0-9]\.[0-9][0-9]\.[0-9][0-9][0-9][0-9] ]]; then
DD=`echo $DATA | cut -d "." -f 1`
MM=`echo $DATA | cut -d "." -f 2`
YYYY=`echo $DATA | cut -d "." -f 3`
echo "<Cell ss:StyleID=\"AcadDate\"><Data ss:Type=\"DateTime\">${YYYY}-${MM}-${DD}T00:00:00.000</Data></Cell>"
else
echo "<Cell><Data ss:Type=\"String\">${DATA}</Data></Cell>"
fi
(( c+=1 ))
done
echo "</Row>"
(( r+=1 ))
done
echo "</Table>\n</Worksheet>\n</Workbook>"
rm $IN_FILE > /dev/null
exit 0

Commands inherit their standard input from the process that starts them. In your case, your script provides its standard input for each command that it runs. A simple example script:
#!/bin/bash
cat > foo.txt
Piping data into your shell script causes cat to read that data, since cat inherits its standard input from your script.
$ echo "Hello world" | myscript.sh
$ cat foo.txt
Hello world
The read command is provided by the shell for reading text from standard input into a shell variable if you don't have another command to read or process your script's standard input.
#!/bin/bash
read foo
echo "You entered '$foo'"
$ echo bob | myscript.sh
You entered 'bob'

There is one problem here. If you run the script without first checking to ensure there is input on stdin, then it will hang till something is typed.
So, to get around this, you can check to ensure there is stdin first, and if not, then use a command line argument instead if given.
Create a script called "testPipe.sh"
#!/bin/bash
# Check to see if a pipe exists on stdin.
if [ -p /dev/stdin ]; then
echo "Data was piped to this script!"
# If we want to read the input line by line
while IFS= read line; do
echo "Line: ${line}"
done
# Or if we want to simply grab all the data, we can simply use cat instead
# cat
else
echo "No input was found on stdin, skipping!"
# Checking to ensure a filename was specified and that it exists
if [ -f "$1" ]; then
echo "Filename specified: ${1}"
echo "Doing things now.."
else
echo "No input given!"
fi
fi
Then to test:
Let's add some stuff to a test.txt file and then pipe the output to our script.
printf "stuff\nmore stuff\n" > test.txt
cat test.txt | ./testPipe.sh
Output:
Data was piped to this script!
Line: stuff
Line: more stuff
Now let's test if not providing any input:
./testPipe.sh
Output:
No input was found on stdin, skipping!
No input given!
Now let's test if providing a valid filename:
./testPipe.sh test.txt
Output:
No input was found on stdin, skipping!
Filename specified: test.txt
Doing things now..
And finally, let's test using an invalid filename:
./testPipe.sh invalidFile.txt
Output:
No input was found on stdin, skipping!
No input given!
Explanation:
Programs like read and cat will use the stdin if it is available within the shell, otherwise they will wait for input.
Credit goes to Mike from this page in his answer showing how to check for stdin input: https://unix.stackexchange.com/questions/33049/check-if-pipe-is-empty-and-run-a-command-on-the-data-if-it-isnt?newreg=fb5b291531dd4100837b12bc1836456f

If the external program (that you are scripting) already takes input from stdin, your script does not need to do anything. For example, awk reads from stdin, so a short script to count words per line:
#!/bin/sh
awk '{print NF}'
Then
./myscript.sh <<END
one
one two
one two three
END
outputs
1
2
3

Related

In a bash pipe, take the output of the previous command as a variable to the next command (Eg. if statement)

I wanted to write a command to compare the hash of a file. I wrote the below single line command. Wanted to understand as to how I can take the output of the previous command as a variable for the current command, in a pipe.
Eg. below command I wanted to compare the output of 1st command "Calculated hash" to the original hash. In the last command, I wanted to refer to the output of the previous command. How do I do that in the if statement? (Instead of $0)
sha256sum abc.txt | awk '{print $1}' | if [ "$0" = "8237491082roieuwr0r9812734iur" ]; then
echo "match"
fi
Following your narrow request looks like:
sha256sum abc.txt |
awk '{print $1}' |
if [ "$(cat)" = "8237491082roieuwr0r9812734iur" ]; then echo "match"; fi
...as cat with no arguments reads the command's stdin, and in a pipeline, content generated from prior stages are streamed into their successors.
Alternately:
sha256sum abc.txt |
awk '{print $1}' |
if read -r line && [ "$line" = "8237491082roieuwr0r9812734iur" ]; then echo "match"; fi
...wherein we read only a single line from stdin instead of using cat. (To instead loop over all lines given on stdin, see BashFAQ #1).
However, I would strongly suggest writing this instead as:
if [ "$(sha256sum abc.txt | awk '{print $1}')" = "8237491082roieuwr0r9812734iur" ]; then
echo "match"
fi
...which, among other things, keeps your logic outside the pipeline, so your if statement can set variables that remain set after the pipeline exits. See BashFAQ #24 for more details on the problems inherent in running code in pipelines.
Consider using sha256sum's check mode. If you save the output of sha256sum to a file, you can check it with sha256sum -c.
$ echo foo > file
$ sha256sum file > hash.txt
$ cat hash.txt
b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c file
$ sha256sum -c hash.txt
file: OK
$ if sha256sum -c --quiet hash.txt; then echo "match"; fi
If you don't want to save the hashes to a file you could pass them in via a here-string:
if sha256sum -c --quiet <<< 'b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c file'; then
echo "match"
fi

Read multiple variables from file

I need to read a file that has lines like
user=username1
pass=password1
How can I read multiple lines like this into separate variables like username and password?
Would I use awk or grep? I have found ways to read lines into variables with grep but would I need to read the file for each individual item?
The end result is to use these variables to access a database via the command line. So I need to be able to read, store and use these values in other commands.
if the process which generates the file is safe and has shell syntax just source the file.
. ./file
Otherwise the file can be processes before to add quotes
perl -ne 'if (/^([A-Za-z_]\w*)=(.*)/) {$k=$1;$v=$2;$v=~s/\x27/\x27\\\x27\x27/g;print "$k=\x27$v\x27\n";}' <file >file2
. ./file2
If you want to use awk then
Input
$ cat file
user=username1
pass=password1
Reading
$ user=$(awk -F= '$1=="user"{print $2;exit}' file)
$ pass=$(awk -F= '$1=="pass"{print $2;exit}' file)
Output
$ echo $user
username1
$ echo $pass
password1
You could use a loop for your file perhaps, but this is probably the functionality you're looking for.
$ echo 'user=username1' | awk -F= '{print $2}'
username1
Using the -F flag sets the delimiter to = and we select the 2nd item from the row.
file.txt:
user=username1
pass=password1
user=username2
pass=password2
user=username3
pass=password3
Do to avoid browsing several times the file file.txt:
#!/usr/bin/env bash
func () {
echo "user:$1 pass:$2"
}
i=0
while IFS='' read -r line; do
if [ $i -eq 0 ]; then
i=1
user=$(echo ${line} | cut -f2 -d'=')
else
i=0
pass=$(echo ${line} | cut -f2 -d'=')
func "$user" "$pass"
fi
done < file.txt
Output:
user:username1 pass:password1
user:username2 pass:password2
user:username3 pass:password3

bash: what is the difference between "done < foo", "done << foo" and "done <<< foo" when closing a loop?

In a bash script, I see several while statements with those redirect signs when closing the loop.
I know that if I end it with "done < file", I am redirecting the file to the stdin of the command in the while statement. But what the others means?
I would appreciate if someone could give an explanation with examples.
With the file text.txt
1aa
2bb
3cc
Redirection:
$ cat < text.txt
1aa
2bb
3cc
Here document:
$ cat << EOF
> 1AA
> 2BB
> EOF
1AA
2BB
Here string:
$ cat <<< 1aaa
1aaa
The first form, <, is an input redirection. It somewhat different than << and <<< which are two variants of a here document.
The first form, <, is primarily used to redirect the contents of a file to a command or process. It is a named FIFO, and therefor a file that is passed to a command that accepts file arguments.
cmd < file
will open the file named file and create a new file name to open and read. The difference between cmd file and cmd < file is the name passed to cmd in the second case is the name of a named pipe.
You can also do process substitution:
cmd <(process)
An example use would be comparing two directories:
diff <(ls dir1) <(ls dir2)
In this case, the command ls dir1 and ls dir2 has output redirected to a file like stream that is then read by diff as if those were two files.
You can see the name of the file device by passing to echo a process substitution:
$ echo <(ls)
/dev/fd/63
Since echo does not support opening files, it just prints the name of the FIFO.
Here documents are easier to demonstrate. The << form has a 'limit string' that is not included in the output:
$ cat <<HERE
> line 1
> line 2
> line 3
> HERE
line 1
line 2
line 3
The HERE is a unique string that must be on its own line.
The 'here string' or <<< form does not require the delimiting string of the << form and is on a single line:
$ cat <<< 'line 1'
line 1
You can also expand parameters:
$ v="some text"
$ cat <<< "$v"
some text
But not other forms of shell expansions:
Brace expansion:
$ echo a{b,c,d}e
abe ace ade
$ cat <<< a{b,c,d}e
a{b,c,d}e
Given a 'generic' Bash while loop that reads input line by line:
while IFS= read -r line || [[ -n $line ]]; do printf "'%s'\n" "$line"; done
There are several ways that you can feed input into that loop.
First example, you can redirect a file. For demo, create a 6 line file:
$ seq 6 > /tmp/6.txt
Redirect the input of the file into the loop:
while IFS= read -r line || [[ -n $line ]]; do printf "'%s'\n" "$line"; done </tmp/6.txt
'1'
'2'
'3'
'4'
'5'
'6'
Or, second example, you can directly read from the output of seq using redirection:
$ while IFS= read -r line || [[ -n $line ]]; do printf "'%s'\n" "$line"; done < <(seq 3)
'1'
'2'
'3'
(Please note the extra < with a space for this form)
Or, third example, you can use a 'HERE' doc separated by CR:
while IFS= read -r line || [[ -n $line ]]; do printf "'%s'\n" "$line"; done <<HERE
1
2 3
4
HERE
'1 '
'2 3'
' 4'
Going back to diff which will only work on files, you can use process substitution and a HERE doc or process substitution and redirection to use diff on free text or the output of a program.
Given:
$ cat /tmp/f1.txt
line 1
line 2
line 3
Normally you would need to have a second file to compare free text with that file. You can use a HERE doc and process substitution to skip creating a separate file:
$ diff /tmp/f1.txt <(cat <<HERE
line 1
line 2
line 5
HERE
)
3c3
< line 3
---
> line 5
command < foo
Redirect the file foo to the standard input of command.
command << foo
blah 1
blah 2
foo
Here document: send the following lines up to foo to the standard input of command.
command <<< foo
Here-string. The string foo is sent to the standard input of command.

What does the "done < $var" at the end of a loop do?

Just a simple question - I'm wondering what the following code is doing:
nlwd="$PWD/NLWD.txt"
cat /dev/null > $nlwd
echo "Enter filename to process:"
read name
while read line
do
uid="$(echo $line | cut -d, -f1)"
echo "$uid" | grep [0-9] >> $nlwd
done < $name
In particular, I'm wondering what the done < $name is doing.
It's taking a file name, reading that file line-by-line, and doing stuff with each line.
< is an input redirect, which means that the loop is taking its input from $name.
For example:
while read LINE
do
echo $LINE
done < $name
...is essentially the same as:
cat $name
In response to your comment, the cat /dev/null > $nlwd just empties out the file's contents. This time, it uses the > output redirection to take the contents of /dev/null (which is Linux's black hole file), and outputs that emptiness into file represented by the $nlwd variable. Here's a simpler example:
$> echo "something" > something.txt
$> cat something.txt
something
$> cat /dev/null > something.txt
$> cat something.txt
$>
Further reading: http://en.wikipedia.org/wiki//dev/null
It's an input redirection. The while loop (and thus each command in the while loop, specifically read) will take its standard input from the file named by $name.

Read words in a specific line in a text file using shell script

In my Bash shell script, I would like to read a specific line from a file; that is delimited by : and assign each section to a variable for processing later.
For example I want to read the words found on line 2. The text file:
abc:01APR91:1:50
Jim:02DEC99:2:3
banana:today:three:0
Once I have "read" line 2, I should be able to echo the values as something like this:
echo "$name";
echo "$date";
echo "$number";
echo "$age";
The output would be:
Jim
02DEC99
2
3
For echoing a single line of a file, I quite like sed:
$ IFS=: read name date number age < <(sed -n 2p data)
$ echo $name
Jim
$ echo $date
02DEC99
$ echo $number
2
$ echo $age
3
$
This uses process substitution to get the output of sed to the read command. The sed command uses the -n option so it does not print each line (as it does by default); the 2p means 'when it is line 2, print the line'; data is simply the name of the file.
You can use this:
read name date number age <<< $(awk -F: 'NR==2{printf("%s %s %s %s\n", $1, $2, $3, $4)}' inFile)
echo "$name"
echo "$date"
echo "$number"
echo "$age"

Resources