Read rest of while loop output - bash

I want to run a while loop from output I get from MySQL, but my output is being cut off.
Example output I get from MySQL is:
123 nfs://192.168.1.100/full/Some.file.1.txt
124 nfs://192.168.1.100/full/A second file 2.txt
My loop looks like so:
mysql -uuser -ppass queue -ss -e 'select id,path from queue where status = 0' | while read a b c
do
echo $a
echo $b
done
The result for $b cuts off after nfs://192.168.1.100/full/A.
How can I have it output the whole sentence?

Your second filename contains spaces, so that is where the field is cut off.
Since it is the last field of the output, you can just skip field c:
mysql -uuser -ppass queue -ss -e 'select id,path from queue where status = 0' | while read a b
do
echo $a
echo $b
done
The last field in read will have all remaining fields.

Problem is that you are reading each line into 3 variables using:
read a b c
And since your input line also contains a whitespace e.g.
124 nfs://192.168.1.100/full/A second file 2.txt
with the default IFS it is setting 3 variables as:
a=124
b=nfs://192.168.1.100/full/A
c=second file 2.txt
Since c is the last parameter in read it is reading rest of the line in c.
To fix your script you can just do:
read a b

Related

Extracting file content using a for loop [duplicate]

I'm working on a long Bash script. I want to read cells from a CSV file into Bash variables. I can parse lines and the first column, but not any other column. Here's my code so far:
cat myfile.csv|while read line
do
read -d, col1 col2 < <(echo $line)
echo "I got:$col1|$col2"
done
It's only printing the first column. As an additional test, I tried the following:
read -d, x y < <(echo a,b,)
And $y is empty. So I tried:
read x y < <(echo a b)
And $y is b. Why?
You need to use IFS instead of -d:
while IFS=, read -r col1 col2
do
echo "I got:$col1|$col2"
done < myfile.csv
To skip a given number of header lines:
skip_headers=3
while IFS=, read -r col1 col2
do
if ((skip_headers))
then
((skip_headers--))
else
echo "I got:$col1|$col2"
fi
done < myfile.csv
Note that for general purpose CSV parsing you should use a specialized tool which can handle quoted fields with internal commas, among other issues that Bash can't handle by itself. Examples of such tools are cvstool and csvkit.
How to parse a CSV file in Bash?
Coming late to this question and as bash do offer new features, because this question stand about bash and because none of already posted answer show this powerful and compliant way of doing precisely this.
Parsing CSV files under bash, using loadable module
Conforming to RFC 4180, a string like this sample CSV row:
12,22.45,"Hello, ""man"".","A, b.",42
should be splitted as
1 12
2 22.45
3 Hello, "man".
4 A, b.
5 42
bash loadable .C compiled modules.
Under bash, you could create, edit, and use loadable c compiled modules. Once loaded, they work like any other builtin!! ( You may find more information at source tree. ;)
Current source tree (Oct 15 2021, bash V5.1-rc3) do contain a bunch of samples:
accept listen for and accept a remote network connection on a given port
asort Sort arrays in-place
basename Return non-directory portion of pathname.
cat cat(1) replacement with no options - the way cat was intended.
csv process one line of csv data and populate an indexed array.
dirname Return directory portion of pathname.
fdflags Change the flag associated with one of bash's open file descriptors.
finfo Print file info.
head Copy first part of files.
hello Obligatory "Hello World" / sample loadable.
...
tee Duplicate standard input.
template Example template for loadable builtin.
truefalse True and false builtins.
tty Return terminal name.
uname Print system information.
unlink Remove a directory entry.
whoami Print out username of current user.
There is an full working cvs parser ready to use in examples/loadables directory: csv.c!!
Under Debian GNU/Linux based system, you may have to install bash-builtins package by
apt install bash-builtins
Using loadable bash-builtins:
Then:
enable -f /usr/lib/bash/csv csv
From there, you could use csv as a bash builtin.
With my sample: 12,22.45,"Hello, ""man"".","A, b.",42
csv -a myArray '12,22.45,"Hello, ""man"".","A, b.",42'
printf "%s\n" "${myArray[#]}" | cat -n
1 12
2 22.45
3 Hello, "man".
4 A, b.
5 42
Then in a loop, processing a file.
while IFS= read -r line;do
csv -a aVar "$line"
printf "First two columns are: [ '%s' - '%s' ]\n" "${aVar[0]}" "${aVar[1]}"
done <myfile.csv
This way is clearly the quickest and strongest than using any other combination of bash builtins or fork to any binary.
Unfortunely, depending on your system implementation, if your version of bash was compiled without loadable, this may not work...
Complete sample with multiline CSV fields.
Conforming to RFC 4180, a string like this single CSV row:
12,22.45,"Hello ""man"",
This is a good day, today!","A, b.",42
should be splitted as
1 12
2 22.45
3 Hello "man",
This is a good day, today!
4 A, b.
5 42
Full sample script for parsing CSV containing multilines fields
Here is a small sample file with 1 headline, 4 columns and 3 rows. Because two fields do contain newline, the file are 6 lines length.
Id,Name,Desc,Value
1234,Cpt1023,"Energy counter",34213
2343,Sns2123,"Temperatur sensor
to trigg for alarm",48.4
42,Eye1412,"Solar sensor ""Day /
Night""",12199.21
And a small script able to parse this file correctly:
#!/bin/bash
enable -f /usr/lib/bash/csv csv
file="sample.csv"
exec {FD}<"$file"
read -ru $FD line
csv -a headline "$line"
printf -v fieldfmt '%-8s: "%%q"\\n' "${headline[#]}"
numcols=${#headline[#]}
while read -ru $FD line;do
while csv -a row "$line" ; (( ${#row[#]} < numcols )) ;do
read -ru $FD sline || break
line+=$'\n'"$sline"
done
printf "$fieldfmt\\n" "${row[#]}"
done
This may render: (I've used printf "%q" to represent non-printables characters like newlines as $'\n')
Id : "1234"
Name : "Cpt1023"
Desc : "Energy\ counter"
Value : "34213"
Id : "2343"
Name : "Sns2123"
Desc : "$'Temperatur sensor\nto trigg for alarm'"
Value : "48.4"
Id : "42"
Name : "Eye1412"
Desc : "$'Solar sensor "Day /\nNight"'"
Value : "12199.21"
You could find a full working sample there: csvsample.sh.txt or
csvsample.sh.
Note:
In this sample, I use head line to determine row width (number of columns). If you're head line could hold newlines, (or if your CSV use more than 1 head line). You will have to pass number or columns as argument to your script (and the number of head lines).
Warning:
Of course, parsing CSV using this is not perfect! This work for many simple CSV files, but care about encoding and security!! For sample, this module won't be able to handle binary fields!
Read carefully csv.c source code comments and RFC 4180!
From the man page:
-d delim
The first character of delim is used to terminate the input line,
rather than newline.
You are using -d, which will terminate the input line on the comma. It will not read the rest of the line. That's why $y is empty.
We can parse csv files with quoted strings and delimited by say | with following code
while read -r line
do
field1=$(echo "$line" | awk -F'|' '{printf "%s", $1}' | tr -d '"')
field2=$(echo "$line" | awk -F'|' '{printf "%s", $2}' | tr -d '"')
echo "$field1 $field2"
done < "$csvFile"
awk parses the string fields to variables and tr removes the quote.
Slightly slower as awk is executed for each field.
In addition to the answer from #Dennis Williamson, it may be helpful to skip the first line when it contains the header of the CSV:
{
read
while IFS=, read -r col1 col2
do
echo "I got:$col1|$col2"
done
} < myfile.csv
If you want to read CSV file with some lines, so this the solution.
while IFS=, read -ra line
do
test $i -eq 1 && ((i=i+1)) && continue
for col_val in ${line[#]}
do
echo -n "$col_val|"
done
echo
done < "$csvFile"

Stopping paste after any input is exhausted

I have two programs that produce data on stdout, and I'd like to paste their output together. I can successfully do this like so:
paste <(./prog1) <(./prog2)
But I find that this method will print all lines from both inputs,
and what I really want is to stop paste after either input program is finished.
So if ./prog1 produces the output:
a
b
c
But ./prog2 produces:
Hello
World
I would expect the output:
a Hello
b World
Also note that one of the input programs may actually produce infinite output, and I want to be able to handle that case as well. For example, if my inputs are yes and ./prog2, I should get:
y Hello
y World
Use join instead, with a variation on the Schwartzian transform:
numbered () {
nl -s- -ba -nrz
}
join -j 1 <(prog1 | numbered) <(prog2 | numbered) | sed 's/^[^-]*-//'
Piping to nl numbers each line, and join -1 1 will join corresponding lines with the same number. The extra lines in the longer file will have no join partner and be omitted. Once the join is complete, pipe through sed to remove the line numbers.
Here's one solution:
while IFS= read -r -u7 a && IFS= read -r -u8 b; do echo "$a $b"; done 7<$file1 8<$file2
This has the slightly annoying effect of ignoring the last line of an input file if it is not terminated with a newline (but such a file is not a valid text file).
You can wrap this in a function, of course:
paste_short() {
(
while IFS= read -r -u7 a && IFS= read -r -u8 b; do
echo "$a $b"
done
) 7<"$1" 8<"$2"
}
Consider using awk:
awk 'FNR==NR{a[++i]=$0;next} FNR>i{exit}
{print a[FNR], $0}' <(printf "hello\nworld\n") <(printf "a\nb\nc\n")
hello a
world b
Keep the longer output producing program as your 2nd input.

Compare Lines of file to every other line of same file

I am trying to write a program that will print out every line from a file with another line of that file added at the end, basically creating pairs from a portion of each line. If the line is the same, it will do nothing. Also, it must avoid repeating the same pairs. A B is the same as B A
In short
FileInput:
otherstuff A
otherstuff B
otherstuff C
otherstuff D
Output:
A B
A C
A D
B C
B D
C D
I was trying to do this with a BASH script, but was having trouble because I could not get my nested while loops to work. It would read the first line, compare it to each other line, and then stop (Basically only outputting the first 3 lines in the example output above, the outer while loop only ran once).
I also suspect I might be able to do this using MATLAB, so suggestions using that are also welcome.
Here is the bash script that I have thus far. As I said, it is no printing out correctly for me, as the outer loop only runs once.
#READS IN file from terminal
FILE1=$1
#START count at 0
count0=
exec 3<&0
exec 0< $FILE1
while read LINEa; do
while read LINEb; do
eventIDa=$(echo $LINEa | cut -c20-23)
eventIDb=$(echo $LINEb | cut -c20-23)
echo $eventIDa $eventIDb
done
done
Using bash:
#!/bin/bash
[ -f "$1" ] || { echo >&2 "File not found"; exit 1; }
mapfile -t lines < <(cut -c20-23 <"$1" | sort | uniq)
for i in ${!lines[#]}; do
elem1=${lines[$i]}
unset lines[$i]
for elem2 in "${lines[#]}"; do
echo "$elem1" "$elem2"
done
done
This will read a file given as a parameter on the command line, sort and filter out duplicates, and output all combinations. You can modify the parameter to cut to adjust to your particular input file.
Due to the particular way you seem to indent to use cut, your input example above won't work. Instead, use something with the correct line length, such as:
123456789012345678 A
123456789012345678 B
123456789012345678 C
123456789012345678 D
Assuming the otherstuff is not relevant (otherwise you can of course add it later) this should do the trick in Matlab:
combnk({'A' 'B' 'C' 'D'},2)

Bash add to end of file (>>) if not duplicate line

Normally I use something like this for processes I run on my servers
./runEvilProcess.sh >> ./evilProcess.log
However I'm currently using Doxygen and it produces lots of duplicate output
Example output:
QGDict::hashAsciiKey: Invalid null key
QGDict::hashAsciiKey: Invalid null key
QGDict::hashAsciiKey: Invalid null key
So you end up with a very messy log
Is there a way I can only add the line to the log file if the line wasn't the last one added.
A poor example (but not sure how to do in bash)
$previousLine = ""
$outputLine = getNextLine()
if($previousLine != $outputLine) {
$outputLine >> logfile.log
$previousLine = $outputLine
}
If the process returns duplicate lines in a row, pipe the output of your process through uniq:
$ ./t.sh
one
one
two
two
two
one
one
$ ./t.sh | uniq
one
two
one
If the logs are sent to the standard error stream, you'll need to redirect that too:
$ ./yourprog 2>&1 | uniq >> logfile
(This won't help if the duplicates come from multiple runs of the program - but then you can pipe your log file through uniq when reviewing it.)
Create a filter script (filter.sh):
while read line; do
if [ "$last" != "$line" ]; then
echo $line
last=$line
fi
done
and use it:
./runEvilProcess.sh | sh filter.sh >> evillog

Setting a BASH environment variable directly in AWK (in an AWK one-liner)

I have a file that has two columns of floating point values. I also have a C program that takes a floating point value as input and returns another floating point value as output.
What I'd like to do is the following: for each row in the original, execute the C program with the value in the first column as input, and then print out the first column (unchanged) followed by the second column minus the result of the C program.
As an example, suppose c_program returns the square of the input and behaves like this:
$ c_program 4
16
$
and suppose data_file looks like this:
1 10
2 11
3 12
4 13
What I'd like to return as output, in this case, is
1 9
2 7
3 3
4 -3
To write this in really sketchy pseudocode, I want to do something like this:
awk '{print $1, $2 - `c_program $1`}' data_file
But of course, I can't just pass $1, the awk variable, into a call to c_program. What's the right way to do this, and preferably, how could I do it while still maintaining the "awk one-liner"? (I don't want to pull out a sledgehammer and write a full-fledged C program to do this.)
you just do everything in awk
awk '{cmd="c_program "$1; cmd|getline l;print $1,$2-l}' file
This shows how to execute a command in awk:
ls | awk '/^a/ {system("ls -ld " $1)}'
You could use a bash script instead:
while read line
do
FIRST=`echo $line | cut -d' ' -f1`
SECOND=`echo $line | cut -d' ' -f2`
OUT=`expr $SECOND \* 4`
echo $FIRST $OUT `expr $OUT - $SECOND`
done
The shell is a better tool for this using a little used feature. There is a shell variable IFS which is the Input Field Separator that sh uses to split command lines when parsing; it defaults to <Space><Tab><Newline> which is why ls foo is interpreted as two words.
When set is given arguments not beginning with - it sets the positional parameters of the shell to the contents of the arguments as split via IFS, thus:
#!/bin/sh
while read line ; do
set $line
subtrahend=`c_program $1`
echo $1 `expr $2 - $subtrahend`
done < data_file
Pure Bash, without using any external executables other than your program:
#!/bin/bash
while read num1 num2
do
(( result = $(c_program num2) - num1 ))
echo "$num1 $result"
done
As others have pointed out: awk is not not well equipped for this job. Here is a suggestion in bash:
#!/bin/sh
data_file=$1
while read column_1 column_2 the_rest
do
((result=$(c_program $column_1)-$column_2))
echo $column_1 $result "$the_rest"
done < $data_file
Save this to a file, say myscript.sh, then invoke it as:
sh myscript.sh data_file
The read command reads each line from the data file (which was redirected to the standard input) and assign the first 2 columns to $column_1 and $column_2 variables. The rest of the line, if there is any, is stored in $the_rest.
Next, I calculate the result based on your requirements and prints out the line based on your requirements. Note that I surround $the_rest with quotes to reserve spacing. Failure to do so will result in multiple spaces in the input file to be squeezed into one.

Resources