while read loop ignoring last line in file - bash

Reading in a file , members_08_14.csv which just contains a list of numbers, the while loop is reading each line. For each line, the number is matched against a regex to ensure that it's only numbers and exactly 11 characters long.
while read card
if [[ $card =~ ^[0-9]{11}$ ]]
echo "some sql statement with $card" >> temp.sql;
echo "Invalid card number in file: $card";
done <registered/members_08_14.csv
The interesting thing is, the else is not being executed if the regex does not match. I would expect that either the line would be written to temp.sql, or a line would be printed to stdout saying the card number is invalid.
The behaviour, however, is more along the lines of either only the true condition or only the false condition gets activated for the entire file. Why would this be?
Here's the contents of registered/members_08_14.csv:
The first lines are valid, the 5th line is invalid.
Output of cat -vte registered/members_08_14.csv

If the last line of your file has no newline on the end, read will put its content into card -- but will then exit with a nonzero value. Because read has exited with a nonzero value in this case, the while loop will exit without going on to the statement that runs the regex at all.
The easiest fix is to correct the file.
Another approach you can take is to ignore the exit status of read when it actually populates its destination (and, while at it, to put $'\r' into IFS, such that read will ignore the extra characters in DOS newlines):
while card=; IFS=$' \t\r\n' read -r card || [[ $card ]]; do
if [[ $card =~ ^[0-9]{11}$ ]]
echo "some sql statement with $card" >> temp.sql;
echo "Invalid card number in file: $card";
done <registered/members_08_14.csv

Perhaps your file is in DOS format that you also get to read carriage returns (\r) to the end of the variable. Try to run dos2unix file or sed -i 's|\r||' file. Another way is to trim out that character after every input through this
while IFS=$' \t\r\n' read -r card

To read all the lines, regardless of whether they are ended with a new line or not:
cat "somefile" | { cat ; echo ; } | while read line; do echo $line; done
Copy number of line composed by special character in bash

I have an exercise where I have a file and at the begin of it I have something like
# tototata
Hello world
Test test
#this is it
And I have to take each first line starting with a # until the line where I don't have one and stock it in a variable. In case of a shebang, it has to skip it and if there's blank space between lines, it has to skip them too. We just want the comment between the shebang and the next character.
I'm new to bash and I would like to know if there's a way to do it please ?
Expected output:
# tototata
Try in this easy way to better understand.
sed 1d your_input_file | while read line;
check=$( echo $line | grep ^"[#;]" )
if ([ ! -z "$check" ] || [ -z "$line" ])
echo $line;
exit 1;
This may be more correct, although your question was unclear about weather the input file had a script shebang, if the shebang had to be skipped to match your sample output, or if the input file shebang was just bogus.
It is also unclear for what to do, if the first lines of the input file are not starting with #.
You should really post your assignment's text as a reference.
Anyway here is a script that does collects first set of consecutive lines starting with a sharp # into the arr array variable.
It may not be an exact solution to your assignment (witch you should be able to solve with what your previous lessons taught you), but will get you some clues and keys to iterate reading lines from a file and testing that lines starts with a #.
#!/usr/bin/env bash
# Our variable to store parsed lines
# Is an array of strings with an entry per line
declare -a arr=()
# Iterate reading lines from the file
# while it matches Regex: ^[#]
# mean while lines starts with a sharp #
while IFS=$'\n' read -r line && [[ "$line" =~ ^[#] ]]; do
# Add line to the arr array variable
done <a.txt
# Print each array entries with a newline
printf '%s\n' "${arr[#]}"
How about this (not tested, so you may have to debug it a bit, but my comments in the code should explain what is going on):
while read line
# initial is 1 one the first line, and 0 after this. When the script starts,
# the variable is undefined.
: ${initial:=1}
# Test for lines starting with #. Need to quote the hash
# so that it is not taken as comment.
if [[ $line == '#'* ]]
# Test for initial #!
if (( initial == 1 )) && [[ $line == '#!'* ]]
: # ignore it
echo $line # or do whatever you want to do with it
# stop on non-blank, non-comment line
if [[ $line != *[^\ ]* ]]
initial=0 # Next line won't be an initial line
done < your_file

How to check if the matched EMPTY line is the LAST line of a file in a while IFS read

I have a while IFS read loop to check for different matches in the lines.
I check for and empty/blank line like this:
while IFS= read -r line; do
[[ -z $line ]] && printf some stuff
I also want to check if the matched empty/blank is also the last line of the file. I am going to run this script on a lot of files, they all:
-end with an empty line
-they are all a DIFFERENT LENGTH so I cannot assume anything
-they have other empty lines but not necessarily at the very end (this is why I have to differentiate)
Thanks in advance.
As chepner has noted, in a shell line-reading loop the only way to know whether a given line is the last one is to try to read the next one.
You can emulate "peeking" at the next line using the code below, which allows you to detect the desired condition while still processing the lines uniformly.
This solution may not be for everyone, because the logic is nontrivial and therefore requires quite a bit of extra, non-obvious code, and processing is slowed down as well.
Note that the code assumes that the last line has a trailing \n (as all well-formed multiline text input should have).
#!/usr/bin/env bash
eof=0 peekedChar= hadEmptyLine=0 lastLine=0
while IFS= read -r line || { eof=1; (( hadEmptyLine )); }; do
# Construct the 1-2 element array of lines to process in this iteration:
# - an empty line detected in the previous iteration by peeking, if applicable
(( hadEmptyLine )) && aLines=( '' ) || aLines=()
# - the line read in this iteration, with the peeked char. prepended
if (( eof )); then
# No further line could be read in this iteration; we're here only because
# $hadEmptyLine was set, which implies that the empty line we're processing
# is by definition the last one.
lastLine=1 hadEmptyLine=0
# Add the just-read line, with the peeked char. prepended.
aLines+=( "${peekedChar}${line}" )
# "Peek" at the next line by reading 1 character, which
# we'll have to prepend to the *next* iteration's line, if applicable.
# Being unable to read tells us that this is the last line.
IFS= read -n 1 peekedChar || lastLine=1
# If the next line happens to be empty, our "peeking" has fully consumed it,
# so we must tell the next iteration to insert processing of this empty line.
hadEmptyLine=$(( ! lastLine && ${#peekedChar} == 0 ))
# Process the 1-2 lines.
ndx=0 maxNdx=$(( ${#aLines[#]} - 1 ))
for line in "${aLines[#]}"; do
if [[ -z $line ]]; then # an empty line
# Determine if this empty line is the last one overall.
thisEmptyLineIsLastLine=$(( lastLine && ndx == maxNdx ))
echo "empty line; last? $thisEmptyLineIsLastLine"
echo "nonempty line: [$line]"
done < file

Reformatting a csv file, script is confused by ' %." '

I'm using bash on cygwin.
I have to take a .csv file that is a subset of a much larger set of settings and shuffle the new csv settings (same keys, different values) into the 1000-plus-line original, making a new .json file.
I have put together a script to automate this. The first step in the process is to "clean up" the csv file by extracting lines that start with "mme " and "sms ". Everything else is to pass through cleanly to the "clean" .csv file.
This routine is as follows:
# clean up the settings, throwing out mme and sms entries
cat extract.csv | while read -r LINE; do
if [[ $LINE == "mme "* ]]
printf "$LINE\n" >> mme_settings.csv
elif [[ $LINE == "sms "* ]]
printf "$LINE\n" >> sms_settings.csv
printf "$LINE\n" >> extract_clean.csv
My problem is that this thing stubs its toe on the following string at the end of one entry: 100%." When it's done with the line, it simply elides the %." and the new-line marker following it, and smears the two lines together:
... 100next.entry.keyname...
I would love to reach in and simply manually delimit the % sign, but it's not a realistic option for my use case. Clearly I'm missing something. My suspicion is that I am in some wise abusing cat or read in the first line.
If there is some place I should have looked to find the answer before bugging you all, by all means point me in that direction and I'll sod off.
Syntax for printf is :
printf format [argument]...
In [ printf ] format string, anything followed by % is a format specifier as described in the link above. What you would like to do is :
while read -r line; do # Replaced LINE with line, full uppercase variable are reserved for the syste,
if [[ "$line" = "mme "* ]] # Here* would glob for anything that comes next
printf "%s\n" $line >> mme_settings.csv
elif [[ "$line" = "sms "* ]]
printf "%s\n" $line >> sms_settings.csv
printf "%s\n" $line >> extract_clean.csv
done<extract.csv # Avoided the useless use of cat
As pointed out, your problem is expanding a parameter containing a formatting instruction in the formatting argument of printf, which can be solved by using echo instead or moving the parameter to be expanded out of the formatting string, as demonstrated in other answers.
I recommend not looping over your whole file with Bash in the first place, as it's notoriously slow; you're extracting lines starting with certain patterns, which is a job at which grep excels:
grep '^mme ' extract.csv > mme_settings.csv
grep '^sms ' extract.csv > sms_settings.csv
grep -v '^mme \|^sms ' extract.csv > extract_clean.csv
The third command uses the -v option (extract lines that don't match) and alternation to exclude lines both starting with mme and sms.

Bash read function returns error code when using new line delimiter

I have a script that I am returning multiple values from, each on a new line. To capture those values as bash variables I am using the read builtin (as recommended here).
The problem is that when I use the new line character as the delimiter for read, I seem to always get a non-zero exit code. This is playing havoc with the rest of my scripts, which check the result of the operation.
Here is a cut-down version of what I am doing:
$ read -d '\n' a b c < <(echo -e "1\n2\n3"); echo $?; echo $a $b $c
1 2 3
Notice the exit status of 1.
I don't want to rewrite my script (the echo command above) to use a different delimiter (as it makes sense to use new lines in other places of the code).
How do I get read to play nice and return a zero exit status when it successfully reads 3 values?
Hmmm, it seems that I may be using the "delimiter" wrongly. From the man page:
-d *delim*
The first character of delim is used to terminate the input line,
rather than newline.
Therefore, one way I could achieve the desired result is to do this:
read -d '#' a b c < <(echo -e "1\n2\n3\n## END ##"); echo $?; echo $a $b $c
Perhaps there's a nicer way though?
The "problem" here is that read returns non-zero when it reaches EOF which happens when the delimiter isn't at the end of the input.
So adding a newline to the end of your input will make it work the way you expect (and fix the argument to -d as indicated in gniourf_gniourf's comment).
What's happening in your example is that read is scanning for \ and hitting EOF before finding it. Then the input line is being split on \n (because of IFS) and assigned to $a, $b and $c. Then read is returning non-zero.
Using -d for this is fine but \n is the default delimiter so you aren't changing anything if you do that and if you had gotten the delimiter correct (-d $'\n') in the first place you would have seen your example not work at all (though it would have returned 0 from read). (See http://ideone.com/MWvgu7)
A common idiom when using read (mostly with non-standard values for -d is to test for read's return value and whether the variable assigned to has a value. read -d '' line || [ "$line" ] for example. Which works even when read fails on the last "line" of input because of a missing terminator at the end.
So to get your example working you want to either use multiple read calls the way chepner indicated or (if you really want a single call) then you want (See http://ideone.com/xTL8Yn):
IFS=$'\n' read -d '' a b c < <(printf '1 1\n2 2\n3 3')
echo $?
printf '[%s]\n' "$a" "$b" "$c"
And adding \0 to the end of the input stream (e.g. printf '1 1\n2 2\n3 3\0') or putting || [ "$a" ] at the end will avoid the failure return from the read call.
The setting of IFS for read is to prevent the shell from word-splitting on spaces and breaking up my input incorrectly. -d '' is read on \0.
-d is the wrong thing to use here. What you really want is three separate calls to read:
{ read a; read b; read c; } < <(echo $'1\n2\n3\n')
Be sure that the input ends with a newline so that the final read has an exit status of 0.
If you don't know how many lines are in the input ahead of time, you need to read the values into an array. In bash 4, that takes just a single call to readarray:
readarray -t arr < <(echo $'1\n2\n3\n')
Prior to bash 4, you need to use a loop:
while read value; do
done < <(echo $'1\n2\n3\n')
read always reads a single line of input; the -d option changes read's idea of what terminates a line. An example:
$ while read -d'#' value; do
> echo "$value"
> done << EOF
> a#b#c#

Parsing .csv file in bash, not reading final line

I'm trying to parse a csv file I made with Google Spreadsheet. It's very simple for testing purposes, and is basically:
The problem is that the csv doesn't end in a newline character so when I cat the file in BASH, I get
MacBook-Pro:Desktop kkSlider$ cat test.csv
5,6MacBook-Pro:Desktop kkSlider$
I just want to read line by line in a BASH script using a while loop that every guide suggests, and my script looks like this:
while IFS=',' read -r last first
echo "$last $first"
done < test.csv
The output is:
MacBook-Pro:Desktop kkSlider$ ./test.sh
1 2
3 4
Any ideas on how I could have it read that last line and echo it?
Thanks in advance.
You can force the input to your loop to end with a newline thus:
(cat test.csv ; echo) | while IFS=',' read -r last first
echo "$last $first"
Unfortunately, this may result in an empty line at the end of your output if the input already has a newline at the end. You can fix that with a little addition:
(cat test.csv ; echo) | while IFS=',' read -r last first
if [[ $last != "" ]] ; then
echo "$last $first"
Another method relies on the fact that the values are being placed into the variables by the read but they're just not being output because of the while statement:
while IFS=',' read -r last first
echo "$last $first"
done <test.csv
if [[ $last != "" ]] ; then
echo "$last $first"
That one works without creating another subshell to modify the input to the while statement.
Of course, I'm assuming here that you want to do more inside the loop that just output the values with a space rather than a comma. If that's all you wanted to do, there are other tools better suited than a bash read loop, such as:
tr "," " " <test.csv
cat file |sed -e '${/^$/!s/$/\n/;}'| while IFS=',' read -r last first; do echo "$last $first"; done
If the last (unterminated) line needs to be processed differently from the rest, #paxdiablo's version with the extra if statement is the way to go; but if it's going to be handled like all the others, it's cleaner to process it in the main loop.
You can roll the "if there was an unterminated last line" into the main loop condition like this:
while IFS=',' read -r last first || [ -n "$last" ]
echo "$last $first"
done < test.csv
