How to check if the matched EMPTY line is the LAST line of a file in a while IFS read - bash

I have a while IFS read loop to check for different matches in the lines.
I check for and empty/blank line like this:
while IFS= read -r line; do
[[ -z $line ]] && printf some stuff
I also want to check if the matched empty/blank is also the last line of the file. I am going to run this script on a lot of files, they all:
-end with an empty line
-they are all a DIFFERENT LENGTH so I cannot assume anything
-they have other empty lines but not necessarily at the very end (this is why I have to differentiate)
Thanks in advance.

As chepner has noted, in a shell line-reading loop the only way to know whether a given line is the last one is to try to read the next one.
You can emulate "peeking" at the next line using the code below, which allows you to detect the desired condition while still processing the lines uniformly.
This solution may not be for everyone, because the logic is nontrivial and therefore requires quite a bit of extra, non-obvious code, and processing is slowed down as well.
Note that the code assumes that the last line has a trailing \n (as all well-formed multiline text input should have).
#!/usr/bin/env bash
eof=0 peekedChar= hadEmptyLine=0 lastLine=0
while IFS= read -r line || { eof=1; (( hadEmptyLine )); }; do
# Construct the 1-2 element array of lines to process in this iteration:
# - an empty line detected in the previous iteration by peeking, if applicable
(( hadEmptyLine )) && aLines=( '' ) || aLines=()
# - the line read in this iteration, with the peeked char. prepended
if (( eof )); then
# No further line could be read in this iteration; we're here only because
# $hadEmptyLine was set, which implies that the empty line we're processing
# is by definition the last one.
lastLine=1 hadEmptyLine=0
else
# Add the just-read line, with the peeked char. prepended.
aLines+=( "${peekedChar}${line}" )
# "Peek" at the next line by reading 1 character, which
# we'll have to prepend to the *next* iteration's line, if applicable.
# Being unable to read tells us that this is the last line.
IFS= read -n 1 peekedChar || lastLine=1
# If the next line happens to be empty, our "peeking" has fully consumed it,
# so we must tell the next iteration to insert processing of this empty line.
hadEmptyLine=$(( ! lastLine && ${#peekedChar} == 0 ))
fi
# Process the 1-2 lines.
ndx=0 maxNdx=$(( ${#aLines[#]} - 1 ))
for line in "${aLines[#]}"; do
if [[ -z $line ]]; then # an empty line
# Determine if this empty line is the last one overall.
thisEmptyLineIsLastLine=$(( lastLine && ndx == maxNdx ))
echo "empty line; last? $thisEmptyLineIsLastLine"
else
echo "nonempty line: [$line]"
fi
((++ndx))
done
done < file

Related

Copy number of line composed by special character in bash

I have an exercise where I have a file and at the begin of it I have something like
#!usr/bin/bash
# tototata
#tititutu
#ttta
Hello world
Hi
Test test
#zabdazj
#this is it
And I have to take each first line starting with a # until the line where I don't have one and stock it in a variable. In case of a shebang, it has to skip it and if there's blank space between lines, it has to skip them too. We just want the comment between the shebang and the next character.
I'm new to bash and I would like to know if there's a way to do it please ?
Expected output:
# tototata
#tititutu
#ttta
Try in this easy way to better understand.
#!/bin/bash
sed 1d your_input_file | while read line;
do
check=$( echo $line | grep ^"[#;]" )
if ([ ! -z "$check" ] || [ -z "$line" ])
then
echo $line;
else
exit 1;
fi
done
This may be more correct, although your question was unclear about weather the input file had a script shebang, if the shebang had to be skipped to match your sample output, or if the input file shebang was just bogus.
It is also unclear for what to do, if the first lines of the input file are not starting with #.
You should really post your assignment's text as a reference.
Anyway here is a script that does collects first set of consecutive lines starting with a sharp # into the arr array variable.
It may not be an exact solution to your assignment (witch you should be able to solve with what your previous lessons taught you), but will get you some clues and keys to iterate reading lines from a file and testing that lines starts with a #.
#!/usr/bin/env bash
# Our variable to store parsed lines
# Is an array of strings with an entry per line
declare -a arr=()
# Iterate reading lines from the file
# while it matches Regex: ^[#]
# mean while lines starts with a sharp #
while IFS=$'\n' read -r line && [[ "$line" =~ ^[#] ]]; do
# Add line to the arr array variable
arr+=("$line")
done <a.txt
# Print each array entries with a newline
printf '%s\n' "${arr[#]}"
How about this (not tested, so you may have to debug it a bit, but my comments in the code should explain what is going on):
while read line
do
# initial is 1 one the first line, and 0 after this. When the script starts,
# the variable is undefined.
: ${initial:=1}
# Test for lines starting with #. Need to quote the hash
# so that it is not taken as comment.
if [[ $line == '#'* ]]
then
# Test for initial #!
if (( initial == 1 )) && [[ $line == '#!'* ]]
then
: # ignore it
else
echo $line # or do whatever you want to do with it
fi
fi
# stop on non-blank, non-comment line
if [[ $line != *[^\ ]* ]]
then
break
fi
initial=0 # Next line won't be an initial line
done < your_file

Bash/Shell -- Inserting line breaks between dates

Ive got data that comes in a file with multiple dates/times, etc...
example:
12/15/19,23:30,80.2
12/15/19,23:45,80.6
12/16/19,00:00,80.5
12/16/19,00:15,80.2
And would like to use some command that will automatically go through the whole file and anytime the date changes, it would insert 2 Blank lines so that i'm able to see more clearly when the date changes.
example of what i'm looking for the file to look like after said command:
12/15/19,23:30,80.2
12/15/19,23:45,80.6
12/16/19,00:00,80.5
12/16/19,00:15,80.2
What is the best way to do this through bash/shell command line commands?
Using awk:
awk -F',' 'NR>1 && prev!=$1{ print ORS }
{ prev=$1; print }' file
Use , as field separator
If this is not the first line and prev is different from field1, print two newlines (print prints one newline and the
output record separator ORS another one)
For each line, save the value of field1 in variable prev and print the line
Since you're detecting patterns over multiple lines, you'll want to use bash builtins instead of programs like grep or sed.
# initialize variable
last_date=''
# loop over file lines (IFS='' to loop by line instead of word)
while IFS='' read line; do
# extract date (up to first comma)
this_date="${line%%,*}"
# print blank line unless dates are equal
[[ "$this_date" = "$last_date" ]] || echo
# remember date for next line
last_date="$this_date"
# print
printf '%s\n' "$line"
# feed loop with file
done < my_file.txt
Here's the shorter copy/paste version:
b='';while IFS='' read l;do a="${l%%,*}";[[ "$a" = "$b" ]]||echo;b="$a";printf '%s\n' "$l";done < my_file.txt
And you can also make it a function:
function add_spaces {
# initialize variable
last_date=''
# loop over file lines (IFS='' to loop by line instead of word)
while IFS='' read line; do
# extract date (up to first comma)
this_date="${line%%,*}"
# print blank line unless dates are equal
[[ "$this_date" = "$last_date" ]] || echo
# remember date for next line
last_date="$this_date"
# print
printf '%s\n' "$line"
# feed loop with file
done < "$1" # $1 is the first argument to the function
}
So that you can call it whenever you want:
add_spaces my_file.txt

Bash read function returns error code when using new line delimiter

I have a script that I am returning multiple values from, each on a new line. To capture those values as bash variables I am using the read builtin (as recommended here).
The problem is that when I use the new line character as the delimiter for read, I seem to always get a non-zero exit code. This is playing havoc with the rest of my scripts, which check the result of the operation.
Here is a cut-down version of what I am doing:
$ read -d '\n' a b c < <(echo -e "1\n2\n3"); echo $?; echo $a $b $c
1
1 2 3
Notice the exit status of 1.
I don't want to rewrite my script (the echo command above) to use a different delimiter (as it makes sense to use new lines in other places of the code).
How do I get read to play nice and return a zero exit status when it successfully reads 3 values?
Update
Hmmm, it seems that I may be using the "delimiter" wrongly. From the man page:
-d *delim*
The first character of delim is used to terminate the input line,
rather than newline.
Therefore, one way I could achieve the desired result is to do this:
read -d '#' a b c < <(echo -e "1\n2\n3\n## END ##"); echo $?; echo $a $b $c
Perhaps there's a nicer way though?
The "problem" here is that read returns non-zero when it reaches EOF which happens when the delimiter isn't at the end of the input.
So adding a newline to the end of your input will make it work the way you expect (and fix the argument to -d as indicated in gniourf_gniourf's comment).
What's happening in your example is that read is scanning for \ and hitting EOF before finding it. Then the input line is being split on \n (because of IFS) and assigned to $a, $b and $c. Then read is returning non-zero.
Using -d for this is fine but \n is the default delimiter so you aren't changing anything if you do that and if you had gotten the delimiter correct (-d $'\n') in the first place you would have seen your example not work at all (though it would have returned 0 from read). (See http://ideone.com/MWvgu7)
A common idiom when using read (mostly with non-standard values for -d is to test for read's return value and whether the variable assigned to has a value. read -d '' line || [ "$line" ] for example. Which works even when read fails on the last "line" of input because of a missing terminator at the end.
So to get your example working you want to either use multiple read calls the way chepner indicated or (if you really want a single call) then you want (See http://ideone.com/xTL8Yn):
IFS=$'\n' read -d '' a b c < <(printf '1 1\n2 2\n3 3')
echo $?
printf '[%s]\n' "$a" "$b" "$c"
And adding \0 to the end of the input stream (e.g. printf '1 1\n2 2\n3 3\0') or putting || [ "$a" ] at the end will avoid the failure return from the read call.
The setting of IFS for read is to prevent the shell from word-splitting on spaces and breaking up my input incorrectly. -d '' is read on \0.
-d is the wrong thing to use here. What you really want is three separate calls to read:
{ read a; read b; read c; } < <(echo $'1\n2\n3\n')
Be sure that the input ends with a newline so that the final read has an exit status of 0.
If you don't know how many lines are in the input ahead of time, you need to read the values into an array. In bash 4, that takes just a single call to readarray:
readarray -t arr < <(echo $'1\n2\n3\n')
Prior to bash 4, you need to use a loop:
while read value; do
arr+=("$value")
done < <(echo $'1\n2\n3\n')
read always reads a single line of input; the -d option changes read's idea of what terminates a line. An example:
$ while read -d'#' value; do
> echo "$value"
> done << EOF
> a#b#c#
> EOF
a
b
c

while read loop ignoring last line in file

Reading in a file , members_08_14.csv which just contains a list of numbers, the while loop is reading each line. For each line, the number is matched against a regex to ensure that it's only numbers and exactly 11 characters long.
while read card
do
if [[ $card =~ ^[0-9]{11}$ ]]
then
echo "some sql statement with $card" >> temp.sql;
else
echo "Invalid card number in file: $card";
fi
done <registered/members_08_14.csv
The interesting thing is, the else is not being executed if the regex does not match. I would expect that either the line would be written to temp.sql, or a line would be printed to stdout saying the card number is invalid.
The behaviour, however, is more along the lines of either only the true condition or only the false condition gets activated for the entire file. Why would this be?
Here's the contents of registered/members_08_14.csv:
47678009583
47678009585
47678009587
47678009590
476780095905
The first lines are valid, the 5th line is invalid.
Output of cat -vte registered/members_08_14.csv
47678009583$
47678009585$
47678009587$
47678009590$
476780095905$
If the last line of your file has no newline on the end, read will put its content into card -- but will then exit with a nonzero value. Because read has exited with a nonzero value in this case, the while loop will exit without going on to the statement that runs the regex at all.
The easiest fix is to correct the file.
Another approach you can take is to ignore the exit status of read when it actually populates its destination (and, while at it, to put $'\r' into IFS, such that read will ignore the extra characters in DOS newlines):
while card=; IFS=$' \t\r\n' read -r card || [[ $card ]]; do
if [[ $card =~ ^[0-9]{11}$ ]]
then
echo "some sql statement with $card" >> temp.sql;
else
echo "Invalid card number in file: $card";
fi
done <registered/members_08_14.csv
Perhaps your file is in DOS format that you also get to read carriage returns (\r) to the end of the variable. Try to run dos2unix file or sed -i 's|\r||' file. Another way is to trim out that character after every input through this
while IFS=$' \t\r\n' read -r card
To read all the lines, regardless of whether they are ended with a new line or not:
cat "somefile" | { cat ; echo ; } | while read line; do echo $line; done
Source : My open source project https://sourceforge.net/projects/command-output-to-html-table/

Read user given file character by character in bash

I have a file which is kind of unformatted, I want to place a new-line after every 100th character and remove any other new lines in it so that file may look with consistent width and readable
This code snippet helps read all the lines
while read LINE
do
len=${#LINE}
echo "Line length is : $len"
done < $file
but how do i do same for characters
Idea is to have something like this : (just an example, it may have syntax errors, not implemented yet)
while read ch #read character
do
chcount++ # increment character count
if [ "$chcount" -eq "100" && "$ch"!="\n" ] #if 100th character and is not a new line
then
echo -e "\n" #echo new line
elif [ "$ch"=="\n" ] #if character is not 100th but new line
then
ch=" " $replace it with space
fi
done < $file
I am learning bash, so please go easy!!
I want to place a new-line after every 100th character and remove any
other new lines in it so that file may look with consistent width and
readable
Unless you have a good reason to write a script, go ahead but you don't need one.
Remove the newline from the input and fold it. Saying:
tr -d '\n' < inputfile | fold -w 100
should achieve the desired result.
bash adds a -n flag to the standard read command to specify a number of characters to read, rather than a full line:
while read -n1 c; do
echo "$c"
done < $file
You can call the function below in any of the following ways:
line_length=100
wrap $line_length <<< "$string"
wrap $line_length < file_name
wrap $line_length < <(command)
command | wrap $line_length
The function reads the input line by line (more efficiently than by character) which essentially eliminates the existing newlines (which are replaced by spaces). The remainder of the previous line is prefixed to the current one and the result is split at the desired line length. The remainder after the split is kept for the next iteration. If the output buffer is full, it is output and cleared otherwise it's kept for the next iteration so more can be added. Once the input has been consumed, there may be additional text in the remainder. The function is called recursively until that is also consumed and output.
wrap () {
local remainder rest part out_buffer line len=$1
while IFS= read -r line
do
line="$remainder$line "
(( part = $len - ${#out_buffer} ))
out_buffer+=${line::$part}
remainder=${line:$part}
if (( ${#out_buffer} >= $len ))
then
printf '%s\n' "$out_buffer"
out_buffer=
fi
done
rest=$remainder
while [[ $rest ]]
do
wrap $len <<< "$rest"
done
if [[ $out_buffer ]]
then
printf '%s\n' "$out_buffer"
out_buffer=
fi
}
#!/bin/bash
w=~/testFile.txt
chcount=0
while read -r word ; do
len=${#word}
for (( i = 0 ; i <= $len - 1 ; ++i )) ; do
let chcount+=1
if [ $chcount -eq 100 ] ; then
printf "\n${word:$i:1}"
let chcount=0
else
printf "${word:$i:1}"
fi
done
done < $w
Are you looking for something like this?

Resources