Counting newline characters in bash shell script - bash

I cannot get this script to work at all. I am just trying to count the number of lines in a file WITHOUT using wc. here is what I have so far
FILE=file.txt
lines=0
while IFS= read -n1 char
do
if [ "$char" == "\n" ]
then
lines=$((lines+1))
fi
done < $FILE
this is just a small part of a bigger script that should count total words, characters and lines in a file. I cannot figure any of it out though. Please help
The problem is the if-statement conditional is never true.. Its as if the program cannot detect what a '\n' is.

declare -i lines=0 words=0 chars=0
while IFS= read -r line; do
((lines++))
array=($line) # don't quote the var to enable word splitting
((words += ${#array[#]}))
((chars += ${#line} + 1)) # add 1 for the newline
done < "$filename"
echo "$lines $words $chars $filename"

You have two problems there. They are fixed in the following:
#!/bin/bash
file=file.txt
lines=0
while IFS= read -rN1 char; do
if [[ "$char" == $'\n' ]]; then
((++lines))
fi
done < "$file"
One problem was the $'\n' in the test, the other one, more subtle, was that you need to use the -N switch, not the -n one in read (help read for more information). Oh, and you also want to use the -r option (check with and without, when you have backslashes in your file).
Minor things I changed: Use more robust [[...]], used lower case variable names (it's considered bad practice to have upper case variable names). Used arithmetic ((++lines)) instead of the silly lines=$((lines+1)).

Related

Why will it not echo "$d" anwser

I need the script to output the result, but echo "$d" does not output anything. I made the ciphertext.gz earlier in the script and the $fil is ciphertext.gz. Bash script:
echo "Fil: ciphertext.gz"
a="ABCDEFGHIJKLMNOPQRSTUVXYZ"
[[ "${*/-d/}" != "" ]] &&
echo "Usage: $0 [-d]" && exit 1
m=${1:+-}
m=-
t=$fil
printf "Nøgle 'eks. ABCDE': "
read -r k
k=$(echo "$k" | tr [a-vx-z] [A-VX-Z] )
printf "\n"
for ((i=0;i<${#t};i++)); do
p1=${a%%${t:$i:1}*}
p2=${a%%${k:$((i%${#k})):1}*}
d="${d}${a:$(((${#p1}${m:-+}${#p2})%${#a})):1}"
done
echo "$d"
Extended comments, (not really an answer, since it's unclear what the code should do):
This is wrong:
m=${1:+-}
m=-
...since it has the same effect as:
m=-
This reads one line from standard input:
read -r k
...which, unless ciphertext is only one line long, probably
defeats the purpose of the next eight lines of code. Even if
standard input was unzip < ciphertext.gz |, it would only decode
the first line of ciphertext.
Wrap the for in an appropriate while read k loop.

Reformatting a csv file, script is confused by ' %." '

I'm using bash on cygwin.
I have to take a .csv file that is a subset of a much larger set of settings and shuffle the new csv settings (same keys, different values) into the 1000-plus-line original, making a new .json file.
I have put together a script to automate this. The first step in the process is to "clean up" the csv file by extracting lines that start with "mme " and "sms ". Everything else is to pass through cleanly to the "clean" .csv file.
This routine is as follows:
# clean up the settings, throwing out mme and sms entries
cat extract.csv | while read -r LINE; do
if [[ $LINE == "mme "* ]]
then
printf "$LINE\n" >> mme_settings.csv
elif [[ $LINE == "sms "* ]]
then
printf "$LINE\n" >> sms_settings.csv
else
printf "$LINE\n" >> extract_clean.csv
fi
done
My problem is that this thing stubs its toe on the following string at the end of one entry: 100%." When it's done with the line, it simply elides the %." and the new-line marker following it, and smears the two lines together:
... 100next.entry.keyname...
I would love to reach in and simply manually delimit the % sign, but it's not a realistic option for my use case. Clearly I'm missing something. My suspicion is that I am in some wise abusing cat or read in the first line.
If there is some place I should have looked to find the answer before bugging you all, by all means point me in that direction and I'll sod off.
Syntax for printf is :
printf format [argument]...
In [ printf ] format string, anything followed by % is a format specifier as described in the link above. What you would like to do is :
while read -r line; do # Replaced LINE with line, full uppercase variable are reserved for the syste,
if [[ "$line" = "mme "* ]] # Here* would glob for anything that comes next
then
printf "%s\n" $line >> mme_settings.csv
elif [[ "$line" = "sms "* ]]
then
printf "%s\n" $line >> sms_settings.csv
else
printf "%s\n" $line >> extract_clean.csv
fi
done<extract.csv # Avoided the useless use of cat
As pointed out, your problem is expanding a parameter containing a formatting instruction in the formatting argument of printf, which can be solved by using echo instead or moving the parameter to be expanded out of the formatting string, as demonstrated in other answers.
I recommend not looping over your whole file with Bash in the first place, as it's notoriously slow; you're extracting lines starting with certain patterns, which is a job at which grep excels:
grep '^mme ' extract.csv > mme_settings.csv
grep '^sms ' extract.csv > sms_settings.csv
grep -v '^mme \|^sms ' extract.csv > extract_clean.csv
The third command uses the -v option (extract lines that don't match) and alternation to exclude lines both starting with mme and sms.

printing line numbers that are multiple of 5

Hi I am trying to print/echo line numbers that are multiple of 5. I am doing this in shell script. I am getting errors and unable to proceed. below is the script
#!/bin/bash
x=0
y=$wc -l $1
while [ $x -le $y ]
do
sed -n `$x`p $1
x=$(( $x + 5 ))
done
When executing above script i get below errors
#./echo5.sh sample.h
./echo5.sh: line 3: -l: command not found
./echo5.sh: line 4: [: 0: unary operator expected
Please help me with this issue.
For efficiency, you don't want to be invoking sed multiple times on your file just to select a particular line. You want to read through the file once, filtering out the lines you don't want.
#!/bin/bash
i=0
while IFS= read -r line; do
(( ++i % 5 == 0 )) && echo "$line"
done < "$1"
Demo:
$ i=0; while read line; do (( ++i % 5 == 0 )) && echo "$line"; done < <(seq 42)
5
10
15
20
25
30
35
40
A funny pure Bash possibility:
#!/bin/bash
mapfile ary < "$1"
printf "%.0s%.0s%.0s%.0s%s" "${ary[#]}"
This slurps the file into an array ary, which each line of the file in a field of the array. Then printf takes care of printing one every 5 lines: %.0s takes a field, but does nothing, and %s prints the field. Since mapfile is used without the -t option, the newlines are included in the array. Of course this really slurps the file into memory, so it might not be good for huge files. For large files you can use a callback with mapfile:
#!/bin/bash
callback() {
printf '%s' "$2"
ary=()
}
mapfile -c 5 -C callback ary < "$1"
We're removing all the elements of the array during the callback, so that the array doesn't grow too large, and the printing is done on the fly, as the file is read.
Another funny possibility, in the spirit of glenn jackmann's solution, yet without a counter (and still pure Bash):
#!/bin/bash
while read && read && read && read && IFS= read -r line; do
printf '%s\n' "$line"
done < "$1"
Use sed.
sed -n '0~5p' $1
This prints every fifth line in the file starting from 0
Also
y=$wc -l $1
wont work
y=$(wc -l < $1)
You need to create a subshell as bash will see the spaces as the end of the assignment, also if you just want the number its best to redirect the file into wc.
Dont know what you were trying to do with this ?
x=$(( $x + 5 ))
Guessing you were trying to use let, so id suggest looking up the syntax for that command. It would look more like
(( x = x + 5 ))
Hope this helps
There are cleaner ways to do it, but what you're looking for is this.
#!/bin/bash
x=5
y=`wc -l $1`
y=`echo $y | cut -f1 -d\ `
while [ "$y" -gt "$x" ]
do
sed -n "${x}p" "$1"
x=$(( $x + 5 ))
done
Initialize x to 5, since there is no "line zero" in your file $1.
Also, wc -l $1 will display the number of line counts, followed by the name of the file. Use cut to strip the file name out and keep just the first word.
In conditionals, a value of zero can be interpreted as "true" in Bash.
You should not have space between your $x and your p in your sed command. You can put them right next to each other using curly braces.
You can do this quite succinctly using awk:
awk 'NR % 5 == 0' "$1"
NR is the record number (line number in this case). Whenever it is a multiple of 5, the expression is true, so the line is printed.
You might also like the even shorter but slightly less readable:
awk '!(NR%5)' "$1"
which does the same thing.

Read user given file character by character in bash

I have a file which is kind of unformatted, I want to place a new-line after every 100th character and remove any other new lines in it so that file may look with consistent width and readable
This code snippet helps read all the lines
while read LINE
do
len=${#LINE}
echo "Line length is : $len"
done < $file
but how do i do same for characters
Idea is to have something like this : (just an example, it may have syntax errors, not implemented yet)
while read ch #read character
do
chcount++ # increment character count
if [ "$chcount" -eq "100" && "$ch"!="\n" ] #if 100th character and is not a new line
then
echo -e "\n" #echo new line
elif [ "$ch"=="\n" ] #if character is not 100th but new line
then
ch=" " $replace it with space
fi
done < $file
I am learning bash, so please go easy!!
I want to place a new-line after every 100th character and remove any
other new lines in it so that file may look with consistent width and
readable
Unless you have a good reason to write a script, go ahead but you don't need one.
Remove the newline from the input and fold it. Saying:
tr -d '\n' < inputfile | fold -w 100
should achieve the desired result.
bash adds a -n flag to the standard read command to specify a number of characters to read, rather than a full line:
while read -n1 c; do
echo "$c"
done < $file
You can call the function below in any of the following ways:
line_length=100
wrap $line_length <<< "$string"
wrap $line_length < file_name
wrap $line_length < <(command)
command | wrap $line_length
The function reads the input line by line (more efficiently than by character) which essentially eliminates the existing newlines (which are replaced by spaces). The remainder of the previous line is prefixed to the current one and the result is split at the desired line length. The remainder after the split is kept for the next iteration. If the output buffer is full, it is output and cleared otherwise it's kept for the next iteration so more can be added. Once the input has been consumed, there may be additional text in the remainder. The function is called recursively until that is also consumed and output.
wrap () {
local remainder rest part out_buffer line len=$1
while IFS= read -r line
do
line="$remainder$line "
(( part = $len - ${#out_buffer} ))
out_buffer+=${line::$part}
remainder=${line:$part}
if (( ${#out_buffer} >= $len ))
then
printf '%s\n' "$out_buffer"
out_buffer=
fi
done
rest=$remainder
while [[ $rest ]]
do
wrap $len <<< "$rest"
done
if [[ $out_buffer ]]
then
printf '%s\n' "$out_buffer"
out_buffer=
fi
}
#!/bin/bash
w=~/testFile.txt
chcount=0
while read -r word ; do
len=${#word}
for (( i = 0 ; i <= $len - 1 ; ++i )) ; do
let chcount+=1
if [ $chcount -eq 100 ] ; then
printf "\n${word:$i:1}"
let chcount=0
else
printf "${word:$i:1}"
fi
done
done < $w
Are you looking for something like this?

Hanging bash loop script?

varrr=0
while read line
do
if [ $line -gt 500 -a $line -le 600 ]; then # for lines 501-600
echo $line >> 'file_out_${varrr}.ubi'
fi
done << 'file_in_${varrr}.ubi'
file_in_${varrr}.ubi is a text file with around 1000 lines. I want to print lines 501-600 to new file.
Running this code leaves my Ubuntu terminal with a > symbol on a new line, as if I need to type another command to finish the loop. I can' figure out what is wrong with this loop though. Seems like it's complete. See any mistakes I've made? Thanks.
I'm only going to answer your specific question: it's because you used a heredoc << symbol, instead of a redirection <. Your last line should read:
done < 'file_in_${varrr}.ubi'
(observe the single <).
But then you'll realize that you have some quoting problems. So, your last line should read:
done < "file_in_${varrr}.ubi"
(observe the double quotes ").
Similarly, watch out your quotings in line 6. You should have this instead:
echo "$line" >> "file_out_${varrr}.ubi"
(double quotes " for file_out_${varrr}.ubi).
But then, this will not behave as you expect... Maybe this will do:
varrr=0
linenb=0
while IFS= read -r line; do
((++linenb))
if ((linenb>500 && linenb<=600)); then # for lines 501-600
echo "$line" >> "file_out_${varrr}.ubi"
fi
done < "file_in_${varrr}.ubi"
Hope this helps!
If you just want to print lines from 501 to 600, why don't you use the following?
awk 'NR>=501 && NR<=600' file_in > file_out
awk 'NR==n' myfile prints the line n of the file myfile. Then, you can use ranges as I writted above.
You can simply use sed. It's the simplest tool for it and is cleaner and faster than a while loop with tests.
varrr=0
sed -n 501,600p "file_in_${varrr}.ubi" >> "file_out_${varrr}.ubi"
Or
varrr=0
sed -n 501,600p "file_in_${varrr}.ubi" > "file_out_${varrr}.ubi"
If you want to override existing data.
The mistake in your loop by the way is because you're not using a counter and comparing your line number by the line itself instead.
varrr=0
counter=0
while read line; do
(( ++counter ))
[[ counter -gt 500 && counter -le 600 ]] && echo "$line"
done < "file_in_${varrr}.ubi" > "file_out_${varrr}.ubi"
Noticeably you need to use < for input not << and place your variables around double quotes not single quotes.

Resources