Why awk '{ print }' doesn't start a new line but loops on space char - shell

I have this shell script
#!/bin/bash
LINES=$(awk '{ print }' filename.txt)
for LINE in $LINES; do
echo "$LINE"
done
And filename.txt has this content
Loreum ipsum dolores
Loreum perche non se imortale
The shell script is iterating all spaces of the lines in filename.txt while it is supposed to loop only those two lines.
But when I type the "awk '{ print }' filename.txt" in terminal then it loops ok.
Any explanations?
Thanks in advance!

The $(...) construct absorbs all the output from awk as one large string, and then for LINE in $LINES splits on whitespace. You want this construct instead:
#! /bin/sh
while read LINE; do
printf '%s\n' "$LINE"
done < filename.txt

The other answers are good, another thing you can do is temporarily change your IFS (Internal Field Separator) variable. If you update your shell script to look like this:
#!/bin/bash
IFS="
"
LINES=$(awk '{ print }' filename.txt)
for LINE in $LINES; do
echo "$LINE"
done
This updates the IFS to be a newline instead of ' ' which should also do what you want.
Just another suggestion.

You need to loop over LINES as an array as all lines are stored as an array there.
Here's an example how to loop over the lines:
http://tldp.org/LDP/abs/html/arrays.html#SCRIPTARRAY

Related

Removing newlines in a txt file

I have a txt file in a format like this:
test1
test2
test3
How can I bring it into a format like this using bash?
test1,test2,test3
Assuming that “using Bash” means “without any external processes”:
if IFS= read -r line; then
printf '%s' "$line"
while IFS= read -r line; do
printf ',%s' "$line"
done
echo
fi
Old answer here
TL;DR:
cat "export.txt" | paste -sd ","
Another pure bash implementation that avoids explicit loops:
#!/usr/bin/env bash
file2csv() {
local -a lines
readarray -t lines <"$1"
local IFS=,
printf "%s\n" "${lines[*]}"
}
file2csv input.txt
You can use awk. If the file name is test.txt then
awk '{print $1}' ORS=',' test.txt | awk '{print substr($1, 1, length($1)-1)}'
The first awk commad joins the three lines with comma (test1,test2,test3,).
The second awk command just deletes the last comma from the string.
Use tool 'tr' (translate) and sed to remove last comma:
tr '\n' , < "$source_file" | sed 's/,$//'
If you want to save the output into a variable:
var="$( tr '\n' , < "$source_file" | sed 's/,$//' )"
Using sed:
$ sed ':a;N;$!ba;s/\n/,/g' file
Output:
test1,test2,test3
I think this is where I originally picked it up.
If you don't want a terminating newline:
$ awk '{printf "%s%s", sep, $0; sep=","}' file
test1,test2,test3
or if you do:
awk '{printf "%s%s", sep, $0; sep=","} END{print ""}' file
test1,test2,test3
Another loopless pure Bash solution:
contents=$(< input.txt)
printf '%s\n' "${contents//$'\n'/,}"
contents=$(< input.txt) is equivalent to contents=$(cat input.txt). It puts the contents of the input.txt file (with trailing newlines automatically removed) into the variable contents.
"${contents//$'\n'/,}" replaces all occurrences of the newline character ($'\n') in contents with the comma character. See Parameter expansion [Bash Hackers Wiki].
See the accepted, and excellent, answer to Why is printf better than echo? for an explanation of why printf '%s\n' is used instead of echo.

Convert text file into a comma delimited string

I don't seem to locate an SO question that matches this exact problem.
I have a text file that has one text token per line, without any commas, tabs, or quotes. I want to create a comma delimited string based on the file content.
Input:
one
two
three
Output:
one,two,three
I am using this command:
csv_string=$(tr '\n' ',' < file | sed 's/,$//')
Is there a more efficient way to do this?
The usual command to do this is paste
csv_string=$(paste -sd, file.txt)
You can do it entirely with bash parameter expansion operators instead of using tr and sed.
csv_string=$(<file) # read file into variable
csv_string=${csv_string//$'\n'/,} # replace \n with ,
csv_string=${csv_string%,} # remove trailing comma
One way with Awk would be to reset the RS and treat the records as separated by blank lines. This would handle words with spaces and format them in CSV format as expected.
awk '{$1=$1}1' FS='\n' OFS=',' RS= file
The {$1=$1} is a way to reconstruct the fields in each line($0) of the file based on modifications to Field (FS/OFS) and/or Record separators(RS/ORS). The trailing 1 is to print every line with the modifications done inside {..}.
With Perl one-liner:
$ cat csv_2_text
one
two
three
$ perl -ne '{ chomp; push(#lines,$_) } END { $x=join(",",#lines); print "$x" }' csv_2_text
one,two,three
$ perl -ne ' { chomp; $_="$_," if not eof ;printf("%s",$_) } ' csv_2_text
one,two,three
$
From #codeforester
$ perl -ne 'BEGIN { my $delim = "" } { chomp; printf("%s%s", $delim, $_); $delim="," } END { printf("\n") }' csv_2_text
one,two,three
$
Tested the four approaches on a Linux box - Bash only, paste, awk, Perl, as well as the tr | sed approach shown in the question:
#!/bin/bash
# generate test data
seq 1 10000 > test.file
times=${1:-50}
printf '%s\n' "Testing paste solution"
time {
for ((i=0; i < times; i++)); do
csv_string=$(paste -sd, test.file)
done
}
printf -- '----\n%s\n' "Testing pure Bash solution"
time {
for ((i=0; i < times; i++)); do
csv_string=$(<test.file) # read file into variable
csv_string=${csv_string//$'\n'/,} # replace \n with ,
csv_string=${csv_strings%,} # remove trailing comma
done
}
printf -- '----\n%s\n' "Testing Awk solution"
time {
for ((i=0; i < times; i++)); do
csv_string=$(awk '{$1=$1}1' FS='\n' OFS=',' RS= test.file)
done
}
printf -- '----\n%s\n' "Testing Perl solution"
time {
for ((i=0; i < times; i++)); do
csv_string=$(perl -ne '{ chomp; $_="$_," if not eof; printf("%s",$_) }' test.file)
done
}
printf -- '----\n%s\n' "Testing tr | sed solution"
time {
for ((i=0; i < times; i++)); do
csv_string=$(tr '\n' ',' < test.file | sed 's/,$//')
done
}
Surprisingly, the Bash only solution does quite poorly. paste comes on top, followed by tr | sed, Awk, and perl:
Testing paste solution
real 0m0.109s
user 0m0.052s
sys 0m0.075s
----
Testing pure Bash solution
real 1m57.777s
user 1m57.113s
sys 0m0.341s
----
Testing Awk solution
real 0m0.221s
user 0m0.152s
sys 0m0.077s
----
Testing Perl solution
real 0m0.424s
user 0m0.388s
sys 0m0.080s
----
Testing tr | sed solution
real 0m0.162s
user 0m0.092s
sys 0m0.141s
For some reasons, csv_string=${csv_string//$'\n'/,} hangs on macOS Mojave running Bash 4.4.23.
Related posts:
How to join multiple lines of file names into one with custom delimiter?
Concise and portable “join” on the Unix command-line
Turning multi-line string into single comma-separated

For loop in a awk command

I have a file which has rows , now i want to read it'w value from awk command in Unix. I am able to read that file , but i have added a for loop to traverse all the data into the file. But my for loop is not ending it is going in infinite loop.
Below code i am using to read the file and get the data of $1 ,$2 and $3 position
file=$1;
nbrClients=`wc -l $file | cut -d' ' -f1`;
echo $nbrClients;
awk '{
for(i=0; i<=$nbrClients; ++i)
{print $1 $2 $3}
}' $file
File which i am reading has below format :
abc 12 test.txt
abc 12 test.txt
abc 12 test.txt
abc 12 test.txt
abc 12 test.txt
abc 12 test.txt
So for this nbrClients value will be 6 and it should loop for 6 times but it is not doing so .Please suggest what wrong i am doing in this.
Here is the full code which i am trying to :
file=$1;
nbrClients=`wc -l $file | cut -d' ' -f1`;
echo $nbrClients;
file=$1;
cat | awk '{
fileName=$1
tnxCount=$2
for i in `seq 1 $tnxCount`
do
echo "Starting thread number $i"
nohup perl /home/user/abc.pl -i $fileName >>/home/user/test_load_${today}.out 2>&1 &
done
}' $file;
I think the problem here is that you're under the impression that the for loop is what will cause awk to step through your input file, whereas it's awk's nature to do that already.
Awk works by taking a set of condition { statement } pairs, and then FOR EACH LINE OF INPUT, evaluating the condition, and if it rings true, executing the statement. Note that conditions can be statements (since functions and other commands have a return value) and statements can include if constructs, so there's a lot of flexibility here.
Note that awk can also reduce or simplify stuff you'd do in a shell script. Consider the following:
#!/bin/sh
file="$1"
awk '
NR==FNR {
ClientCount++
next
}
FNR==1 {
printf "%s: %d\n", FILENAME, ClientCount
}
{
print $1, $2, $3
}
' "$file" "$file"
This script reads your input file twice -- once to count the lines (so that the line count can be placed at the top of theoutput), and once to process the lines, printing the first three fields. The script is composed of three condition { statement } groupings:
The first one is the counter. It only operates on the first instance of the file, and the next command insures that no other commands will be run on that file.
The second one operates on the first line of the file. But since the first condition captured all of the first file, this statement will only be executed once, when the first line of the second file is in play.
The third one is what prints the bulk of your output. With awk, when no condition is included, the condition is assumed to be "true", so this statement runs for each line of the second file.
The awk script could of course be compressed onto a single line, I've spaced it out for easier reading.
Note also that this method of keeping or showing a line count might be a little heavy handed. If you know that you're just showing a line count, you can use the internal awk variable NR. At the point in your script where the second condition is evaluated, NR-1 is the line count of the previous file, so you could use:
#!/bin/sh
file="$1"
awk '
NR==FNR {
next
}
FNR==1 {
printf "%s: %d\n", FILENAME, NR-1
}
{
print $1, $2, $3
}
' "$file" "$file"
updating the answer based on comment and latest version of the question
file=$1;
nbrClients=`wc -l $file | cut -d' ' -f1`;
echo $nbrClients;
file=$1;
cat $file | awk -v fileName="$1" -v tnxCount="$2" '{
system("echo "Starting thread number $i"")
system("nohup perl /home/user/abc.pl -i $fileName >>/home/user/test_load_${today}.out 2>&1 &")
}';

Printing number of lines with in shell with echo

I know that the simplest way to print out the specific value of line/bytes/words is to use wc -l < filename.sh, but when i try to use it in conjunction with the echo command, it's printing the physical command itself and not the output.
My intended output is "this file has x lines", with x being number of lines, but when i try to do things like echo "this line has" wc -l < filename.sh "lines", it's printing the command itself. I've also tried this without breaking the quotation, among several other things.
is it just the command itself that's not applicable alongside echo, or am i missing something extremely obvious here?
echo "this line has $(wc -l < filename.sh) lines"
printf is versatile:
printf 'this file has %s lines\n' $(wc -l < filename.sh)
$(command) converts the output of command into an argument.
Try this one:
echo "this file has `wc -l < filename.sh | awk '{print $1}'` lines"
Explanation:
wc -l < filename.sh retrieves the line number of the file
awk '{print $1}' prints the number without any blanks
`` means executing the command first in order to get the result
Without any subshell or pipe, awk have an inbuilt variable NR which holds the number of record in the input file. Print is written inside END block to print the result at the end else, it will print the line number of each line.
awk 'END{print "This line has " NR " lines" }' file

Embedding awk in a shell script

I've been using a bash script (script.sh) which uses various awk scripts (script1.awk,script2.awk) which are tailored at "runtime", by replacing values for instance.
I've been looking for ways to embed them completely within the first bash script.
Ideally, I would like to have a file looking like this :
################################
# AWK scripts #
################################
read -d '' scriptVariable <<'EOF'
'
{my block commands;}
'
EOF
################################
# End of AWK Scripts #
################################
awk $scriptVariable ${inputfile} # This line obviously doesn't work
instead of the traditional :
awk '{
my script commands
' ${intputfile}
Of course, I could write them to a file but the whole point is not to. Any suggestions ?
EDIT : Although dogbane answers works fine, the next problem is that with the <<'HERE' tags, newline characters are not read.I can't unquote it since, otherwise, he's trying to interpret the awk script then encountering a $ sign within (and there are). And with no newlines, I can't comment anything within the awk script (without commenting half the script when the newlines characters are being removed ...). Anyone ?
<< 'EOF'
BEGIN{#Hello
print $1
}
EOF # Is read as BEGIN{#Hello print $1} by awk; ie BEGIN{
<< EOF
BEGIN{#Hello
print $1
}
EOF #Is read correctly by awk but Bash tried to express $1 and fails
Remove the single quotes around the awk script and enclose the script variable in double-quotes. This works for me:
################################
# AWK scripts #
################################
read -d '' scriptVariable << 'EOF'
BEGIN {
print "start"
}
{
print $0
}
END{
print "hello"
}
EOF
################################
# End of AWK Scripts #
################################
awk "$scriptVariable" ${inputfile}
The single quotes around awk scripts can be cantankerous in a shell.
echo dummy code | awk '/dummy/{print '$1' $2}' - another_file | while read LINE; do
case $LINE in
"$1 code")echo success;;
*)echo $LINE;;
esac
done
This will remove from stdin and then another_file and output lines containing dummy and replace the first field with the first argument to your bash script. Notice that the bash variable $1 had to be unquoted.
You can include your awk inside your bash script using a Here-Document.

Resources