In bash how to echo on the next line into a file? - bash

I try to retrieve the 1st column of the file results and put this column on the file named results2, but the problem is my script only write the last line of results:
for line in $(cat results | cut -d" " -f1);
do echo -e "$line">results2;
done

You can do what you want in the shell like this:
while read -r col1 rest
do
printf '%s\n' "$col1"
done < results > results2
...but it'd make a lot more sense to just do this:
cut -d' ' -f1 results > results2
The problem with your script is that you're using > inside the loop, which truncates the file and reopens it for writing every iteration. You can get around this by redirecting the whole loop to the file but as I've shown, the best method is just to redirect the output of cut.
To read lines in the shell, use a while read loop; don't read lines with for. However, bear in mind that the shell isn't designed to do this task efficiently, whereas that's exactly what the standard tools such as cut are there for.

Related

Why does "cut" command skip first line in this "while read line" loop?

I'm writing a bash script, and I need to take the second field of every line in a file, and save them in another file. I know there are many possible ways to do this, BUT, I tried first using while read line; do, and I got stuck. Now, I really want to know what is happening.
For example, input file would be:
line1 11111
line2 222222
line3 333
line4 4444
(The field separtor is "\t").
This is what I was doing:
inputfile=$1
cat $"inputfile" | while read -r line
do
cut -f2 >> results_file
done
The problem is, the output would be:
222222
333
4444
(skipping the first line)
I´ve alredy tested hundreds of modifications, and tried to used other commands instead of cut(like, sed, grep...). I would appreciate some help, or someone pointing me in the right direction.
Thank you very much!
You are not using the variable $line set by read. Try instead
inputfile=$1
cat "$inputfile" | while read -r line
do
echo "$line" | cut -f2 >> results_file
done
In your original code, the while loop is actually run only once, not four times; try putting echo 'Hello!' in the loop to your original code. You would see the message only once, not four times. I guess, without echo "$line" | part, cut -f2 ... part consumes the pipe away.
That is, your while loop first consumes the first line of the stdin and puts this line in the variable $line, leaving the next three lines for later use. But $line is never used. Instead, the remaining three lines are consumed by the command cut.
All commands within a command group are within the scope of any redirections applied to a command group (or any compound command):
— https://mywiki.wooledge.org/BashGuide/CompoundCommands
The pipe operator creates a subshell environment for each command.
— https://mywiki.wooledge.org/BashGuide/InputAndOutput
We can interpret the quotes as "the stdin to your while loop (i.e., the output of cat "$inputfile") is accessed by cut, unless you sever its access by creating a new subshell e.g., by another pipe echo "$line" | ...."
By the way, you can just use cut -f2 "$inputfile" >> results_file without the while loop.
With respect to your comment Does it mean to use "\t at the end" as a separator - no. You're confusing what was suggested, $'\t' with '\t$'. $'\t' means "the literal tab character generated from the escape sequence \t".
You also said in your comment your real 2nd fields are URLs to be curled. You shouldn't be using a UUOC and cut anyway, here's how to really do this:
while IFS=$'\t' read -r key url; do
val=$(curl "$url" | whatever)
printf '%s\t%s\n' "$key" "$val"
done < "$inputfile" > results_file
Replace whatever with whatever command you use to produce the output you want from the curl output.

parse and echo string in a bash while loop

I have a file with this structure:
picture1_123.txt
picture2_456.txt
picture3_789.txt
picture4_012.txt
I wanted to get only the first segment of the file name, that is, picture1 to picture4.
I first used the following code:
cat picture | while read -r line; do cut -f1 -d "_"; echo $line; done
This returns the following output:
picture2
picture3
picture4
picture1_123.txt
This error got corrected when I changed the code to the following:
cat picture | while read line; do s=$(echo $line | cut -f1 -d "_"); echo $s; done
picture1
picture2
picture3
picture4
Why in the first:
The lines are printed in a different order than the original file?
no operation is done on picture1_123.txt and picture1 is not printed?
Thank you!
What Was Wrong
Here's what your old code did:
On the first (and only) iteration of the loop, read line read the first line into line.
The cut command read the entire rest of the file, and wrote the results of extracting only the desired field to stdout. It did not inspect, read, or modify the line variable.
Finally, your echo $line wrote the first line in entirety, with nothing being cut.
Because all input had been consumed by cut, nothing remained for the next read line to consume, so the loop never ran a second time.
How To Do It Right
The simple way to do this is to let read separate out your prefix:
while IFS=_ read -r prefix suffix; do
echo "$prefix"
done <picture
...or to just run nothing but cut, and not use any while read loop at all:
cut -f1 -d_ <picture

CSV file parsing in Bash

I have a CSV file with sample entries given below. What I want is to write a Bash script to read the CSV file line by line and put the first entry e.g 005 in one variable and the IP 192.168.10.1 in another variable, that I need to pass to some other script.
005,192.168.10.1
006,192.168.10.109
007,192.168.10.12
008,192.168.10.121
009,192.168.10.123
A more efficient approach, without the need to fork cut each time:
#!/usr/bin/env bash
while IFS=, read -r field1 field2; do
# do something with $field1 and $field2
done < file.csv
The gains can be quite substantial for large files.
Here's how I would do it with GNU tools :
while read line; do
echo $line | cut -d, -f1-2 --output-delimiter=' ' | xargs your_command
done < your_input.csv
while read line; do [...]; done < your_input.csv will read your file line by line.
For each line, we will cut it to its first two fields (separated by commas since it's a CSV) and pass them separated by spaces to xargs which will in turn pass as parameters to your_command.
If this is a very simple csv file with no string literals, etc. you can simply use head and cut:
#!/bin/bash
while read line
do
id_field=$(cut -d',' -f 1 <<<"$line") #here 005 for the first line
ip_field=$(cut -d',' -f 2 <<<"$line") #here 192.168.0.1 for the first line
#do something with $id_field and $ip_field
done < file.csv
The program works as follows: we use cut -d',' to obtain the first and second field of that line. We wrap this around a while read line and use I/O redirection to feed the file to the while loop.
Of course you substitute file.csv with the name of the file you want to process, and you can use other variable names than the ones in this sample.

Printing file content on one line

I'm completely lost trying to do something which I thought would be very straightforward : read a file line by line and output everything on one line.
I'm using bash on RHEL.
Consider a simple test case with a file (test.in) with following content:
one
two
three
four
I want to have a script which reads this files and outputs:
one two three four
Done
I tried this (test.sh):
cat test.in | while read in; do
printf "%s " "$in"
done
echo "Done"
The result looks like:
# ./test.sh
foure
Done
#
It seems that the printf causes the cursor to jump to the first position on the same line immediately after the %s. The issues holds when doing echo -e "$in \c".
Any ideas?
another answer:
tr '[:space:]' ' ' < file
echo
This must be safest and most efficient as well. Use \n if you want to only convert new lines instead of any white spaces.
You can use:
echo -- $(<test.in); echo 'Done'
one two three four
Done
echo -- `cat file` | tail -c +4
the -- is to protect you from command line options. But in my shell the -- is printed out. I think that might be a bug. Will have to check.
So you need to check if you have to include | tail -c +4 in your implementation.

Passing input to sed, and sed info to a string

I have a list of files (~1000) and there is 1 file per line in my text file named: 'files.txt'
I have a macro that looks something like the following:
#!/bin/sh
b=$(sed '${1}q;d' files.txt)
cat > MyMacro_${1}.C << +EOF
myFile = new TFile("/MYPATHNAME/$b");
+EOF
and I use this input script by doing
./MakeMacro.sh 1
and later I want to do
./MakeMacro.sh 2
./MakeMacro.sh 3
...etc
So that it reads the n'th line of my files.txt and feeds that string to my created .C macro.
So that it reads the n'th line of my files.txt and feeds that string to my created .C macro.
Given this statement and your tags, I'm going to answer using shell tools and not really address the issue of the .c macro.
The first line of your script contains a sed script. There are numerous ways to get the Nth line from a text file. The simplest might be to use head and tail.
$ head -n "${i}" files.txt | tail -n 1
This takes the first $i lines of files.txt, and shows you the last 1 lines of that set.
$ sed -ne "${i}p" files.txt
This use of sed uses -n to avoid printing by default, then prints the $ith line. For better performance, try:
$ sed -ne "${i}{p;q;}" files.txt
This does the same, but quits after printing the line, so that sed doesn't bother traversing the rest of the file.
$ awk -v i="$i" 'NR==i' files.txt
This passes the shell variable $i into awk, then evaluates an expression that tests whether the number of records processed is the same as that variable. If the expression evaluates true, awk prints the line. For better performance, try:
$ awk -v i="$i" 'NR==i{print;exit}' files.txt
Like the second sed script above, this will quit after printing the line, so as to avoid traversing the rest of the file.
Plenty of ways you could do this by loading the file into an array as well, but those ways would take more memory and perform less well. I'd use one-liners if you can. :)
To take any of these one-liners and put it into your script, you already have the notation:
if expr "$i" : '[0-9][0-9]*$' >/dev/null; then
b=$(sed -ne "${i}{p;q;}" files.txt)
else
echo "ERROR: invalid line number" >&2; exit 1
fi
If I am understanding you correctly, you can do a for loop in bash to call the script multiple times with different arguments.
for i in `seq 1 n`; do ./MakeMacro.sh $i; done
Based on the OP's comment, it seems that he wants to submit the generated files to Condor. You can modify the loop above to include the condor submission.
for i in `seq 1 n`; do ./MakeMacro.sh $i; condor_submit <OutputFile> ; done
i=0
while read file
do
((i++))
cat > MyMacro_${i}.C <<-'EOF'
myFile = new TFile("$file");
EOF
done < files.txt
Beware: you need tab indents on the EOF line.
I'm puzzled about why this is the way you want to do the job. You could have your C++ code read files.txt at runtime and it would likely be more efficient in most ways.
If you want to get the Nth line of files.txt into MyMacro_N.C, then:
{
echo
sed -n -e "${1}{s/.*/myFile = new TFILE(\"&\");/p;q;}" files.txt
echo
} > MyMacro_${1}.C
Good grief. The entire script should just be (untested):
awk -v nr="$1" 'NR==nr{printf "\nmyFile = new TFile(\"/MYPATHNAME/%s\");\n\n",$0 > ("MyMacro_"nr".C")}' files.txt
You can throw in a ;exit before the } if performance is an issue but I doubt if it will be.

Resources