parse and echo string in a bash while loop - bash

I have a file with this structure:
picture1_123.txt
picture2_456.txt
picture3_789.txt
picture4_012.txt
I wanted to get only the first segment of the file name, that is, picture1 to picture4.
I first used the following code:
cat picture | while read -r line; do cut -f1 -d "_"; echo $line; done
This returns the following output:
picture2
picture3
picture4
picture1_123.txt
This error got corrected when I changed the code to the following:
cat picture | while read line; do s=$(echo $line | cut -f1 -d "_"); echo $s; done
picture1
picture2
picture3
picture4
Why in the first:
The lines are printed in a different order than the original file?
no operation is done on picture1_123.txt and picture1 is not printed?
Thank you!

What Was Wrong
Here's what your old code did:
On the first (and only) iteration of the loop, read line read the first line into line.
The cut command read the entire rest of the file, and wrote the results of extracting only the desired field to stdout. It did not inspect, read, or modify the line variable.
Finally, your echo $line wrote the first line in entirety, with nothing being cut.
Because all input had been consumed by cut, nothing remained for the next read line to consume, so the loop never ran a second time.
How To Do It Right
The simple way to do this is to let read separate out your prefix:
while IFS=_ read -r prefix suffix; do
echo "$prefix"
done <picture
...or to just run nothing but cut, and not use any while read loop at all:
cut -f1 -d_ <picture

Related

Why does "cut" command skip first line in this "while read line" loop?

I'm writing a bash script, and I need to take the second field of every line in a file, and save them in another file. I know there are many possible ways to do this, BUT, I tried first using while read line; do, and I got stuck. Now, I really want to know what is happening.
For example, input file would be:
line1 11111
line2 222222
line3 333
line4 4444
(The field separtor is "\t").
This is what I was doing:
inputfile=$1
cat $"inputfile" | while read -r line
do
cut -f2 >> results_file
done
The problem is, the output would be:
222222
333
4444
(skipping the first line)
I´ve alredy tested hundreds of modifications, and tried to used other commands instead of cut(like, sed, grep...). I would appreciate some help, or someone pointing me in the right direction.
Thank you very much!
You are not using the variable $line set by read. Try instead
inputfile=$1
cat "$inputfile" | while read -r line
do
echo "$line" | cut -f2 >> results_file
done
In your original code, the while loop is actually run only once, not four times; try putting echo 'Hello!' in the loop to your original code. You would see the message only once, not four times. I guess, without echo "$line" | part, cut -f2 ... part consumes the pipe away.
That is, your while loop first consumes the first line of the stdin and puts this line in the variable $line, leaving the next three lines for later use. But $line is never used. Instead, the remaining three lines are consumed by the command cut.
All commands within a command group are within the scope of any redirections applied to a command group (or any compound command):
— https://mywiki.wooledge.org/BashGuide/CompoundCommands
The pipe operator creates a subshell environment for each command.
— https://mywiki.wooledge.org/BashGuide/InputAndOutput
We can interpret the quotes as "the stdin to your while loop (i.e., the output of cat "$inputfile") is accessed by cut, unless you sever its access by creating a new subshell e.g., by another pipe echo "$line" | ...."
By the way, you can just use cut -f2 "$inputfile" >> results_file without the while loop.
With respect to your comment Does it mean to use "\t at the end" as a separator - no. You're confusing what was suggested, $'\t' with '\t$'. $'\t' means "the literal tab character generated from the escape sequence \t".
You also said in your comment your real 2nd fields are URLs to be curled. You shouldn't be using a UUOC and cut anyway, here's how to really do this:
while IFS=$'\t' read -r key url; do
val=$(curl "$url" | whatever)
printf '%s\t%s\n' "$key" "$val"
done < "$inputfile" > results_file
Replace whatever with whatever command you use to produce the output you want from the curl output.

Read a file in a Bash script

I have a file in my file system. I want to read that file in bash script. File format is different i want to read only selected values from the file. I don't want to read the whole file as the file is very huge. Below is my file format:
Name=TEST
Add=TEST
LOC=TEST
In the file it will have data like above. From that I want to get only Add date in a variable. Could you please suggest me how I can do this.
As of now i am doing this to read the file:
file="data.txt"
while IFS= read line
do
# display $line or do somthing with $line
echo "$line"
done < "$file"
Use the right tool meant for the job, Awk in this case to speed things up!
dateValue="$(awk -F"=" '$1=="Add"{print $2; exit}' file)"
printf "%s\n" "dateValue"
TEST
The idea is to split input lines by = as the de-limiter. The awk logic works by checking the $1 field which equals to Add and prints the corresponding value associated with it.
The exit part after print is optional. It will quit the processing as soon as the Add string is met. It will help in quick processing if the file is huge as you have indicated.
You could rewrite your loop this way, notice the break after you got your line:
while IFS='=' read -r key value; do
if [[ $value == "Add" ]]; then
# your logic
break
fi
done < "$file"
If your intention is to just get the very first occurrence of "Add=", then you could use grep this way:
value=$(grep -m 1 '^Add=' "$file" | cut -f2 -d=)

CSV file parsing in Bash

I have a CSV file with sample entries given below. What I want is to write a Bash script to read the CSV file line by line and put the first entry e.g 005 in one variable and the IP 192.168.10.1 in another variable, that I need to pass to some other script.
005,192.168.10.1
006,192.168.10.109
007,192.168.10.12
008,192.168.10.121
009,192.168.10.123
A more efficient approach, without the need to fork cut each time:
#!/usr/bin/env bash
while IFS=, read -r field1 field2; do
# do something with $field1 and $field2
done < file.csv
The gains can be quite substantial for large files.
Here's how I would do it with GNU tools :
while read line; do
echo $line | cut -d, -f1-2 --output-delimiter=' ' | xargs your_command
done < your_input.csv
while read line; do [...]; done < your_input.csv will read your file line by line.
For each line, we will cut it to its first two fields (separated by commas since it's a CSV) and pass them separated by spaces to xargs which will in turn pass as parameters to your_command.
If this is a very simple csv file with no string literals, etc. you can simply use head and cut:
#!/bin/bash
while read line
do
id_field=$(cut -d',' -f 1 <<<"$line") #here 005 for the first line
ip_field=$(cut -d',' -f 2 <<<"$line") #here 192.168.0.1 for the first line
#do something with $id_field and $ip_field
done < file.csv
The program works as follows: we use cut -d',' to obtain the first and second field of that line. We wrap this around a while read line and use I/O redirection to feed the file to the while loop.
Of course you substitute file.csv with the name of the file you want to process, and you can use other variable names than the ones in this sample.

In bash how to echo on the next line into a file?

I try to retrieve the 1st column of the file results and put this column on the file named results2, but the problem is my script only write the last line of results:
for line in $(cat results | cut -d" " -f1);
do echo -e "$line">results2;
done
You can do what you want in the shell like this:
while read -r col1 rest
do
printf '%s\n' "$col1"
done < results > results2
...but it'd make a lot more sense to just do this:
cut -d' ' -f1 results > results2
The problem with your script is that you're using > inside the loop, which truncates the file and reopens it for writing every iteration. You can get around this by redirecting the whole loop to the file but as I've shown, the best method is just to redirect the output of cut.
To read lines in the shell, use a while read loop; don't read lines with for. However, bear in mind that the shell isn't designed to do this task efficiently, whereas that's exactly what the standard tools such as cut are there for.

Echo changes my tabs to spaces

I'm taking the following structure from around the net as a basic example of how to read from a file in BASH:
cat inputfile.txt | while read line; do echo $line; done
My inputfile.txt is tab-delimited, though, and the lines that come out of the above command are space-delimited.
This is causing me problems in my actual application, which is of course more complex than the above: I want to take the line, generate some new stuff based on it, and then output the original line plus the new stuff as extra fields. And the pipeline is going to be complicated enough without a bunch of cut -d ' ' and sed -e 's/ /\t/g' (which wouldn't be safe for tab-delimited data containing spaces anyway).
I've looked at IFS solutions, but they don't seem to help in this case. What I want is an OFS...except that I'm in echo, not awk! I think that if I could just get echo to spit out what I gave it, verbatim, I'd be in good shape. Any thoughts? Thanks!
Try:
cat inputfile.txt | while read line; do echo "$line"; done
instead.
In other words, it's not read replacing the tabs, it's echo.
See the following transcript (using <<tab>> where the tabs are):
pax$ echo 'hello<<tab>>there' | while read line ; do echo $line ; done
hello there
pax$ echo 'hello<<tab>>there' | while read line ; do echo "$line" ; done
hello<<tab>>there

Resources