How to read specific lines in a file in BASH? - bash

while read -r line will run through each line in a file. How can I have it run through specific lines in a file, for example, lines "1-20", then "30-100"?

One option would be to use sed to get the desired lines:
while read -r line; do
echo "$line"
done < <(sed -n '1,20p; 30,100p' inputfile)
Saying so would feed lines 1-20, 30-100 from the inputfile to read.

#devnull's sed command does the job. Another alternative is using awk since it avoids doing the read and you can do the processing in awk itself:
awk '(NR>=1 && NR<=20) || (NR>=30 && NR<=100) {print "processing $0"}' file

Related

How to copy specific columns from one csv file to another csv file?

File1.csv:
File2.csv:
I want to replace the contents of configSku,selectedSku,config_id in File1.csv with the contents of configSku,selectedSku,config_idfrom File2.csv. The end result should look like this:
Here are the links to download the files so you can try it yourself:
File1.csv: https://www.dropbox.com/s/2o12qjzqlcgotxr/file1.csv?dl=0
File2.csv: https://www.dropbox.com/s/331lpqlvaaoljil/file2.csv?dl=0
Here's what I have tried but still failed:
#!/bin/bash
INPUT=/tmp/file2.csv
OLDIFS=$IFS
IFS=,
[ ! -f $INPUT ] && { echo "$INPUT file not found"; exit 99; }
echo "no,my_account,form_token,fingerprint,configSku,selectedSku,config_id,address1,item_title" > /tmp/temp.csv
while read item_title configSku selectedSku config_id
do
cat /tmp/file1.csv |
awk -F ',' -v item_title="$item_title" \
-v configSku="$configSku" \
-v selectedSku="$selectedSku" \
-v config_id="$config_id" \
-v OFS=',' 'NR>1{$5=configSku; $6=selectedSku; $7=config_id; $9=item_title; print}' >> /tmp/temp.csv
done < <(tail -n +2 "$INPUT")
IFS=$OLDIFS
How do I do this ?
If I understood the question correctly, how about using:
paste -d, file1.csv file2.csv | awk -F, -v OFS=',' '{print $1,$2,$3,$4,$11,$12,$13,$8,$10}'
This is not as nearly as robust as the other answer, and assumes that file1.csv and file2.csv have the same number of lines and each line in one file corresponds to the same line on the other file. the output would look like this:
no,my_account,form_token,fingerprint,configSku,selectedSku,config_id,address1,item_title
1,account1,asdf234safd,sd4d5s6sa,NEWconfigSku1,NEWselectedSku1,NEWconfig_id1,myaddr1,Samsung Handsfree
2,account2,asdf234safd,sd4d5s6sa,NEWconfigSku2,NEWselectedSku2,NEWconfig_id2,myaddr2,Xiaomi Mi headset
3,account3,asdf234safd,sd4d5s6sa,NEWconfigSku3,NEWselectedSku3,NEWconfig_id3,myaddr3,Ear Headphones with Mic
4,account4,asdf234safd,sd4d5s6sa,NEWconfigSku4,NEWselectedSku4,NEWconfig_id4,myaddr4,Handsfree/Headset
The first part is using paste to put the files side-by-side, separated by comma, hence the -d option. Then, you end up with a combined file with 13 columns. The awk part first tells that the input and output field separators should be comma (-F,and -v OFS=',', respectively) and then prints the desired columns (columns 1-4 from first file, then columns 2-4 of the second file, which now correspond to columns 11-13 on the merged file.
The main issue in your original script is that you are reading one file (/tmp/file2.csv) one line at a time, and for each line, your parse and print the whole other file (/tmp/file1.csv).
Here is an example how to merge two csv files in bash:
#!/bin/bash
# Open both files in "reading mode"
exec 3<"$1"
exec 4<"$2"
# Read(/discard) the header line in both csv files
read -r -u 3
read -r -u 4
# Print the new header line
printf "your,own,header,line\n"
# Read both files one line at a time and print the merged result
while true; do
IFS="," read -r -u 3 your own || break
IFS="," read -r -u 4 header line
printf "%s,%s,%s,%s\n" "$your" "$own" "$header" "$line"
done
exec 3<&-
exec 4<&-
Assuming you saved the script above in "merge_csv.sh", you can use it like this:
$ bash merge_csv.sh /tmp/file1.csv /tmp/file2.csv > /tmp/temp.csv
Be sure to modify the script to suit your needs (I did not use the headers you provided in your question).
If you are not familiar with the exec command, the tldp documentation and the bash hackers wiki both have an entry about it. The man page for read should document the -u option well enough. Finally, the VAR="something" command arg1 arg2 (used in the script for IFS=',' read -u -r 3) is a common construct in shell scripting. If you are not familiar with it, I believe this answer should provide enough information on what it does.
Note: if you want to do more complex processing of csv files I recommend using python and its csv package.

Naming awk output in loop

I'm relatively new to the world of shell scripts so hopefully this won't be too difficult. I have a file (dirlist) with a list of directories. I want to
cat 'dirlist' with the path to each file
use a program called samtools to modify the file from dirlist
use awk to subset the samtools output on a variable chr17
write the output to a file that uses the 8th field of the directory, from 'dirlist' for naming
do this for all the files listed in dirlist
I think I have all the pieces here. Items 1-3 are working fine but the loop is simply naming the file "echo".
for i in `cat dirlist`; do samtools depth $i | awk '$1 == "chr17" {print $0}' echo $i | awk -F'[/]' '{print $8}'; done
Any help would be greatly appreciated
A native bash implementation (just one process, rather than starting an awk for every file) follows:
while IFS= read -r filename; do
while IFS= read -r line; do
if [[ $line = "chr17"[[:space:]]* ]]; then
IFS=/ read -r -a pieces <<<"$filename"
printf '%s\n' "${pieces[7]}"
fi
done < <(samtools depth "$filename")
done <dirlist
I think that's what you want to do
... | awk -v f="$i" 'BEGIN{split(f,fs,"/")} $1=="chr17" {print > fs[8]}'
the final file name will be generated from the original file name split by "/" and use only the 8th segment. Kind of unusual, perhaps needs some error handling.
not tested, caveat emptor...

Remove everything in a pipe delimited file after second-to-last pipe

How can remove everything in a pipe delimited file after the second-to-last pipe? Like for the line
David|3456|ACCOUNT|MALFUNCTION|CANON|456
the result should be
David|3456|ACCOUNT|MALFUNCTION
Replace |(string without pipe)|(string without pipe) at the end of each line:
sed 's/|[^|]*|[^|]*$//' inputfile
Using awk, something like
awk -F'|' 'BEGIN{OFS="|"}{NF=NF-2; print}' inputfile
David|3456|ACCOUNT|MALFUNCTION
(or) use cut if you know the number of columns in total, i,e 6 -> 4
cut -d'|' -f -4 inputfile
David|3456|ACCOUNT|MALFUNCTION
The command I would use is
cat input.txt | sed -r 's/(.*)\|.*/\1/' > output.txt
A pure Bash solution:
while IFS= read -r line || [[ -n $line ]] ; do
printf '%s\n' "${line%|*|*}"
done <inputfile
See Reading input files by line using read command in shell scripting skips last line (particularly the answer by Jahid) for details of how the while loop works.
See pattern matching in Bash for information about ${line%|*|*}.

Extract first word in colon separated text file

How do i iterate through a file and print the first word only. The line is colon separated. example
root:01:02:toor
the file contains several lines. And this is what i've done so far but it does'nt work.
FILE=$1
k=1
while read line; do
echo $1 | awk -F ':'
((k++))
done < $FILE
I'm not good with bash-scripting at all. So this is probably very trivial for one of you..
edit: variable k is to count the lines.
Use cut:
cut -d: -f1 filename
-d specifies the delimiter
-f specifies the field(s) to keep
If you need to count the lines, just
count=$( wc -l < filename )
-l tells wc to count lines
awk -F: '{print $1}' FILENAME
That will print the first word when separated by colon. Is this what you are looking for?
To use a loop, you can do something like this:
$ cat test.txt
root:hello:1
user:bye:2
test.sh
#!/bin/bash
while IFS=':' read -r line || [[ -n $line ]]; do
echo $line | awk -F: '{print $1}'
done < test.txt
Example of reading line by line in bash: Read a file line by line assigning the value to a variable
Result:
$ ./test.sh
root
user
A solution using perl
%> perl -F: -ane 'print "$F[0]\n";' [file(s)]
change the "\n" to " " if you don't want a new line printed.
You can get the first word without any external commands in bash like so:
printf '%s' "${line%%:*}"
which will access the variable named line and delete everything that matches the glob :* and do so greedily, so as close to the front (that's the %% instead of a single %).
Though with this solution you do need to do the loop yourself. If this is the only thing you want to do with the variable the cut solution is better so you don't have to do the file iteration yourself.

cutting certain lengths of the string in csv file and adding it in quotes followed by a comma

I have a csv file like
PC1234
PC4567
PC7890
one below another
i am trying using shell script to cut the first 6 values alone as they would be accompanied by several gaps(spaces).I have tried the below.
cat Upd_Master_Payloads.csv | while read line
do
interface=`expr $line|cut -c 1-6`
echo "$interface" > INPUT.csv
done
chmod 777 INPUT.csv
sed "s/.*/'&'/" INPUT.csv | tr '\n' ',' > SQL_INPUT.csv
chmod 777 SQL_INPUT.csv
i am able to do so for only the first line and not the other lines..
please help as i am new to shell scripting.
That's because you keep overwriting the entire file with > INPUT.csv. Try using >> INPUT.csv to append to the file instead.
Alternatively, a better way to write the loop is:
while IFS= read -r line
do
interface=${line:0:6}
echo "$interface"
done < Upd_Master_Payloads.csv > INPUT.csv

Resources