choosing column name in .csv file - bash

I'm really new to bash programming. I want to write the results of two variables into a .csv file. I use this command:
while IFS= read -r line; do
ip=$(dig +short $line)
echo "${line}, ${ip}" >> file.csv
done < domains
It works file. It creates two columns in file.csv and writes the result of $line in the first column and the result of $ip in the 2nd column.
I wanted to know if there is a way to choose a name for these columns. For example
column1 : $line & column2:$ip

In CSV files column names are the contents of the first row, so (before your loop) you can write:
echo "Line,Ip" > file.csv.tmp # Add columns in new temporary file
cat file.csv.tmp >> file.csv # Append all the data of the original file
rm file.csv # Remove the original file
mv file.cvs.tmp file.csv # Rename the temporary file
Or you can also simply use this other method:
echo "Line,Ip
$(cat file.cs)" > file.csv
I hope it helps.
As helen pointed out in the comments, if the file should be overwritten with every run then you can simply add echo "Line,Ip" > file.csv before the loop.

Related

Assign content of a file to an array in bash

I have a file that contains parts of file names either as newline (or separated by spaces). Lets take the following example:
cat file.txt
1
4
500
The actual file names are file_1.dat, file_2.dat, file_3.dat, file_4.dat file_500.dat, and so on.
I want to combine only those files whose names (or part of the names) are stored in file.txt.
To do so I am doing the following:
## read the file and assign to an array
array=()
while IFS= read -r line; do
array+=($line)
done < file.txt
## combine the contents of the files
for file in ${array[#]}
do
cat "file_$file.dat"
done > output.dat
Now in this solution what I don't like is the assignment of the array, that I have to run a do loop for this.
I tried to use
mapfile -t array < <(cat file.txt)
I also tried,
array=( $(cat file2.txt) )
The array that is needed finally is
array=(1 4 500)
In some of the answers(in this platform), I see that doing in the above way (the last option) might be harmful. I wanted to have some clarification on what to do for such assignments.
My question is: in this situation what is the best (safe and fast) way to assign the content of a file into an array?
array=( $(cat file2.txt) )
does not necessarily put each line in the array. It puts each word resulting from word-splitting and globbing into the array.
Consider this file
1
2 3
*
mapfile -t array < file.txt will create an array with the elements 1, 2 3, and *.
array=( $(cat file.txt) ) will create an array with the elements 1, 2, and 3, along with an element for each file name in the current directory.
Using mapfile is both safer and makes your intent of storing one line per element clearer.
However, there is no need for an array at all. You can process each file as you pull a line from your input file.
while IFS= read -r line; do
cat "file_$line.dat"
done < file.txt > output.dat
If you don’t want to deduplicate the file name fragments:
readarray -t lines < file.txt
declare -a fragments
for line in "${lines[#]}"; do
fragments+=($line)
done
names=("${fragments[#]/#/file_}")
names=("${names[#]/%/.dat}")
cat "${names[#]}"
If you do want to deduplicate the file name fragments:
readarray -t lines < file.txt
declare -Ai set_of_fragments
for line in "${lines[#]}"; do
for fragment in $line; do
((++set_of_fragments["${fragment}"]))
done
done
fragments=("${!set_of_fragments[#]}")
names=("${fragments[#]/#/file_}")
names=("${names[#]/%/.dat}")
cat "${names[#]}"

parsing many variables to another file contain many rows

I have an issue while parsing many variables which extracted by cut command to another file contain many rows, I need to set the variables to the end of each row in sequence.
EX: file 100.txt contain 1000 rows and contain 3 fields A,B,C
another file called pin contain 1000 rows and contain 1 filed 2222
I need to take it one by one and inserted at the end of each row into 100.txt file.
while IFS= read -r line; do
sed -i "/:[0-9]*$/ ! s%$%,$line%" "100.txt"
done < pin.txt
What I have got:
1,2,3,2222,3333
1,2,3,2222,3333
What I expected:
1,2,3,2222
1,2,3,3333
If both files have the same number of lines, paste is your friend:
paste -d, 100.txt pin.txt > tmp.txt
mv -f tmp.txt 100.txt
Here is i would do it using a while read loop without the sed
while IFS= read -r file1 <&3; do
IFS= read -r file2
printf '%s,%s\n' "$file1" "$file2"
done 3<100.txt < pin.txt
Using mapfile bash4+ only.
mapfile -t file1 < 100.txt
mapfile -t file2 < pin.txt
for i in "${!file1[#]}"; do
printf '%s,%s\n' "${file1[$i]}" "${file2[$i]}"
done
Of course those shell loop would be very slow on large data/size files.

Remove first row from file with shell script using if statement

There are some data files being imported with header names on the first row and others dont have headers. The ones that are with headers are having always "company" as first field on the first row. For loading them into DB I need to get rid of the first row. So I need to write .sh scrict that deletes first row only of those files that have first column first row="company". I guess I need to combine awk with if statement but I dont know exactly how.
if head -n 1 input.csv | cut -f 1 -d ',' | grep company
then tail -n +2 input.csv > output.csv
else
cp input.csv output.csv
fi
If you're sure the string "company" appears only as 1st field on headers, you can go this way
sed -e /^company,/d oldfile > newfile
supposing the separator is a comma.
Another solution :
if [ head -1 oldfile | grep -q "^company,"] ; then
sed -e 1d oldfile > newfile
else
cp oldfile newfile
fi
No if needed. Just do it straight forward as you stated your requirements:
Print the first line unless it starts with company:
strip_header_if_present() {
IFS='' read -r first_line
echo "$first_line" | grep -v ^company,
Now print the remaining lines:
cat
}
To use this shell function:
strip_header_if_present < input.csv > output.csv

How to read a single column CSV file in bash?

I am relatively new to bash/programming in general.
I have a single column CSV that looks like this:
domain1.com
domain2.com
domain3.com
domain4.com
I want to run through each entry and do something with it. Here is my code:
foo(){
i=0
while read -a line;
do
echo ${line[i]}
((i++))
done < myfile.csv
}
And nothing happens. I have figured out that if I change the file I'm pointing at to:
done< <(grep '' myfile.csv)
it will work, but only spit out the very last line of the CSV, like this:
domain4.com
Again, I am a beginner and teaching myself this stuff, so any explanations you want to give with your answers would be GREATLY appreciated!
EDIT So it appears that my new problem is removing the ^M character from my CSV file. Once I figure out how to do this, I will mark the answer here that works for me.
If you want to store your lines on an array you'd simply do:
readarray -t lines < file
And, if you want to try processing those lines you can have something like
for line in "${lines[#]}"; do
echo "$line"
done
Or by index (mind !):
for i in "${!lines[#]}"; do
echo "${lines[i]}"
done
Indices start with 0.
while read p; do
echo $p
done < myfile.csv
Looks like you have 2 issues:
Your lines are all ending with \r
There is no new line or \r at the end of last line
To fix this issue use this script:
echo >> file.csv
while read -r line; do echo "$line"; done < <(tr '\r' '\n' < file.csv)
You can also simply read the file into the array with:
array=( `<file` )
If you have need to use numerical indexes, then you can access the elements with:
for ((i=0;i<${#array[#]};i++))
printf " array [%2d]: %s\n" "$i" "${array[$i]}"
done

How do I add to a column instead of a row using Bash Script and csv?

#!/bin/bash
# This file will gather who is information
while IFS=, read url
do
whois $url > output.txt
echo "$url," >> Registrants.csv
grep "Registrant Email:" output.txt >> Registrants.csv
done < $1
How do I get the grep output to go into a new column instead of a new row? I Want column 1 to have the echo, column 2 to have the grep, then go down to a new row.
You can disable the trailing newline on echo with the -n flag.
#!/bin/bash
# This file will gather who is information
while IFS=, read url
do
whois $url > output.txt
echo -n "$url," >> Registrants.csv
grep "Registrant Email:" output.txt >> Registrants.csv
done < $1
Use printf, then you don't have to worry if the "echo" you are using accepts options.
printf "%s" "$url,"
printf is much more portable than "echo -n".

Resources