How to split a text file on a delimiter into multiple files in unix? - bash

I have a text file that looks like this:
input_file
1|abc
2|def
3|ghi
n|etc...
I need to split this up into two files on the pipe delimeter. So this is the expected output:
File_1:
1
2
3
n
File_2:
abc
def
ghi
etc
I do not know how many lines the input file will have. How do you achieve this in ksh or bash?
Thank you.

awk would be suitable for this task:
awk -F\| '{print $1 > "File_1"; print $2 > "File_2"}' input_file
This splits your text on the "|" and prints each column to the respective file.
If there were more than two fields, you may prefer to use a loop instead:
awk -F\| '{for(i=1;i<=NF;++i) print $i > "File_" i}' input_file

cut -d '|' -f 1 input_file > File_1
cut -d '|' -f 2 input_file > File_2
Only with bash:
while IFS='|' read A B; do echo "$A" >>File_1; echo "$B" >>File_2; done <input_file

Here is another solution using other bash commands
cat input_file | cut -d '|' -f1 > File_1
cat input_file | cut -d '|' -f2 > File_2
Or you can put them together in one line
cat input_file | tee >(cut -d '|' -f1 > File_1) | cut -d '|' -f2 > File_2

Related

How to grab fields in inverted commas

I have a text file which contains the following lines:
"user","password_last_changed","expires_in"
"jeffrey","2021-09-21 12:54:26","90 days"
"root","2021-09-21 11:06:57","0 days"
How can I grab two fields jeffrey and 90 days from inverted commas and save in a variable.
If awk is an option, you could save an array and then save the elements as individual variables.
$ IFS="\"" read -ra var <<< $(awk -F, '/jeffrey/{ print $1, $NF }' input_file)
$ $ var2="${var[3]}"
$ echo "$var2"
90 days
$ var1="${var[1]}"
$ echo "$var1"
jeffrey
while read -r line; do # read in line by line
name=$(echo $line | awk -F, ' { print $1} ' | sed 's/"//g') # grap first col and strip "
expire=$(echo $line | awk -F, ' { print $3} '| sed 's/"//g') # grap third col and strip "
echo "$name" "$expire" # do your business
done < yourfile.txt
IFS=","
arr=( $(cat txt | head -2 | tail -1 | cut -d, -f 1,3 | tr -d '"') )
echo "${arr[0]}"
echo "${arr[1]}"
The result is into an array, you can access to the elements by index.
May be this below method will help you using
sed and awk command
#!/bin/sh
username=$(sed -n '/jeffrey/p' demo.txt | awk -F',' '{print $1}')
echo "$username"
expires_in=$(sed -n '/jeffrey/p' demo.txt | awk -F',' '{print $3}')
echo "$expires_in"
Output :
jeffrey
90 days
Note :
This above method will work if their is only distinct username
As far i know username are not duplicate

Print line based on 2nd field value, without using a loop

I try to retrieve a line from a file without using a loop.
myFile.txt
val1;a;b;c
val2;b;d;e
val3;c;r;f
I would like to get the line where the second column is b.
If I do grep "b" myFile.txt then both first and second line will be outputed.
If I do cat myFile.txt | cut -d ';' -f2 | grep "b" then the output will just be b whereas I'd like to get the full line val2;b;d;e.
Is there a way of reaching the desired results without using a loop as below ? My file being huge it wouldn't be nice looping through it again and again.
while read line; do
if [ `echo $line | cut -d ';' -f2` = "b" ]; then
echo $line
fi
done < myFile.txt
Given your input file, The below one-liner should work:
awk -F";" '$2 == "b" {print}' myFile.txt
Explanation:
awk -F";" ##Field Separator as ";"
'$2 == "b" ##Searches for "b" in the second column($2)
{print}' ##prints the searched line
Using:
grep:
grep '^[^;]*;b;' myFile.txt
sed:
sed '/^[^;]*;b;/!d' myFile.txt
Output is the same for both:
val2;b;d;e

shortening headers using awk

I have headers like
>XX|6226515|new|xx_000000.1| XXXXXXX
in a text file which I am trying shorten to
>XX6226515
using awk. I tried
awk -F"|" '/>/{$0=">"$1}1' input.txt > output.txt
but it yields the following instead
>XX|6226515|new|
awk -F"|" '{print $1$2}' input.txt > output.txt
Output:
>XX6226515
sed solution:
sed -e 's/|//' -e 's/|.*//'
The first substitution removes the first vertical bar, the second one removes the second one and anything after it.
$ awk -F'|' '$0=$1$2' <<< ">XX|6226515|new|xx_000000.1| XXXXXXX"
>XX6226515
This cut can also make it:
cut -d"|" --output-delimiter="" -f-2
See output:
$ echo ">XX|6226515|new|xx_000000.1| XXXXXXX" | cut -d"|" --output-delimiter="" -f-2
>XX6226515
-d"|" sets | as field delimiter.
--output-delimiter="" indicates that the output delimiter has to be empty.
-f-2 indicates that it has to print all records up to the 2nd (inclusive).
Also with just bash:
while IFS="|" read a b _
do
echo "$a$b"
done <<< ">XX|6226515|new|xx_000000.1| XXXXXXX"
See output:
$ while IFS="|" read a b _; do echo "$a$b"; done <<< ">XX|6226515|new|xx_000000.1| XXXXXXX"
>XX6226515

Splitting CSVs into files named for one of the columns

I have CSVs like this:
apple,file1.txt
banana,file1.txt
carrot,file2.txt
How can I get it to place all of the items from the left column into files named with the items in the right column? E.g. file.txt would contain this list:
apple
banana
So far, I have this:
while read line
do
firstcolumn=$(echo $line | awk -F ",*" '{print $1}')
secondcolumn=$(echo $line | awk -F ",*" '{print $2}')
done < Text/selection.csv
One way using awk:
awk 'BEGIN { FS = "," } { print $1 >> $2 }' infile
This should work -
awk -F, '{a[$1]=$2} END{for (i in a) print i > a[i]}' file
Test:
[jaypal:~/Temp] cat file
apple,file1.txt
banana,file1.txt
carrot,file2.txt
[jaypal:~/Temp] awk -F, '{a[$1]=$2} END{for (i in a) print i > a[i]}' file
[jaypal:~/Temp] ls file*
file file1.txt file2.txt
[jaypal:~/Temp] cat file1.txt
apple
banana
[jaypal:~/Temp] cat file2.txt
carrot
Update:
You can also do something like this -
awk -F, '{print $1 > $2}' INPUT_FILE
Pure Bash and under the assumption that all target files are empty or non-existing:
while IFS=',' read item file ; do
echo "$item" >> "$file"
done < "$infile"
sed loves this stuff...
sed "s%\(.*\),\(.*\)%echo \1 >> \2 %" inputfile.txt | sh

Bash: "xargs cat", adding newlines after each file

I'm using a few commands to cat a few files, like this:
cat somefile | grep example | awk -F '"' '{ print $2 }' | xargs cat
It nearly works, but my issue is that I'd like to add a newline after each file.
Can this be done in a one liner?
(surely I can create a new script or a function that does cat and then echo -n but I was wondering if this could be solved in another way)
cat somefile | grep example | awk -F '"' '{ print $2 }' | while read file; do cat $file; echo ""; done
Using GNU Parallel http://www.gnu.org/software/parallel/ it may be even faster (depending on your system):
cat somefile | grep example | awk -F '"' '{ print $2 }' | parallel "cat {}; echo"
awk -F '"' '/example/{ system("cat " $2 };printf "\n"}' somefile

Resources