How to use shell to solve the scripts and about file? - shell

I have a question:
file:
154891
145690
165211
190189
135901
290134
I want to output like this: (Every three uid separated by comma)
154891,145690,165211
190189,135901,290134
How can I do it?

You can use pr:
pr -3 -s, -l 1
Print in 3 columns, with commas as separators, with a 'page length' of 1.
154891,145690,165211
190189,135901,290134

sed ':1;N;s/\n/,/;0~3b;t1' file
or
awk 'ORS=NR%3?",":"\n"' file

There could be many ways to do that, pick one you like, with/out comma ",":
$ awk '{printf "%s%s",$0,(NR%3?",":RS)}' file
154891,145690,165211
190189,135901,290134
$ xargs -n3 -a file
154891 145690 165211
190189 135901 290134

Related

How to add a header to text file in bash?

I have a text file and want to convert it to csv file before to convert it, i want to add a header to text file so that the csv file has the same header. I have one thousand columns in text file and want to have one thousand column name. As a side note, the content of the text file is just rows of some numbers which is separated by comma ",". Is there any way to add the header line in bash?
I tried the way below and didn't work. I did the command below first in python.
> for i in range(1001):
> print "col" + "_" + "i"
save the output of this in text file with this command (python header.py >> header.txt) and add the output of this in format of text file to the original text file that i have like below:
cat header.txt filename.txt > newfilename.txt
then convert the txt file to csv file with "mv newfilename.txt newfilename.csv".
But unfortunately this way doesn't work as the header line has double number of other rows for some reason. I would appreciate any help to make this problem solve.
based on the description your file is already comma separated, so is a csv file. You just want to add a column number header line.
$ awk -F, 'NR==1{for(i=1;i<=NF;i++) printf "col_%d%s", $i,(i==NF?ORS:FS)}1' file
will add column headers as many as the fields in the first row of the file
e.g.
$ seq 5 | paste -sd, | # create 1,2,3,4,5 as a test input
awk -F, 'NR==1{for(i=1;i<=NF;i++) printf "col_%d%s", i, (i==NF?ORS:FS)}1'
col_1,col_2,col_3,col_4,col_5
1,2,3,4,5
You can generate the column names in bash using one of the options below. Each example generates a header.txt file. You already have code to add this to the beginning of your file as a header.
Using bash loops
Bash loops for this many iterations will be inefficient, but will work.
for i in {1..10}; do
echo -n "col_$i "
done > header.txt
echo >> header.txt
or using seq
for i in $(seq 1 1000); do
echo -n "col_$i "
done > header.txt
echo >> header.txt
Using seq only
Using seq alone will be more efficient.
seq -f "col_%g" -s" " 1 1000 > header.txt
Use seq and sed
You can use the seq utility to construct your CSV header, with a little minor help from Bash expansions. You can then insert the new header row into your existing CSV file, or concatenate the header with your data.
For example:
# construct a quoted CSV header
columns=$(seq -f '"col_%g"' -s', ' 1 1001)
# strip the trailing comma
columns="${columns%,*}"
# insert headers as first line of foo.csv with GNU sed
sed -i -e "1 i\\${columns}" /tmp/foo.csv
Caveats
If you don't have GNU sed, you can also use cat, sponge, or other tools to concatenate your header and data, although most of your concatenation options will require redirection to a new combined file to avoid clobbering your existing data.
For example, given /tmp/data.csv as your original data file:
seq -f '"col_%g"' -s', ' 1 1001 > /tmp/header.csv
sed -i -e 's/,[[:space:]]*$//' /tmp/header.csv
cat /tmp/header /tmp/data > /tmp/new_file.csv
Also, note that while Bash solutions that avoid calling standard utilities are possible, doing it in pure Bash might be too slow or memory intensive for large data sets.
Your mileage may vary.
printf "col%s," {1..100} |
sed 's/,$//' |
cat - filename.txt >newfilename.txt
I believe sed should supply the missing final newline as a side effect. If not, maybe try 's/,$/\n/' though this isn't entirely portable, either. You could probably replace the cat with sed as well, something like
... | sed 's/,$//;r filename.txt'
but again, I'm not entirely sure how portable this is.

How to enumerate a one column csv file using bash?

I have a list like this
6.53143.S
6.47643.S
6.53161.S
dots are just for presentation
some bash scripting
6.53143.S
6.47643.S
6.53161.s
Try this :
awk '{print NR, $0}' file
If your data actually looks like this:
- 6.53143.S
- 6.47643.S
- 6.53161.S
use:
$ awk '$1=NR' file
1 6.53143.S
2 6.47643.S
3 6.53161.S
In case you only want to print the line numbers along with lines then use simple cat for the same.
cat -n Input_file

Replacing newlines with commas at every third occurrence using AWK?

For example: a given file has the following lines:
1
alpha
beta
2
charlie
delta
10
text
test
I'm trying to get the following output using awk:
1,alpha,beta
2,charlie,delta
10,text,test
Fairly simple. Use the output record separator as follows. Specify the comma delimiter when the line number is not divisible by 3 and the newline otherwise:
awk 'ORS=NR%3?",":"\n"' file
awk can handle this easily by manipulating ORS:
awk '{ORS=","} !(NR%3){ORS="\n"} 1' file
1,alpha,beta
2,charlie,delta
10,text,test
there is a tool for this kind of text processing pr
$ pr -3ats, file
1,alpha,beta
2,charlie,delta
10,text,test
You can also use xargs with sed to coalesce multiple lines into single lines, useful to know:
cat file|xargs -n3|sed 's/ /,/g'

Removes values in a file that match patterns from another file [duplicate]

This question already has answers here:
Bash, Linux, Need to remove lines from one file based on matching content from another file
(3 answers)
Closed 7 years ago.
I have a list of values in one file:
item2
item3
item4
and I want to remove the entire line from another file when the rows looks like this:
item1|XXXX|ABCD
item2|XXXX|ABCD
item3|XXXX|ABCD
item4|XXXX|ABCD
item5|XXXX|ABCD
So that I'm left with:
item1|XXXX|ABCD
item5|XXXX|ABCD
Is there a bash sequence to do this?
grep -vf can do the job:
grep -vFf file1 file2
item1|XXXX|ABCD
item5|XXXX|ABCD
awk to the rescue!
$ awk -F"|" 'NR==FNR{a[$1];next} !($1 in a)' remove items
item1|XXXX|ABCD
item5|XXXX|ABCD
where the item list to be removed is in file "remove" and data in file "items"
If your distinctive marker is that |XXXX|ABCD| string, you can just grep it out:
$ grep -vF '|XXXX|ABCD|' input > output
It's safer to use option -F (fixed strings) because your pattern is dangerously close to containing regex metacharacters (namely in your case: |—it's not active in the default grep regex syntax, but you don't want to worry about that when you're working with simple patterns).
If your distinctive pattern is the rest of the line, you can use a whole file as a pattern list with grep's -f option:
$ grep -vFf item_list < input > output

Remove a line from a csv file bash, sed, bash

I'm looking for a way to remove lines within multiple csv files, in bash using sed, awk or anything appropriate where the file ends in 0.
So there are multiple csv files, their format is:
EXAMPLEfoo,60,6
EXAMPLEbar,30,10
EXAMPLElong,60,0
EXAMPLEcon,120,6
EXAMPLEdev,60,0
EXAMPLErandom,30,6
So the file will be amended to:
EXAMPLEfoo,60,6
EXAMPLEbar,30,10
EXAMPLEcon,120,6
EXAMPLErandom,30,6
A problem which I can see arising is distinguishing between double digits that end in zero and 0 itself.
So any ideas?
Using your file, something like this?
$ sed '/,0$/d' test.txt
EXAMPLEfoo,60,6
EXAMPLEbar,30,10
EXAMPLEcon,120,6
EXAMPLErandom,30,6
For this particular problem, sed is perfect, as the others have pointed out. However, awk is more flexible, i.e. you can filter on an arbitrary column:
awk -F, '$3!=0' test.csv
This will print the entire line is column 3 is not 0.
use sed to only remove lines ending with ",0":
sed '/,0$/d'
you can also use awk,
$ awk -F"," '$NF!=0' file
EXAMPLEfoo,60,6
EXAMPLEbar,30,10
EXAMPLEcon,120,6
EXAMPLErandom,30,6
this just says check the last field for 0 and don't print if its found.
sed '/,[ \t]*0$/d' file
I would tend to sed, but there is an egrep (or: grep -e) -solution too:
egrep -v ",0$" example.csv

Resources