How I can delete the first column in even rows? [duplicate] - bash

I have a csv file with data presented as follows
87540221|1356438283301|1356438284971|1356438292151697
87540258|1356438283301|1356438284971|1356438292151697
87549647|1356438283301|1356438284971|1356438292151697
I'm trying to save the first column to a new file (without field separator , and then delete the first column from the main csv file along with the first field separator.
Any ideas?
This is what I have tried so far
awk 'BEGIN{FS=OFS="|"}{$1="";sub("|,"")}1'
but it doesn't work

This is simple with cut:
$ cut -d'|' -f1 infile
87540221
87540258
87549647
$ cut -d'|' -f2- infile
1356438283301|1356438284971|1356438292151697
1356438283301|1356438284971|1356438292151697
1356438283301|1356438284971|1356438292151697
Just redirect into the file you want:
$ cut -d'|' -f1 infile > outfile1
$ cut -d'|' -f2- infile > outfile2 && mv outfile2 file

Assuming your original CSV file is named "orig.csv":
awk -F'|' '{print $1 > "newfile"; sub(/^[^|]+\|/,"")}1' orig.csv > tmp && mv tmp orig.csv

GNU awk
awk '{$1="";$0=$0;$1=$1}1' FPAT='[^|]+' OFS='|'
Output
1356438283301|1356438284971|1356438292151697
1356438283301|1356438284971|1356438292151697
1356438283301|1356438284971|1356438292151697

Pipe is special regex symbol and sub function expectes you to pass a regex. Correct awk command should be this:
awk 'BEGIN {FS=OFS="|"} {$1=""; sub(/\|/, "")}'1 file
OUTPUT:
1356438283301|1356438284971|1356438292151697
1356438283301|1356438284971|1356438292151697
1356438283301|1356438284971|1356438292151697

With sed :
sed 's/[^|]*|//' file.txt

Related

how to add zero to single digit date in a date field in shell

I have data like the below
1213231421312131|USER|21121|1231412|XEM|NAME|NAME|||5072020|2313||||NY|2131|||99|E|||ver.01
6454242352352352|USER|13131|7342422|XEM|NAME|NAME|||13032001|1231||||TX|7312|||11|E|||ver.01
5131242515111233|USER|21212|2314413|XEM|NAME|NAME|||2101979|1231||||TX|7312|||11|E|||ver.01
2341313412341123|USER|62422|1124242|XEM|NAME|NAME|||23111979|1231||||TX|7312|||11|E|||ver.01
I need data as below
1213231421312131|USER|21121|1231412|XEM|NAME|NAME|||05072020|2313||||NY|2131|||99|E|||ver.01
6454242352352352|USER|13131|7342422|XEM|NAME|NAME|||13032001|1231||||TX|7312|||11|E|||ver.01
5131242515111233|USER|21212|2314413|XEM|NAME|NAME|||02101979|1231||||TX|7312|||11|E|||ver.01
2341313412341123|USER|62422|1124242|XEM|NAME|NAME|||23111979|1231||||TX|7312|||11|E|||ver.01
So filed 10 is a date column, i would like to add zero to date... 7 digit to 8 digit. I have used the below command, but that command replacing Pipe symbol to space.
awk -F "|" '{$10 = sprintf("%08d", $10); print}' <fileName>
Please help me with this request
thank you
Yum
awk has an input field separator and an output field separator. The print command uses the latter. So to keep the | symbols, set the output field separator OFS too:
awk -F\| -v OFS=\| '{$10 = sprintf("%08d", $10); print}' yourFile
Or with numfmt from GNU coreutils (preinstalled on most Linux systems)
numfmt -d\| --field 10 --format %08f < yourFile
You can set the input (FS) and output (OFS) separator within a BEGIN clause:
awk 'BEGIN{FS=OFS="|"} {$10 = sprintf("%08d", $10); print}' <filename>
IMHO, it is the most convenient way to define identical input and output field separators other than space.
See the online demo:
s='1213231421312131|USER|21121|1231412|XEM|NAME|NAME|||5072020|2313||||NY|2131|||99|E|||ver.01
6454242352352352|USER|13131|7342422|XEM|NAME|NAME|||13032001|1231||||TX|7312|||11|E|||ver.01
5131242515111233|USER|21212|2314413|XEM|NAME|NAME|||2101979|1231||||TX|7312|||11|E|||ver.01
2341313412341123|USER|62422|1124242|XEM|NAME|NAME|||23111979|1231||||TX|7312|||11|E|||ver.01'
awk 'BEGIN{FS=OFS="|"} {$10 = sprintf("%08d", $10); print}' <<< "$s"
Output:
1213231421312131|USER|21121|1231412|XEM|NAME|NAME|||05072020|2313||||NY|2131|||99|E|||ver.01
6454242352352352|USER|13131|7342422|XEM|NAME|NAME|||13032001|1231||||TX|7312|||11|E|||ver.01
5131242515111233|USER|21212|2314413|XEM|NAME|NAME|||02101979|1231||||TX|7312|||11|E|||ver.01
2341313412341123|USER|62422|1124242|XEM|NAME|NAME|||23111979|1231||||TX|7312|||11|E|||ver.01
You should accept the awk solution before you get answers like
sed -r 's/(([^|]*\|){9})([^|]{7}\|)/\10\3/' filename
or worse
paste -d "|" <(cut -d"|" -f1-9 filename) \
<(cut -d"|" -f10 filename |sed -r 's/^.{7}$/0&/') \
<(cut -d"|" -f11- filename)

How can I send the last column of the first line to standard output?

For example
The file TEMPFILE.TXT contains this:
PROC-|STUFF_THINGS|MORE STUFF|PING|AUTOSYS
PROC-|ASTUFF_THINGS_XX_2|Print-Wire|AUTONON
I only want to print AUTOSYS to standard output.
Use awk:
awk -F'|' 'NR==1 {print $NF; exit}' file
If you don't mind hardcoding the number of columns, then:
head -1 file | cut -d'|' -f5
Column-count agnostic approach, but more round-about and expensive:
head -1 file | rev | cut -f1 -d'|' | rev
In all these, we are only reading the first line of the file.
You can try :
while read line ;do echo "${line##*|}";break;done

Bash: concenate lines in csv file (1+2, 3+4 etc)

I have a bash file with increasing integers in the first column and some text behind.
1,text1a,text1b
2,text2a,text2b
3,text3a,text3b
4,text4a,text4b
...
I would like to add line 1+2, 3+4 etc. and add the outcome to a new csv file.
The desired output would be
1,text1a,text1b,2,text2a,text2b
3,text3a,text3b,4,text4a,text4b
...
A second option without the numbers would be great as well. The actual input would be
1,text,text,,,text#text.com,2,text.text,text
2,text,text,,,text#text.com,3,text.text,text
3,text,text,,,text#text.com,2,text.text,text
4,text,text,,,text#text.com,3,text.text,text
Desired outcome
text,text,,,text#text.com,2,text.text,text,text,text,,,text#text.com,3,text.text,text
text,text,,,text#text.com,2,text.text,text,text,text,,,text#text.com,3,text.text,text
$ pr -2ats, file
gives you
1,text1a,text1b,2,text2a,text2b
3,text3a,text3b,4,text4a,text4b
UPDATE
for the second part
$ cut -d, -f2- file | pr -2ats,
will give you
text,text,,,text#text.com,2,text.text,text,text,text,,,text#text.com,3,text.text,text
text,text,,,text#text.com,2,text.text,text,text,text,,,text#text.com,3,text.text,text
awk solution:
awk '{ printf "%s%s",$0,(!(NR%2)? ORS:",") }' input.csv > output.csv
The output.csv content:
1,text1a,text1b,2,text2a,text2b
3,text3a,text3b,4,text4a,text4b
----------
Additional approach (to skip numbers):
awk -F',' '{ printf "%s%s",$2 FS $3,(!(NR%2)? ORS:FS) }' input.csv > output.csv
The output.csv content:
text1a,text1b,text2a,text2b
text3a,text3b,text4a,text4b
3rd approach (for your extended input):
awk -F',' '{ sub(/^[0-9]+,/,"",$0); printf "%s%s",$0,(!(NR%2)? ORS:FS) }' input.csv > output.csv
With bash, cut, sed and paste:
paste -d, <(cut -d, -f 2- file | sed '2~2d') <(cut -d, -f 2- file | sed '1~2d')
Output:
text1a,text1b,text2a,text2b
text3a,text3b,text4a,text4b
I hoped to get started with something simple as
printf '%s,%s\n' $(<inputfile)
This turns out wrong when you have spaces inside your text fields.
The improvement is rather a mess:
source <(echo "printf '%s,%s\n' $(sed 's/.*/"&"/' inputfile|tr '\n' ' ')")
Skipping the first filed can be done in the same sed command:
source <(echo "printf '%s,%s\n' $(sed -r 's/([^,]*),(.*)/"\2"/' inputfile|tr '\n' ' ')")
EDIT:
This solution will fail when it has special characters, so you should use a simple solution as
cut -f2- file | paste -d, - -

How to print the csv file excluding first column till end using awk

I have a csv file with dynamic columns.
I've tried to use awk -F , 'NF>1' resul1.txt but it still prints all columns.
Since it has dynamic columns.
Its quite difficult to print using print $1 till end.
Try this awk command:
awk -F, '{$1=""}1' input.txt | awk -vOFS=, '{$1=$1}1' > output.txt
Make the 1st field empty
Print out entire line again
try substr function :
substr(string, start [, length ])
Return a length-character-long substring of string, starting at character number start. The first character of a string is character
number one.For example, substr("washington", 5, 3) returns "ing".*
awk -F, '{print substr($0,length($1)+1+length(FS))}' file
You can use cut:
cut -d',' -f2- yourfile.csv > output.csv
Explanation:
-d - setting delimiter to ,
-f - fields to print
2- - from 2 field to end of line
With awk:
awk -F, '{sub(/[^,]+,/,"",$0);}1' OFS=, yourfile.csv > output.csv
With sed:
sed -i.bak 's/^[^,]\+,//g' yourfile.csv
-i - in-place edit

Remove space between 2 columns and insert commas - bash

I am using:
cut -f1-2 input.txt|sed 1d
The data is outputting like this:
/mnt/Hector/Data/benign/binary/benign-pete/ fd0977d5855d1295bd57383b17981a09
/mnt/Hector/Data/benign/binary/benign-pete/ fd34c32786aadab513f506c30c2cba33
/mnt/Hector/Data/benign/binary/benign-pete/ fe7d03512e0731e40be628524efbf317
I am trying to get it to output without a space like this and insert a comma between the file path and md5 check sum so excel can separate it properly:
/mnt/Hector/Data/benign/binary/benign-pete/,fd0977d5855d1295bd57383b17981a09
/mnt/Hector/Data/benign/binary/benign-pete/,fd34c32786aadab513f506c30c2cba33
/mnt/Hector/Data/benign/binary/benign-pete/,fe7d03512e0731e40be628524efbf317
I didn't see your input.txt, but try this line, do the job in one shot:
awk -v OFS="," 'NR>1{print $1,$2}' input.txt
This can make it:
$ tr -s " " < your_file | sed 's/ /,/g'
/mnt/Hector/Data/benign/binary/benign-pete/,fd0977d5855d1295bd57383b17981a09
/mnt/Hector/Data/benign/binary/benign-pete/,fd34c32786aadab513f506c30c2cba33
/mnt/Hector/Data/benign/binary/benign-pete/,fe7d03512e0731e40be628524efbf317
tr -s " " < your_file removes extra spaces. sed 's/ /,/g' replaces spaces with commas.

Resources