UNIX command line file edit functionality - shell

Back again with another question.
The file I have right now is of the following format:
1234,
1234,
1-23-4
I would like to do two things with this file.
First, remove the - characters in the third line. Therefore, 1-23-4 ==> 1234.
Second, I would like to make it all print in one line.
The final result should look like:
1234,1234,1234
Is this possible using line commands in a script? Kindly advise.
Thank you in advance for your time and help.

With tr:
$ tr -d '\n-' < a | sed 's/$/\n/'
1234,1234,1234
To remove the hyphens:
$ tr -d '-' < file
1234,
1234,
1234
And the same applies for the new lines with \n.
As we are removing all new lines, it will miss the finishing one. To recover it, we use sed.
$prompt tr -d '\n-' < a
1234,1234,1234$prompt
$ tr -d '\n-' < a | sed 's/$/\n/'
1234,1234,1234
Thanks fedorqui.I tried this. I replaced file with my file name but it
is not creating a new file with the final format for me. Sorry i think
I should have mentioned this in the question. My bad.
No problem. You just need to redirect it:
$ tr -d '\n-' < a | sed 's/$/\n/' > new_file
$ cat new_file
1234,1234,1234

(gnu) awk:
awk -v RS="\0" 'gsub(/[\n-]/,"")' file
test
kent$ echo "1234,
1234,
1-23-4"|awk -v RS="\0" 'gsub(/[\n-]/,"")'
1234,1234,1234

tr was a good one.
Anotherway in perl:
perl -pne 's/[-\n]//g' your_file
-n-> this will act as a while loop for each line in the file.
-e-> the thing after this is nothing but the expression which will act on each an every line.
-p-> print each line after the expression is executed on each line.
s/search/replace/g
s/ search for either a newline or "-"/ replace with empty character/g-for all occurences in the line.

Related

printing only specific lines with sed

I have following File (wishlist.txt):
Alligatoah Musik_ist_keine_lösung;https:///uhfhf
Alligatoah STRW;https:///uhfhf?i
Amewu Entwicklungshilfe;https:///uhfhf?i
and want to have the first word of line n.
so for n = 1:
Alligatoah
What i have so far is:
sed -e 's/\s.*//g' wishlist.txt
is there a elegant way to get rid of all lines except n?
Edit:
How to pass a bash variable "$i" to sed since
sed -n '$is/ .*//p' $wishlist
and
sed -n "\`${i}\`s/ .*//p" $wishlist
doesn't work
A couple of other techniques to get the first word of the 3rd line:
awk -v line=3 'NR == line {print $1; exit}' file
or
head -n 3 file | tail -n 1 | cut -d ' ' -f 1
Something like this. For the 1st word of the 3rd line.
sed -n '3s/\s.*//p' wishlist.txt
To use a variable: Note: Double quotes.
line=3; sed -n "${line}s/\s.*//p" wishlist.txt
sed supports "addresses", so you can tell it what lines to operate on. To print only the first line, you can use
sed -e '1!d; s/\s.*//'
where 1!d means: on lines other then 1, delete the line.

How to place comma separated values into newline and remove all the id's before including colon

I have below command output from the Linux System where it fetches the all the account names by comma separated which I want to be placed into newline's, so remove all the command and place individual account name into newline.
$ getent group pi_infra
pi_infra:*:5899:pxf59093,pxv07744,pxa02374,pxa07513,pxa08599,pxa11102,pxa30995,pxa34158,pxf07822,pxf29346,pxf30902,pxf31604,pxf31606,pxf31953,pxf34985,pxf41740,pxf41778,pxf43236,pxf43917,pxf45518,pxf46461,pxf49051,pxf58440,pxf58523,pxf58621,pxf60794,pxf60938,pxf61299,pxf63061,pxp08000,pxp25916,pxp42841,pxp68003,pxp69833,pxp87972
$ cat pi_in| sed 's/,/\n/g'
$ cat pi_in| tr ',' '\n'
Result From the above.
pi_infra:*:5899:pxf59093
pxv07744
pxa02374
pxa07513
pxa08599
pxa11102
pxa30995
pxa34158
pxf07822
pxf29346
pxf30902
pxf31604
pxf31606
pxf31953
pxf34985
pxf41740
pxf41778
pxf43236
pxf43917
pxf45518
pxf46461
pxf49051
pxf58440
pxf58523
pxf58621
pxf60794
pxf60938
pxf61299
pxf63061
pxp08000
pxp25916
pxp42841
pxp68003
pxp69833
pxp87972
As i want to remove all the stuff before : and only want ID printed hence i've chosen to use below.
$ cat pi_in| cut -d":" -f4 | tr ',' '\n'
pxf59093
pxv07744
pxa02374
pxa07513
pxa08599
pxa11102
pxa30995
pxa34158
pxf07822
pxf29346
pxf30902
pxf31604
pxf31606
pxf31953
pxf34985
pxf41740
pxf41778
pxf43236
pxf43917
pxf45518
pxf46461
pxf49051
pxf58440
pxf58523
pxf58621
pxf60794
pxf60938
pxf61299
pxf63061
pxp08000
pxp25916
pxp42841
pxp68003
pxp69833
pxp87972
This above works fine but looking it this all can be integrated into one rather using tr and cut two times distinctly.
Preferably awk or sed would be appropriate.
Thanks.
In awk could you please try following.
awk -F':' '{gsub(",",ORS,$4);print $4}' Input_file
2nd solution:
awk '{sub(/.*:/,"");gsub(/,/,ORS)} 1' Input_file
$ sed 's/.*://; y/,/\n/' file
pxf59093
pxv07744
pxa02374
...
s/.*:// removes everything preceding the last colon, and the colon itself, and y/,/\n/ does what tr does in your approach.
This might work for you (GNU sed):
sed 'y/,/\n/;/:/!P;D' file
Translate ,'s to newlines and don't print any line with a : in it.
N.B. The solution by #oguz ismail is more efficient and faster (with regards to a sed solution).

Edit data removing line breaks and putting everything in a row

Hi I'm new in shell scripting and I have been unable to do this:
My data looks like this (much bigger actually):
>SampleName_ZN189A
01000001000000000000100011100000000111000000001000
00110000100000000000010000000000001100000010000000
00110000000000001110000010010011111000000100010000
00000110000001000000010100000000010000001000001110
0011
>SampleName_ZN189B
00110000001101000001011100000000000000000000010001
00010000000000000010010000000000100100000001000000
00000000000000000000000010000000000010111010000000
01000110000000110000001010010000001111110101000000
1100
Note: After every 50 characters there is a line break, but sometimes less when the data finishes and there's a new sample name
I would like that after every 50 characters, the line break would be removed, so my data would look like this:
>SampleName_ZN189A
0100000100000000000010001110000000011100000000100000110000100000000000010000000000001100000010000000...
>SampleName_ZN189B
0011000000110100000101110000000000000000000001000100010000000000000010010000000000100100000001000000...
I tried using tr but I got an error:
tr '\n' '' < my_file
tr: empty string2
Thanks in advance
tr with "-d" deletes specified character
$ cat input.txt
00110000001101000001011100000000000000000000010001
00010000000000000010010000000000100100000001000000
00000000000000000000000010000000000010111010000000
01000110000000110000001010010000001111110101000000
1100
$ cat input.txt | tr -d "\n"
001100000011010000010111000000000000000000000100010001000000000000001001000000000010010000000100000000000000000000000000000010000000000010111010000000010001100000001100000010100100000011111101010000001100
You can use this awk:
awk '/^ *>/{if (s) print s; print; s="";next} {s=s $0;next} END {print s}' file
>SampleName_ZN189A
010000010000000000001000111000000001110000000010000011000010000000000001000000000000110000001000000000110000000000001110000010010011111000000100010000000001100000010000000101000000000100000010000011100011
>SampleName_ZN189B
001100000011010000010111000000000000000000000100010001000000000000001001000000000010010000000100000000000000000000000000000010000000000010111010000000010001100000001100000010100100000011111101010000001100
Using awk
awk '/>/{print (NR==1)?$0:RS $0;next}{printf $0}' file
if you don't care of the result which has additional new line on first line, here is shorter one
awk '{printf (/>/?RS $0 RS:$0)}' file
This might work for you (GNU sed):
sed '/^\s*>/!{H;$!d};x;s/\n\s*//2gp;x;h;d' file
Build up the record in the hold space and when encountering the start of the next record or the end-of-file remove the newlines and print out.
you can use this sed,
sed '/^>Sample/!{ :loop; N; /\n>Sample/{n}; s/\n//; b loop; }' file.txt
Try this
cat SampleName_ZN189A | tr -d '\r'
# tr -d deletes the given/specified character from the input
Using simple awk, Same will be achievable.
awk 'BEGIN{ORS=""} {print}' SampleName_ZN189A #Output doesn't contains an carriage return
at the end, If u want an line break at the end this works.
awk 'BEGIN{ORS=""} {print}END{print "\r"}' SampleName_ZN189A
# select the correct line break charachter (i.e) \r (or) \n (\r\n) depends upon the file format.

sed, capture only the number

I have this text file:
some text A=10 some text
some more text A more text
some other text A=30 other text
I'm trying to use sed to capture only the numeric value of A. Using this
cat textfile | sed -r 's/.*A=(\S+).*/\1/'
I get:
10
some more text A more text
30
But what i really need is:
10
0
30
If the string A= does not exist output a 0. How can I accomplish this?
I cannot think on a one-liner, so this is my approach:
while read line
do
grep -Po '(?<=A=)\d+' <<< "$line" || echo "0"
done < file
I am using the look-behind grep to get any number after A=. In case there is none, the || (else) will print a 0.
I love code-golf!
sed -e 's/^/A=0 /; s/.*\<A=\(\d\+\).*/\1/'
This prepends A=0 to the line before substituting.
try this one-liner:
awk -F'A=' 'NF==1{print "0";next}{sub(/ .*/,"",$2);print $2}' file
with your data:
kent$ echo "some text A=10 some text
some more text A more text
some other text A=30 other text"|awk -F'A=' 'NF==1{print "0";next}{sub(/.*/,"",$2);print $2}'
10
0
30
gawk
awk '{$0=gensub(/^.*A=?([[:digit:]]+).*$/, "\\1", "g"); print($0+0)}' file.txt
This might work for you (GNU sed):
sed '/.*A=\([0-9][0-9]*\).*/s//\1/;t;s/.*/0/' file
Look for the string A= followed by one or more numbers and if it occurs replace the whole line by the back reference. Otherwise replace the whole of the line by 0.
I think the best way is to do two different commands - the first replaces lines without 'A=' with the line 'A=0', the second does what you did.
So
cat textfile | sed -r 's/^([^A]|A[^=)*$/A=0/' | sed -r 's/.*A=(\S+).*/\1/'
How about:
sed -r -e 's/.*A=(\S+).*/\1/' -e 's/.*A.*/0/'
Some grep-sed-cut combination:
grep -o 'A=\?[0-9]*' input | sed 's/A$/A=0/' | cut -d= -f2
Produces:
10
0
30

head and grep simultaneously

Is there a unix one liner to do this?
head -n 3 test.txt > out_dir/test.head.txt
grep hello test.txt > out_dir/test.tmp.txt
cat out_dir/test.head.txt out_dir/test.tmp.txt > out_dir/test.hello.txt
rm out_dir/test.head.txt out_dir/test.tmp.txt
I.e., I want to get the header and some grep lines from a given file, simultaneously.
Use awk:
awk 'NR<=3 || /hello/' test.txt > out_dir/test.hello.txt
You can say:
{ head -n 3 test.txt ; grep hello test.txt ; } > out_dir/test.hello.txt
Try using sed
sed -n '1,3p; /hello/p' test.txt > out_dir/test.hello.txt
The awk solution is the best, but I'll add a sed solution for completeness:
$ sed -n test.txt -e '1,3p' -e '4,$s/hello/hello/p' test.txt > $output_file
The -n says not to print out a line unless specified. The -e are the commands '1,3p prints ou the first three lines 4,$s/hello/hello/p looks for all lines that contain the word hello, and substitutes hello back in. The p on the end prints out all lines the substitution operated upon.
There should be a way of using 4,$g/HELLO/p, but I couldn't get it to work. It's been a long time since I really messed with sed.
Of course, I would go awk but here is an ed solution for the pre-vi nostalgics:
ed test.txt <<%
4,$ v/hello/d
w test.hello.txt
%

Resources