Awk/sed replace in each line with previous string in the line - bash

I have a file test.txt like this (but containing many more lines)
/foo/bar/how hello
/foo/bar/are hello
/foo/bar/you hello
I want to get this output:
/foo/bar/how how
/foo/bar/are are
/foo/bar/you you
I have tried this:
while read line
do
bla=$(echo $line | cut -f4 -d"/" | cut -f1 -d" ")
sed -i "s/hello/$bla/"
done <test.txt
But the output is:
sed: no input files
sed: no input files
sed: no input files
When I provide the filename (while read line; do bla=$(echo $line | cut -f4 -d"/" | cut -f1 -d" "); sed -i "s/hello/$bla/" test.txt ; done <test.txt), I get this:
/foo/bar/how how
/foo/bar/are how
/foo/bar/you how
I would like to replace on each line some constant pattern by a pattern appearing before on the same line and that changes from line to line. Any idea on how I could do that (using sed or awk)? Many thanks!

$ sed 's~\([^/]*\) .*~\1 \1~' file
/foo/bar/how how
/foo/bar/are are
/foo/bar/you you

Below awk solution might help
awk '{$2=$1;sub(/.*\//,"",$2)}1' test.txt
Ouput
/foo/bar/how how
/foo/bar/are are
/foo/bar/you you
Notes
By default awk fields are whitespace separated so, you have two fields ie $1 and $2.
First assign the first field of every record to second ie $2=$1
Then, in the second field, strip the the part till the last / using sub(/.*\//,"",$2).
1 at the end is the simplest of awk command which prints each record.

Try this:
$ sed 's~\(.*/\)\([^ ]*\) .*~\1\2 \2~' test.txt
/foo/bar/how how
/foo/bar/are are
/foo/bar/you you
Use the -i option to edit the file in place:
sed -i 's~\(.*/\)\([^ ]*\) .*~\1\2 \2~' test.txt
Explanation:
s: substitute
\(.*/\)\: any character up to last /
followed by \([^ ]*\): any non-space character followed by a space
Using backreference, the strings that matches the pattern are replaced with : first group (/foo/bar/) followed by repeated second group (the word after last / : how, are or you).

Related

How to print nth line by replacing cut with sed

I have to write a function taking as argument a csv file named players.csv and a number giving the line to print.
Indded, i have to print the nth line 2column and 3rd column with a "is" between. For example Mike is John. column delimeter is ";".
I have the following code which is working :
sed -n "$2p" players.csv | cut -d ";" -f 2,3 --output-delimiter=' is '
However, I have to do the same without using cut. I can only use sed and wc. Do you have any idea what sed command I can use to have the same behavior as with cut.
Thank you for your attention and your help.
You were almost there:
sed -En "$2"'s/[^;]*;([^;]*);([^;]*).*/\1 is \2/p' players.csv

Delete words in a line using grep or sed

I want to delete three words with a special character on a line such as
Input:
\cf4 \cb6 1749,1789 \cb3 \
Output:
1749,1789
I have tried a couple sed and grep statements but so far none have worked, mainly due to the character \.
My unsuccessful attempt:
sed -i 's/ [.\c ] //g' inputfile.ext >output file.ext
Awk accepts a regex Field Separator (in this case, comma or space):
$ awk -F'[ ,]' '$0 = $3 "." $4' <<< '\cf4 \cb6 1749,1789 \cb3 \'
1749.1789
-F'[ ,]' - Use a single character from the set space/comma as Field Separator
$0 = $3 "." $4 - If we can set the entire line $0 to Field 3 $4 followed by a literal period "." followed by Field 4 $4, do the default behavior (print entire line)
Replace <<< 'input' with file if every line of that file has the same delimeters (spaces/comma) and number of fields. If your input file is more complex than the sample you shared, please edit your question to show actual input.
The backslash is a special meta-character that confuses bash.
We treat it like any other meta-character, by escaping it, with--you guessed it--a backslash!
But first, we need to grep this pattern out of our file
grep '\\... \\... [0-9]+,[0-9]+ \\... \\' our_file # Close enough!
Now, just sed out those pesky backslashes
| sed -e 's/\\//g' # Don't forget the g, otherwise it'll only strip out 1 backlash
Now, finally, sed out the clusters of 2 alpha followed by a number and a space!
| sed -e 's/[a-z][a-z][0-9] //g'
And, finally....
grep '\\... \\... [0-9]+,[0-9]+ \\... \\' our_file | sed -e 's/\\//g' | sed -e 's/[a-z][a-z][0-9] //g'
Output:
1749,1789
My guess is you are having trouble because you have backslashes in input and can't figure out how to get backslashes into your regex. Since backslashes are escape characters to shell and regex you end up having to type four backslashes to get one into your regex.
Ben Van Camp already posted an answer that uses single quotes to make the escaping a little easier; however I shall now post an answer that simply avoids the problem altogether.
grep -o '[0-9]*,[0-9]*' | tr , .
Locks on to the comma and selects the digits on either side and outputs the number. Alternately if comma is not guaranteed we can do it this way:
egrep -o ' [0-9,]*|^[0-9,]*' | tr , . | tr -d ' '
Both of these assume there's only one usable number per line.
$ awk '{sub(/,/,".",$3); print $3}' file
1749.1789
$ sed 's/\([^ ]* \)\{2\}\([^ ]*\).*/\2/; s/,/./' file
1749.1789

Trim Line After Third Occurence of Colon

I am parsing through a log file and I am trying to clean up the output.
Here's a sample input line
2016-04-11 12:45:26 : TEXT TO REMOVE
Here's my current code which removes everything after the first colon.
sed 's/:.*//'
which outputs
2016-04-11 12
I'd like to modify this so that it removes everything after the third colon instead (so I end up with just the date and time).
Here's a sample output I would like:
2016-04-11 12:45:26
That's what cut was invented to do:
$ cut -d':' -f1-3 file
2016-04-11 12:45:26
How about looking for the spaces surrounding the colon?
sed 's/ : .*//'
awk -F ' : ' '{print $1}'
You can use this sed:
str='2016-04-11 12:45:26 : TEXT TO REMOVE'
sed 's/ *:[^:]*$//' <<< "$str"
i.e. use [^:]*$ pattern to make sure we match last segment of line after last :
Output:
2016-04-11 12:45:26
Strictly speaking removing everything after the 3rd : is equivalent to print only the chars that are before it. sed would be easier to use that way.
Give this a try:
sed "s/^\([^:][^:]*:[^:][^:]*:[^:][^:]*\):.*$/\1/"
The same principle can be used to print only date time before ::
sed "s/^\([0-9][0-9]*-[0-9][0-9]-[0-9][0-9] [0-9][0-9]:[0-9][0-9]:[0-9][0-9]\).*$/\1/g"
The chars in between the \( and the \) can be reused in the replacement section with \1.

Delete first characters off of a line in a file with awk or grep

I'm attempting to remove a certain pattern from a line, but not the entire line itself. An example would be:
Original:
user=dannyBoy
Desired:
dannyBoy
I have a file that is full of lines like that, so I was wondering how I would be able to cut a specific part of the text off, whether that be just removing the first five characters from the list or searching for the pattern "user=" and removing it.
There are many ways to do this:
cut -d'=' -f2- file
sed 's/^[^=]*//' file
awk -F= '{print $2}' file #if just one = is present
cut sets a delimiter (-d'=) and then prints all the fields starting from the 2nd one (-f2-).
sed looks for all the content from the beginning up to the first = and removes it.
awk sets = as field separator and prints the second field.
Using ex:
echo user=dannyBoy | ex -s +"norm df=" +%p -cq! /dev/stdin
where ex is equivalent to vi -e/vim -e which basically executes vi command: df= (delete until finds =), then print the buffer (%p).
If you've multiple lines like that, then it would be simpler by using substitution:
ex -s +"%s/^.*=//g" +%p -cq! foo.txt
To edit file in place, change -cq! to -cwq.
The command below deletes the first 5 characters:
$ echo "user=dannyboy" | cut -c 6-
You can use it on a file with cut -c 6- inputfilename as well.

Display all fields except the last

I have a file as show below
1.2.3.4.ask
sanma.nam.sam
c.d.b.test
I want to remove the last field from each line, the delimiter is . and the number of fields are not constant.
Can anybody help me with an awk or sed to find out the solution. I can't use perl here.
Both these sed and awk solutions work independent of the number of fields.
Using sed:
$ sed -r 's/(.*)\..*/\1/' file
1.2.3.4
sanma.nam
c.d.b
Note: -r is the flag for extended regexp, it could be -E so check with man sed. If your version of sed doesn't have a flag for this then just escape the brackets:
sed 's/\(.*\)\..*/\1/' file
1.2.3.4
sanma.nam
c.d.b
The sed solution is doing a greedy match up to the last . and capturing everything before it, it replaces the whole line with only the matched part (n-1 fields). Use the -i option if you want the changes to be stored back to the files.
Using awk:
$ awk 'BEGIN{FS=OFS="."}{NF--; print}' file
1.2.3.4
sanma.nam
c.d.b
The awk solution just simply prints n-1 fields, to store the changes back to the file use redirection:
$ awk 'BEGIN{FS=OFS="."}{NF--; print}' file > tmp && mv tmp file
Reverse, cut, reverse back.
rev file | cut -d. -f2- | rev >newfile
Or, replace from last dot to end with nothing:
sed 's/\.[^.]*$//' file >newfile
The regex [^.] matches one character which is not dot (or newline). You need to exclude the dot because the repetition operator * is "greedy"; it will select the leftmost, longest possible match.
With cut on the reversed string
cat youFile | rev |cut -d "." -f 2- | rev
If you want to keep the "." use below:
awk '{gsub(/[^\.]*$/,"");print}' your_file

Resources