I have a following file and i would like to swap the values which are in format of digits(up to 3 digits)#digits(up to 4 digits) followed by # or space/end of line.
If # is followed by non digits then it shouldn't interchange them.
Sample Input
cat file1
xyz xyz xyz 123#456#1#34#123#2
xyz xyz xyz xyz xyz
test test
123#456#1#34#123#212#3#456#1#34#123#2#123#xyzxyz xyz
xyz xyz xyz
Sample output:
xyz xyz xyz 456#123#34#1#2#123
xyz xyz xyz xyz xyz
test test
456#123#34#1#212#123#456#3#34#1#2#123#123#xyzxyz xyz
xyz xyz xyz
Have tried the following logic, seems like split is required in order interchange the values, but not able to check the condition along with how to save this in same field
awk '{for(a=1;a<=NF;a++){if($a~/#/){split($a,b,"[##]");val1=b[1];val2=b[2];print val1,val2}}}' file1
123 456
123 456
This simple gnu sed should be able to do the job:
sed -E 's/\<([0-9]{1,3})#([0-9]{1,4})(#|$)/\2#\1\3/g' file
xyz xyz xyz 456#123#34#1#2#123
xyz xyz xyz xyz xyz
test test
456#123#34#1#212#123#456#3#34#1#2#123#123#xyzxyz xyz
xyz xyz xyz
Here, \< is used for word boundary.
Note that on BSD sed you have to use [[:<:]] for word boundary:
sed -E 's/[[:<:]]([0-9]{1,3})#([0-9]{1,4})(#|$)/\2#\1\3/g' file
Explanation:
\<: Word boundary
([0-9]{1,3}): Match 1 to 3 digits
#: Match a #
([0-9]{1,4}): Match 1 to 4 digits
(#|$): Match a # or end of line
With your shown samples, could you please try following. Written and tested in GNU awk.
awk -v RS='([0-9]{1,3}#[0-9]{1,4}#)+[0-9]{1,3}#[0-9]{1,4}' '
{
val=""
delete arr
delete arr2
num=split(RT,arr,"#")
for(i=1;i<=num;i++){
valTemp=""
split(arr[i],arr2,"#")
valTemp=arr2[2]"#"arr2[1]
val=(val?val "#":"")valTemp
}
ORS=val
}
1
' Input_file
Using GNU sed for the word \boundary:
sed -E 's/\b([[:digit:]]{1,3})#([[:digit:]]{1,4})(#|[[:blank:]]*|[[:blank:]]*$)/\2#\1\3/g' infile
Input:
xyz123#456#1#34#1234#2
0123#456# 123#456#
123#456#1#34#123#212#3#456#1#34#123#2#123#xyzx
5678#124 111#110# 002#001 01#010 1111#000
1111#000
Output:
xyz123#456#34#1#1234#2
0123#456# 456#123#
456#123#34#1#212#123#456#3#34#1#2#123#123#xyzx
5678#124 110#111# 001#002 010#01 1111#000
1111#000
Related
I have file like
abc dog 1.0
abc cat 2.4
abc elephant 1.2
and I want to replace last word from a line which contains 'elephant' with string which I know.
The result should be
abc dog 1.0
abc cat 2.4
abc elephant mystring
I have sed '/.*elephant.*/s/%/%/' $file but what should be instead of '%'?
EDIT:
odd example
abc dogdogdogdog 1.0
abc cat 2.4
abc elephant 1.2
and now try to change last line.
EDIT: To preserve spaces could you please try following.
awk '
match($0,/elephant[^0-9]*/){
val=substr($0,RSTART,RLENGTH-1)
sub("elephant","",val)
$NF=val "my_string"
val=""
}
1
' Input_file
Could you please try following(if you are ok with awk).
awk '/elephant/{$NF="my_string"} 1' Input_file
In case you want to save output into Input_file itself try following.
awk '/elephant/{$NF="my_string"} 1' Input_file > temp_file && mv temp_file Input_file
basic
sed '/elephant/ s/[^[:blank:]]\{1,\}$/mstring/' $file
if some space could be at the end
sed '/elephant/ s/[^[:blank:]]\{1,\}[[:blank:]*$/mystring/' $file
an alternative to do the substitution and preserve the space:
awk '/elephant/{sub(".{"length($NF)"}$","new")}7' file
with your example:
kent$ cat f
abc dog 1.0
abc cat 2.4
abc elephant 1.2
kent$ awk '/elephant/{sub(".{"length($NF)"}$","new")}7' f
abc dog 1.0
abc cat 2.4
abc elephant new
Robustly in any awk:
$ awk '$2=="elephant"{sub(/[^[:space:]]+$/,""); $0=$0 "mystring"} 1' file
abc dog 1.0
abc cat 2.4
abc elephant mystring
Note that unlike the other answers you have so far it will not fail when the target string (elephant) is part of some other string or appears in some other location than the 2nd field or contains any regexp metachars, or when the replacement string contains &, etc.
I am given a file. If a line has "xxx" as its third word then I need to replace it with "yyy". My final output must have all the original lines with the modified lines.
The input file is-
abc xyz mno
xxx xyz abc
abc xyz xxx
abc xxx xxx xxx
The required output file should be-
abc xyz mno
xxx xyz abc
abc xyz yyy
abc xxx yyy xxx
I have tried-
grep "\bxxx\b" file.txt | awk '{if ($3=="xxx") print $0;}' | sed -e 's/[^ ]*[^ ]/yyy/3'
but this gives the output as-
abc xyz yyy
abc xxx yyy xxx
Following simple awk may help you in same.
awk '$3=="xxx"{$3="yyy"} 1' Input_file
Output will be as follows.
abc xyz mno
xxx xyz abc
abc xyz yyy
abc xxx yyy xxx
Explanation: Checking condition here if $3 3rd field is equal to string xxx then setting $3's value to string yyy. Then mentioning 1 there, since awk works on method of condition then action. I am making condition TRUE here by mentioning 1 here and NOT mentioning any action here so be default print of current line will happen(either with changed 3rd field or with new 3rd field).
sed solution:
sed -E 's/^(([^[:space:]]+[[:space:]]+){2})apathy\>/\1empathy/' file
The output:
abc xyz mno
apathy xyz abc
abc xyz empathy
abc apathy empathy apathy
To modify the file inplace add -i option: sed -Ei ....
In general the awk command may look like
awk '{command set 1}condition{command set 2}' file
The command set 1 would be executed for every line while command set 2 will be executed if the condition preceding that is true.
My final output must have all the original lines with the modified
lines
In your case
awk 'BEGIN{print "Original File";i=1}
{print}
$3=="xxx"{$3="yyy"}
{rec[i++]=$0}
END{print "Modified File";for(i=1;i<=NR;i++)print rec[i]}'file
should solve that.
Explanation
$3 is the the third space-delimited field in awk. If it matches "xxx", then it is replaced. Print the unmodified lines first while storing the modified lines in an array. At the end, print the modified lines. BEGIN and END blocks are executed only at the beginning and the end respectively. NR is the awk built-in variable which denotes that number of records processed till the moment. Since it is used in the END block it should give us the total number of records.
All good :-)
Ravinder has already provided you with the shortest awk solution possible.
In sed, the following would work:
sed -E 's/(([^ ]+ ){2})xxx/\1yyy/'
Or if your sed doesn't include -E, you can use the more painful BRE notation:
sed 's/\(\([^ ][^ ]* \)\{2\}\)xxx/\1yyy/'
And if you're in the mood to handle this in bash alone, something like this might work:
while read -r line; do
read -r -a a <<<"$line"
[[ "${a[2]}" == "xxx" ]] && a[2]="yyy"
printf '%s ' "${a[#]}"
printf '\n'
done < input.txt
I have two files:
abc
ghi
and the second (aka database file)
abc 123
def 456
ghi 789
and I want to query the database file to print the second column into the second column of the first file if there is a match
So my output would be
abc 123
ghi 789
logically, I understand what I have to do, but I lack the commands in bash for it...
my attempt was to use join with the -1 but I do not understand how to implement it...
what's wrong with join?
$ cat 1
abc
ghi
$ cat 2
abc 123
def 456
ghi 789
$ join 1 2
abc 123
ghi 789
then if you want to store it somewhere just redirect the stdout.
join is a little overkill here (as it requires sorting) because file1 has just one column. Can you not use grep -f?
grep -Fwf file1 file2
-F treats the content of file1 as strings, not patterns
-w looks for the whole word to match
how to use awk on the following file named "awk.txt" and print all fields in proper length of space or tab length between.
# cat /root/awk.txt
abc hij klm
def pqr hij
mmm fgf hgt
yyt ghf jkw
I wanted to use awk on this and print in the following proper format.
abc hij klm
def pqr hij
mmm fgf hgt
yyt ghf jkw
Please help!!
Use the column command from coreutils:
column -t file
In this special case, where all entries have the same length, the following awk command would do the trick as well, however column can do the job even if the entries have different length:
awk '{$1=$1}1' OFS=' ' file
This line of awk will format the output using printf (documentation)
awk '{printf "%3s\t%3s\t%3s\n",$1,$2,$3}' awk.txt
If you want to strip the first line starting with #
awk '!/^#/{printf "%3s\t%3s\t%3s\n",$1,$2,$3}'
if i have file like this
test.txt
abc naveen
abc cde
naveen cde
kumar
naveen
abc
cde
abc
naveen
cde
Question 1: In this we have repeated patterns like abc, navee, cdf etc
Now we have to get the lines from first occurrence of one pattern to any second occurrence of another pattern
For example, I want to get the lines from the 2nd occurrence of abc to the 3rd occurrence of naveen i.e we get output as
abc cde
naveen cde
kumar
naveen
Question 2 (this question is continue to above question):
I want to get only the lines between them (exclude those abc and naveen )
So, I want output as
cde
naveen cde
kumar
this can be done by using sed command ....
so any one please give me the answer for this
try this
a=2
b=3
abcocc=`awk '$0~/abc/{print NR}' txt | awk -v occ=$a 'NR==occ{print $0}' `
naveenocc=`awk '$0~/naveen/{print NR}' txt | awk -v occ=$b 'NR==occ{print $0}'`
1) awk -v abc=$abcocc -v naveen=$naveenocc 'NR>=abc&&NR<=naveen{print $0}' txt
2) awk -v abc=$abcocc -v naveen=$naveenocc 'NR>abc&&NR<naveen{print $0}' txt
a is occurrence of abc and b is occurrence of Naveen and txt is input file. try and let me know if modification is needed.