How to trim every nth line? - bash

i would like to cut off the first 9 characters of each 4th line. I could use cut -c 9, but i don't know how to select only every 4th line, without loosing the remaining lines.
Input:
#V300059044L3C001R0010004402
AAGTAGATATCATGGAGCCG
+
FFFGFGGFGFGFFGFFGFFGGGGGFFFGG
#V300059044L3C001R0010009240
AAAGGGAGGGAGAATAATGG
+
GFFGFEGFGFGEFDFGGEFFGGEDEGEGF
Output:
#V300059044L3C001R0010004402
AAGTAGATATCATGGAGCCG
+
FGFFGFFGFFGGGGGFFFGG
#V300059044L3C001R0010009240
AAAGGGAGGGAGAATAATGG
+
FGEFDFGGEFFGGEDEGEGF

Could you please try following, written and tested with shown samples in GNU awk.
awk 'FNR%4==0{print substr($0,10);next} 1' Input_file
OR as per #tripleee's suggestion(in comments) try:
awk '!(FNR%4) { $0 = substr($0, 10) }1' Input_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
FNR%4==0{ ##Checking condition if this line number is fully divided by 4(every 4th line).
print substr($0,10) ##Printing line from 10th character here.
next ##next will skip all further statements from here.
}
1 ##1 will print current Line.
' Input_file ##Mentioning Input_file name here.

GNU sed can choose every 4th line with 4~4, e.g.:
sed -E '4~4s/.{9}//'

Related

How to catch xth pattern1 to pattern2

this is my example to explain my question :
Bug Day 2022-01-13:
Security-Fail 248975
Resolve:
...
Bug Day 2022-01-25:
Security-Fail 225489
Security-Fail 225256
Security-Fail 225236
Resolve:
...
Bug Day 2022-02-02:
Security-Fail 222599
Resolve:
So, I have a big file that contain multiple security vulnerabilities.
I want to obtain that :
2022-01-13;248975
2022-01-25;225489,225256,225236
2022-02-02;222599
I though about doing something like
bugDayNb=$(grep "Bug Day" | wc -l)
for i in $bugDayNb; do
echo "myBugsFile" | grep -A10 -m$i "Bug Day"
done
The problem of this command is, if there are more than 10 Security-Fail, it won't works, and if I put a "-A50" it may take the next Security-Fail of the next Bug Day.
So I would prefer a way to sed or something like that from xth "Bug Day" to xth "Resolve"
Thank you !!
Here's one way to do it:
$ awk '/^Bug Day/{d=$NF; s=""}
/^Security-Fail/{d = d s $NF; s=","}
/^Resolve:/{print d}' ip.txt
2022-01-13:248975
2022-01-25:225489,225256,225236
2022-02-02:222599
/^Bug Day/{d=$NF; s=""} save the date to variable d if line starts with Bug Day and initialize s to empty string
use {d=$NF; sub(/:$/, ";", d); s=""} if you want ; instead of :
/^Security-Fail/{d = d s $NF; s=","} when line starts with Security-Fail append the number to d variable and set s so that further appends will be separated by ,
/^Resolve:/{print d} print the results when Resolve: is seen
With your shown samples, please try following awk program.
awk '
/Bug Day/{
sub(/:$/,"",$NF)
bugVal=$NF
next
}
/^Security-Fail/{
secVal=(secVal?secVal ",":"")$NF
next
}
/^Resolve:/ && bugVal && secVal{
print bugVal";"secVal
bugVal=secVal=""
}
' Input_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
/Bug Day/{ ##Checking condition if line contains Bug day then do following.
sub(/:$/,"",$NF) ##Substituting : at last of $NF in current line.
bugVal=$NF ##Creating bugVal which has $NF value in it.
next ##next will skip all further statements from here.
}
/^Security-Fail/{ ##Checking if line starts from Security-Fail then do following.
secVal=(secVal?secVal ",":"")$NF ##Creating secVal which has $NF value in it and keep adding value to it with delimiter of comma here.
next ##next will skip all further statements from here.
}
/^Resolve:/ && bugVal && secVal{ ##Checking condition if line starts from Resolve: and bugVal is SET and secVal is SET then do following.
print bugVal";"secVal ##printing bugVal semi-colon secVal here.
bugVal=secVal="" ##Nullifying bugVal and secVal here.
}
' Input_file ##mentioning Input_file name here.
This might work for you (GNU sed):
sed -nE '/Bug Day/{:a;N;/Resolve/!ba;s/.* //mg;y/\n/,/;s/:,(.*),.*/;\1/p}' file
Gather up lines between Bug Day and Resolve and format accordingly.
If you want to be selective about a single day or range of days, use:
sed -nE '/Bug Day/{x;s/^/x/;/^x{1,3}$/!{x;d};x
:a;N;/Resolve/!ba;s/.* //mg;y/\n/,/;s/:,/;/;s/(.*),.*/\1/p}' file
The above command displays the first 3 days i.e. 1 to 3
Would you please try an awk solution:
awk '/^Bug Day/ {f=1; line=$0; next} # start of block
f {line=line ORS $0} # append the line if "f" is set
/^Security-Fail/ {g=1} # the block contains "Security-Fail"
/^Resolve/ {if (g) print line; f=g=0; line=""} # end of block
' input_file
If you prefer a one-liner:
awk '/^Bug Day/{f=1; line=$0; next} f{line=line ORS $0} /^Security-Fail/{g=1} /^Resolve/{if (g) print line; f=g=0; line=""}' input_file

How to increment the numbers in a file by 3 in bash? [duplicate]

file1.text contains below data.
VARIABLE=00
RATE=14
PRICE=100
I need to increment value by 1 only for below whenever I want.
VARIABLE=00 file name: file1.txt
output should be incremented by 1 every time.
output will be like below
VARIABLE=01
in next run VARIABLE=02 and so on....
Could you please try following, written and tested with shown samples in GNU awk.
awk 'BEGIN{FS=OFS="="} /^VARIABLE/{$NF=sprintf("%02d",$NF+1)} 1' Input_file > temp && mv temp Input_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section of this program from here.
FS=OFS="=" ##Setting FS and OFS as = here.
}
/^VARIABLE/{ ##Checking condition if line starts from VARIABLE then do following.
$NF=sprintf("%02d",$NF+1) ##Adding 1 last field and saing it to last field with 2 digits value.
}
1 ##1 will print the current line.
' Input_file > temp && mv temp Input_file ##Mentioning Input_file name here.
You can do it quite simply as a one-liner in Perl:
perl -i -pe '/^VARIABLE=/ && s/(\d+)/$&+1/e' file
In case you are unfamiliar with Perl, that says...
Run Perl and modify file in-place. if you come to any lines containing VARIABLE=, substitute the digits on that line with an expression calculated as "whatever the number was +1"
Note that Perl is a standard part of macOS - i.e. automatically included with all versions.

Reformatting text file using awk and cut as a one liner

Data:
CHR SNP BP A1 TEST NMISS BETA SE L95 U95 STAT P
1 chr1:1243:A:T 1243 T ADD 16283 -6.124 0.543 -1.431 0.3534 -1.123 0.14
Desired output:
MarkerName P-Value
chr1:1243 0.14
The actual file is 1.2G worth of lines like the above
I need to strip the 2nd column of the text past the 2nd colon and then paste this to the final 12th column and give it a new header.
I have tried:
awk '{print $2, $12}' | cut -d: -f1-2
but this removes the whole line after the colons and I want to keep the "p" column
I outputted this to a new file and then pasted it onto the P-value column using awk but was wondering if there was a one-liner method of doing this?
Many thanks
My comment in more understandable form:
$ awk '
BEGIN {
print "MarkerName P-Value" # output header
}
NR>1 { # skip the funky first record
split($2,a,/:/) # split by :
printf "%s:%s %s\n",a[1],a[2],$12 # printf allows easier output formating
}' file
Output:
MarkerName P-Value
chr1:1243 0.14
EDIT: Adding one more solution here, since OP mentioned my first solution somehow didn't work for OP but it worked fine for me, as an alternative adding this here.
awk '
BEGIN{
print "MarkerName P-Value"
}
FNR>1{
match($2,/([^:]*:){2}/)
print OFS substr($2,RSTART,RLENGTH-1),$NF
}
' Input_file
With shown samples, could you please try following. You need not to use cut with awk, awk could take care of everything within itself.
awk -F' +|:' '
BEGIN{
print "MarkerName P-Value"
}
FNR>1{
print OFS $2":"$3,$NF
}
' Input_file
Explanation: Adding detailed explanation for above.
awk -F' +|:' ' ##Starting awk program from here and setting field separator as spaces or colon for all lines.
BEGIN{ ##Starting BEGIN section of this program from here.
print "MarkerName P-Value" ##Printing headers here.
}
FNR>1{ ##Checking condition if line number is greater than 1 then do following.
print OFS $2":"$3,$NF ##Printing space(OFS) 2nd field colon 3rd field and last field as per OP request.
}
' Input_file ##Mentioning Input_file name here.
$ awk -F'[: ]+' '{print (NR==1 ? "MarkerName P-Value" : $2":"$3" "$NF)}' file
MarkerName P-Value
chr1:1243 0.14
Sed alternative:
sed -En '1{s/^.*$/MarkerName\tP-Value/p};s/([[:digit:]]+[[:space:]]+)([[:alnum:]]+:[[:digit:]]+)(.*)([[:digit:]]+\.[[:digit:]]+$)/\2\t\4/p'
For the first line, substitute the full line for the headers. Then, split the line into 4 sections based on regular expressions and then print the 2nd subsection followed by a tab and then the 4th subsection.

How to increase value of a text variable in a file

file1.text contains below data.
VARIABLE=00
RATE=14
PRICE=100
I need to increment value by 1 only for below whenever I want.
VARIABLE=00 file name: file1.txt
output should be incremented by 1 every time.
output will be like below
VARIABLE=01
in next run VARIABLE=02 and so on....
Could you please try following, written and tested with shown samples in GNU awk.
awk 'BEGIN{FS=OFS="="} /^VARIABLE/{$NF=sprintf("%02d",$NF+1)} 1' Input_file > temp && mv temp Input_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section of this program from here.
FS=OFS="=" ##Setting FS and OFS as = here.
}
/^VARIABLE/{ ##Checking condition if line starts from VARIABLE then do following.
$NF=sprintf("%02d",$NF+1) ##Adding 1 last field and saing it to last field with 2 digits value.
}
1 ##1 will print the current line.
' Input_file > temp && mv temp Input_file ##Mentioning Input_file name here.
You can do it quite simply as a one-liner in Perl:
perl -i -pe '/^VARIABLE=/ && s/(\d+)/$&+1/e' file
In case you are unfamiliar with Perl, that says...
Run Perl and modify file in-place. if you come to any lines containing VARIABLE=, substitute the digits on that line with an expression calculated as "whatever the number was +1"
Note that Perl is a standard part of macOS - i.e. automatically included with all versions.

How to add single quote after specific word using sed?

I am trying to write a script to add a single quote after a "GOOD" word .
For example, I have file1 :
//WER GOOD=ONE
//WER1 GOOD=TWO2
//PR1 GOOD=THR45
...
Desired change is to add single quotes :
//WER GOOD='ONE'
//WER1 GOOD='TWO2'
//PR1 GOOD='THR45'
...
This is the script which I am trying to run:
#!/bin/bash
for item in `grep "GOOD" file1 | cut -f2 -d '='`
do
sed -i 's/$item/`\$item/`\/g' file1
done
Thank you for the help in advance !
Could you please try following.
sed "s/\(.*=\)\(.*\)/\1'\2'/" Input_file
OR as per OP's comment to remove empty line use:
sed "s/\(.*=\)\(.*\)/\1'\2'/;/^$/d" Input_file
Explanation: following is only for explanation purposes.
sed " ##Starting sed command from here.
s/ ##Using s to start substitution process from here.
\(.*=\)\(.*\) ##Using sed buffer capability to store matched regex into memory, saving everything till = in 1st buffer and rest of line in 2nd memory buffer.
/\1'\2' ##Now substituting 1st and 2nd memory buffers with \1'\2' as per OP need adding single quotes before = here.
/" Input_file ##Closing block for substitution, mentioning Input_file name here.
Please use -i option in above code in case you want to save output into Input_file itself.
2nd solution with awk:
awk 'match($0,/=.*/){$0=substr($0,1,RSTART) "\047" substr($0,RSTART+1,RLENGTH) "\047"} 1' Input_file
Explanation: Adding explanation for above code.
awk '
match($0,/=.*/){ ##Using match function to mmatch everything from = to till end of line.
$0=substr($0,1,RSTART) "\047" substr($0,RSTART+1,RLENGTH) "\047" ##Creating value of $0 with sub-strings till value of RSTART and adding ' then sub-strings till end of line adding ' then as per OP need.
} ##Where RSTART and RLENGTH are variables which will be SET once a TRUE matched regex is found.
1 ##1 will print edited/non-edited line.
' Input_file ##Mentioning Input_file name here.
3rd solution: In case you have only 2 fields in your Input_file then try more simpler in awk:
awk 'BEGIN{FS=OFS="="} {$2="\047" $2 "\047"} 1' Input_file
Explanation of 3rd code: Use only for explanation purposes, for running please use above code itself.
awk ' ##Starting awk program here.
BEGIN{FS=OFS="="} ##Setting FS and OFS values as = for all line for Input_file here.
{$2="\047" $2 "\047"} ##Setting $2 value with adding a ' $2 and then ' as per OP need.
1 ##Mentioning 1 will print edited/non-edited lines here.
' Input_file ##Mentioning Input_file name here.

Resources