I need to create a border around the output of a command in terminal so that if, for example, the output of a command is this:
Apple
Paper Clip
Water
It will become this:
/==========\
|Apple |
|Paper Clip|
|Water |
\==========/
Thanks ahead of time for any and all responses.
-C.L
awk seems like the least insane way to go about this:
command | expand | awk 'length($0) > length(longest) { longest = $0 } { lines[NR] = $0 } END { gsub(/./, "=", longest); print "/=" longest "=\\"; n = length(longest); for(i = 1; i <= NR; ++i) { printf("| %s %*s\n", lines[i], n - length(lines[i]) + 1, "|"); } print "\\=" longest "=/" }'
expand replaces tabs that may be in the output with the appropriate number of spaces to keep the look of it the same (this is to make sure that every byte of output is rendered with the same width). The awk code works as follows:
length($0) > length(longest) { # Remember the longest line
longest = $0
}
{ # also remember all lines in order
lines[NR] = $0
}
END { # when you have everything:
gsub(/./, "=", longest) # build a line of = as long as the longest
# line
print "/=" longest "=\\" # use it to print the top bit
n = length(longest) # format the content with left and right
for(i = 1; i <= NR; ++i) { # delimiters; spacing through printf
printf("| %s %*s\n", lines[i], n - length(lines[i]) + 1, "|")
}
print "\\=" longest "=/" # print bottom bit.
}
The most insane way to do it, and I dare you to dispute this, is with sed:
#!/bin/sed -f
# assemble lines in the hold buffer, preceded by the left delimiter
s/^/| /
1h
1!H
$!d
# make a copy of it in the pattern space
x
h
# isolate the longest line (or rather: a line of = as long as the longest
# line)
s/[^\n]/=/g
:a
/^\(=*\)\n\1/ {
s//\1/
ba
}
//! {
s/\n=*//
ta
}
# build top bit, print it
s,.*,/&\\,
p
# build measuring stick
s,.\(.*\).,=\1,
# for all lines in the output:
:lineloop
# fetch the line
G
s/^\(=*\n\)\([^\n]*\).*/\1\2/
# replace it with = to get a second measuring stick
s/[^\n]/=/g
# fetch another copy of the line
G
s/^\(=*\n=*\n\)\([^\n]*\).*/\1\2/
# inner loop:
:spaceloop
# while the line measuring stick is not as long as the overall measuring
# stick
/^\(=*\)\n\1/! {
# append a = to it and a space to the line for output
s/\n/\n=/
s/$/ /
b spaceloop
}
# once that is done, append the second delimiter
s/$/|/
# remove one measuring stick
s/=*\n//
# put the second behind the actual line
s/\(.*\)\n\(.*\)/\2\n\1/
# print the line
P
# remove it. Only the measuring stick remains and can be reused for the
# next line
s/.*\n//
# do this while there are more lines to be processed
x
/\n/ {
s/[^\n]*\n//
x
b lineloop
}
# then build the bottom bit and print it.
x
s/=/\\/
s/$/\//
Put that in a file foo.sed, use command | expand | sed -f foo.sed. But only do it once to confirm that it works. You don't want to run something like that in production.
Not in the language you were looking for, but succinct and readable:
#!/usr/bin/env ruby
input = STDIN.read.split("\n")
width = input.map(&:size).max + 2
bar = '='*(width-2)
puts '/' + bar + '\\'
input.each {|i| puts "|"+i+" "*(width-i.size-2)+"|" }
puts '\\'+ bar + '/'
You can save it in a file, chmod +x it, and pipe your input into it.
If you "need" to have it in a one-liner:
echo e"Apple\nPaper Clip\nWater" |
ruby -e 'i=STDIN.read.split("\n");w=i.map(&:size).max+2;b="="*(w-2);i.map! {|j| "|"+j+" "*(w-j.size-2)+"|" };i.unshift "/"+b+"\\"; i<<"\\"+b+"/";puts i'
Related
I have around 65000 products codes in a text file.I wanted to split those number in group of 999 each .Then-after want each 999 number with single quotes separated by comma.
Could you please suggest how I can achieve above scenario through Unix script.
87453454
65778445
.
.
.
.
Till 65000 productscodes
Need to arrange in below pattern:
'87453454','65778445',
With awk:
awk '
++c == 1 { out = "\047" $0 "\047"; next }
{ out = out ",\047" $0 "\047" }
c == 999 { print out; c = 0 }
END { if (c) print out }
' file
Or, with GNU sed:
sed "
:a
\$bb
N
0~999{
:b
s/\n/','/g
s/^/'/
s/$/'/
b
}
ba" file
With Perl:
perl -ne '
sub pq { chomp; print "\x27$_\x27" } pq;
for (1 .. 998) {
if (defined($_ = <>)) {
print ",";
pq
}
}
print "\n"
' < file
Credit for Mauke perl#libera.chat
65000 isn't that many lines for awk - just do it all in one shot :
mawk 'BEGIN { FS = RS; RS = "^$"; OFS = (_="\47")(",")_
} gsub(/^|[^0-9]*$/,_, $!(NF = NF))'
'66771756','69562431','22026341','58085790','22563930',
'63801696','24044132','94255986','56451624','46154427'
That's for grouping them all in one line. To make 999 ones, try
jot -r 50 10000000 99999999 |
# change "5" to "999" here
rs -C= 0 5 |
mawk 'sub(".*", "\47&\47", $!(NF -= _==$NF ))' FS== OFS='\47,\47'
'36452530','29776340','31198057','36015730','30143632'
'49664844','83535994','86871984','44613227','12309645'
'58002568','31342035','72695499','54546650','21800933'
'38059391','36935562','98323086','91089765','65672096'
'17634208','14009291','39114390','35338398','43676356'
'14973124','19782405','96782582','27689803','27438921'
'79540212','49141859','25714405','42248622','25589123'
'11466085','87022819','65726165','86718075','56989625'
'12900115','82979216','65469187','63769703','86494457'
'26544666','89342693','64603075','26102683','70528492'
_==$NF checks whether right most column is empty or not,
—- i.e. whether there's a trailing edge sep that needds to be trimmed
If your input file only contains short codes as shown in your example, you could use the following hack:
xargs -L 999 bash -c "printf \'%s\', \"\$#\"; echo" . <inputFile >outputFile
Alternatively, you can use this sed command:
sed -Ene"s/(.*)/'\1',/;H" -e{'0~999','$'}'{z;x;s/\n//g;p}' <inputFile >outputFile
s/(.*)/'\1',/ wraps each line in '...',
but does not print it (-n)
instead, H appends the modified line to the so called hold space; basically a helper variable storing a single string.
(This also adds a line break as a separator, but we remove that later).
Every 999 lines (0~999) and at the end of the input file ($) ...
... the hold space is then printed and cleared (z;x;...;p)
while deleting all delimiter-linebreaks (s/\n//g) mentioned earlier.
I have several lines in a file (input.in) that may look like this (asterisks are not literal; added for emphasis):
200928,121546,00002,**0000004015K**,**0000000641}**,00102020
200928,121546,00002,**0000000227B**,**0000000970R**,84839923
200928,121546,00003,**0000001197A**,**0000000227B**,93877763
I need to be able to find the value of the last character in the forth and fifth element (or look at the position 31 and 43) to determine what the actual number should be and if it's positive or negative. The result should look like the following after modifications:
200928,121546,00002,-00000040152,-00000006410,00102020
200928,121546,00002,00000002272,-00000009709,84839923
200928,121546,00003,00000011971,00000002272,93877763
{ABCDEFGHI correspond to all positive field and subs are 0123456789
}JKLMNOPQR correspond to all negative field and subs are 0123456789
I'm able to get all the positive number conversions working correctly but I am having problems with the negative conversions.
My code looks sorta like this for getting the positive switches (This is a "packed field" conversion btw):
sed -i -E "s/^(.{$a})\{/\10/" input.in
This is for the { positive case where the sub will be 0.
Where $a is introduced by a for a in 30 42 do loop. I have no issues identifying and updating the last char for that string but I can't figure out how to only flip the negative values if the corresponding character is found. I was thinking something like looking at the entire group of 11 (4th and 5th element) and if the last char in that group is }JKLMNOPQR, insert - at the first position and replace }JKLMNOPQR with 0123456789. respectively. Stuck here though. Of course the objective is to update the file with the changes after subs have been completed.
Code sample:
input="input.in"
for a in 30 42
do
while IFS= read -r line
do
echo "${line:$a:1} found, converting"
edbvalue=${line:$a:1}
case $edbvalue in
{)
echo -n -e "{ being replaced with 0\n"
sed -i -E "s/^(.{$a})\{/\10/" input.in
;;
A)
echo -n -e "A being replaced with 1\n"
sed -i -E "s/^(.{$a})A/\11/" input.in
;;
.
.
.
R)
echo -n -e "R being replaced with 9\n"
sed -i -E "s/^(.{$a})R/\19/" input.in
;;
*)
echo -n -e "no conversion needed\n"
;;
esac
done < "$input"
done
Rewriting the input file repeatedly is horrendously inefficient. You want to perform all the replacements in one go.
sed is rather hard to read once you start doing nontrivial things, so I would recommend switching to Awk (or a proper modern scripting language like Python if you want to invest more into this).
awk -F , 'BEGIN { OFS=FS
pos = "{ABCDEFGHI"; neg = "}JKLMNOPQR";
for (i=0; i<10; ++i) { p[substr(pos, i+1, 1)] = i; n[substr(neg, i+1, 1)] = i }
}
{ for (i=4; i<=5; i++) {
where = length($i)
what = substr($i, where, 1)
if (what ~ "^[" pos "]$") sign = ""
else if (what ~ "^[" neg "]$") sign = "-"
else print "Error: field " i " " $i " malformed" >"/dev/stderr"
$i = sign substr($i, 1, where-1) (sign ? n[what] : p[what])
}
}1' input.in
Demo: https://ideone.com/z8wK0V
This isn't entirely obvious, but here's a quick breakdown.
In the BEGIN block, we create two associative arrays, such that
p["{"] = 0, n["}"] = 0
p["A"] = 1, n["J"] = 1
p["B"] = 2, n["K"] = 2
p["C"] = 3, n["L"] = 3
p["D"] = 4, n["M"] = 4
p["E"] = 5, n["N"] = 5
p["F"] = 6, n["O"] = 6
p["G"] = 7, n["P"] = 7
p["H"] = 8, n["Q"] = 8
p["I"] = 9, n["R"] = 9
(We also set OFS to FS so that Awk will print the output comma-separated, like it reads the input.)
Down in the main block, we loop over fields 4 and 5, extracting the last character and mapping it to the corresponding entry from the correct one of the two arrays, and add a sign if warranted.
This simply writes to standard output; save to a new file and move it back over the original input file, or if you have GNU Awk, explore its -i inplace option.
If you really wanted to do this in sed, it offers a rather convenient y/{ABCDEFGHI/0123456789/ but picking apart the fields and then reassembling the line when you are done is not going to be pleasant.
I've recently approached the incredibly fast awk since I needed to parse very big files.
I had to parse this kind of input...
ID 001R_FRG3G Reviewed; 256 AA.
AC Q6GZX4;
[...]
SQ SEQUENCE 256 AA; 29735 MW; B4840739BF7D4121 CRC64;
MAFSAEDVLK EYDRRRRMEA LLLSLYYPND RKLLDYKEWS PPRVQVECPK APVEWNNPPS
EKGLIVGHFS GIKYKGEKAQ ASEVDVNKMC CWVSKFKDAM RRYQGIQTCK IPGKVLSDLD
AKIKAYNLTV EGVEGFVRYS RVTKQHVAAF LKELRHSKQY ENVNLIHYIL TDKRVDIQHL
EKDLVKDFKA LVESAHRMRQ GHMINVKYIL YQLLKKHGHG PDGPDILTVK TGSKGVLYDD
SFRKIYTDLG WKFTPL
//
ID 002L_FRG3G Reviewed; 320 AA.
AC Q6GZX3;
[...]
SQ SEQUENCE 320 AA; 34642 MW; 9E110808B6E328E0 CRC64;
MSIIGATRLQ NDKSDTYSAG PCYAGGCSAF TPRGTCGKDW DLGEQTCASG FCTSQPLCAR
IKKTQVCGLR YSSKGKDPLV SAEWDSRGAP YVRCTYDADL IDTQAQVDQF VSMFGESPSL
AERYCMRGVK NTAGELVSRV SSDADPAGGW CRKWYSAHRG PDQDAALGSF CIKNPGAADC
KCINRASDPV YQKVKTLHAY PDQCWYVPCA ADVGELKMGT QRDTPTNCPT QVCQIVFNML
DDGSVTMDDV KNTINCDFSK YVPPPPPPKP TPPTPPTPPT PPTPPTPPTP PTPRPVHNRK
VMFFVAGAVL VAILISTVRW
//
ID 004R_FRG3G Reviewed; 60 AA.
AC Q6GZX1; dog;
[...]
SQ SEQUENCE 60 AA; 6514 MW; 12F072778EE6DFE4 CRC64;
MNAKYDTDQG VGRMLFLGTI GLAVVVGGLM AYGYYYDGKT PSSGTSFHTA SPSFSSRYRY
...filter it with a file like this...
Q6GZX4
dog
...to get an output like this:
Q6GZX4 MAFSAEDVLKEYDRRRRMEALLLSLYYPNDRKLLDYKEWSPPRVQVECPKAPVEWNNPPSEKGLIVGHFSGIKYKGEKAQASEVDVNKMCCWVSKFKDAMRRYQGIQTCKIPGKVLSDLDAKIKAYNLTVEGVEGFVRYSRVTKQHVAAFLKELRHSKQYENVNLIHYILTDKRVDIQHLEKDLVKDFKALVESAHRMRQGHMINVKYILYQLLKKHGHGPDGPDILTVKTGSKGVLYDDSFRKIYTDLGWKFTPL 256
dog MNAKYDTDQGVGRMLFLGTIGLAVVVGGLMAYGYYYDGKTPSSGTSFHTASPSFSSRYRY 60
To do this, I came up with this code:
BEGIN{
while(getline<"filterFile.txt">0)B[$1];
}
{
if ($1=="ID")
len=$4;
else{
if ($1=="AC"){
acc=0;
line = substr($0,6,length($0)-6);
split(line,A,"; ");
for (i in A){
if (A[i] in B){
acc=A[i];
}
}
if (acc){
printf acc"\t";
}
}
if (acc){
if(substr($0, 1, 5) == " "){
printf $1$2$3$4$5$6;
}
if ($1 == "//"){
print "\t"len
}
}
}
}
However, since I've seen many examples of similar tasks done with awk, I think there probably is a much more elegant and efficient way to do it. But I can't really grasp the super-compact examples usually found around the internet.
Since this is my input, my output and my code I think this is a good occasion to understand more of awk optimization in terms of performance and coding-style, if some awk-guru has some time and patience to spend in this task.
Perl to the rescue:
#!/usr/bin/perl
use warnings;
use strict;
open my $FILTER, '<', 'filterFile.txt' or die $!;
my %wanted; # Hash of the wanted ids.
chomp, $wanted{$_} = 1 for <$FILTER>;
$/ = "//\n"; # Record separator.
while (<>) {
my ($id_string) = /^ AC \s+ (.*) /mx;
my #ids = split /\s*;\s*/, $id_string;
if (my ($id) = grep $wanted{$_}, #ids) {
print "$id\t";
my ($seq) = /^ SQ \s+ .* $ ((?s:.*)) /mx;
$seq =~ s/\s+//g; # Remove whitespace.
$seq =~ s=//$==; # Remove the final //.
print "$seq\t", length $seq, "\n";
}
}
An awk solution with a different field separator (in this way, you avoid to use substr and split):
BEGIN {
while (getline<"filterFile.txt">0) filter[$1] = 1;
FS = "[ \t;]+"; OFS = ""; ORS = "";
}
{
if (flag) {
if (len)
if ($1 == "//") {
print "\t" len "\n";
flag = 0; len = 0;
} else {
$1 = $1;
print;
}
else if ($1 == "SQ") len = $3;
} else if ($1 == "AC") {
for (i = 1; ++i < NF;)
if (filter[$i]) {
flag = 1;
print $i "\t";
break;
}
}
}
END { if (flag) print "\t" len }
Note: this code is not designed to be short but to be fast. That's why I didn't try to remove nested if/else conditions, but I tried to reduce as possible the global number of tests for a whole file.
However, after several changes since my first version and after several benchmarks, I must admit that choroba perl version is a little faster.
For that kind of task, an idea is to pipe your second file through awk or sed in order to create on the fly a new awk script parsing the big file. As an example:
Control file (f1):
test
dog
Data (f2):
tree 5
test 2
nothing
dog 1
An idea to start with:
sed 's/^\(.*\)$/\/\1\/ {print $2}/' f1 | awk -f - f2
(where -f - means: read the awk script from the standard input rather than from a named file).
may not be much shorter than the original but multiple awk scripts will make the code simpler. First awk generates the records of interest, second extracts the information, third formats
$ awk 'NR==FNR{keys[$0];next}
{RS="//";
for(k in keys)
if($0~k)
{print "key",k; print $0}}' keys file
| awk '/key/{key=$2;f=0;;next}
/SQ/{f=1;print "\n\n"key,$3;next}
f{gsub(" ","");printf $0}
END{print}'
| awk -vRS= -vOFS="\t" '{print $1,$3,$2}'
will print
Q6GZX4 MAFSAEDVLKEYDRRRRMEALLLSLYYPNDRKLLDYKEWSPPRVQVECPKAPVEWNNPPSEKGLIVGHFSGIKYKGEKAQASEVDVNKMCCWVSKFKDAMRRYQGIQTCKIPGKVLSDLDAKIKAYNLTVEGVEGFVRYSRVTKQHVAAFLKELRHSKQYENVNLIHYILTDKRVDIQHLEKDLVKDFKALVESAHRMRQGHMINVKYILYQLLKKHGHGPDGPDILTVKTGSKGVLYDDSFRKIYTDLGWKFTPL 256
dog MNAKYDTDQGVGRMLFLGTIGLAVVVGGLMAYGYYYDGKTPSSGTSFHTASPSFSSRYRY 60
Your code looks almost OK as-is. Keep it simple, single-pass like that.
Only a couple suggestions:
1) The business around the split is too messy/brittle. Maybe try it this way:
acc="";
n=split($0,A,"[; ]+");
for (i=2;i<=n;++i){
if (A[i] in B){
acc=A[i];
break;
}
}
2) Don't use input data in the first argument to your printfs. You never know when something that looks like printf formatting might come in and really mess things up:
printf "%s\t",acc";
printf "%s%s%s%s%s%s",$1,$2,$3,$4,$5,$6;
Update with one more possible "elegance":
3) The awk style of pattern{action} is already a form of if/then, so you can avoid a lot of your outer if/then nesting:
$1="ID" {len=$4}
$1="AC" {
acc="";
...
}
acc {
if(substr($0, 1, 5) == " "){
...
}
In Vim it's actually one-liner to find the pattern:
/^AC.\{-}Q6GZX4;\_.\{-}\nSQ\_.\{-}\n\zs\_.\{-}\ze\/\//
where Q6GZX4; is your pattern to find in order to match the sequence characters.
The above basically will do:
Search for the line with AC at the beginning (^) which is followed by Q6GZX4;.
Follow across multiple lines (\_.\{-}) to the line starting with SQ (\nSQ).
Then follow to the next line ignoring what's in the current (\_.\{-}\n).
Now start selecting the main pattern (\zs) which is basically everything across multiple lines (\_.\{-}) until (\ze) the // pattern if found.
Then execute normal Vim commands (norm) which selects the pattern (gn) and yank it into x register ("xy).
You may now print register (echo #x) or remove whitespace characters from it.
This can be extended into Ex editor script as below (e.g. cmd.ex):
let s="Q6GZX4"
exec '/^AC.\{-}' . s . ';\_.\{-}\nSQ\_.\{-}\n\zs\_.\{-}\ze\/\//norm gn"xy'
let #x=substitute(#x,'\W','','g')
silent redi>>/dev/stdout
echon s . " " . #x
redi END
q!
Then run from the command-line as:
$ ex inputfile < cmd.ex
Q6GZX4 MAFSAEDVLKEYDRRRRMEALLLSLYYPNDRKLLDYKEWSPPRVQVECPKAPVEWNNPPSEKGLIVGHFSGIKYKGEKAQASEVDVNKMCCWVSKFKDAMRRYQGIQTCKIPGKVLSDLDAKIKAYNLTVEGVEGFVRYSRVTKQHVAAFLKELRHSKQYENVNLIHYILTDKRVDIQHLEKDLVKDFKALVESAHRMRQGHMINVKYILYQLLKKHGHGPDGPDILTVKTGSKGVLYDDSFRKIYTDLGWKFTPL
The above example can be further extended for multiple files or matches.
awk 'FNR == NR { aFilter[ $1 ";"] = $1; next }
/^AC/ {
if (String !~ /^$/) print Taken "\t" String "\t" Len
Taken = ""; String = ""
for ( i = 2; i <= NF && Taken ~ /^$/; i++) {
if( $i in aFilter) Taken = aFilter[ $i]
}
Take = Taken !~ /^$/
next
}
Take && /^SQ/ { Len = $3; next }
Take && /^[[:blank:]]/ {
gsub( /[[:blank:]]*/, "")
String = String $0
}
END { if( String !~ /^$/) print Taken "\t" String "\t" Len }
' filter.txt YourFile
Not really shorter, maybe a bit more generic. The heavy part is to extract the value that serve as filter from the line
I have two files:
File with strings (new line terminated)
File with integers (one per line)
I would like to print the lines from the first file indexed by the lines in the second file. My current solution is to do this
while read index
do
sed -n ${index}p $file1
done < $file2
It essentially reads the index file line by line and runs sed to print that specific line. The problem is that it is slow for large index files (thousands and ten thousands of lines).
Is it possible to do this faster? I suspect awk can be useful here.
I search SO to my best but could only find people trying to print line ranges instead of indexing by a second file.
UPDATE
The index is generally not shuffled. It is expected for the lines to appear in the order defined by indices in the index file.
EXAMPLE
File 1:
this is line 1
this is line 2
this is line 3
this is line 4
File 2:
3
2
The expected output is:
this is line 3
this is line 2
If I understand you correctly, then
awk 'NR == FNR { selected[$1] = 1; next } selected[FNR]' indexfile datafile
should work, under the assumption that the index is sorted in ascending order or you want lines to be printed in their order in the data file regardless of the way the index is ordered. This works as follows:
NR == FNR { # while processing the first file
selected[$1] = 1 # remember if an index was seen
next # and do nothing else
}
selected[FNR] # after that, select (print) the selected lines.
If the index is not sorted and the lines should be printed in the order in which they appear in the index:
NR == FNR { # processing the index:
++counter
idx[$0] = counter # remember that and at which position you saw
next # the index
}
FNR in idx { # when processing the data file:
lines[idx[FNR]] = $0 # remember selected lines by the position of
} # the index
END { # and at the end: print them in that order.
for(i = 1; i <= counter; ++i) {
print lines[i]
}
}
This can be inlined as well (with semicolons after ++counter and index[FNR] = counter, but I'd probably put it in a file, say foo.awk, and run awk -f foo.awk indexfile datafile. With an index file
1
4
3
and a data file
line1
line2
line3
line4
this will print
line1
line4
line3
The remaining caveat is that this assumes that the entries in the index are unique. If that, too, is a problem, you'll have to remember a list of index positions, split it while scanning the data file and remember the lines for each position. That is:
NR == FNR {
++counter
idx[$0] = idx[$0] " " counter # remember a list here
next
}
FNR in idx {
split(idx[FNR], pos) # split that list
for(p in pos) {
lines[pos[p]] = $0 # and remember the line for
# all positions in them.
}
}
END {
for(i = 1; i <= counter; ++i) {
print lines[i]
}
}
This, finally, is the functional equivalent of the code in the question. How complicated you have to go for your use case is something you'll have to decide.
This awk script does what you want:
$ cat lines
1
3
5
$ cat strings
string 1
string 2
string 3
string 4
string 5
$ awk 'NR==FNR{a[$0];next}FNR in a' lines strings
string 1
string 3
string 5
The first block only runs for the first file, where the line number for the current file FNR is equal to the total line number NR. It sets a key in the array a for each line number that should be printed. next skips the rest of the instructions. For the file containing the strings, if the line number is in the array, the default action is performed (so the line is printed).
Use nl to number the lines in your strings file, then use join to merge the two:
~ $ cat index
1
3
5
~ $ cat strings
a
b
c
d
e
~ $ join index <(nl strings)
1 a
3 c
5 e
If you want the inverse (show lines that NOT in your index):
$ join -v 2 index <(nl strings)
2 b
4 d
Mind also the comment by #glennjackman: if your files are not lexically sorted, then you need to sort them before passing in:
$ join <(sort index) <(nl strings | sort -b)
In order to complete the answers that use awk, here's a solution in Python that you can use from your bash script:
cat << EOF | python
lines = []
with open("$file2") as f:
for line in f:
lines.append(int(line))
i = 0
with open("$file1") as f:
for line in f:
i += 1
if i in lines:
print line,
EOF
The only advantage here is that Python is way more easy to understand than awk :).
This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
The following command is working fine when I am not writing it in a script file, but when I put this command in a script file, it shows an error.
nawk 'c-- >0;$0~s{if(b)for(c=b+1;c >1;c--)print r[(NR-c+1)%b];print;c=a}b{r[NR%b]=$0}' b=10 a=10 s="string pattern" file
The error is:
nawk: syntax error at source line 1 context is >>> ' <<< missing }
nawk: bailing out at source line 1
One of the comment responses to one of the many requests for 'What does your script look like' is:
#!/bin/ksh
Stringname=$1
directory=$2
d=$3
Command="nawk 'c-- >0;$0~s{if(b)for(c=b+1;c >1;c--)print r[(NR-c+1)%b];print;c=a}b{r[NR%b]=$0}' b=10 a=10 s=\"$stringname\" $directory"
$Command> $d
Storing the whole command in a string like that is hugely fraught; don't do it! It's unnecessary and very, very hard to get right.
#!/bin/ksh
Stringname=$1
directory=$2
d=$3
nawk 'c-- >0;$0~s{if(b)for(c=b+1;c >1;c--)print r[(NR-c+1)%b];print;c=a}b{r[NR%b]=$0}' b=10 a=10 s="$stringname" $directory > $d
The quickest way to solve the problem of printing N lines before and M lines after a match is to install GNU grep and use:
grep -B $N -A $M 'string pattern' file
Failing that, here's a Perl script I about 5 years ago to do the job. Note that there are some complications if you ask for 10 lines before and 10 lines after a match, and you have:
a match at line 7 (not 10 lines before)
a match at line 30 and another at 35 (need to print lines 20-45)
a match at line 60 where the last line is line 65 (not 10 lines after)
and there are multiple files to process.
This code does handle all that. It can probably be improved.
#!/usr/bin/perl -w
#
# #(#)$Id: sgrep.pl,v 1.6 2007/09/18 22:55:20 jleffler Exp $
#
# Perl-based SGREP (special grep) command
#
# Print lines around the line that matches (by default, 3 before and 3 after).
# By default, include file names if more than one file to search.
#
# Options:
# -b n1 Print n1 lines before match
# -f n2 Print n2 lines following match
# -n Print line numbers
# -h Do not print file names
# -H Do print file names
use strict;
use constant debug => 0;
use Getopt::Std;
my(%opts);
sub usage
{
print STDERR "Usage: $0 [-hnH] [-b n1] [-f n2] pattern [file ...]\n";
exit 1;
}
usage unless getopts('hnf:b:H', \%opts);
usage unless #ARGV >= 1;
if ($opts{h} && $opts{H})
{
print STDERR "$0: mutually exclusive options -h and -H specified\n";
exit 1;
}
my $op = shift;
print "# regex = $op\n" if debug;
# print file names if -h omitted and more than one argument
$opts{F} = (defined $opts{H} || (!defined $opts{h} and scalar #ARGV > 1)) ? 1 : 0;
$opts{n} = 0 unless defined $opts{n};
my $before = (defined $opts{b}) ? $opts{b} + 0 : 3;
my $after = (defined $opts{f}) ? $opts{f} + 0 : 3;
print "# before = $before; after = $after\n" if debug;
my #lines = (); # Accumulated lines
my $tail = 0; # Line number of last line in list
my $tbp_1 = 0; # First line to be printed
my $tbp_2 = 0; # Last line to be printed
# Print lines from #lines in the range $tbp_1 .. $tbp_2,
# leaving $leave lines in the array for future use.
sub print_leaving
{
my ($leave) = #_;
while (scalar(#lines) > $leave)
{
my $line = shift #lines;
my $curr = $tail - scalar(#lines);
if ($tbp_1 <= $curr && $curr <= $tbp_2)
{
print "$ARGV:" if $opts{F};
print "$curr:" if $opts{n};
print $line;
}
}
}
# General logic:
# Accumulate each line at end of #lines.
# ** If current line matches, record range that needs printing
# ** When the line array contains enough lines, pop line off front and,
# if it needs printing, print it.
# At end of file, empty line array, printing requisite accumulated lines.
while (<>)
{
# Add this line to the accumulated lines
push #lines, $_;
$tail = $.;
printf "# array: N = %d, last = $tail: %s", scalar(#lines), $_ if debug > 1;
if (m/$op/o)
{
# This line matches - set range to be printed
my $lo = $. - $before;
$tbp_1 = $lo if ($lo > $tbp_2);
$tbp_2 = $. + $after;
print "# $. MATCH: print range $tbp_1 .. $tbp_2\n" if debug;
}
# Print out any accumulated lines that need printing
# Leave $before lines in array.
print_leaving($before);
}
continue
{
if (eof)
{
# Print out any accumulated lines that need printing
print_leaving(0);
# Reset for next file
close ARGV;
$tbp_1 = 0;
$tbp_2 = 0;
$tail = 0;
#lines = ();
}
}
I bet you're trying to execute your script as nawk -f file instead of just ./file.