search through regex pattern in a file - shell

I have a file with data in below format
abc {u_bit_top/connect_down/u_FDIO[6]/u_latch}
ghi {u_bit_top/seq_connect/p_REDEIO[9]/ff_latch
def {u_bit_top/connect_up/shift_reg[7]
I want to search for pattern *bit_top*FDIO* and *bit_top*REDEIO*in the file in each line and delete the complete line if pattern is found.
I want output as
def {u_bit_top/connect_up/shift_reg[7]
I did using sed like sed "/bit_top/d;/FDIO/d;/REDEIO/d;" but this deletes the line having bit_top and FDIO and REDEIO separately.
How I can search for above pattern and delete the line containing it.
Shell or TCL anything will be useful.

Since you tagged tcl
set fh [open "filename"]
set contents [split [read -nonewline $fh] \n]
close $fh
set filtered [lsearch -inline -not -regexp $contents {bit_top.*(FDIO|REDEIO)}]
results in
def {u_bit_top/connect_up/shift_reg[7]
lsearch documentation.
But really all you need for this is grep
grep -Ev 'bit_top.*(FDIO|REDEIO)' filename

You've been close! ;)
sed '/bit_top.*FDIO/d' input
Just input a regex to sed that matches what you want...

Using sed
$ sed -E '/bit_top.*(REDE|FD)IO/d' input_file
def {u_bit_top/connect_up/shift_reg[7]

You might use GNU AWK for this task following way, let file.txt content be
abc {u_bit_top/connect_down/u_FDIO[6]/u_latch}
ghi {u_bit_top/seq_connect/p_REDEIO[9]/ff_latch
def {u_bit_top/connect_up/shift_reg[7]
then
awk '/bit_top/&&(/FDIO/||/REDEIO/){next}{print}' file.txt
gives output
def {u_bit_top/connect_up/shift_reg[7]
Explanation: if lines contain bit_top AND (FDIO OR REDEIO) then go to next line i.e. skip it. If that did not happen line is just printed.
(tested in GNU Awk 5.0.1)

With a small change you can implement the compound pattern (eg, *bit_top*FDIO*) in sed.
A couple variations on OP's current sed:
# daisy-chain the 2 requirements:
$ sed "/bit_top.*FDIO/d;/bit_top.*REDEIO/d" file
def {u_bit_top/connect_up/shift_reg[7]
# enable "-E"xtended regex support:
$ sed -E "/bit_top.*(FDIO|REDEIO)/d" file
def {u_bit_top/connect_up/shift_reg[7]

You can read the file line by line and perform not operation on your regex pattern.
set fp [open "input.txt" r]
while { [gets $fp data] >= 0 } {
if {![regexp {bit_top.*(FDIO|REDEIO)} $data match]}
{puts $match}
}
close $fp

Related

How to process tr across all files in a directory and output to a different name in another directory?

mpu3$ echo * | xargs -n 1 -I {} | tr "|" "/n"
which outputs:
#.txt
ag.txt
bg.txt
bh.txt
bi.txt
bid.txt
dh.txt
dw.txt
er.txt
ha.txt
jo.txt
kc.txt
lfr.txt
lg.txt
ng.txt
pb.txt
r-c.txt
rj.txt
rw.txt
se.txt
sh.txt
vr.txt
wa.txt
is what I have so far. What is missing is the output; I get none. What I really want is to get a list of txt files, use their name up to the extension, process out the "|" and replace it with a LF/CR and put the new file in another directory as [old-name].ics. HALP. THX in advance. - Idiot me.
You can loop over the files and use sed to process the file:
for i in *.txt; do
sed -e 's/|/\n/g' "$i" > other_directory/"${i%.txt}".ics
done
No need to use xargs, especially with echo which would risk the filenames getting word split and having globbing apply to them, so could well do the wrong thing.
Then we use sed and use s to substitute | with \n g makes it a global replace. We redirect that to the other director you want and use bash's parameter expansion to strip off the .txt from the end
Here's an awk solution:
$ awk '
FNR==1 { # for first record of every file
close(f) # close previous file f
f="path_to_dir/" FILENAME # new filename with path
sub(/txt$/,"ics",f) } # replace txt with ics
{
gsub(/\|/,"\n") # replace | with \n
print > f }' *.txt # print to new file

sed: Replacing a range of text with contents of a file

There are many examples here and elsewhere on the interwebs for using sed's 'r' to replace a pattern, but it does not seem to work on a range, but maybe I'm just not holding it right.
The following works as expected, deleting BEGIN PATTERN and replacing it with the contents of /tmp/somefile.
sed -n "/BEGIN PATTERN/{ r /tmp/somefile d }" TARGET_FILE
This, however, only replaces END_PATTERN with the contents of /tmp/somefile.
sed -n "/BEGIN PATTERN/,/END PATTERN/ { r /tmp/somefile d }" TARGET_FILE
I suppose I could try perl or awk to do this as well, but it seems like sed should be able to do this.
I believe that this does what you want:
sed $'/BEGIN PATTERN/r somefile\n /BEGIN PATTERN/,/END PATTERN/d' file
Or:
sed -e '/BEGIN PATTERN/r somefile' -e '/BEGIN PATTERN/,/END PATTERN/d' file
How it works
/BEGIN PATTERN/r somefile
Whenever BEGIN PATTERN is found, this inserts the contents of somefile.
/BEGIN PATTERN/,/END PATTERN/d
Whenever we are in the range from a line with /BEGIN PATTERN/ to a line with /END PATTERN/, we delete (d) the contains of the pattern buffer.
Example
Let's consider these two test files:
$ cat file
prelude
BEGIN PATTERN
middle
END PATTERN
afterthought
and:
$ cat somefile
This is
New.
Our command produces:
$ sed $'/BEGIN PATTERN/r somefile\n /BEGIN PATTERN/,/END PATTERN/d' file
prelude
This is
New.
afterthought
This might work for you (GNU sed):
sed -e '/BEGIN PATTERN/,/END PATTERN/{/END PATTERN/!d;r somefile' -e 'd}' file
John1024's answer works if BEGIN PATTERN and END PATTERN are different. If this is not the case, the following works:
sed $'/PATTERN/,/PATTERN/d; 1,/PATTERN/ { /PATTERN/r somefile\n }' file
By preserving the pattern:
sed $'/PATTERN/,/PATTERN/ { /PATTERN/!d; }; 1,/PATTERN/ { /PATTERN/r somefile\n }' file
This solution can yield false positives if the pattern is not paired as potong pointed out.

filter specific attribute from a file

I have an input.txt file has following text. I have to filter the "".
- <ci>
<id>a573f0d014c18a5811793aedb5aad3</id>
<viewName>Windows</viewName>
</ci>
- <ci>
<id>7ad9088802ef62d75a15c9d4799fe8</id>
<viewName>Network</viewName>
</ci>
- <ci>
<id>abbbeeb60c4074bbc8483f321e0b43</id>
<viewName>Unix</viewName>
</ci>
Output should be like this:
a573f0d014c18a5811793aedb5aad3
7ad9088802ef62d75a15c9d4799fe8
abbbeeb60c4074bbc8483f321e0b43
With gnu grep you can use a positive lookahead and a positive lookbehind:
$ grep -oP '(?<=<id>).*(?=</id>)' file
a573f0d014c18a5811793aedb5aad3
7ad9088802ef62d75a15c9d4799fe8
abbbeeb60c4074bbc8483f321e0b43
another grep alternative based on data pattern
grep -o '[a-f0-9]\{30\}'
Perl solution:
perl -lane 'print $1 if /^\s*<id>(\S+)<\/id>/' file
The /regex/ captures the information between < id > and < /id > into variable $1
These command-line options are used:
n loop around every line of the input file, put the line in the $_ variable, do not automatically print every line
l removes newlines before processing, and adds them back in afterwards
a autosplit mode – perl will automatically split input lines on whitespace into the #F array
e : execute the perl code

How to append a line after a search result?

So I grep for something in some file:
grep "import" test.txt | tail -1
In test.txt there is
import-one
import-two
import-three
some other stuff in the file
This will return the last search result:
import-three
Now how do I add some text -after-- import-three but before "some other stuff in the file". Basically I want to append a line but not at the end of a file but after a search result.
I understand that you want some text after each search result, which would mean after every matching line. So try
grep "import" test.txt | sed '/$/ a\Line to be added'
You can try something like this with sed
sed '/import-three/ a\
> Line to be added' t
Test:
$ sed '/import-three/ a\
> Line to be added' t
import-one
import-two
import-three
Line to be added
some other stuff in the file
One way assuming that you cannot distingish between different "import" sentences. It reverses the file with tac, then find the first match (import-three) with sed, insert a line just before it (i\) and reverse again the file.
The :a ; n ; ba is a loop to avoid processing again the /import/ match.
The command is written throught several lines because the sed insert command is very special with the syntax:
$ tac infile | sed '/import/ { i\
"some text"
:a
n
ba }
' | tac -
It yields:
import-one
import-two
import-three
"some text"
some other stuff in the file
Using ed:
ed test.txt <<END
$
?^import
a
inserted text
.
w
q
END
Meaning: go to the end of the file, search backwards for the first line beginning with import, add the new lines below (insertion ends with a "." line), save and quit

sed to grep lines after specific line for further processing

I am working with a script which looks for file lines after a specific line and process them to get data from it.
Let me illustrate with an example,
if file "sample.log" has lines like
qwerty asdf foo bar
foo
time: 1:00 PM
foo1 bar1
foo foo fooo copying file abc/def/ghi/foo.txt
bar bar1 bar2 copying file efg/qwe/bar.txt
foo
My script should search for contents after time: 1:00 PM. After finding those lines, it must look for lines matching the pattern "copying" and get the path specified in the line.
In this case, output written to another file should be
abc/def/ghi/foo.txt
efg/qwe/bar.txt
I tried this using following command but getting empty string as output. Please guide me with this
sed -n '/^time: 1:00 PM/{/^(.*)copying file/s/^(.*)copying file //p}' ../../sample.log
If you're already in Tcl, you could code it in Tcl:
set fid [open "FILE" r]
set have_time false
while {[gets $fid line] != -1} {
if {$have_time && [regexp {copying file (.*)} $line -> filename]} {
puts $filename
} elseif {[string first "time:" $line] > -1} {
set have_time true
}
}
close $fid
If your file is quite huge, exec sed may be faster, but you'll have to see for yourself.
Note, if you're going to exec sed, keep in mind that inside Tcl, single quotes have no special meaning: use braces to quote the sed program.
exec sed -e {do stuff here} FILE
sed '/1:00 PM/,$ {/copying/s:.*file \(.*\):\1:p};d' FILE
This might work for you (GNU sed):
sed -ne '/1:00 PM/,$!b' -e 's/.*copying.* //w copy' file

Resources