Looking for a simple shell script (with sed or awk) to comment out lines of a text file if a string exists in the line(s). As an example, a text file with the following:
line1 word1 word2
line2 word3 word4
line3 word5 word6
line4 word1 word7
line5 word10 word11
To be changed to:
#line1 word1 word2
line2 word3 word4
line3 word5 word6
#line4 word1 word7
line5 word10 word11
As you see, only the lines with the string "word1" are commented out.
I believe this will do it for you.
sed -i .backup "/[[:<:]]word1[[:>:]]/s/^/#/g" file
I think, your question is similar to How do I add a comment (#) in front of a line after a key word search match
Please correct me if i am wrong. I hope, this will help you.
Try this:
$ sed -e '/[[:<:]]word1[[:>:]]/ s/^/# /' < file
# line1 word1 word2
line2 word3 word4
line3 word5 word6
# line4 word1 word7
line5 word10 word11
How does this work? The sed man page says,
The form of a sed command is as follows:
[address[,address]]function[arguments]
Later in the man page, it clarifies that an address can be a regular expression, which causes the function to be applied to each line matching the regular expression. So what the command given above does is, if the line contains the standalone word word1, apply the substitution function to replace the beginning-of-line anchor with "# ".
Related
I have a list of words I need to check in more one hundred text files.
My list of word's file named : word2search.txt.
This text file contains N word :
Word1
Word2
Word3
Word4
Word5
Word6
Wordn
So far I've done this bash file :
#!/bin/bash
listOfWord2Find=/home/mobaxterm/MyDocuments/word2search.txt
while IFS= read -r listOfWord2Find
do
echo "$listOfWord2Find"
grep -l -R "$listOfWord2Find" /home/mobaxterm/MyDocuments/txt/*.txt
echo "================================================================="
done <"$listOfWord2Find"
The result does not satisfy me, I can hardly exploit the result
Word1
/home/mobaxterm/MyDocuments/txt/new 6.txt
/home/mobaxterm/MyDocuments/txt/file1.txt
/home/mobaxterm/MyDocuments/txt/file2.txt
/home/mobaxterm/MyDocuments/txt/file3.txt
=================================================================
Word2
/home/mobaxterm/MyDocuments/txt/new 6.txt
/home/mobaxterm/MyDocuments/txt/file1.txt
=================================================================
Word3
/home/mobaxterm/MyDocuments/txt/new 6.txt
/home/mobaxterm/MyDocuments/txt/file4.txt
/home/mobaxterm/MyDocuments/txt/file5.txt
/home/mobaxterm/MyDocuments/txt/file1.txt
=================================================================
Word4
/home/mobaxterm/MyDocuments/txt/new 6.txt
/home/mobaxterm/MyDocuments/txt/file1.txt
=================================================================
Word5
/home/mobaxterm/MyDocuments/txt/new 6.txt
=================================================================
This is what i want to see :
/home/mobaxterm/MyDocuments/txt/file1.txt : Word1, Word2, Word3, Word4
/home/mobaxterm/MyDocuments/txt/file2.txt : Word1
/home/mobaxterm/MyDocuments/txt/file3.txt : Word1
/home/mobaxterm/MyDocuments/txt/file4.txt : Word3
/home/mobaxterm/MyDocuments/txt/file5.txt : Word3
/home/mobaxterm/MyDocuments/txt/new 6.txt : Word1, Word2, Word3, Word4, Word5, Word6
I do not understand why my script doesnt show me the Word6(there are files which contains this word6). It stops at word5. To avoid this issue, I've added a new line blablabla (I'm sure to not find this occurence).
If you can help me on this subject :)
Thank you.
Another much more elegant approach to search all words on each file. One file at a time.
Use grep command multi pattern option -f, --file=FILE, and print matched lines with -o, --only-matching
Then to pipe massage the resulting words into csv list.
Like this:
script.sh
#!/bin/bash
for currFile in $*; do
matched_words_list=$(grep --only-matching --file=$WORDS_LIST $currFile |sort|uniq|awk -vORS=', ' 1|sed "s/, $//")
printf "%s : %s\n" "$currFile" "$matched_words_list"
done
script.sh output
Passing words list file in environment variable: WORDS_LIST
Passing inspected files list as arguments list input.*.txt
export WORDS_LIST=./words.txt; ./script.sh input.*.txt
input.1.txt : word1, word2
input.2.txt : word4
input.3.txt :
Explanation:
using words.txt:
word2
word1
word5
word4
using input.1.txt:
word1
word2
word3
word3
word1
word3
And pipe massage the grep command
grep --file=words.txt -o input.1.txt |sort|uniq|awk -vORS=, 1|sed s/,$//
word1,word2
output 1
List all matched words from words.txt in inspected file input.1.txt
grep --file=words.txt -o input.1.txt
word1
word2
word1
output 2
List all matched words from words.txt in inspected file input.1.txt
Than sort the output words list
grep --file=words.txt -o input.1.txt|sort
word1
word1
word2
output 3
List all matched words from words.txt in inspected file input.1.txt
Than sort the output words list
Than remove duplicate words
grep --file=words.txt -o input.1.txt|sort|uniq
word1
word2
output 4
List all matched words from words.txt in inspected file input.1.txt
Than sort the output words list
Than remove duplicate words
Than create a csv list from the unique words
grep --file=words.txt -o input.1.txt|sort|uniq|awk -vORS=, 1
word1,word2,
output 5
List all matched words from words.txt in inspected file input.1.txt
Than sort the output words list
Than remove duplicate words
Than create a csv list from the unique words
Than remove trailing , from csv list
grep --file=words.txt -o input.1.txt|sort|uniq|awk -vORS=, 1|sed s/,$//
word1,word2
The suggest strategy is to scan each line once with all words.
Suggest to write gawk script, which is standard Linux awk
script.awk
FNR == NR { # Only in first file having match words list
matchWordsArr[++wordsCount] = $0; # read match words into ordered array
matchedWordInFile[wordsCount] = 0; # reset matchedWordInFile array
}
FNR != NR { # Read line in inspected file
for (i in matchWordsArr) { # scan line for all match words
if ($0 ~ matchWordsArr[i]) matchedWordInFile[i]++; # if word is mached increment respective matchedWordInFile[i]
}
}
ENDFILE{ # on each file read completion
if (FNR != NR) { # if not first file
outputLine = sprintf("%s: ", FILENAME); # assign outputLine header to current fileName
for (i in matchWordsArr) { # iterate over matched words
if (matchedWordInFile[i] == 0) continue; # skip unmatched words
outputLine = sprintf("%s%s%s", outputLine, seprator, matchWordsArr[i]); # append matched word to outputLine
matchedWordInFile[i] = 0; # reset matched words array
seprator = ","; # set words list seperator ","
}
print outputLine;
}
outputLine = seprator = ""; # reset words list seperator "" and outputLine
}
input.1.txt:
word1
word2
word3
input.2.txt:
word3
word4
word5
input.3.txt:
word3
word7
word8
words.txt
word2
word1
word5
word4
running:
$ awk -f script.awk words.txt input.*.txt
input.1.txt: word2,word1
input.2.txt: word5,word4
input.3.txt:
Just grep:
grep -f list.txt input.*.txt
-f FILENAME allows to use a file with patterns for grep to search.
If you want to display the filename along with the match, pass -H in addition to that:
grep -Hf list.txt input.*.txt
Consider the following file:
word1 word2 word3
word1 word2 word3
word6 word7 word8
word6 word7 word9
word9 word10 word4
word1 word2 word5
word1 word2 word5
I search for a shell command line to output lines where 2 first words are different from previous and next line.
Expected output:
word9 word10 word4
Any idea?
case 1: each line has same number of words (fields)
uniq can skip initial fields but not trailing fields
rev reverses the characters on a line
Since each line has the same number of fields (1 trailing), we can do:
<file rev | uniq -u -f1 | rev
case 2: arbitrary number of words on each line
We can write an awk script that keeps track of the current and the previous two lines and prints the previous one when appropriate:
awk <file '
{
# does current line match previous line?
diff = !( $1==p1 && $2==p2 )
# print stashed line if not duplicate
if (diff && pdiff) print p0
# stash current line data
pdiff=diff; p0=$0; p1=$1; p2=$2
}
END {
# print the final line if appropriate
if (pdiff) print p0
}
'
I guess there is some redundancy here but works
$ awk '{k=$1 FS $2}
k!=p && p!=pp {print p0}
{p0=$0; pp=p; p=k}
END {if(p!=pp) print}' file
word9 word10 word4
I've script1.pl and script2.pl. I'm looking for making script2.pl able to call the value of $string from script1.pl.
script1.pl
$string="word1 word2 word3 word4 word5 word6 word7 word8 word9";
$cmd="perl \"My\\File\\Path\\script2.pl\"";
system ($cmd);
script2.pl
print $string;
Note: I'm using perl for Windows.
Best practice is use a module. See perlmod.
In your case, you can use require. Make sure that require files return truth by adding 1.
script1.pl:
#!/usr/bin/perl
use warnings;
use strict;
our $string = "word1 word2 word3 word4 word5 word6 word7 word8 word9";
our $cmd = "perl \"My\\File\\Path\\script2.pl\"";
system ($cmd);
1;
script2.pl:
#!/usr/bin/perl
use strict;
use warnings;
use vars qw($string);
require "script1.pl";
print $string, "\n";
Output:
word1 word2 word3 word4 word5 word6 word7 word8 word9
While you can make that work, you're much better off passing in the variable as command line arguments, or if there's a lot of data, to STDIN.
# script1.pl
my $cmd = qq[$^X "My\\File\\Path\\script2.pl"];
my #words = qw[word1 word2 word3 word4 word5 word6 word7 word8 word9];
system $cmd, #words;
# script2.pl
print join ", ", #ARGV;
This doesn't scale well. You're better off rewriting script2.pl as a library and calling a function.
# mylibrary.pl
sub print_stuff {
print join ", ", #_;
}
# script1.pl
require 'mylibrary.pl';
print_stuff(qw[word1 word2 word3 word4 word5 word6 word7 word8 word9]);
For a handful of functions this will work fine. Eventually you'll want to look into writing modules.
I want to copy the first value of colum in the first position and comment out the old value.
For example :
word1 word2 1233425 -----> 1233425 word1 word2 #1233425
word1 word2 word3 49586 -----> 49586 word1 word2 word3 #49586
I don't know the number of words preceding the number.
I tried with an awk script :
awk '{$1="";score=$NF;$NF="";print $score $0 #$score}' file
But It does not work.
What about this? It is pretty similar to yours.
$ awk '{score=$NF; $NF="#"$NF; print score, $0}' file
1233425 word1 word2 #1233425
49586 word1 word2 word3 #49586
Note that in your case you are emptying $1, which is not necessary. Just store score as you did and then add # to the beginning of $NF.
Using awk
awk '{f=$NF;$NF="#" $NF;print f,$0}' file
Since we posted the same answer, here is a shorter variation :)
awk '{$0=$NF FS$0;$NF="#"$NF}1' file
$0=$NF FS$0 add last field to line
$NF="#"$NF add # to last field.
1 print line
A perl way to do it:
perl -pe 's/^(.+ )(\d+)/$2 $1 #$2/' infile
sed 's/\(.*\) \([^[:blank:]]\{1,\}\)/\2 \1 #\2/' YourFile
with GNU sed add -posix option
I'm trying to solve a problem using the sed command.
I have a Table with data (few rows and cols).
I want to be able to replace the string in the i,j spot with a new string.
For an example :
word1 word2 word3 word4
word5 word6 word7 word8
word9 word10 word11 word12
with the input of 1,1 and abc should return
word1 word2 word3 word4
word5 abc word7 word8
word9 word10 word11 word12
And if possible, print it to a new file.
Thanks
Using awk might be easier:
awk -v c=1 -v r=1 -v w='abc' 'NR==r+1{$(c+1)=w}1' file
word1 word2 word3 word4
word5 abc word7 word8
word9 word10 word11 word12