find and replace in place with grep and sed (and make a log for the files changed) - bash

My script is as follow (variables are defined above by user input):
grep -RlI $OLD $PATH > $LIST
while read line
do
FILE=echo $line
sed -i '' -e 's|$OLD|$NEW|g' $FILE
done < $LIST
It seems to work except that sed fails because
"sed: -i may not be used with stdin"
What am I doing wrong? Maybe that's the wrong approach for what I am trying to do?
(which, by the way, is to replace occurrences of a string in many files, AND to create a file that lists all files that contain a match.)
Many thanks,
C

Try replacing
FILE=echo $line
with
FILE="$line"
sed is complaining because the $FILE variable doesn't contain anything, or just contains whitespace. Examine the contents of the file referenced by $LIST; make sure there are no empty lines or lines with just whitespace.

sed -i -r 's/\$[[:alnum:]]{32}-[[:digit:]]{8}\$[[:alnum:]+\.\_\-]{2,3}#[[:alnum:]+\.\_\-]*/****/' *.log
my variant to replace data like $1BC29B36F623BA82AAF6724FD3B16718-17082022$2sy#domain4.name with *****

Related

How to remove characters in filename up to and including second underscore

I've been looking around for a while on this and can't seem to find a solution on how to use sed to do this. I have a file that is named:
FILE_772829345_D594_242_25kd_kljd.mov
that I want to be renamed
D594_242_25kd_kljd.mov
I currently have been trying to get sed to work for this but have only been able to remove the first second of the file:
echo 'FILE_772829345_D594_242_25kd_kljd.mov' | sed 's/[^_]*//'
_772829345_D594_242_25kd_kljd.mov
How would I get sed to do the same instruction again, up to the second underscore?
If the filename is in a shell variable, you don't even need to use sed, just use a shell expansion with # to trim through the second underscore:
filename="FILE_772829345_D594_242_25kd_kljd.mov"
echo "${filename#*_*_}" # prints "D594_242_25kd_kljd.mov"
BTW, if you're going to use mv to rename the file, use its -i option to avoid file getting overwritten if there are any name conflicts:
mv -i "$filename" "${filename#*_*_}"
If all your files are named similarly, you can use cut which would be a lot simpler than sed with a regex:
cut -f3- -d_ <<< "FILE_772829345_D594_242_25kd_kljd.mov"
Output:
D594_242_25kd_kljd.mov

Unix shell scripting, need assign the text files values to the sed command

i was trying to add the lines from the text file to the sed command
observered_list.txt
Uncaught SlingException
cannot render resource
IncludeTag Error
Recursive invocation
Reference component error
i need it to be coded like the following
sed '/Uncaught SlingException\|cannot render resource\|IncludeTag Error\|Recursive invocation\|Reference component error/ d'
help me to do this.
I would suggest you create a sed script and delete each pattern consecutively:
while read -r pattern; do
printf "/%s/ d;\n" "$pattern"
done < observered_list.txt >> remove_patterns.sed
# now invoke sed on the file you want to modify
sed -f remove_patterns.sed file_to_clean
Alternatively you could construct the sed command like this:
pattern=
while read -r line; do
pattern=$pattern'\|'$line
done < observered_list.txt
# strip of first and last \|
pattern=${pattern#\\\|}
pattern=${pattern%\\\|}
printf "sed '/%s/ d'\n" "$pattern"
# you still need to invoke the command, it's just printed
You can use grep for that:
grep -vFf /file/with/patterns.txt /file/to/process.txt
Explanation:
-v excludes lines of process.txt which match one of the patterns from output
-F treats patterns in patterns.txt as fixed strings instead of regexes (looks like this is desired here)
-f reads patterns from patterns.txt
Check man grep for further information.

sed delete not working with cat variable

I have a file named test-domain, the contents of which contain the line 100.am.
When I do this, the line with 100.am is deleted from the test-domain file, as expected:
for x in $(echo 100.am); do sed -i "/$x/d" test-domain; done
However, if instead of echo 100.am, I read each line from a file named unwanted-lines, it does NOT work.
for x in $(cat unwanted-lines); do sed -i "/$x/d" test-domain; done
This is even if the only contents of unwanted-lines is one line, with the exact contents 100.am.
Does anyone know why sed delete line works if you use echo in your variable, but not if you use cat?
fgrep -v -f unwanted-lines test-domain > /tmp/Buffer
mv /tmp/Buffer test-domain
sed is not interesting in this case due to multiple call in shell (poor efficiency and lot of ressources used). The way to still use sed is to preload line to delete, and make a search base on this preloaded info but very heavy compare to fgrep in this case
Does anyone know why sed delete line works if you use echo in your
variable, but not if you use cat?
I believe that your file containing unwanted lines contains CR+LF line endings due to which it doesn't work when you use the file. You could strip the CR in your loop:
for x in $(cat unwanted-lines); do x="${x//$'\r'}"; sed -i "/$x/d" test-domain; done
One better strategy than yours would be to use a genuine editor, e.g., ed, as so:
ed -s test-domain < <(
shopt -s extglob
while IFS= read -r l; do
[[ $l = *([[:space:]]) ]] && continue
l=${l//./\\.}
echo "g/$l/d"
done < unwanted-lines
echo "wq"
)
Caveat. You must make sure that the file unwanted-lines doesn't contain any character that could clash with ed's regexps and commands. I have already included a match for a period (i.e., replace . with \.).
This method is quite efficient, as you're not forking so many times on sed, writing temp files, renaming them, etc.
Another possibility would be to use grep, but then you won't have the editing option ed offers.
Remark. ed is the standard editor.
why not just applying the sed command on your file?
sed -i '/.*100\.am/d' your_file

Trying to write a script to clean <script.aa=([].slice+'hjkbghkj') from multiple htm files, recursively

I am trying to modify a bash script to remove a glob of malicious code from a large number of files.
The community will benefit from this, so here it is:
#!/bin/bash
grep -r -l 'var createDocumentFragm' /home/user/Desktop/infected_site/* > /home/user/Desktop/filelist.txt
for i in $(cat /home/user/Desktop/filelist.txt)
do
cp -f $i $i.bak
done
for i in $(cat /home/user/Desktop/filelist.txt)
do
$i | sed 's/createDocumentFragm.*//g' > $i.awk
awk '/<\/SCRIPT>/{p=1;print}/<\/script>/{p=0}!p'
This is where the script bombs out with this message:
+ for i in '$(cat /home/user/Desktop/filelist.txt)'
+ sed 's/createDocumentFragm.*//g'
+ /home/user/Desktop/infected_site/index.htm
I get 2 errors and the script stops.
/home/user/Desktop/infected_site/index.htm: line 1: syntax error near unexpected token `<'
/home/user/Desktop/infected_site/index.htm: line 1: `<html><head><script>(function (){ '
I have the first 2 parts done.
The files containing createDocumentfragm have been enumerated in a text file correctly.
The files in the textfile.txt have been duplicated, in their original location with a .bak added to them IE: infected_site/some_directory/infected_file.htm and infected_file.htm.bak
effectively making sure we have a backup.
All I need to do now is write an AWK command that will use the list of files in filelist.txt, use the entire glob of malicious text as a pattern, and remove it from the files. Using just the uppercase script as the starting point, and the lower case script is too generic and could delete legitimate text
I suspect this may help me, but I don't know how to use it correctly.
http://backreference.org/2010/03/13/safely-escape-variables-in-awk/
Once I have this part figured out, and after you have verified that the files weren't mangled you can do this to clean out the bak files:
for i in $(cat /home/user/Desktop/filelist.txt)
do
rm -f $i.bak
done
Several things:
You have:
$i | sed 's/var createDocumentFragm.*//g' > $i.awk
You should probably meant this (using your use of cat which we'll talk about in a moment):
cat $i | sed 's/var createDocumentFragm.*//g' > $i.awk
You're treating each file in your file list as if it was a command and not a file.
Now, about your use of cat. If you're using cat for almost anything but concatenating multiple files together, you probably are doing something not quite right. For example, you could have done this:
sed 's/var createDocumentFragm.*//g' "$i" > $i.awk
I'm also a bit confused about the awk statement. Exactly what file are you using awk on? Your awk statement is using STDIN and STDOUT, so it's reading file names from the for loop and then printing the output on the screen. Is the sed statement suppose to feed into the awk statement?
Note that I don't have to print out my file to STDOUT, then pipe that into sed. The sed command can take the file name directly.
You also want to avoid for loops over a list of files. That is very inefficient, and can cause problems with the command line getting overloaded. Not a big issue today, but can affect you when you least suspect it. What happens is that your $(cat /home/user/Desktop/filelist.txt) must execute first before the for loop can even start.
A little rewriting of your program:
cd ~/Desktop
grep -r -l 'var createDocumentFragm' infected_site/* > filelist.txt
while read file
do
cp -f "$file" "$file.bak"
sed 's/var createDocumentFragm.*//g' "$file" > "$i.awk"
awk '/<\/SCRIPT>/{p=1;print}/<\/script>/{p=0}!p'
done < filelist.txt
We can use one loop, and we made it a while loop. I could even feed the grep into that while loop:
grep -r -l 'var createDocumentFragm' infected_site/* | while read file
do
cp -f "$file" "$file.bak"
sed 's/var createDocumentFragm.*//g' "$file" > "$i.awk"
awk '/<\/SCRIPT>/{p=1;print}/<\/script>/{p=0}!p'
done < filelist.txt
and then I don't even have to create a temporary file.
Let me know what's going on with the awk. I suspect you wanted something like this:
grep -r -l 'var createDocumentFragm' infected_site/* | while read file
do
cp -f "$file" "$file.bak"
sed 's/var createDocumentFragm.*//g' "$file" \
| awk '/<\/SCRIPT>/{p=1;print}/<\/script>/{p=0}!p' > "$i.awk"
done < filelist.txt
Also note I put quotes around file names. This helps prevent problems if file name has a space in it.

using sed to find and replace in bash for loop

I have a large number of words in a text file to replace.
This script is working up until the sed command where I get:
sed: 1: "*.js": invalid command code *
PS... Bash isn't one of my strong points - this doesn't need to be pretty or efficient
cd '/Users/xxxxxx/Sites/xxxxxx'
echo `pwd`;
for line in `cat myFile.txt`
do
export IFS=":"
i=0
list=()
for word in $line; do
list[$i]=$word
i=$[i+1]
done
echo ${list[0]}
echo ${list[1]}
sed -i "s/{$list[0]}/{$list[1]}/g" *.js
done
You're running BSD sed (under OS X), therefore the -i flag requires an argument specifying what you want the suffix to be.
Also, no files match the glob *.js.
This looks like a simple typo:
sed -i "s/{$list[0]}/{$list[1]}/g" *.js
Should be:
sed -i "s/${list[0]}/${list[1]}/g" *.js
(just like the echo lines above)
So myFile.txt contains a list of from:to substitutions, and you are looping over each of those. Why don't you create a sed script from this file instead?
cd '/Users/xxxxxx/Sites/xxxxxx'
sed -e 's/^/s:/' -e 's/$/:/' myFile.txt |
# Output from first sed script is a sed script!
# It contains substitutions like this:
# s:from:to:
# s:other:substitute:
sed -f - -i~ *.js
Your sed might not like the -f - which means sed should read its script from standard input. If that is the case, perhaps you can create a temporary script like this instead;
sed -e 's/^/s:/' -e 's/$/:/' myFile.txt >script.sed
sed -f script.sed -i~ *.js
Another approach, if you don't feel very confident with sed and think you are going to forget in a week what the meaning of that voodoo symbols is, could be using IFS in a more efficient way:
IFS=":"
cat myFile.txt | while read PATTERN REPLACEMENT # You feed the while loop with stdout lines and read fields separated by ":"
do
sed -i "s/${PATTERN}/${REPLACEMENT}/g"
done
The only pitfall I can see (it may be more) is that if whether PATTERN or REPLACEMENT contain a slash (/) they are going to destroy your sed expression.
You can change the sed separator with a non-printable character and you should be safe.
Anyway, if you know whats on your myFile.txt you can just use any.

Resources