I have a small script which basically generates a menu of all the scripts in my ~/scripts folder and next to each of them displays a sentence describing it, that sentence being the third line within the script commented out. I then plan to pipe this into fzf or dmenu to select it and start editing it or whatever.
1 #!/bin/bash
2
3 # a script to do
So it would look something like this
foo.sh a script to do X
bar.sh a script to do Y
Currently I have it run a for loop over all the files in the scripts folder and then run sed -n 3p on all of them.
for i in $(ls -1 ~/scripts); do
echo -n "$i"
sed -n 3p "~/scripts/$i"
echo
done | column -t -s '#' | ...
I was wondering if there is a more efficient way of doing this that did not involve a for loop and only used sed. Any help will be appreciated. Thanks!
Instead of a loop that is parsing ls output + sed, you may try this awk command:
awk 'FNR == 3 {
f = FILENAME; sub(/^.*\//, "", f); print f, $0; nextfile
}' ~/scripts/* | column -t -s '#' | ...
Yes there is a more efficient way, but no, it doesn't only use sed. This is probably a silly optimization for your use case though, but it may be worthwhile nonetheless.
The inefficiency is that you're using ls to read the directory and then parse its output. For large directories, that causes lots of overhead for keeping that list in memory even though you only traverse it once. Also, it's not done correctly, consider filenames with special characters that the shell interprets.
The more efficient way is to use find in combination with its -exec option, which starts a second program with each found file in turn.
BTW: If you didn't rely on line numbers but maybe a tag to mark the description, you could also use grep -r, which avoids an additional process per file altogether.
This might work for you (GNU sed):
sed -sn '1h;3{H;g;s/\n/ /p}' ~/scripts/*
Use the -s option to reset the line number addresses for each file.
Copy line 1 to the hold space.
Append line 3 to the hold space.
Swap the hold space for the pattern space.
Replace the newline with a space and print the result.
All files in the directory ~/scripts will be processed.
N.B. You may wish to replace the space delimiter by a tab or pipe the results to the column command.
Related
I need to see the last characters of bunch of text files (or alternatively test whether they are "}" and give a list of files that test negative ). Is there an easy way to do this from the command line.
(Ideally the solution works without reading the whole file from the start because in addition to there being many they can also be quite large.
P.S.: Any answer would be great but I would really appreciate if the function and syntax of everything in the answer can be fully explained.
It can be done fairly easily with tail and then string indexing in bash. For example, you obtain the last line in a file with, tail -n1 file. You will need to store the line in a variable using command-substitution, e.g.
lastln=$(tail -n1 file)
Then it is simply a matter of indexing the last characters, e.g.
echo ${lastln:(-1)}
(note: when indexing from the end of the string, you must put the offset (e.g. -1 in parenthesis (-1) -- or -- you must leave a space before the -1, e.g. echo ${lastln: -1} is also valid.)
You can try this:
for file in file1 file2; do tail -n 1 "$file" | grep -q '}$' || echo "$file"; done
where you should replace file1 file2 with the list of files you want to analyze, e.g. * or the like. Now what happens here? The outer part
for file in file1 file2; do ...; done
is a simple loop over the files, where inside the loop, you can refer to the current file as $file. Then,
tail -n 1 "$file"
prints the last line of the given file and
| grep -q '}$'
redirects the output to grep (turned into silent mode with -q), which looks for '}' immediatly followed by the end of the line ($). The return value of this command can be used to chain another action: when grep returns non-zero (indicating failure, i.e., the pattern is not matched), the last part
|| echo "$file"
is executed, resulting in the list of files you need.
I was wondering if there was a way to delete everything after a certain line of a text file in bash. So say there's a text file with 10 lines, and I want to delete every line after line number 4, so only the first 4 lines remained, how would I go about doing that?
You can use GNU sed:
sed -i '5,$d' file.txt
That is, 5,$ means the range line 5 until the end, and d means to delete.
Only the first 4 lines will remain.
The -i flag tells sed to edit the file in-place.
If you have only BSD sed, then the -i flag requires a backup file suffix:
sed -i.bak '5,$d' file.txt
As #ephemient pointed out, while this solution is simple,
it's inefficient because sed will still read the input until the end of the file, which is unnecessary.
As #agc pointed out, the inverse logic of my first proposal might be actually more intuitive. That is, do not print by default (-n flag),
and explicitly print range 1,4:
sed -ni.bak 1,4p file.txt
Another simple alternative, assuming that the first 4 lines are not excessively long and so they easily fit in memory, and also assuming that the 4th line ends with a newline character,
you can read the first 4 lines into memory and then overwrite the file:
lines=$(head -n 4 file.txt)
echo "$lines" > file.txt
Minor refinements on Janos' answer, ephemient's answer, and cdark's comment:
Simpler (and faster) sed code:
sed -i 4q file
When a filter util can't directly edit a file, there's
sponge:
head -4 file | sponge file
Most efficient for Linux might be truncate -- coreutils sibling util to fallocate, which offers the same minimal I/O of ephemient's more portable, (but more complex), dd-based answer:
truncate -s `head -4 file | wc -c` file
The sed method that #janos is simple but inefficient. It will read every line from the original file, even ones it could ignore (although that can be fixed using 4q), and -i actually creates a new file (which it renames to replace the original file). And there's the annoying bit where you need to use sed -i '5,$d' file.txt with GNU sed but sed -i '' '5,$d' file.txt with BSD sed in order to remove the existing file instead of leaving a backup.
Another method that performs less I/O:
dd bs=1 count=0 if=/dev/null of=file.txt \
seek=$(grep -b ^ file.txt | tail -n+5 | head -n1 | cut -d: -f1)
grep -b ^ file.txt prints out byte offsets on each line, e.g.
$ yes | grep -b ^
0:y
2:y
4:y
...
tail -n+5 skips the first 4 lines, outputting the 5th and subsequent lines
head -n1 takes only the next line (e.g. only the 5th line)
After head reads the one line, it will exit. This causes tail to exit because it has nowhere to output to anymore. This causes grep to exit for the same reason. Thus, the rest of file.txt does not need to be examined.
cut -d: -f1 takes only the first part before the : (the byte offset)
dd bs=1 count=0 if=/dev/null of=file.txt seek=N
using a block size of 1 byte, seek to block N of file.txt
copy 0 blocks of size 1 byte from /dev/null to file.txt
truncate file.txt here (because conv=notrunc was not given)
In short, this removes all data on the 5th and subsequent lines from file.txt.
On Linux there is a command named fallocate which can similarly extend or truncate a file, but that's not portable.
UNIX filesystems support efficiently truncating files in-place, and these commands are portable. The downside is that it's more work to write out.
(Also, dd will print some unnecessary stats to stderr, and will exit with an error if the file has fewer than 5 lines, although in that case it will leave the existing file contents in place, so the behavior is still correct. Those can be addressed also, if needed.)
If I don't know the line number, merely the line content (I need to know that there is nothing below the line containing 'knowntext' that I want to preserve.), then I use.
sed -i '/knowntext/,$d' inputfilename
to directly alter the file, or to be cautious
sed '/knowntext/,$d' inputfilename > outputfilename
where inputfilename is unaltered, and outputfilename contains the truncated version of the input.
I am not competent to comment on the efficiency of this, but I know that files of 20kB or so are dealt with faster than I can blink.
Using GNU awk (v. 4.1.0+, see here). First we create a test file (NOTICE THE DISCLAIMER):
$ seq 1 10 > file # THIS WILL OVERWRITE FILE NAMED file WITH TEST DATA
Then the code and validation (WILL MODIFY THE ORIGINAL FILE NAMED file):
$ awk -i inplace 'NR<=4' file
$ cat file
1
2
3
4
Explained:
$ awk -i inplace ' # edit is targetted to the original file (try without -i ...)
NR<=4 # output first 4 records
' file # file
You could also exit on line NR==5 which would be quicker if you redirected the output of the program to a new file (remove # for action) which would be the same as head -4 file > new_file:
$ awk 'NR==5{exit}1' file # > new_file
When testing, don't forget the seq part first.
I am working with plotting extremely large files with N number of relevant data entries. (N varies between files).
In each of these files, comments are automatically generated at the start and end of the file and would like to filter these out before recombining them into one grand data set.
Unfortunately, I am using MacOSx, where I encounter some issues when trying to remove the last line of the file. I have read that the most efficient way was to use head/tail bash commands to cut off sections of data. Since head -n -1 does not work for MacOSx I had to install coreutils through homebrew where the ghead command works wonderfully. However the command,
tail -n+9 $COUNTER/test.csv | ghead -n -1 $COUNTER/test.csv >> gfinal.csv
does not work. A less than pleasing workaround was I had to separate the commands, use ghead > newfile, then use tail on newfile > gfinal. Unfortunately, this will take while as I have to write a new file with the first ghead.
Is there a workaround to incorporating both GNU Utils with the standard Mac Utils?
Thanks,
Keven
The problem with your command is that you specify the file operand again for the ghead command, instead of letting it take its input from stdin, via the pipe; this causes ghead to ignore stdin input, so the first pipe segment is effectively ignored; simply omit the file operand for the ghead command:
tail -n+9 "$COUNTER/test.csv" | ghead -n -1 >> gfinal.csv
That said, if you only want to drop the last line, there's no need for GNU head - OS X's own BSD sed will do:
tail -n +9 "$COUNTER/test.csv" | sed '$d' >> gfinal.csv
$ matches the last line, and d deletes it (meaning it won't be output).
Finally, as #ghoti points out in a comment, you could do it all using sed:
sed -n '9,$ {$!p;}' file
Option -n tells sed to only produce output when explicitly requested; 9,$ matches everything from line 9 through (,) the end of the file (the last line, $), and {$!p;} prints (p) every line in that range, except (!) the last ($).
I realize that your question is about using head and tail, but I'll answer as if you're interested in solving the original problem rather than figuring out how to use those particular tools to solve the problem. :)
One method using sed:
sed -e '1,8d;$d' inputfile
At this level of simplicity, GNU sed and BSD sed both work the same way. Our sed script says:
1,8d - delete lines 1 through 8,
$d - delete the last line.
If you decide to generate a sed script like this on-the-fly, beware of your quoting; you will have to escape the dollar sign if you put it in double quotes.
Another method using awk:
awk 'NR>9{print last} NR>1{last=$0}' inputfile
This works a bit differently in order to "recognize" the last line, capturing the previous line and printing after line 8, and then NOT printing the final line.
This awk solution is a bit of a hack, and like the sed solution, relies on the fact that you only want to strip ONE final line of the file.
If you want to strip more lines than one off the bottom of the file, you'd probably want to maintain an array that would function sort of as a buffered FIFO or sliding window.
awk -v striptop=8 -v stripbottom=3 '
{ last[NR]=$0; }
NR > striptop*2 { print last[NR-striptop]; }
{ delete last[NR-striptop]; }
END { for(r in last){if(r<NR-stripbottom+1) print last[r];} }
' inputfile
You specify how much to strip in variables. The last array keeps a number of lines in memory, prints from the far end of the stack, and deletes them as they are printed. The END section steps through whatever remains in the array, and prints everything not prohibited by stripbottom.
Parsing output of ls to iterate through list of files is bad. So how should I go about iterating through list of files in order by which they were first created? I browsed several questions here on SO and they all seem to parsing ls.
The embedded link suggests:
Things get more difficult if you wanted some specific sorting that
only ls can do, such as ordering by mtime. If you want the oldest or
newest file in a directory, don't use ls -t | head -1 -- read Bash FAQ
99 instead. If you truly need a list of all the files in a directory
in order by mtime so that you can process them in sequence, switch to
perl, and have your perl program do its own directory opening and
sorting. Then do the processing in the perl program, or -- worst case
scenario -- have the perl program spit out the filenames with NUL
delimiters.
Even better, put the modification time in the filename, in YYYYMMDD
format, so that glob order is also mtime order. Then you don't need ls
or perl or anything. (The vast majority of cases where people want the
oldest or newest file in a directory can be solved just by doing
this.)
Does that mean there is no native way of doing it in bash? I don't have the liberty to modify the filename to include the time in them. I need to schedule a script in cron that would run every 5 minutes, generate an array containing all the files in a particular directory ordered by their creation time and perform some actions on the filenames and move them to another location.
The following worked but only because I don't have funny filenames. The files are created by a server so it will never have special characters, spaces, newlines etc.
files=( $(ls -1tr) )
I can write a perl script that would do what I need but I would appreciate if someone can suggest the right way to do it in bash. Portable option would be great but solution using latest GNU utilities will not be a problem either.
sorthelper=();
for file in *; do
# We need something that can easily be sorted.
# Here, we use "<date><filename>".
# Note that this works with any special characters in filenames
sorthelper+=("$(stat -n -f "%Sm%N" -t "%Y%m%d%H%M%S" -- "$file")"); # Mac OS X only
# or
sorthelper+=("$(stat --printf "%Y %n" -- "$file")"); # Linux only
done;
sorted=();
while read -d $'\0' elem; do
# this strips away the first 14 characters (<date>)
sorted+=("${elem:14}");
done < <(printf '%s\0' "${sorthelper[#]}" | sort -z)
for file in "${sorted[#]}"; do
# do your stuff...
echo "$file";
done;
Other than sort and stat, all commands are actual native Bash commands (builtins)*. If you really want, you can implement your own sort using Bash builtins only, but I see no way of getting rid of stat.
The important parts are read -d $'\0', printf '%s\0' and sort -z. All these commands are used with their null-delimiter options, which means that any filename can be procesed safely. Also, the use of double-quotes in "$file" and "${anarray[*]}" is essential.
*Many people feel that the GNU tools are somehow part of Bash, but technically they're not. So, stat and sort are just as non-native as perl.
With all of the cautions and warnings against using ls to parse a directory notwithstanding, we have all found ourselves in this situation. If you do find yourself needing sorted directory input, then about the cleanest use of ls to feed your loop is ls -opts | read -r name; do... This will handle spaces in filenames, etc.. without requiring a reset of IFS due to the nature of read itself. Example:
ls -1rt | while read -r fname; do # where '1' is ONE not little 'L'
So do look for cleaner solutions avoiding ls, but if push comes to shove, ls -opts can be used sparingly without the sky falling or dragons plucking your eyes out.
let me add the disclaimer to keep everyone happy. If you like newlines inside your filenames -- then do not use ls to populate a loop. If you do not have newlines inside your filenames, there are no other adverse side-effects.
Contra: TLDP Bash Howto Intro:
#!/bin/bash
for i in $( ls ); do
echo item: $i
done
It appears that SO users do not know what the use of contra means -- please look it up before downvoting.
You can try using use stat command piped with sort:
stat -c '%Y %n' * | sort -t ' ' -nk1 | cut -d ' ' -f2-
Update: To deal with filename with newlines we can use %N format in stat andInstead of cut we can use awk like this:
LANG=C stat -c '%Y^A%N' *| sort -t '^A' -nk1| awk -F '^A' '{print substr($2,2,length($2)-2)}'
Use of LANG=C is needed to make sure stat uses single quotes only in quoting file names.
^A is conrtrol-A character typed using ControlVA keys together.
How about a solution with GNU find + sed + sort?
As long as there are no newlines in the file name, this should work:
find . -type f -printf '%T# %p\n' | sort -k 1nr | sed 's/^[^ ]* //'
It may be a little more work to ensure it is installed (it may already be, though), but using zsh instead of bash for this script makes a lot of sense. The filename globbing capabilities are much richer, while still using a sh-like language.
files=( *(oc) )
will create an array whose entries are all the file names in the current directory, but sorted by change time. (Use a capital O instead to reverse the sort order). This will include directories, but you can limit the match to regular files (similar to the -type f predicate to find):
files=( *(.oc) )
find is needed far less often in zsh scripts, because most of its uses are covered by the various glob flags and qualifiers available.
I've just found a way to do it with bash and ls (GNU).
Suppose you want to iterate through the filenames sorted by modification time (-t):
while read -r fname; do
fname=${fname:1:((${#fname}-2))} # remove the leading and trailing "
fname=${fname//\\\"/\"} # removed the \ before any embedded "
fname=$(echo -e "$fname") # interpret the escaped characters
file "$fname" # replace (YOU) `file` with anything
done < <(ls -At --quoting-style=c)
Explanation
Given some filenames with special characters, this is the ls output:
$ ls -A
filename with spaces .hidden_filename filename?with_a_tab filename?with_a_newline filename_"with_double_quotes"
$ ls -At --quoting-style=c
".hidden_filename" " filename with spaces " "filename_\"with_double_quotes\"" "filename\nwith_a_newline" "filename\twith_a_tab"
So you have to process a little each filename to get the actual one. Recalling:
${fname:1:((${#fname}-2))} # remove the leading and trailing "
# ".hidden_filename" -> .hidden_filename
${fname//\\\"/\"} # removed the \ before any embedded "
# filename_\"with_double_quotes\" -> filename_"with_double_quotes"
$(echo -e "$fname") # interpret the escaped characters
# filename\twith_a_tab -> filename with_a_tab
Example
$ ./script.sh
.hidden_filename: empty
filename with spaces : empty
filename_"with_double_quotes": empty
filename
with_a_newline: empty
filename with_a_tab: empty
As seen, file (or the command you want) interprets well each filename.
Each file has three timestamps:
Access time: the file was opened and read. Also known as atime.
Modification time: the file was written to. Also known as mtime.
Inode modification time: the file's status was changed, such as the file had a new hard link created, or an existing one removed; or if the file's permissions were chmod-ed, or a few other things. Also known as ctime.
Neither one represents the time the file was created, that information is not saved anywhere. At file creation time, all three timestamps are initialized, and then each one gets updated appropriately, when the file is read, or written to, or when a file's permissions are chmoded, or a hard link created or destroyed.
So, you can't really list the files according to their file creation time, because the file creation time isn't saved anywhere. The closest match would be the inode modification time.
See the descriptions of the -t, -u, -c, and -r options in the ls(1) man page for more information on how to list files in atime, mtime, or ctime order.
Here's a way using stat with an associative array.
n=0
declare -A arr
for file in *; do
# modified=$(stat -f "%m" "$file") # For use with BSD/OS X
modified=$(stat -c "%Y" "$file") # For use with GNU/Linux
# Ensure stat timestamp is unique
if [[ $modified == *"${!arr[#]}"* ]]; then
modified=${modified}.$n
((n++))
fi
arr[$modified]="$file"
done
files=()
for index in $(IFS=$'\n'; echo "${!arr[*]}" | sort -n); do
files+=("${arr[$index]}")
done
Since sort sorts lines, $(IFS=$'\n'; echo "${!arr[*]}" | sort -n) ensures the indices of the associative array get sorted by setting the field separator in the subshell to a newline.
The quoting at arr[$modified]="${file}" and files+=("${arr[$index]}") ensures that file names with caveats like a newline are preserved.
Straight to the point, I'm wondering how to use grep/find/sed/awk to match a certain string (that ends with a number) and increment that number by 1. The closest I've come is to concatenate a 1 to the end (which works well enough) because the main point is to simply change the value. Here's what I'm currently doing:
find . -type f | xargs sed -i 's/\(\?cache_version\=[0-9]\+\)/\11/g'
Since I couldn't figure out how to increment the number, I captured the whole thing and just appended a "1". Before, I had something like this:
find . -type f | xargs sed -i 's/\?cache_version\=\([0-9]\+\)/?cache_version=\11/g'
So at least I understand how to capture what I need.
Instead of explaining what this is for, I'll just explain what I want it to do. It should find text in any file, recursively, based on the current directory (isn't important, it could be any directory, so I'd configure that later), that matches "?cache_version=" with a number. It will then increment that number and replace it in the file.
Currently the stuff I have above works, it's just that I can't increment that found number at the end. It would be nicer to be able to increment instead of appending a "1" so that the future values wouldn't be "11", "111", "1111", "11111", and so on.
I've gone through dozens of articles/explanations, and often enough, the suggestion is to use awk, but I cannot for the life of me mix them. The closest I came to using awk, which doesn't actually replace anything, is:
grep -Pro '(?<=\?cache_version=)[0-9]+' . | awk -F: '{ print "match is", $2+1 }'
I'm wondering if there's some way to pipe a sed at the end and pass the original file name so that sed can have the file name and incremented number (from the awk), or whatever it needs that xargs has.
Technically, this number has no importance; this replacement is mainly to make sure there is a new number there, 100% for sure different than the last. So as I was writing this question, I realized I might as well use the system time - seconds since epoch (the technique often used by AJAX to eliminate caching for subsequent "identical" requests). I ended up with this, and it seems perfect:
CXREPLACETIME=`date +%s`; find . -type f | xargs sed -i "s/\(\?cache_version\=\)[0-9]\+/\1$CXREPLACETIME/g"
(I store the value first so all files get the same value, in case it spans multiple seconds for whatever reason)
But I would still love to know the original question, on incrementing a matched number. I'm guessing an easy solution would be to make it a bash script, but still, I thought there would be an easier way than looping through every file recursively and checking its contents for a match then replacing, since it's simply incrementing a matched number...not much else logic. I just don't want to write to any other files or something like that - it should do it in place, like sed does with the "i" option.
I think finding file isn't the difficult part for you. I therefore just go to the point, to do the +1 calculation. If you have gnu sed, it could be done in this way:
sed -r 's/(.*)(\?cache_version=)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge' file
let's take an example:
kent$ cat test
ello
barbaz?cache_version=3fooooo
bye
kent$ sed -r 's/(.*)(\?cache_version=)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge' test
ello
barbaz?cache_version=4fooooo
bye
you could add -i option if you like.
edit
/e allows you to pass matched part to external command, and do substitution with the execution result. Gnu sed only.
see this example: external command/tool echo, bc are used
kent$ echo "result:3*3"|sed -r 's/(result:)(.*)/echo \1$(echo "\2"\|bc)/ge'
gives output:
result:9
you could use other powerful external command, like cut, sed (again), awk...
Pure sed version:
This version has no dependencies on other commands or environment variables.
It uses explicit carrying. For carry I use the # symbol, but another name can be used if you like. Use something that is not present in your input file.
First it finds SEARCHSTRING<number> and appends a # to it.
It repeats incrementing digits that have a pending carry (that is, have a carry symbol after it: [0-9]#)
If 9 was incremented, this increment yields a carry itself, and the process will repeat until there are no more pending carries.
Finally, carries that were yielded but not added to a digit yet are replaced by 1.
sed "s/SEARCHSTRING[0-9]*[0-9]/&#/g;:a {s/0#/1/g;s/1#/2/g;s/2#/3/g;s/3#/4/g;s/4#/5/g;s/5#/6/g;s/6#/7/g;s/7#/8/g;s/8#/9/g;s/9#/#0/g;t a};s/#/1/g" numbers.txt
This perl command will search all files in current directory (without traverse it, you will need File::Find module or similar for that more complex task) and will increment the number of a line that matches cache_version=. It uses the /e flag of the regular expression that evaluates the replacement part.
perl -i.bak -lpe 'BEGIN { sub inc { my ($num) = #_; ++$num } } s/(cache_version=)(\d+)/$1 . (inc($2))/eg' *
I tested it with file in current directory with following data:
hello
cache_version=3
bye
It backups original file (ls -1):
file
file.bak
And file now with:
hello
cache_version=4
bye
I hope it can be useful for what you are looking for.
UPDATE to use File::Find for traversing directories. It accepts * as argument but will discard them with those found with File::Find. The directory to begin the search is the current of execution of the script. It is hardcoded in the line find( \&wanted, "." ).
perl -MFile::Find -i.bak -lpe '
BEGIN {
sub inc {
my ($num) = #_;
++$num
}
sub wanted {
if ( -f && ! -l ) {
push #ARGV, $File::Find::name;
}
}
#ARGV = ();
find( \&wanted, "." );
}
s/(cache_version=)(\d+)/$1 . (inc($2))/eg
' *
This is ugly (I'm a little rusty), but here's a start using sed:
orig="something1" ;
text=`echo $orig | sed "s/\([^0-9]*\)\([0-9]*\)/\1/"` ;
num=`echo $orig | sed "s/\([^0-9]*\)\([0-9]*\)/\2/"` ;
echo $text$(($num + 1))
With an original filename ($orig) of "something1", sed splits off the text and numeric portions into $text and $num, then these are combined in the final section with an incremented number, resulting in something2.
Just a start since it doesn't consider cases with numbers within the file name or names with no number at the end, but hopefully helps with your original goal of using sed.
This can actually be simplified within sed by using buffers, I believe (sed can operate recursively), but I'm really rusty with that aspect of it.
perl -pi -e 's/(\?cache_version=)(\d+)/$1.($2+1)/ge' FILE [FILE...]
or for a complete solution:
find . -type f | xargs perl -pi -e 's/(\?cache_version=)(\d+)/$1.($2+1)/ge'
perl substitution operator
/e modifier evaluates the replacement as if it were a Perl statement, using its return value as the replacement text.
. operator concatenates strings in Perl. The parentheses ensures that the arithmetic operation $2+1 takes precedence over concatenation.
/g modifier applies substitution to all matched strings within line
perl options
-p ensures that perl will execute the command on every line of each file
-i ensures that each file will be edited inplace
-e specifies the perl command(s) that are executed (in this case, the substitution operation)