Replacing specific lines in multiple files using bash - bash

I'm relatively new to bash scripting, having started out of the need to manage my simulations on supercomputers. I'm currently stuck on writing a script to change specific lines in my pbs files.
There's 2 stages to my problem. First, I need to replace a number of lines in a text file (another script), and overwrite that file for my later use. The rough idea is:
Replace lines 27, 28 and 29 of 'filename005' with 'text1=000', 'text2=005' and 'text3=010'
Next, I'd like to do that recursively for a set of text files with numbered suffixes, and the numbering influences the replaced text.
My code so far is:
#!/bin/bash
for ((i = 1; i < 10; i++))
do
let NUM=i*5
let OLD=NUM-5
let NOW=NUM
let NEW=NUM+5
let FILE=$(printf "filename%03g" $NUM)
sed "27 c\text1=$OLD" $FILE
sed "28 c\text2=$NOW" $FILE
sed "29 c\text3=$NEW" $FILE
done
I know there are some errors in the last 4 lines of my code, and I'm still studying up on the proper way to implement sed. Appreciate any tips!
Thanks!
CS

Taking the first line of your specification:
Replace lines 27:29 of filename005, with text1=000; text2=005; text3=010
That becomes:
sed -e '27,29c\
text1=000\
text2=005\
text3=010' filename005
Rinse and repeat. The backslashes indicate to sed that the change continues. It's easier on yourself if your actual data lines do not need to end with backslashes.
You can play with:
seq 1 35 |
sed -e '27,29c\
text1=000\
text2=005\
text3=010'
to see what happens without risking damage to precious files. Given the specification lines, you could write a sed script to generate sed scripts from the specification (though I'd be tempted to use Perl or awk instead; indeed, I'd probably do the whole job in Perl).

Okay, I managed to get my code to work after finding out that for in-line replacement, I need to write it to a temporary file. So, with the recursive loop and multi-line replacement (and other small tweaks):
for ((i = 1; i < 10; i++ ))
do
let NUM=i*5
let OLD=NUM-5
let NOW=NUM
let NEW=NUM+5
FILE=`printf "filename%03g" $NUM`
sed -e "27,29 c\
OLD=$OLD\n\
NOW=$NOW\n\
NEW=$NEW" $FILE >temp.tmp && mv temp.tmp $FILE
done
Do let me know if there is a more elegant way to use sed in this context. Thanks again #Johnathan!

Related

How to use locale variable and 'delete last line' in sed

I am trying to delete the last three lines of a file in a shell bash script.
Since I am using local variables in combination with the Regex syntax in sed the answer proposed in How to use sed to remove the last n lines of a file does not cover this case. On the contrary, the cases covered deal with sed in a terminal and does not cover syntax in shell scripts, neither does it cover the use of variables in sed expressions.
The commands I have available is limited, since I am not on a Linux but use a MINGW64 for it.
sed does a create job so far, but deleting the last three lines gives me some headaches in relation of how to format the expression.
I use wc to be aware of how many lines the file has and subtract then with expr three lines.
n=$(wc -l < "$distribution_area")
rel=$(expr $n - 3)
The start point for deleting lines is defined by rel but accessing the local variable happens through the $ and unfortunately the syntax of sed is using the $ to define the end of file. Hence,
sed -i "$rel,$d" "$distribution_area"
won't work, and what ever variant of combinations e.g. '"'"$rel"'",$d' gives me sed: -e expression #1, char 1: unknown command: `"' or something similar.
Can somebody show me how to combine the variable with the $d regex syntax of sed?
sed -i "$rel,$d" "$distribution_area"
Here you're missing the variable name (n) for the second arg.
Consider the following example on a file called test that contains 1-10:
n=$(wc -l < test)
rel=$(($n - 3))
sed "$rel,$n d" test
Result:
1
2
3
4
5
6
To make sure the d will not interfere with the $n, you can add a space instead of escaping.
If you have a recent head available, I'd recommend something like:
head -n -3 test
Can somebody show me how to combine the variable with the $d regex syntax of sed?
$d expands to a varibale d, you have to escape it.
"$rel,\$d"
or:
"$rel"',$d'
But I would use:
head -n -3 "$distribution_area" > "$distribution_area".tmp
mv "$distribution_area".tmp "$distribution_area"
You can remove the last N lines using only pure Bash, without forking additional processes (such as sed). Such scripts look ugly, but they would work in any environment where only Bash runs and nothing else is available, no other binaries like sed, awk etc.
If the entire file fits in RAM, a straightforward solution is to split it by lines and print all but the N trailing ones:
delete_last_n_lines() {
local -ir n="$1"
local -a lines
readarray lines
((${#lines[#]} > n)) || return 0
printf '%s' "${lines[#]::${#lines[#]} - n}"
}
If the file does not fit in RAM, you can keep a FIFO buffer that stores N lines (N + 1 in the “implementation” below, but that’s just a technical detail), let the file (arbitrarily large) flow through the buffer and, after reaching the end of the file, not print out what remains in the buffer (the last N lines to remove).
delete_last_n_lines() {
local -ir n="$1 + 1"
local -a lines
local -i pos i
for ((i = 0; i < n; ++i)); do
IFS= read -r lines[i] || return 0
done
printf '%s\n' "${lines[pos]}"
while IFS= read -r lines[pos++]; do
((pos %= n))
printf '%s\n' "${lines[pos]}"
done
}
The following example gets 10 lines of input, 0 to 9, but prints out only 0 to 6, removing 7, 8 and 9 as desired:
printf '%s' {0..9}$'\n' | delete_last_n_lines 3
Last but not least, this simple hack lacks sed’s -i option to edit files in-place. That could be implemented (e.g.) using a temporary file to store the output and then renaming the temporary file to the original. (A more sophisticated approach would be needed to avoid storing the temporary copy altogether. I don’t think Bash exposes an interface like lseek() to read files “backwards”, so this cannot be done in Bash alone.)

Sed through files without using for loop?

I have a small script which basically generates a menu of all the scripts in my ~/scripts folder and next to each of them displays a sentence describing it, that sentence being the third line within the script commented out. I then plan to pipe this into fzf or dmenu to select it and start editing it or whatever.
1 #!/bin/bash
2
3 # a script to do
So it would look something like this
foo.sh a script to do X
bar.sh a script to do Y
Currently I have it run a for loop over all the files in the scripts folder and then run sed -n 3p on all of them.
for i in $(ls -1 ~/scripts); do
echo -n "$i"
sed -n 3p "~/scripts/$i"
echo
done | column -t -s '#' | ...
I was wondering if there is a more efficient way of doing this that did not involve a for loop and only used sed. Any help will be appreciated. Thanks!
Instead of a loop that is parsing ls output + sed, you may try this awk command:
awk 'FNR == 3 {
f = FILENAME; sub(/^.*\//, "", f); print f, $0; nextfile
}' ~/scripts/* | column -t -s '#' | ...
Yes there is a more efficient way, but no, it doesn't only use sed. This is probably a silly optimization for your use case though, but it may be worthwhile nonetheless.
The inefficiency is that you're using ls to read the directory and then parse its output. For large directories, that causes lots of overhead for keeping that list in memory even though you only traverse it once. Also, it's not done correctly, consider filenames with special characters that the shell interprets.
The more efficient way is to use find in combination with its -exec option, which starts a second program with each found file in turn.
BTW: If you didn't rely on line numbers but maybe a tag to mark the description, you could also use grep -r, which avoids an additional process per file altogether.
This might work for you (GNU sed):
sed -sn '1h;3{H;g;s/\n/ /p}' ~/scripts/*
Use the -s option to reset the line number addresses for each file.
Copy line 1 to the hold space.
Append line 3 to the hold space.
Swap the hold space for the pattern space.
Replace the newline with a space and print the result.
All files in the directory ~/scripts will be processed.
N.B. You may wish to replace the space delimiter by a tab or pipe the results to the column command.

Block cut with sed and suppress the last line

# cat file
LBL 434
any lines but not block start
...
LBL 75677
...
any
LBL 777
...
LBL 798
...
# sed -ne '/LBL 75677/,/LBL/p' file | head -n -1
LBL 75677
...
any
#
The above command is good for me, but I would like to know:
Can I suppress the last line without the head command, only in one sed script? I know the commands and control flow of sed (N P D b ...) but I couldn't figure out it at the moment.
#Cyrus, Thanks It works fine and I know how it works thanks again.
But I wanted to find different way of solution if it is.
I tried the lines of block /LBL 75677/,/LBL/ put into the space buffer of sed with N command and D remove the last line from space buffer (this is first line of new block) and print all space buffer. Does somebody can do it.
Below script :
sed -n '/LBL 75677/{p;:loop;n;/LBL/!{p;b loop}}' file
may be what you're looking for.
:loop here is a label and b loop is unconditional jumping to that label.
Here we create a small loop and go on to print the lines until the next LBL is reached.
sed is for simple substitutions on individual lines (s/old/new/), that is all. For anything else you should be using awk:
$ awk '/LBL/{f=0} /LBL 75677/{f=1} f' file
LBL 75677
...
any
In addition to being simpler and clearer than an equivalent sed script, the above will execute faster (especially if you only want one record output and so can change /LBL/{f=0} to /LBL/{exit}), and be more portable as it will work as-is on all awks on all UNIX systems and will be vastly easier to enhance if/when your requirements change (when dealing with anything more than s/old/new/ a tiny requirements change typically means a complete rewrite for a sed script).
If you're using any constructs other than s, g, and p (with -n) in sed then you are using constructs that became obsolete in the mid-1970s when awk was invented and so sed no longer needed all the cryptic runes to perform simple multi-line tasks.

Increment all regex matching numbers throughout an HTML file

I have a bunch of HTML files that have anchors structured like:
LinkName
I'm running the files through sed to convert the links into this structure:
LinkName
The last piece of the puzzle that I'm trying to solve is that I need to increment the numbers within the anchor by 10:
#L217 -> #L227 // first link
#cl-217 -> #cl-227 // transformed link
So the final version of the above example link would be:
LinkName
I've gotten close =/
awk 'gsub(/#cl-[0-9]+/, "#cl-ABC")') # just can't get the incremented match in ABC
This one works, but only once, or once per line:
awk '{n = substr($0, match($0, /[0-9]+/), RLENGTH) + 10; sub(/[0-9]+/, n); print }
(* I don't have gawk, or gnu sed)
Try this:
1- Create a file named replace.sh
for file in /path/to/files/*.html; do
while read line; do
name=$line
[[ $line =~ 'LinkName' ]];
match=${BASH_REMATCH[1]};
replace=$((${BASH_REMATCH[1]} + 10));
perl -i -pe 's!LinkName!LinkName!g' $file
done < $file
done
2- chmod +x replace.sh
3- ./replace.sh
In POSIX shells you can use let to compute. First get only the number into a variable, then let my_var++ to increment it.
On the other hand, I'm morally obliged to warn you that manipulating HTML with shell scripts is a maintainability disaster waiting to happen. Python, JavaScript, XSLT or Java would all do a much better job.

Read the n-th line of multiple files into a single output

I have some dump files called dump_mydump_0.cfg, dump_mydump_250.cfg, ..., all the way up to dump_mydump_40000.cfg. For each dump file, I'd like to take the 16th line out, read them, and put them into one single file.
I'm using sed, but I came across some syntax errors. Here's what I have so far:
for lineNo in 16 ;
for fileNo in 0,40000 ; do
sed -n "${lineNo}{p;q;}" dump_mydump_file${lineNo}.cfg >> data.txt
done
Considering your files are named with intervals of 250, you should get it working using:
for lineNo in 16; do
for fileNo in {0..40000..250}; do
sed -n "${lineNo}{p;q;}" dump_mydump_file${fileNo}.cfg >> data.txt
done
done
Note both the bash syntax corrections -- do, done, and {0..40000..250} --, and the input file name, that should depend on ${fileNo} instead of ${lineNo}.
Alternatively, with (GNU) awk:
awk "FNR==16{print;nextfile}" dump_mydump_{0..40000..250}.cfg > data.txt
(I used the filenames as shown in the OP as opposed to the ones which would have been generated by the bash for loop, if corrected to work. But you can edit as needed.)
The advantage is that you don't need the for loop, and you don't need to spawn 160 processes. But it's not a huge advantage.
This might work for you (GNU sed):
sed -ns '16wdata.txt' dump_mydump_{0..40000..250}.cfg

Resources