Replace or append block of text in file with contest of another file - bash

I have two files:
super.conf
someconfig=23;
second line;
#blockbegin
dynamicconfig=12
dynamicconfig2=1323
#blockend
otherconfig=12;
input.conf
newdynamicconfig=12;
anothernewline=1234;
I want to run a script and have input.conf replace the contents between the #blockbegin and #blockend lines.
I already have this:
sed -i -ne '/^#blockbegin/ {p; r input.conf' -e ':a; n; /#blockend/ {p; b}; ba}; p' super.conf
It works well but until I change or remove #blockend line in super.conf, then script replaces all lines after #blockbegin.
In addition, I want script to replace block or if block doesn't exists in super.conf append new block with content of input.conf to super.conf.
It can be accomplished by remove + append, but how to remove block using sed or other unix command?

Though I gotta question the utility of this scheme -- I tend to favor systems that complain loudly when expectations aren't met instead of being more loosey-goosey like this -- I believe the following script will do what you want.
Theory of operation: It reads in everything up-front, and then emits its output all in one fell swoop.
Assuming you name the file injector, call it like injector input.conf super.conf.
#!/usr/bin/env awk -f
#
# Expects to be called with two files. First is the content to inject,
# second is the file to inject into.
FNR == 1 {
# This switches from "read replacement content" to "read template"
# at the boundary between reading the first and second files. This
# will of course do something suprising if you pass more than two
# files.
readReplacement = !readReplacement;
}
# Read a line of replacement content.
readReplacement {
rCount++;
replacement[rCount] = $0;
next;
}
# Read a line of template content.
{
tCount++;
template[tCount] = $0;
}
# Note the beginning of the replacement area.
/^#blockbegin$/ {
beginAt = tCount;
}
# Note the end of the replacement area.
/^#blockend$/ {
endAt = tCount;
}
# Finished reading everything. Process it all.
END {
if (beginAt && endAt) {
# Both beginning and ending markers were found; replace what's
# in the middle of them.
emitTemplate(1, beginAt);
emitReplacement();
emitTemplate(endAt, tCount);
} else {
# Didn't find both markers; just append.
emitTemplate(1, tCount);
emitReplacement();
}
}
# Emit the indicated portion of the template to stdout.
function emitTemplate(from, to) {
for (i = from; i <= to; i++) {
print template[i];
}
}
# Emit the replacement text to stdout.
function emitReplacement() {
for (i = 1; i <= rCount; i++) {
print replacement[i];
}
}

I've written perl one-liner:
perl -0777lni -e 'BEGIN{open(F,pop(#ARGV))||die;$b="#blockbegin";$e="#blockend";local $/;$d=<F>;close(F);}s|\n$b(.*)$e\n||s;print;print "\n$b\n",$d,"\n$e\n" if eof;' edited.file input.file
Arguments:
edited.file - path to updated file
input.file - path to file with new content of block
Script first delete block (if find one matching) and next append new block with new content.

You mean say
sed '/^#blockbegin/,/#blockend/d' super.conf

Related

Find, Replace, Remove - with in file

I'm currently using this code:
awk 'BEGIN { s = \"{$CNEW}\" } /WORD_MATCH/ { $0 = s; n = 1 } 1; END { if(!n) print s }' filename > new_filename
To find a match on WORD_MATCH and then replace that line with $CNEW in a file called filename the results are written to new_filename
This all works well. But I have an issue where I may want to DELETE the line instead of replace it.
So I set $CNEW = '' which works in that I get a blank line in the file, but not actually removing the line.
Is there anyway to adapt the AWK command to allow the removal of the line ?
The total aim is :
If there isn't a line in the file containing WORD_MATCH add one, based on $CNEW
If there is a line in the file containing WORD_MATCH update that line with the new value from $CNEW
If $CNEW ='' then delete the line contain WORD_MATCH.
There will only be one line in he file containing WORD_MATCH
Thanks
awk -v s="$CNEW" '/WORD_MATCH/ { n=1; if (s) $0=s; else next; } 1; END { if(s && !n) print s }' file
How it works
-v s="$CNEW"
This creates s as an awk variable with the value $CNEW. Note that the use of -v neatly eliminates the quoting problems that can occur by trying to define s in a BEGIN block.
/WORD_MATCH/ { n=1; if (s) $0=s; else next; }
If the current line matches WORD_MATCH, then set n to 1. If s is non-empty, then set the current line to s. If not, skip the rest of the commands and start over on the next line.
1
This is cryptic shorthand for print the line.
END { if(s && !n) print s }
At the end of the file, if n is still not 1 and s is non-empty, then print s.

How to edit previous line from current in text file?

So what I need exactly.
I have a file that I looping line by line and when I'll found the word "search" I need to return on previous line and change the word "false" to "true" inside that line, but only on that line not for all file. I'm newbie in bash and that all that I have.
file="/u01/MyFile.txt"
count=0
while read line
do
((count++))
if [[ $line == *"[search]"* ]]
then
?????????????
fi
done < $file
You could do the whole thing in pure bash like this:
# Declare a function process_file doing the stuff
process_file() {
# Always have the previous line ready, hold off printing
# until we know if it needs to be changed.
read prev
while read line; do
if [[ $line == *"[search]"* ]]; then
# substitute false with true in $prev. Use ${prev//false/true} if
# several occurrences may need to be replaced.
echo "${prev/false/true}"
else
echo "$prev"
fi
# remember current line as previous for next turn
prev="$line"
done
# in the end, print the last line (it was saved as $prev) in the last
# loop iteration.
echo "$prev"
}
# call function, feed file to it.
process_file < file
However, there are tools that are better suited to this sort of file processing than pure bash and that are commonly used in shell scripts: awk and sed. These tools process a file by reading line after line1 from it and running a piece of code for each line individually, preserving some state between lines (not unlike the code above) and come with more powerful text processing facilities.
For this, I'd use awk:
awk 'index($0, "[search]") { sub(/false/, "true", prev) } NR != 1 { print prev } { prev = $0 } END { print prev }' filename
That is:
index($0, "[search]") { # if the currently processed line contains
sub(/false/, "true", prev) # "[search]", replace false with true in the
# saved previous line. (use gsub if more than
# one occurrence may have to be replaced)
}
NR != 1 { # then, unless we're processing the first line
# and don't have a previous line,
print prev # print the previous line
}
{ # then, for all lines:
prev = $0 # remember it as previous line for the next turn
}
END { # and after the last line was processed,
print prev # print the last line (that we just saved
# as prev)
}
You could also use sed:
sed '/\[search\]/ { x; s/false/true/; x; }; x; ${ p; x; }; 1d' filename
...but as you can see, sed is somewhat more cryptic. It has its strengths, but this problem doesn't play to them.
Addendum, as requested: The main thing to know is that sed reads line into something called the pattern space (on which most commands operate) and has a hold buffer on the side where you can save things between lines. We'll use the hold buffer to hold the current previous line. The code works as follows:
/\[search\]/ { # if the currently processed line contains [search]
x # eXchange pattern space (PS) and hold buffer (HB)
s/false/true/ # replace false with true in the pattern space
x # swap back. This changed false to true in the PS.
# Use s/false/true/g for multiple occurrences.
}
x # swap pattern space, hold buffer (the previous line
# is now in the PS, the current in the HB)
${ # if we're processing the last line,
p # print the PS
x # swap again (current line is now in PS)
}
1d # If we're processing the first line, the PS now holds
# the empty line that was originally in the HB. Don't
# print that.
# We're dropping off the end here, and since we didn't
# disable auto-print, the PS will be printed now.
# That is the previous line except if we're processing
# the last line (then it's the last line)
Well, I did warn you that sed is somewhat more cryptic than awk. A caveat of this code is that it expects the input file to have more than one line.
1 In awk's case, it's records that don't have to be lines but are lines by default.
A very simple approach would be to read 2 lines at a time and then check for the condition in the second line and replace the previous line.
while read prev_line # reads every 1st line
do
read curr_line # reads every 2nd line
if [[ $curr_line == *"[search]"* ]]; then
echo "${prev_line/false/true}"
echo "$curr_line
else
echo "$prev_line"
echo "$curr_line"
fi
done < "file.txt"
The correct version of your way of doing this would be:
file="/u01/MyFile.txt"
count=0
while read line
do
((count++))
if [[ $line == *"[search]"* ]]
then
sed -i.bak "$((count-1))s/true/false/" $file
fi
done < $file

How to add an input file name to multiple output files in awk?

The question might be trivial. I'm trying to figure out a way to add a part of my input file name to multiple outputs generated by the following awk script.
Script:
zcat $1 | BEGIN {
# the number of sequences per file
if (!N) N=10000;
# file prefix
if (!prefix) prefix = "seq";
# file suffix
if (!suffix) suffix = "fa";
# this keeps track of the sequences
count = 0
}
# skip empty lines at the beginning
/^$/ { next; }
# act on fasta header
/^>/ {
if (count % N == 0) {
if (output) close(output)
output = sprintf("%s%07d.%s", prefix, count, suffix)
}
print > output
count ++
next
}
# write the fasta body into the file
{
print >> output
}
The input in $1 variable is 30_C_283_1_5.9.fa.gz
The output files generated by the script are
myseq0000000.fa, myseq1000000.fa and so on....
I would like the output to be
30_C_283_1_5.9_myseq000000.fa, 30_C_283_1_5.9_myseq100000.fa....
Looking forward for some inputs in this regard.
There's a way to direct the output from inside the Awk script:
https://www.gnu.org/software/gawk/manual/html_node/Redirection.html

How to pipe program output so as to eliminate specific text

I have a program which produces results to the terminal which contains a header and a footer. The header ends when the first line containing only '-' characters is encountered and the footer begins when the last line containing a '-'is encountered. I would like to pass the output of this program through another program that will cut out the header and footer, leaving only the data. I am not sure what the most efficient way to do this is. The files are roughly 20MB in size. I am running Mac OSX
You could use 'awk' to do the work. Below is a awk program file I wrote in a file named clip.awk.
You can trim a data file that you described data.txt like this:
$ cat data.txt | awk -f clip.awk
Here is the program clip.awk:
BEGIN { state = 0; # HEADER
}
# match a line of all ----
/^-+$/ {
if (state == 0)
state = 1; # DATA
else
state = 2; # FOOTER
# Skip to next line
next;
}
# print any line while in DATA section
{ if (state == 1) print }

Reading java .properties file from bash

I am thinking of using sed for reading .properties file, but was wondering if there is a smarter way to do that from bash script?
This would probably be the easiest way: grep + cut
# Usage: get_property FILE KEY
function get_property
{
grep "^$2=" "$1" | cut -d'=' -f2
}
The solutions mentioned above will work for the basics. I don't think they cover multi-line values though. Here is an awk program that will parse Java properties from stdin and produce shell environment variables to stdout:
BEGIN {
FS="=";
print "# BEGIN";
n="";
v="";
c=0; # Not a line continuation.
}
/^\#/ { # The line is a comment. Breaks line continuation.
c=0;
next;
}
/\\$/ && (c==0) && (NF>=2) { # Name value pair with a line continuation...
e=index($0,"=");
n=substr($0,1,e-1);
v=substr($0,e+1,length($0) - e - 1); # Trim off the backslash.
c=1; # Line continuation mode.
next;
}
/^[^\\]+\\$/ && (c==1) { # Line continuation. Accumulate the value.
v= "" v substr($0,1,length($0)-1);
next;
}
((c==1) || (NF>=2)) && !/^[^\\]+\\$/ { # End of line continuation, or a single line name/value pair
if (c==0) { # Single line name/value pair
e=index($0,"=");
n=substr($0,1,e-1);
v=substr($0,e+1,length($0) - e);
} else { # Line continuation mode - last line of the value.
c=0; # Turn off line continuation mode.
v= "" v $0;
}
# Make sure the name is a legal shell variable name
gsub(/[^A-Za-z0-9_]/,"_",n);
# Remove newlines from the value.
gsub(/[\n\r]/,"",v);
print n "=\"" v "\"";
n = "";
v = "";
}
END {
print "# END";
}
As you can see, multi-line values make things more complex. To see the values of the properties in shell, just source in the output:
cat myproperties.properties | awk -f readproperties.awk > temp.sh
source temp.sh
The variables will have '_' in the place of '.', so the property some.property will be some_property in shell.
If you have ANT properties files that have property interpolation (e.g. '${foo.bar}') then I recommend using Groovy with AntBuilder.
Here is my wiki page on this very topic.
I wrote a script to solve the problem and put it on my github.
See properties-parser
One option is to write a simple Java program to do it for you - then run the Java program in your script. That might seem silly if you're just reading properties from a single properties file. However, it becomes very useful when you're trying to get a configuration value from something like a Commons Configuration CompositeConfiguration backed by properties files. For a time, we went the route of implementing what we needed in our shell scripts to get the same behavior we were getting from CompositeConfiguration. Then we wisened up and realized we should just let CompositeConfiguration do the work for us! I don't expect this to be a popular answer, but hopefully you find it useful.
If you want to use sed to parse -any- .properties file, you may end up with a quite complex solution, since the format allows line breaks, unquoted strings, unicode, etc: http://en.wikipedia.org/wiki/.properties
One possible workaround would using java itself to preprocess the .properties file into something bash-friendly, then source it. E.g.:
.properties file:
line_a : "ABC"
line_b = Line\
With\
Breaks!
line_c = I'm unquoted :(
would be turned into:
line_a="ABC"
line_b=`echo -e "Line\nWith\nBreaks!"`
line_c="I'm unquoted :("
Of course, that would yield worse performance, but the implementation would be simpler/clearer.
In Perl:
while(<STDIN>) {
($prop,$val)=split(/[=: ]/, $_, 2);
# and do stuff for each prop/val
}
Not tested, and should be more tolerant of leading/trailing spaces, comments etc., but you get the idea. Whether you use Perl (or another language) over sed is really dependent upon what you want to do with the properties once you've parsed them out of the file.
Note that (as highlighted in the comments) Java properties files can have multiple forms of delimiters (although I've not seen anything used in practice other than colons). Hence the split uses a choice of characters to split upon.
Ultimately, you may be better off using the Config::Properties module in Perl, which is built to solve this specific problem.
I have some shell scripts that need to look up some .properties and use them as arguments to programs I didn't write. The heart of the script is a line like this:
dbUrlFile=$(grep database.url.file etc/zocalo.conf | sed -e "s/.*: //" -e "s/#.*//")
Effectively, that's grep for the key and filter out the stuff before the colon and after any hash.
if you want to use "shell", the best tool to parse files and have proper programming control is (g)awk. Use sed only simple substitution.
I have sometimes just sourced the properties file into the bash script. This will lead to environment variables being set in the script with the names and contents from the file. Maybe that is enough for you, too. If you have to do some "real" parsing, this is not the way to go, of course.
Hmm, I just run into the same problem today. This is poor man's solution, admittedly more straightforward than clever;)
decl=`ruby -ne 'puts chomp.sub(/=(.*)/,%q{="\1";}).gsub(".","_")' my.properties`
eval $decl
then, a property 'my.java.prop' can be accessed as $my_java_prop.
This can be done with sed or whatever, but I finally went with ruby for its 'irb' which was handy for experimenting.
It's quite limited (dots should be replaced only before '=',no comment handling), but could be a starting point.
#Daniel, I tried to source it, but Bash didn't like dots in variable names.
I have had some success with
PROPERTIES_FILE=project.properties
function source_property {
local name=$1
eval "$name=\"$(sed -n '/^'"$name"'=/,/^[A-Z]\+_*[A-Z]*=/p' $PROPERTIES_FILE|sed -e 's/^'"$name"'=//g' -e 's/"/\\"/g'|head -n -1)\""
}
source_property 'SOME_PROPERTY'
This is a solution that properly parses quotes and terminates at a space when not given quotes. It is safe: no eval is used.
I use this code in my .bashrc and .zshrc for importing variables from shell scripts:
# Usage: _getvar VARIABLE_NAME [sourcefile...]
# Echos the value that would be assigned to VARIABLE_NAME
_getvar() {
local VAR="$1"
shift
awk -v Q="'" -v QQ='"' -v VAR="$VAR" '
function loc(text) { return index($0, text) }
function unquote(d) { $0 = substr($0, eq+2) d; print substr($0, 1, loc(d)-1) }
{ sub(/^[ \t]+/, ""); eq = loc("=") }
substr($0, 1, eq-1) != VAR { next } # assignment is not for VAR: skip
loc("=" QQ) == eq { unquote(QQ); exit }
loc("=" Q) == eq { unquote( Q); exit }
{ print substr($1, eq + 1); exit }
' "$#"
}
This saves the desired variable name and then shifts the argument array so the rest can be passed as files to awk.
Because it's so hard to call shell variables and refer to quote characters inside awk, I'm defining them as awk variables on the command line. Q is a single quote (apostrophe) character, QQ is a double quote, and VAR is that first argument we saved earlier.
For further convenience, there are two helper functions. The first returns the location of the given text in the current line, and the second prints the content between the first two quotes in the line using quote character d (for "delimiter"). There's a stray d concatenated to the first substr as a safety against multi-line strings (see "Caveats" below).
While I wrote the code for POSIX shell syntax parsing, that appears to only differ from your format by whether there is white space around the asignment. You can add that functionality to the above code by adding sub(/[ \t]*=[ \t]*/, "="); before the sub(…) on awk's line 4 (note: line 1 is blank).
The fourth line strips off leading white space and saves the location of the first equals sign. Please verify that your awk supports \t as tab, this is not guaranteed on ancient UNIX systems.
The substr line compares the text before the equals sign to VAR. If that doesn't match, the line is assigning a different variable, so we skip it and move to the next line.
Now we know we've got the requested variable assignment, so it's just a matter of unraveling the quotes. We do this by searching for the first location of =" (line 6) or =' (line 7) or no quotes (line 8). Each of those lines prints the assigned value.
Caveats: If there is an escaped quote character, we'll return a value truncated to it. Detecting this is a bit nontrivial and I decided not to implement it. There's also a problem of multi-line quotes, which get truncated at the first line break (this is the purpose of the "stray d" mentioned above). Most solutions on this page suffer from these issues.
In order to let Java do the tricky parsing, here's a solution using jrunscript to print the keys and values in a bash read-friendy (key, tab character, value, null character) way:
#!/usr/bin/env bash
jrunscript -e '
p = new java.util.Properties();
p.load(java.lang.System.in);
p.forEach(function(k,v) { out.format("%s\t%s\000", k, v); });
' < /tmp/test.properties \
| while IFS=$'\t' read -d $'\0' -r key value; do
key=${key//./_}
printf -v "$key" %s "$value"
printf '=> %s = "%s"\n' "$key" "$value"
done
I found printf -v in this answer by #david-foerster.
To quote jrunscript: Warning: Nashorn engine is planned to be removed from a future JDK release

Resources