Append to the previous line for a match - bash

can I use sed or awk to append to the previous line if a match is found ?
I have a file which has the format :
INT32
FSHL (const TP Buffer)
{
INT32
FSHL_lm (const TP Buffer)
{ WORD32 ugo = 0; ...
What I am trying to do is scan for independant open braces {and append it to the previous non-blank line .The match should not occur for an open brace appended by anything in the same line .
The expected output :
INT32
FSHL (const TP Buffer){
INT32
FSHL_lm (const TP Buffer)
{ WORD32 ugo = 0; ...
Thanks for the replies .

This might work for you (GNU sed):
sed '$!N;s/\n\s*{\s*$/{/;P;D' file
Explanation:
$!N unless the last line append the next line to the pattern space.
s/\n\s*{\s*$/{/ replace a linefeed followed by no or any amount of white space followed by an opening curly brace followed by no or any amount of white space to the end of the string, by an opening curly brace.
P print upto and including the first newline.
D delete upto and including the first newline (if so do not start a new cycle).

One way using perl. I read all file in slurp mode and use a regular expression to search lines with only a curly brace and remove its leading spaces.
perl -ne '
do {
local $/ = undef;
$data = <>;
};
$data =~ s/\n^\s*(\{\s*)$/\1/mg;
print $data
' infile
Assuming infile with the content of the question, output will be:
FSHL (const TP Buffer){
INT32
FSHL_lm (const TP Buffer)
{ WORD32 ugo = 0; ...

One way using awk:
awk '!(NF == 1 && $1 == "{") { if (line) print line; line = $0; next; } { sub(/^[ \t]+/, "", $0); line = line $0; } END { print line }' file.txt
Or broken out on multiple lines:
!(NF == 1 && $1 == "{") {
if (line) print line
line = $0
next
}
{
sub(/^[ \t]+/, "", $0)
line = line $0
}
END {
print line
}
Results:
INT32
FSHL (const TP Buffer){
INT32
FSHL_lm (const TP Buffer)
{ WORD32 ugo = 0; ...
HTH

[shyam#localhost ~]$ perl -lne 's/^/\n/ if $.>1 && /^\d+/; printf "%s",$_' appendDateText.txt
that will work
i/p:
06/12/2016 20:30 Test Test Test
TestTest
06/12/2019 20:30 abbs abcbcb abcbc
06/11/2016 20:30 test test
i123312331233123312331233123312331233123312331233Test
06/12/2016 20:30 abc
o/p:
06/12/2016 20:30 Test Test TestTestTest
06/12/2019 20:30 abbs abcbcb abcbc
06/11/2016 20:30 test ##testi123312331233123312331233123312331233123312331233Test

Related

Find, Replace, Remove - with in file

I'm currently using this code:
awk 'BEGIN { s = \"{$CNEW}\" } /WORD_MATCH/ { $0 = s; n = 1 } 1; END { if(!n) print s }' filename > new_filename
To find a match on WORD_MATCH and then replace that line with $CNEW in a file called filename the results are written to new_filename
This all works well. But I have an issue where I may want to DELETE the line instead of replace it.
So I set $CNEW = '' which works in that I get a blank line in the file, but not actually removing the line.
Is there anyway to adapt the AWK command to allow the removal of the line ?
The total aim is :
If there isn't a line in the file containing WORD_MATCH add one, based on $CNEW
If there is a line in the file containing WORD_MATCH update that line with the new value from $CNEW
If $CNEW ='' then delete the line contain WORD_MATCH.
There will only be one line in he file containing WORD_MATCH
Thanks
awk -v s="$CNEW" '/WORD_MATCH/ { n=1; if (s) $0=s; else next; } 1; END { if(s && !n) print s }' file
How it works
-v s="$CNEW"
This creates s as an awk variable with the value $CNEW. Note that the use of -v neatly eliminates the quoting problems that can occur by trying to define s in a BEGIN block.
/WORD_MATCH/ { n=1; if (s) $0=s; else next; }
If the current line matches WORD_MATCH, then set n to 1. If s is non-empty, then set the current line to s. If not, skip the rest of the commands and start over on the next line.
1
This is cryptic shorthand for print the line.
END { if(s && !n) print s }
At the end of the file, if n is still not 1 and s is non-empty, then print s.

awk print first occurrence after match

I'm trying to print a portion of a text file between two patterns, then return only the first occurrence. Should be simple but I can't seem to find a solution.
cat test.html
if (var == "Option_1"){
document.write("<td>head1</td>")
document.write("<td>text1</td>")
}
if (var == "Option_2"){
document.write("<td>head2</td>")
document.write("<td>text2</td>")
}
if (var == "Option_1"){
document.write("<td>head3</td>")
document.write("<td>text3</td>")
}
This prints all matches:
awk '/Option_1/,/}/' test.txt
I need it to return only the first, i.e.:
if (var == "Option_1"){
document.write("<td>head1</td>")
document.write("<td>text1</td>")
}
Thanks!
Never use range expressions as they make trivial jobs very slightly briefer but then require a complete rewrite or duplicate conditions for even slightly more interesting tasks. Always use a flag:
$ awk '/Option_1/{f=1} f{print; if (/}/) exit}' file
if (var == "Option_1"){
document.write("<td>head1</td>")
document.write("<td>text1</td>")
}
I assumed that there are no } inside the if blocks.
Using GNU sed :
sed -n '/Option_1/{:a N;s/}/}/;Ta;p;q}' file
Here's how it works :
/Option_1/{ #search for Option_1
:a #create label a
N; #append next line to pattern space
s/}/}/; #substitute } with }
Ta; #if substitution failed, jump to label a
p; #print pattern space
q #exit
}
Adding somewhat to Ed Morton's answer, you can write it again to work for some nested if condition or if there exist any other pair of braces inside the if statement (eg. braces for for loop).
awk '/Option_1/{f=1} f{ if(/{/){count++}; print; if(/}/){count--; if(count==0) exit}}' filename
output for:
if (var == "Option_1"){
document.write("<td>head1</td>")
if (condition){
//code
}
document.write("<td>text1</td>")
}
if (var == "Option_2"){
document.write("<td>head2</td>")
document.write("<td>text2</td>")
}
if (var == "Option_1"){
document.write("<td>head3</td>")
document.write("<td>text3</td>")
}
is:
if (var == "Option_1"){
document.write("<td>head1</td>")
if (condition){
//code
}
document.write("<td>text1</td>")
}
count will keep count on number of starting braces and will print the statement until the count reaches 0 again.
My input might be different from question but the information may be useful.
sed '/Option_1/,/}/ !d;/}/q' YourFile
delete everything not inside your delimiter and quit after last line of it (so 1 section only)
for non GNU sed, replace the ; after d by a real new line
You can do,
awk '/Option_1/,/}/{print; if ($0 ~ /}/) exit}' test.txt
This exits after printing the first match

How to get specific data from block of data based on condition

I have a file like this:
[group]
enable = 0
name = green
test = more
[group]
name = blue
test = home
[group]
value = 48
name = orange
test = out
There may be one ore more space/tabs between label and = and value.
Number of lines may wary in every block.
I like to have the name, only if this is not true enable = 0
So output should be:
blue
orange
Here is what I have managed to create:
awk -v RS="group" '!/enable = 0/ {sub(/.*name[[:blank:]]+=[[:blank:]]+/,x);print $1}'
blue
orange
There are several fault with this:
I am not able to set RS to [group], both this fails RS="[group]" and RS="\[group\]". This will then fail if name or other labels contains group.
I do prefer not to use RS with multiple characters, since this is gnu awk only.
Anyone have other suggestion? sed or awk and not use a long chain of commands.
If you know that groups are always separated by empty lines, set RS to the empty string:
$ awk -v RS="" '!/enable = 0/ {sub(/.*name[[:blank:]]+=[[:blank:]]+/,x);print $1}'
blue
orange
#devnull explained in his answer that GNU awk also accepts regular expressions in RS, so you could only split at [group] if it is on its own line:
gawk -v RS='(^|\n)[[]group]($|\n)' '!/enable = 0/ {sub(/.*name[[:blank:]]+=[[:blank:]]+/,x);print $1}'
This makes sure we're not splitting at evil names like
[group]
enable = 0
name = [group]
name = evil
test = more
Your problem seems to be:
I am not able to set RS to [group], both this fails RS="[group]" and
RS="\[group\]".
Saying:
RS="[[]group[]]"
should yield the desired result.
In these situations where there's clearly name = value statements within a record, I like to first populate an array with those mappings, e.g.:
map["<name>"] = <value>
and then just use the names to reference the values I want. In this case:
$ awk -v RS= -F'\n' '
{
delete map
for (i=1;i<=NF;i++) {
split($i,tmp,/ *= */)
map[tmp[1]] = tmp[2]
}
}
map["enable"] !~ /^0$/ {
print map["name"]
}
' file
blue
orange
If your version of awk doesn't support deleting a whole array then change delete map to split("",map).
Compared to using REs and/or sub()s., etc., it makes the solution much more robust and extensible in case you want to compare and/or print the values of other fields in future.
Since you have line-separated records, you should consider putting awk in paragraph mode. If you must test for the [group] identifier, simply add code to handle that. Here's some example code that should fulfill your requirements. Run like:
awk -f script.awk file.txt
Contents of script.awk:
BEGIN {
RS=""
}
{
for (i=2; i<=NF; i+=3) {
if ($i == "enable" && $(i+2) == 0) {
f = 1
}
if ($i == "name") {
r = $(i+2)
}
}
}
!(f) && r {
print r
}
{
f = 0
r = ""
}
Results:
blue
orange
This might work for you (GNU sed):
sed -n '/\[group\]/{:a;$!{N;/\n$/!ba};/enable\s*=\s*0/!s/.*name\s*=\s*\(\S\+\).*/\1/p;d}' file
Read the [group] block into the pattern space then substitute out the colour if the enable variable is not set to 0.
sed -n '...' set sed to run in silent mode, no ouput unless specified i.e. a p or P command
/\[group\]/{...} when we have a line which contains [group] do what is found inside the curly braces.
:a;$!{N;/\n$/!ba} to do a loop we need a place to loop to, :a is the place to loop to. $ is the end of file address and $! means not the end of file, so $!{...} means do what is found inside the curly braces when it is not the end of file. N means append a newline and the next line to the current line and /\n$/ba when we have a line that ends with an empty line branch (b) to a. So this collects all lines from a line that contains `[group] to an empty line (or end of file).
/enable\s*=\s*0/!s/.*name\s*=\s*\(\S\+\).*/\1/p if the lines collected contain enable = 0 then do not substitute out the colour. Or to put it another way, if the lines collected so far do not contain enable = 0 do substitute out the colour.
If you don't want to use the record separator, you could use a dummy variable like this:
#!/usr/bin/awk -f
function endgroup() {
if (e == 1) {
print n
}
}
$1 == "name" {
n = $3
}
$1 == "enable" && $3 == 0 {
e = 0;
}
$0 == "[group]" {
endgroup();
e = 1;
}
END {
endgroup();
}
You could actually use Bash for this.
while read line; do
if [[ $line == "enable = 0" ]]; then
n=1
else
n=0
fi
if [ $n -eq 0 ] && [[ $line =~ name[[:space:]]+=[[:space:]]([a-z]+) ]]; then
echo ${BASH_REMATCH[1]}
fi
done < file
This will only work however if enable = 0 is always only one line above the line with name.

sed: how to replace CR and/or LF with "\r" "\n", so any file will be in one line

I have files like
aaa
bbb
ccc
I need them to sed into aaa\r\nbbb\r\nccc
It should work either for unix and windows replacing them with \r or \r\n accordingly
The problem is that sed adds \n at the end of line but keeps lines separated. How can I fix it?
These two commands together should do what you want:
sed ':a;N;$!ba;s/\r/\\r/g'
sed ':a;N;$!ba;s/\n/\\n/g'
Pass your input file through both to get the output you want. Theres probably a way to combine them into a single expression.
Stolen and Modified from this question:
How can I replace a newline (\n) using sed?
It's possible to merge lines in sed, but personally, I consider needing to change line breaks a sign that it's time to give up on sed and use a more powerful language instead. What you want is one line of perl:
perl -e 'undef $/; while (<>) { s/\n/\\n/g; s/\r/\\r/g; print $_, "\n" }'
or 12 lines of python:
#! /usr/bin/python
import fileinput
from sys import stdout
first = True
for line in fileinput.input(mode="rb"):
if fileinput.isfirstline() and not first:
stdout.write("\n")
if line.endswith("\r\n"): stdout.write(line[:-2] + "\\r\\n")
elif line.endswith("\n"): stdout.write(line[:-1] + "\\n")
elif line.endswith("\r"): stdout.write(line[:-1] + "\\r")
first = False
if not first: stdout.write("\n")
or 10 lines of C to do the job, but then a whole bunch more because you have to process argv yourself:
#include <stdio.h>
void process_one(FILE *fp)
{
int c;
while ((c = getc(fp)) != EOF)
if (c == '\n') fputs("\\n", stdout);
else if (c == '\r') fputs("\\r", stdout);
else putchar(c);
fclose(fp);
putchar('\n');
}
int main(int argc, char **argv)
{
FILE *cur;
int i, consumed_stdin = 0, rv = 0;
if (argc == 1) /* no arguments */
{
process_one(stdin);
return 0;
}
for (i = 1; i < argc; i++)
{
if (argc[i][0] == '-' && argc[i][1] == 0)
{
if (consumed_stdin)
{
fputs("cannot read stdin twice\n", stderr);
rv = 1;
continue;
}
cur = stdin;
consumed_stdin = 1;
}
else
{
cur = fopen(ac[i], "rb");
if (!cur)
{
perror(ac[i]);
rv = 1;
continue;
}
}
process_one(cur);
}
return rv;
}
awk '{printf("%s\\r\\n",$0)} END {print ""}' file
tr -s '\r' '\n' <file | unix2dos
EDIT (it's been pointed out that the above misses the point entirely! •///•)
tr -s '\r' '\n' <file | perl -pe 's/\s+$/\\r\\n/'
The tr gets rid of empty lines and dos line endings. The pipe means two processes—good on modern hardware.

Joining lines that matches specific conditions in bash

I need a command that will join lines if:
-following line starts with more than 5 spaces
-length of the joined lines won't be greater than 79 characters
-those lines are not between lines with pattern1 and pattern2
-same as above but with another set of patterns, like pattern3 and pattern4
It will work on a file like this:
Long line that contains too much text for combining it with following one
That line cannot be attached to the previous becouse of the length
This one also
becouse it doesn't start with spaces
This one
could be
expanded
pattern1
here are lines
that shouldn't be
changed
pattern2
Another line
to grow
After running the command, output should be:
Long line that contains too much text for combining it with following one
That line cannot be attached to the previous becouse of the length
This one also
becouse that one doesn't start with spaces
This one could be expanded
pattern1
here are lines
that shouldn't be
changed
pattern2
Another line to grow
It can't move part of the line.
I'm using bash 2.05 sed 3.02 awk 3.1.1 and grep 2.5.1 and i don't know how to solve this problem :)
Here's a start for you:
#!/usr/bin/awk -f
BEGIN {
TRUE = printflag1 = printflag2 = 1
FALSE = 0
}
# using two different flags prevents premature enabling when blocks are
# nested or intermingled
/pattern1/ {
printflag1 = FALSE
}
/pattern2/ {
printflag1 = TRUE
}
/pattern3/ {
printflag2 = FALSE
}
/pattern4/ {
printflag2 = TRUE
}
{
line = $0
sub(/^ +/, " ", line)
sub(/ +$/, "", line)
}
/^ / &&
length(accum line) <= 79 &&
printflag1 &&
printflag2 {
accum = accum line
next
}
{
print accum
accum = line
}

Resources