sed: how to replace CR and/or LF with "\r" "\n", so any file will be in one line - windows

I have files like
aaa
bbb
ccc
I need them to sed into aaa\r\nbbb\r\nccc
It should work either for unix and windows replacing them with \r or \r\n accordingly
The problem is that sed adds \n at the end of line but keeps lines separated. How can I fix it?

These two commands together should do what you want:
sed ':a;N;$!ba;s/\r/\\r/g'
sed ':a;N;$!ba;s/\n/\\n/g'
Pass your input file through both to get the output you want. Theres probably a way to combine them into a single expression.
Stolen and Modified from this question:
How can I replace a newline (\n) using sed?

It's possible to merge lines in sed, but personally, I consider needing to change line breaks a sign that it's time to give up on sed and use a more powerful language instead. What you want is one line of perl:
perl -e 'undef $/; while (<>) { s/\n/\\n/g; s/\r/\\r/g; print $_, "\n" }'
or 12 lines of python:
#! /usr/bin/python
import fileinput
from sys import stdout
first = True
for line in fileinput.input(mode="rb"):
if fileinput.isfirstline() and not first:
stdout.write("\n")
if line.endswith("\r\n"): stdout.write(line[:-2] + "\\r\\n")
elif line.endswith("\n"): stdout.write(line[:-1] + "\\n")
elif line.endswith("\r"): stdout.write(line[:-1] + "\\r")
first = False
if not first: stdout.write("\n")
or 10 lines of C to do the job, but then a whole bunch more because you have to process argv yourself:
#include <stdio.h>
void process_one(FILE *fp)
{
int c;
while ((c = getc(fp)) != EOF)
if (c == '\n') fputs("\\n", stdout);
else if (c == '\r') fputs("\\r", stdout);
else putchar(c);
fclose(fp);
putchar('\n');
}
int main(int argc, char **argv)
{
FILE *cur;
int i, consumed_stdin = 0, rv = 0;
if (argc == 1) /* no arguments */
{
process_one(stdin);
return 0;
}
for (i = 1; i < argc; i++)
{
if (argc[i][0] == '-' && argc[i][1] == 0)
{
if (consumed_stdin)
{
fputs("cannot read stdin twice\n", stderr);
rv = 1;
continue;
}
cur = stdin;
consumed_stdin = 1;
}
else
{
cur = fopen(ac[i], "rb");
if (!cur)
{
perror(ac[i]);
rv = 1;
continue;
}
}
process_one(cur);
}
return rv;
}

awk '{printf("%s\\r\\n",$0)} END {print ""}' file

tr -s '\r' '\n' <file | unix2dos
EDIT (it's been pointed out that the above misses the point entirely! •///•)
tr -s '\r' '\n' <file | perl -pe 's/\s+$/\\r\\n/'
The tr gets rid of empty lines and dos line endings. The pipe means two processes—good on modern hardware.

Related

using sed to move a string in a multi line pattern

how can I use sed to change this:
typedef struct
{
uint8_t foo;
uint8_t bar;
} a_somestruct_b;
to
pre_somestruct_post = restruct.
int8lu('foo').
int8lu('bar')
I have many "somestruct" structs to convert.
awk solution to get you started:
$ cat tst.awk
/typedef struct/{p=1;next} # start capturing
p && $1=="}" {
split($2,a,"_") # capture "somestruct"
# in a[2]
printf "%s_%s_%s = restruct.\n", "pre", a[2], "post" # possibly "pre" and "post"
# should be "a" and "b"
# here?
for (j=1;j<=i;j++) printf "%s%s\n", s[j], (j<i?".":"") # print saved struct fields
delete s; i=0; p=0 # reinitialize
}
p && NF==2{
split($1, b, "_") # capture type
sub(/;/,"",$2) # remove ";"
s[++i]=sprintf(" %slu('%s')", b[1], $2) # save struct field in
# array s
}
Testing this with file input.txt:
$ cat input.txt
typedef struct
{
uint8_t foo;
uint8_t bar;
} a_atruct_b;
typedef struct {
uint8_t foo;
uint8_t bar;
} a_bstruct_b;
typedef struct
{
uint8_t foo;
uint8_t bar;
} a_cstruct_b;
gives:
$ awk -f tst.awk input.txt
pre_atruct_post = restruct.
uint8lu('foo').
uint8lu('bar')
pre_bstruct_post = restruct.
uint8lu('foo').
uint8lu('bar')
pre_cstruct_post = restruct.
uint8lu('foo').
uint8lu('bar')
Same thing, as a one-liner:
$ awk '/typedef struct/{p=1;next} p && $1=="}" {split($2,a,"_");printf "%s_%s_%s = restruct.\n", "pre", a[2], "post";for (j=1;j<=i;j++) printf "%s%s\n", s[j], (j<i?".":"");delete s; i=0; p=0} p && NF==2 {split($1, b, "_");sub(/;/,"",$2);s[++i]=sprintf(" %slu('%s')", b[1], $2)}' input.txt
$ cat sed_script
/typedef struct/{ # find the line with "typedef struct"
n;n; # Go to next two line
/uint8_t/{ # Find the line with "uint8_t"
s/uint8_t (.*);/int8lu(\x27\1\x27)./; # substitute the line, i.e. int8lu('foo').
h;n; # copy the pattern space to the hold space,
# then go to next line
s/uint8_t (.*);/int8lu(\x27\1\x27)/; # substitute the line, i.e. int8lu('bar')
H;n # append the pattern space to the hold space
# then go to next line
};
s/.*_(.*)_.*/pre_\1_post = restruct./p; # substitute and print the line,
# i.e., pre_somestruct_post = restruct.
g;p # copy the hold space to the pattern space
# and then print
}
$ sed -rn -f sed_script input
pre_somestruct_post = restruct.
int8lu('foo').
int8lu('bar')
After checked the output is what you desired, added the -i option for sed to edit the file in place.

What cases should I use dollar sign and single quotes in transliterate (eg. tr -d '\n'), or any other function?

Say Im trying to delete newlines or carrige returns. I notice that when I use transliterate to delete the newline characters tr -d '\n', I get the same results as if I were to tr -d $"\n" or tr -d $'\n'. What's the difference?
I'm not sure how the same applies in sed or grep because they are more complicated. So, I'm trying to figure out tr first as that seems to be a simpler bash program.
tr does its own escaping:
When you write tr -d '\n', the tr program itself recognises \+n and substitutes a newline.
When you write tr -d $'\n', Bash converts \n to a newline character, and tr sees it literally.
If you're experimenting to understand what the shell does, it's probably worth writing a short C program to print out each argument letter by letter - something like:
#include <stdio.h>
int main(int argc, char **argv)
{
int i;
/* Ignore argv[0] - the program name is not interesting */
for (i = 1; i < argc; ++i) {
char *p = argv[i];
printf("argv[%d] =", i);
while (*p)
printf(" %3d", (int)*p++);
printf("\n");
}
return 0;
}
This one prints in decimal, but it's easy to change it to use hex or octal. Running it with $'\n' \n "\n" as arguments gives:
argv[1] = 10
argv[2] = 110
argv[3] = 92 110
showing that in the first case, Bash passes a single newline character, in the second case, just the 'n', and in the final case, both '\' and 'n'.

awk print first occurrence after match

I'm trying to print a portion of a text file between two patterns, then return only the first occurrence. Should be simple but I can't seem to find a solution.
cat test.html
if (var == "Option_1"){
document.write("<td>head1</td>")
document.write("<td>text1</td>")
}
if (var == "Option_2"){
document.write("<td>head2</td>")
document.write("<td>text2</td>")
}
if (var == "Option_1"){
document.write("<td>head3</td>")
document.write("<td>text3</td>")
}
This prints all matches:
awk '/Option_1/,/}/' test.txt
I need it to return only the first, i.e.:
if (var == "Option_1"){
document.write("<td>head1</td>")
document.write("<td>text1</td>")
}
Thanks!
Never use range expressions as they make trivial jobs very slightly briefer but then require a complete rewrite or duplicate conditions for even slightly more interesting tasks. Always use a flag:
$ awk '/Option_1/{f=1} f{print; if (/}/) exit}' file
if (var == "Option_1"){
document.write("<td>head1</td>")
document.write("<td>text1</td>")
}
I assumed that there are no } inside the if blocks.
Using GNU sed :
sed -n '/Option_1/{:a N;s/}/}/;Ta;p;q}' file
Here's how it works :
/Option_1/{ #search for Option_1
:a #create label a
N; #append next line to pattern space
s/}/}/; #substitute } with }
Ta; #if substitution failed, jump to label a
p; #print pattern space
q #exit
}
Adding somewhat to Ed Morton's answer, you can write it again to work for some nested if condition or if there exist any other pair of braces inside the if statement (eg. braces for for loop).
awk '/Option_1/{f=1} f{ if(/{/){count++}; print; if(/}/){count--; if(count==0) exit}}' filename
output for:
if (var == "Option_1"){
document.write("<td>head1</td>")
if (condition){
//code
}
document.write("<td>text1</td>")
}
if (var == "Option_2"){
document.write("<td>head2</td>")
document.write("<td>text2</td>")
}
if (var == "Option_1"){
document.write("<td>head3</td>")
document.write("<td>text3</td>")
}
is:
if (var == "Option_1"){
document.write("<td>head1</td>")
if (condition){
//code
}
document.write("<td>text1</td>")
}
count will keep count on number of starting braces and will print the statement until the count reaches 0 again.
My input might be different from question but the information may be useful.
sed '/Option_1/,/}/ !d;/}/q' YourFile
delete everything not inside your delimiter and quit after last line of it (so 1 section only)
for non GNU sed, replace the ; after d by a real new line
You can do,
awk '/Option_1/,/}/{print; if ($0 ~ /}/) exit}' test.txt
This exits after printing the first match

Append to the previous line for a match

can I use sed or awk to append to the previous line if a match is found ?
I have a file which has the format :
INT32
FSHL (const TP Buffer)
{
INT32
FSHL_lm (const TP Buffer)
{ WORD32 ugo = 0; ...
What I am trying to do is scan for independant open braces {and append it to the previous non-blank line .The match should not occur for an open brace appended by anything in the same line .
The expected output :
INT32
FSHL (const TP Buffer){
INT32
FSHL_lm (const TP Buffer)
{ WORD32 ugo = 0; ...
Thanks for the replies .
This might work for you (GNU sed):
sed '$!N;s/\n\s*{\s*$/{/;P;D' file
Explanation:
$!N unless the last line append the next line to the pattern space.
s/\n\s*{\s*$/{/ replace a linefeed followed by no or any amount of white space followed by an opening curly brace followed by no or any amount of white space to the end of the string, by an opening curly brace.
P print upto and including the first newline.
D delete upto and including the first newline (if so do not start a new cycle).
One way using perl. I read all file in slurp mode and use a regular expression to search lines with only a curly brace and remove its leading spaces.
perl -ne '
do {
local $/ = undef;
$data = <>;
};
$data =~ s/\n^\s*(\{\s*)$/\1/mg;
print $data
' infile
Assuming infile with the content of the question, output will be:
FSHL (const TP Buffer){
INT32
FSHL_lm (const TP Buffer)
{ WORD32 ugo = 0; ...
One way using awk:
awk '!(NF == 1 && $1 == "{") { if (line) print line; line = $0; next; } { sub(/^[ \t]+/, "", $0); line = line $0; } END { print line }' file.txt
Or broken out on multiple lines:
!(NF == 1 && $1 == "{") {
if (line) print line
line = $0
next
}
{
sub(/^[ \t]+/, "", $0)
line = line $0
}
END {
print line
}
Results:
INT32
FSHL (const TP Buffer){
INT32
FSHL_lm (const TP Buffer)
{ WORD32 ugo = 0; ...
HTH
[shyam#localhost ~]$ perl -lne 's/^/\n/ if $.>1 && /^\d+/; printf "%s",$_' appendDateText.txt
that will work
i/p:
06/12/2016 20:30 Test Test Test
TestTest
06/12/2019 20:30 abbs abcbcb abcbc
06/11/2016 20:30 test test
i123312331233123312331233123312331233123312331233Test
06/12/2016 20:30 abc
o/p:
06/12/2016 20:30 Test Test TestTestTest
06/12/2019 20:30 abbs abcbcb abcbc
06/11/2016 20:30 test ##testi123312331233123312331233123312331233123312331233Test

Replace a line with multiple lines in a file

I want to replace a single line in a file with multiple lines, e.g., I want to replace a particular function call, say,
foo(1,2)
with
if (a > 1) {
foo(1,2)
} else {
bar(1,2)
}
How can I do it in bash?
This is what the sed s command was built for:
shopt -s extglob
ORIG="foo(1,2)"
REP="if (a > 1) {
foo(1,2)
} else {
bar(1,2)
}"
REP="${REP//+(
)/\\n}"
sed "s/$ORIG/$REP/g" inputfile > outputfile
Note that the REP="${REP//\+( )/\\n}" lines are only needed if you want to define the REP in the formatted way that I did on line two. It might be simpler if you just used \n and \t in REP to begin with.
Edit: Note! You need to escape ' and \ as well in your REP if you have them.
Edit in response to the OP's question
To change your original file without creating a new file, use sed's --in-place flag, like so:
sed --in-place "s/$ORIG/$REP/g" inputfile
Please be careful with the --in-place flag. Make backups before you run it because all changes will be permanent.
This might work for you:
cat <<\! |
> a
> foo(1,2)
> b
> foo(1,2)
> c
> !
> sed '/foo(1,2)/c\
> if (a > 1) {\
> foo(1,2)\
> } else {\
> bar(1,2)\
> }'
a
if (a > 1) {
foo(1,2)
} else {
bar(1,2)
}
b
if (a > 1) {
foo(1,2)
} else {
bar(1,2)
}
c
To replace strings in-place in a file, you can use ed (as conveniently tagged in the question). Assuming your input file looks like this:
line before
foo(1,2)
line between
foo(1,2)
line after
You can write a script to do the substitution and store it in a file such as script.ed:
%s/\([[:blank:]]*\)foo(1,2)/\1if (a > 1) {\
\1 foo(1,2)\
\1} else {\
\1 bar(1,2)\
\1}/
w
q
Notice that this takes indentation into account; every line is prepended with whatever blanks were there before the function call in the original file, so the result would look like this:
$ ed -s infile < script.ed
$ cat infile
line before
if (a > 1) {
foo(1,2)
} else {
bar(1,2)
}
line between
if (a > 1) {
foo(1,2)
} else {
bar(1,2)
}
line after
Should the function call not be on a line on its own but potentially prepended by other characters that shouldn't be removed, you could use this as the first line of the substitution:
%s/\([[:blank:]]*\)\(.*\)foo(1,2)/\1\2if (a > 1) {\
So this
} something; foo(1,2)
would become
} something; if (a > 1) {
foo(1,2)
} else {
bar(1,2)
}
with indentation still properly accounted for.

Resources