Removing two consecutive line breaks - bash

My file has a lot of line breaks, like this:
This is a line.
This is another line.
I would like to remove these, but only in cases where the first line ends with }, e.g.:
\macro{This is a line.}
This is another line.
That should become:
\macro{This is a line.}This is another line.
How can I remove the line breaks in this situation?

This is what I figured out:
$ sed -n '/}$/{h;:a;n;/^$\|}$/{H;$!ba};H;g;s#}\n*#}#g};p' input.txt
The idea behind is:
Accumulate all continuous empty lines and lines endswith '}'
Substitute }\n* with }
Last line needs special consideration.

You can just use an editor that support regular expressions and do a replace in your file. Replace:
}$\n\n
with
}
If you need to do it programmatically, the same principle applies (i.e. using regex for string replacement) but the actual answer will obviously depend on language/environment.

This might work for you:
sed '$!N;s/}\n$/}/;P;D' file
if there is white space involved, try:
sed '$!N;s/}\s*\n\s*$/}/;P;D' file
or more formally:
sed '$!N;s/}[[:space:]]*\n[[:space:]]*$/}/;P;D' file

Related

extract data between similar patterns

I am trying to use sed to print the contents between two patterns including the first one. I was using this answer as a source.
My file looks like this:
>item_1
abcabcabacabcabcabcabcabacabcabcabcabcabacabcabc
>item_2
bcdbcdbcdbcdbbcdbcdbcdbcdbbcdbcdbcdbcdbbcdbcdbcdbcdbbcdbcdbcdbcdb
>item_3
cdecde
>item_4
defdefdefdefdefdefdef
I want it to start searching from item_2 (and include) and finish at next occuring > (not include). So my code is sed -n '/item_2/,/>/{/>/!p;}'.
The result wanted is:
item_2
bcdbcdbcdbcdbbcdbcdbcdbcdbbcdbcdbcdbcdbbcdbcdbcdbcdbbcdbcdbcdbcdb
but I get it without item_2.
Any ideas?
Using awk, split input by >s and print part(s) matching item_2.
$ awk 'BEGIN{RS=">";ORS=""} /item_2/' file
item_2
bcdbcdbcdbcdbbcdbcdbcdbcdbbcdbcdbcdbcdbbcdbcdbcdbcdbbcdbcdbcdbcdb
I would go for the awk method suggested by oguz for its simplicity. Now if you are interested in a sed way, out of curiosity, you could fix what you have already tried with a minor change :
sed -n '/^>item_2/ s/.// ; //,/>/ { />/! p }' input_file
The empty regex // recalls the previous regex, which is handy here to avoid duplicating /item_2/. But keep in mind that // is actually dynamic, it recalls the latest regex evaluated at runtime, which is not necessarily the closest regex on its left (although it's often the case). Depending on the program flow (branching, address range), the content of the same // can change and... actually here we have an interesting example ! (and I'm not saying that because it's my baby ^^)
On a line where /^>item_2/ matches, the s/.// command is executed and the latest regex before // becomes /./, so the following address range is equivalent to /./,/>/.
On a line where /^>item_2/ does not match, the latest regex before // is /^>item_2/ so the range is equivalent to /^>item_2/,/>/.
To avoid confusion here as the effect of // changes during execution, it's important to note that an address range evaluates only its left side when not triggered and only its right side when triggered.
This might work for you (GNU sed):
sed -n ':a;/^>item_2/{s/.//;:b;p;n;/^>/!bb;ba}' file
Turn off implicit printing -n.
If a line begins >item_2, remove the first character, print the line and fetch the next line
If that line does not begins with a >, repeat the last two instructions.
Otherwise, repeat the whole set of instructions.
If there will always be only one line following >item_2, then:
sed '/^>item_2/!d;s/.//;n' file

Replacing Middle Part of String Occurring Multiple Times

I have a file, that has variations of this line multiple times:
source = "git::https://github.com/ORGNAME/REPONAME.git?ref=develop"
I am passing through a tag name in a variable. I want to find every line that starts with source and update that line in the file to be
source = "git::https://github.com/ORGNAME/REPONAME.git?ref=$TAG"
This should be able to be done with awk and sed, but having some difficulty making it work. Any help would be much appreciated!
Best,
Keren
Edit: In this scenario, the it says "develop", but it could also be set to "feature/test1" or "0.0.1" as well.
Edit2: The line with "source" is also indented by three or four spaces.
This should do:
sed 's/^\([[:blank:]]*source.*[?]ref=\)[^"]*\("\)/\1'"$TAG"'\2/' file
with sed
$ sed '/^source/s/ref=develop"$/ref=$TAG"/' file
replace ref=develop" at the end of line with ref=$TAG" for lines starting with source.

search a pattern in each line and append it at the end of that line

I have a file with the following entries:
folder1/a_b.csv folder1/generated/
folder2/folder3/a_b1.csv folder12/generated/
folder4/b_c.csv folder123/generated/
folder5/d.csv folder1/new_folder/generated/
folder6/12.csv folder/anotherfolder/morefolder/evenmorefolder/generated/
I want to copy the csv file name from each line, paste them at the end of that line and append it with ".org". Hence, the changed file would look like
folder1/a_b.csv folder1/generated/a_b.csv.org
folder2/folder3/a_b1.csv folder12/generated/a_b1.csv.org
folder4/b_c.csv folder123/generated/b_c.csv.org
folder5/d.csv folder1/new_folder/generated/d.csv.org
folder6/12.csv folder/anotherfolder/morefolder/evenmorefolder/generated/12.csv.org
Basically, I am looking for a command in vim or sed using which I can search a pattern in each line and append it at the end of that line. Is it possible?
Thanks in advance.
Vim
Here's how to do this in Vim:
:%s/\([^/]*\.csv\)\( .*\)/&\1.org/
This global (:%) substitution matches the filename (characters that don't contain /, ending in .csv), and captures \(...\) it. It then matches the rest of the line, and captures that, too.
As a replacement, first keep the original match & (or \0), then append the first capture (\1) with the additional suffix.
sed
Though the regular expression syntax is somewhat different than in Vim, the identical expression can be used with sed:
sed -e 's/\([^/]*\.csv\)\( .*\)/&\1.org/' input
Alternatives
It looks like you want to do file renaming in batches. On Linux, the mmv command-line tool is well suited for that; you'll probably find many similar tools on the web, too.
This might work for you (GNU sed):
sed -r 's|/([^ ]*) .*|&\1.org|' file

How can I replace a word at a specific line in a file in unix

I've researched other questions on here, but haven't really found one that works for me. I'm trying to select a specific line from a file and replace a string on that line with another string. So I have a file named my_course. I'm trying to modify a line in my_course that starts with "123". on that line I want to replace the string "0," with "1,". Help?
One possibility would be to use sed:
sed '/^123/ s/0/1/' my_course
In the first /../ part you just have to specify the pattern you are looking for ^123 for a line starting with 123.
In the s/from/to/ part you have specify the substitution to be performed.
Note that by default after substitution the file will be written to stdout. You might want to:
redirect the output using ... > my_new_course
perform the substitution "in place" using the -e switch to sed
If you are using the destructive in place variant you might want to use -iEXTENSION in addition to keep a copy with the given EXTENSION of the original version in case something goes wrong.
EDIT:
To match the desired lined with a prefix stored in a variable you have to enclose the sed script with double quotes " as using single qoutes ' will prevent variable expansion:
sed "/^$input/ s/0/1/" my_course
Have you tried this:
sed -e '[line]s/old_string/new_string/' my_course
PS: the [ ] shouldn't be used, is there just to make it clear that you should put the number right before the "s".
Cheers!
In fact, the -e in this case is not necessary, I can write just
sed '<line number>s/<old string>/<new string>/' my_course
This is what worked for me on Fedora 36, GNU bash, version 5.2.15(1)-release (x86_64-redhat-linux-gnu):
sed -i '1129s/additional/extra/' en-US/Design.xml
I know you said you couldn't use line numbers; I don't know how to address that part, but this replaced "additional" with "extra" on line 1129 of that file.

bash templating

i have a template, with a var LINK
and a data file, links.txt, with one url per line
how in bash i can substitute LINK with the content of links.txt?
if i do
#!/bin/bash
LINKS=$(cat links.txt)
sed "s/LINKS/$LINK/g" template.xml
two problem:
$LINKS has the content of links.txt without newline
sed: 1: "s/LINKS/http://test ...": bad flag in substitute command: '/'
sed is not escaping the // in the links.txt file
thanks
Use some better language instead. I'd write a solution for bash + awk... but that's simply too much effort to go into. (See http://www.gnu.org/manual/gawk/gawk.html#Getline_002fVariable_002fFile if you really want to do that)
Just use any language where you don't have to mix control and content text. For example in python:
#!/usr/bin/env python
links = open('links.txt').read()
template = open('template.xml').read()
print template.replace('LINKS', links)
Watch out if you're trying to force sed solution with some other separator - you'll get into the same problems unless you find something disallowed in urls (but are you verifying that?) If you don't, you already have another problem - links can contain < and > and break your xml.
You can do this using ed:
ed template.xml <<EOF
/LINKS/d
.r links.txt
w output.txt
EOF
The first command will go to the line
containing LINKS and delete it.
The second line will insert the
contents of links.txt on the current
line.
The third command will write the file
to output.txt (if you omit output.txt
the edits will be saved to
template.xml).
Try running sed twice. On the first run, replace / with \/. The second run will be the same as what you currently have.
The character following the 's' in the sed command ends up the separator, so you'll want to use a character that is not present in the value of $LINK. For example, you could try a comma:
sed "s,LINKS,${LINK}\n,g" template.xml
Note that I also added a \n to add an additional newline.
Another option is to escape the forward slashes in $LINK, possibly using sed. If you don't have guarantees about the characters in $LINK, this may be safer.

Resources