Remove "../" from text file using sed - bash

I have a text file containing text such as this
../path-to-image/folder1/image.jpg path-to-another-image/folder2/image.png
I would like to remove the "../" part and obtain
path-to-image/folder1/image.jpg path-to-another-image/folder2/image.png
I have tried using sed with
sed -i 's#../##g' file.txt
But I obtain the following:
path-to-imafoldeimage.jpg path-to-another-imafoldeimage.png
All the slashes and some other characters were removed and thus the path to my images was broken.
I looked up how to make it match exactly the string using
\<\>
sed 's#\<../\>#%%#g' file.txt
But the output is identical to input. Is there a way to remove "../" using sed? I need this from command line since I have about 10 files with similar path structures which I will copy into a bunch of directories. Meaning I can't do this manually.

.s have special meaning in regex syntax, and need to be escaped.
Either [.] (creating a character class of size one) or \. will suffice; I strongly advise the former, as it works properly in a wider array of quoting contexts. Thus:
sed -i 's#[.][.]/##g' file.txt

Dots are special characters in regex. They mean any character (except a newline). So you need to escape them with backslashes in the sed command:
sed -i 's#\.\./##g' file.txt

Do sed -i 's/\.\.\///g' file.txt
's/\.\.\///g' replaces ../ with an empty string, as of the syntax 's/string/replacement/g'
\.\.\/ escapes the dots and the slash, which is necessary because dots and slashes are special characters in regex. After escaping \.\.\/, the string reads ../.
The following two slashes surround the replacement string, which is empty in this case.
Edit:
For easier legibility (and to avoid escaping the slash):
sed -i 's#\.\.\/##g' file.txt. This is much closer to your initial attempt, and as a revised explanation, \.\./ translates to ../, as the slash no longer needs to be escaped. The dots are still special characters and must be escaped with the backslash.

Related

Extract text between two special characters

Trying to extract the text between the special characters "\ and \" through sed
Ex: "\hell##$\"},
expected output : hell##$
You can do it quite easily with using a capture-group and backreference with basic regular-expressions:
sed 's/^["][\]\([^\]*\).*$/\1/'
Explanation
Normal substitution sed 's/find/replace/, where
find is ^["][\] a double-quote and \ before beginning the capture \(...\) which contains [^\]* (zero or more characters not a \), the closing of the capture \) and then .*$ the remainder of the string;
replace is \1 (the first backreference) containing the text captured between \(...\).
(note: if your "\ doesn't begin the string, remove the first '^' anchor)
Example
$ echo '"\hell##$\"},' | sed 's/^["][\]\([^\]*\).*$/\1/'
hell##$
Look things over and let me know if you have questions.
This might work for you (GNU sed):
sed -nE '/"\\[^\\]*\\+([^\\"][^\\]*\\+)*"/{s/"\\/\n/;s/.*\n//;s/\\"/\n/;P;D}' file
The solution comes in two parts:
Firstly, a regexp to determine whether a pair of two characters exists. This can be tricky as a negated class is insufficient because edge cases can easily defeat a simplistic approach.
Secondly, once a pair of characters does exist the text between them must be extracted piece meal.

Using sed to substitute text around dynamic filename

I'm trying to figure out the best method for substituting text in a BASH script. Sed seems to be the best option, but correct me if I'm wrong.
What I'd like to do is take every instance of images/< filename >.png in a file, and add surrounding text - {{media("images/.< filename >.png")}}. The following code is the closest I've been able to get:
sed -i -e 's:images/.*.png:{{media("images/.*.png")}}:g' file.html
How can I make this happen?
In sed, & in the substitution will be replaced with the matched string, so if we can assume no spaces in a filename and a word boundary before and after each, this does what you want:
s:\bimages/\S*\.png\b:{{media("&")}}:g
Try it online!
Apart from doing the substitution, there are a couple issues with your code worth mentioning:
images/.*.png will match images/foo.png, but it will also match images/foopng. Don't forget to escape regex characters: images/.*\.png.
sed quantifiers are always greedy. Suppose you had this input:
Foo images/bar.png baz images/qux.png quux
In this case, the expression images/.*\.png would match everything from the first images to the last .png. The solution above avoids this by using \S instead of . to match only non-whitespace characters.

Remove prefix of each line in a file and output to another file using sed

I have a source code file in which comments are prefixed with // (ie. double slashes and an empty space), I want to convert the source code into a document so I tried to cat file.c and pipe it to sed, the thinking is to replace "double slash and a space" if a line starts with it, with empty string, but it looks like the slash has some special meaning in sed, so what's the best way of constructing the sed arguments?
Thanks!
If you want to remove the special meaning of / from sed then following may help you in same.
sed 's/^\/\/ //g' Input_file
So I am escaping / here by using \ before it, so it will be taken as a literal character rather than it's special meaning in code. Also if you are happy with above command's result then use -i to save the changes in Input_file itself. Hope this helps.
The slash only has meaning if you allow it.
sed 's#^// +##' < file.c

Using dollar sign in sed for both variable replacement and character

I try to use sed to change a line in a file named convergence.gnu
I have a variable named lafila, which is a file name
Currently, I can do:
lafila="nGas060.dat"
sed -i "6s/.*/plot \"$lafila\" using 1:2 with l/" convergence.gnu
This changes correctly the sixth line of my convergence.gnu file to:
plot "nGas062.dat" using 1:2 with l
However, now I want to include a dollar sign in the replaced line to obtain instead:
plot "nGas062.dat" using ($1/1000):2 with l
Can you tell me what to change in my sed command? If I escape a dollar sign it does not work properly. Double dollars also don't work.
I believe your issue is actually being caused by the forward slash in ($1/1000), which clashes with the slashes being used to delimit the various components of the sed command. You either need to escape that forward slash as well, or alternatively use a different character for delimiting the sed strings. Either of the below should work:
lafila="nGas060.dat"
sed -i "6s/.*/plot \"$lafila\" using (\$1\/1000):2 with l/" convergence.gnu
or
lafila="nGas060.dat"
sed -i "6s,.*,plot \"$lafila\" using (\$1/1000):2 with l," convergence.gnu
Using a different delimiting character can be a good way to make your sed string look neater and avoid the leaning toothpick syndrome.
echo foo | sed "s,foo,/there/are/a/lot/of/slashes/here,"
is much nicer than
echo foo | sed "s/foo/\/there\/are\/a\/lot\/of\/slashes\/here/"
Use single quotes:
sed -i '6s/.*/plot "'$lafila'" using ($1\/1000):2 with l/' convergence.gnu
Single quotes protect double quotes and $ is not interpreted inside them. However, you do need to escape /.
See also:
Difference between single and double quotes in Bash
Try it:
lafila="nGas060.dat"
sed -i "6s#.*#plot \"$lafila\" using (\\\$1/1000):2 with l#" convergence.gnu
You need only to escape backslash and after the dollar sign.
\\ + \$ == \\\$

Need to diff two text files in linux with some patterns in filelines

File A contains
Test-1.2-3
Test1-2.2-3
Test2-4.2-3
File B contains
Test1
Expected output should be
Test-1.2-3
Test2-4.2-3
diff A B doesn't work as expected.
Kindly let me know if any solutions here.
Using grep:
grep -vf B A
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. The empty file
contains zero patterns, and therefore matches nothing.
-v, --invert-match
Invert the sense of matching, to select non-matching lines.
Edit:
Optionally, you may want to use the -w option if you want a more precise match on "words" only which seems to be your case from your example since your match is followed by '-'. As DevSolar points out, you may also want to use the -F option to prevent input patterns from your file B to be interpreted as regular expressions.
grep -vFwf B A
-w, --word-regexp
Select only those lines containing matches that form whole
words. The test is that the matching substring must either be
at the beginning of the line, or preceded by a non-word
constituent character. Similarly, it must be either at the end
of the line or followed by a non-word constituent character.
Word-constituent characters are letters, digits, and the
underscore.
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings (rather than regular
expressions), separated by newlines, any of which is to be matched.
To complement Julien Lopez's helpful answer:
If you want to ensure that lines from File B only match at the beginning of lines from File A, you can prepend ^ to each line from file B, using sed:
grep -vf <(sed 's/^/^/' fileB) fileA
grep, which by default interprets its search strings as BREs (basic regular expressions), then interprets the ^ as the beginning-of-line anchor.
If the lines in File B may contain characters that are regex metacharacters (such as ^, *,?, ...) but should be treated as literals, you must escape them first:
grep -vf <(sed 's/[^^]/[&]/g; s/\^/\\^/g; s/^/^/' fileB) fileA
An explanation of this grim-looking - but generically robust - sed command can be found in this this answer of mine.
Note:
Assumes bash, ksh, or zsh due to use of <(...), a process substitution, which makes the output from sed act as if it were provided via a file.
sed command s/^/^/ looks like it won't do anything, but the first ^, in the regex part of the call, is the beginning-of-line anchor[1]
, whereas the second ^, in the substitution part of the call, is a literal to place at the beginning of the line (which will later itself be interpreted as the beginning-of-line anchor in the context of grep).
[1] Strictly speaking, to sed it is the beginning-of-pattern-space anchor, because it is possible to read multiple lines at once with sed, in which case ^ refers to the beginning of the pattern space (input buffer) as a whole, not to individual lines.

Resources