Removing Time Stamp With Sed - bash

I have found a couple examples online.. but I could not
find a combination that would work, as the synx for sed
is very tricky, if you could please kindly point me in
right direction I would be highly grateful..
Here is the time stamp that i would like to remove from the file
00:02:06.580 --> 00:02:07.380
Here is what i already tried
cat sometextfile.txt | sed -r 's /\[0-9]{2}:[0-9]{2}:[0-9]{2}\/ g'
But I keep getting and error: sed: -e expression #1, char 34: unterminated `s' command
Thanks!!

The syntax is s/ what to replace / what to replace it with /. You are missing the second part. Even if you want to replace it with nothing, you need all three slashes; just don't put anything between the last two. As it is, you have only one slash, because the second one is quoted with \, meaning sed will treat it as part of the expression and look for a literal / in the input.
The beginning of your regex is also wrong. \[0-9]{2} matches the literal string [0-9 followed by exactly two right brackets (]]). Remove the initial backslash (\) if you want to match "exactly two digits".
Also, you never need to do cat filename |; you can just do < filename. In this specific case, sed takes a filename parameter, so you can do without the <, too.
So it should be something like this;
sed -E 's/[0-9]{2}:[0-9]{2}:[0-9]{2}//' sometextfile.txt
(I used -E because it's more portable than -r, which is a GNUism.)
You don't need the g on the end unless there's more than one timestamp per line.

Related

how to add a charecter to a a number using "sed"?

I got some output file, with a bug. I can't correct the source code, as it is not mine.
The output is a table of numbers (14 columns, hundreds of rows) that look like "1.5398E+02" (format of X.XXXXE-YY ). The bug is that if the power has three digits, so the number look like "6.4492-137" (missing the "E"). I been told to run a "sed" with something like:
sed ' s/([0-9]\.[0-9]{4})-([0-9]{3})/\1E-\2/' model.txt > modelCorrect.txt
or maybe
sed ' s/([0-9]\.[0-9]{4})-([0-9]{3})/([0-9]\.[0-9]{4})-E([0-9]{3})' model.txt > modelCorrect.txt
But it doesn't work (sed: -e expression #1, char 39: invalid reference \2 on s' command's RHS or sed: -e expression #1, char 63: unterminated s' command). What am I doing wrong?
What am I doing wrong?
Just sed uses basic regular expressions (BREs). There, groups are written as \(...\). In your command there is no group, therefore you cannot use the reference \2.
To uses the "normal" extended regular expressions (EREs), use sed -E.
Other than that, you forgot to allow the + in your regex. And since your file has 14 columns, you may want to replace all matches in each line (using s/.../.../g) instead of just the first one.
Also, it probably is safer to match numbers with an arbitrary number of places. Why invest so much work into checking that the number has the format 1.4444-333 if you could just allow all numbers?
sed -E 's/([0-9])([-+][0-9]+)/\1E\2/g' <<< "6.4492-137 1.23+4"
prints 6.4492E-137 1.23E+4.

how to filter a fil with regular expression using sed command?

Can anyone please tell me what these two commands do?
sed -i 's!{[^{]*\;}! !' file.txt
sed -i 's!{[^{]*{! !' file.txt
I found this example and i can not figure out the result provided when running the code.
sed -i 's!{[^{]*\;}! !' file.txt
sed -i means in place, file.txt might be altered.
's!....!....!' substitute command, splittet by exclamation marks. Most often you will see slashes used, but sed accepts different characters, defined by the first one, following the s. Note that exclamation marks make problems in the shell. Since there is no slash, neither in pattern, nor in replacement, I don't see a reason to use them.
{[^{]*\;} pattern to match
' ' substitution eventually a blank, if transferred with care, but might be a tab or funky half spaces or something else too.
Now what is the complicated expression:
{[^{]*\;} a literal pair of curly braces containing ...
[^{] a negation group, negation is by first char being a '^', so anything which is not a opening, curly brace, followed by
the quantifier *, meaning in any number, including 0, followed by
backspace, which is a masking tool, as so often.
and a semicolon.
So
'{aaaaa;}' should match
'{a;}' should match
'{;}' should match
'{};}' should match
'{{;}' should not match
'{a}' should not match
'{a}' should not match

Sed capturing too much during substring extraction

I'm trying to parse a curl response in order to retrieve an img src, identified with the alt tag captcha.
So to test my sed expression I tried the following:
echo 'alt="captcha" src="http://example.com/foo.html" /></p>' | sed -n 's/.*alt="captcha" src="\([^"]*\)/\1/p'
However this echos
http://example.com/foo.html" /></p>
How can I simply return
http://example.com/foo.html
?
I am new to sed so I would like to know where I'm going wrong.
This answer explains sed's behavior, but 123 - who also gave the right answer to the sed problem succinctly in a comment - points to a potentially better alternative, if you have GNU grep: grep -oP 'alt="captcha" src="\K[^"]*'. GNU grep's -P option supports PCREs, which are more powerful regular expressions than those available in sed.
The issue is not related to greediness, but to the fact that your regex only matches part of the line:
To extract a substring in sed, your regex must match the entire line. Otherwise, any parts not matched by your regex are simply passed through, as happened with the trailing " /></p> in your case; here's a fix:
$ echo 'alt="captcha" src="http://example.com/foo.html" /></p>' |
sed -n 's/.*alt="captcha" src="\([^"]*\).*/\1/p'
http://example.com/foo.html
Note the trailing .* I've added, which ensures that the remainder of the line is matched as well.
Without it, what is left of the input line after the match is simply appended to the result of your substitution; i.e., the " /></p> part. More correctly: the remaining part of the line is simply not replaced.
Therefore, generally, you'd use an approach such as the following (pseudo notation):
sed 's/^...<capture-group>...$/\1/p'
Again, the regex must match the whole line for this to work.
Due to sed's greedy matching, you neither need ^ nor $, though you may choose to add it for clarity of intent.
Caveat: If your capture group has no ambiguity, .* is fine to match the remainder of the line, but .* to match everything before the capture group will not work in all cases - see below.
A simple example to demonstrate the problem:
$ sed -n 's/[^"]*"\([^"]*\)/>>\1<</p' <<<'before"foo"after' # WRONG
>>foo<<"after
Note how \1 does contain the substring of interest captured by \([^"]*\), as intended - the string foo between "..." - but, because the regex stopped matching just before the closing ", the remainder of the line - "after - is still output.
Fixed version, with .* appended to ensure that the whole line matches:
$ sed -n 's/[^"]*"\([^"]*\).*/>>\1<</p' <<<'before"foo"after'
>>foo<<
Also note how [^"]*" is used to match the beginning of the line up to the capture group; .* would not work here, due to sed's greedy matching:
$ sed -n 's/.*"\([^"]*\).*/>>\1<</p' <<<'before"foo"after' # WRONG
>>after<<
.*" greedily matches everything up to the last ", and so the capture group then captures after, which is the run of non-" chars. after the closing ".
Use sed grouping. Its always my goto!
Sed regex:
echo 'alt="captcha" src="http://example.com/foo.html" /></p>' | sed 's/\(^alt.*src=\"\)\(.*\)\(\".*p>\)/\2/g'
Output
http://example.com/foo.html

unterminated address regex while using sed

I am trying to use the sed command to find and print the number that appears between "\MP2=" and "\" in a portion of a line that appears like this in a large .log file
\MP2=-193.0977448\
I am using the command below and getting the following error:
sed "/\MP2=/,/\/p" input.log
sed: -e expression #1, char 12: unterminated address regex
Advice on how to alter this would be greatly appreciated!
Superficially, you just need to double up the backslashes (and it's generally best to use single quotes around the sed program):
sed '/\\MP2=/,/\\/p' input.log
Why? The double-backslash is necessary to tell sed to look for one backslash. The shell also interprets backslashes inside double quoted strings, which complicates things (you'd need to write 4 backslashes to ensure sed sees 2 and interprets it as 'look for 1 backslash') — using single quoted strings avoids that problem.
However, the /pat1/,/pat2/ notation refers to two separate lines. It looks like you really want:
sed -n '/\\MP2=.*\\/p' input.log
The -n suppresses the default printing (probably a good idea on the first alternative too), and the pattern looks for a single line containing \MP2= followed eventually by a backslash.
If you want to print just the number (as the question says), then you need to work a little harder. You need to match everything on the line, but capture just the 'number' and remove everything except the number before printing what's left (which is just the number):
sed -n '/.*\\MP2=\([^\]*\)\\.*/ s//\1/p' input.log
You don't need the double backslash in the [^\] (negated) character class, though it does no harm.
If the starting and ending pattern are on the same line, you need a substitution. The range expression /r1/,/r2/ is true from (an entire) line which matches r1, through to the next entire line which matches r2.
You want this instead;
sed -n 's/.*\\MP2=\([^\\]*\)\\.*/\1/p' file
This extracts just the match, by replacing the entire line with just the match (the escaped parentheses create a group which you can refer back to in the substitution; this is called a back reference. Some sed dialects don't want backslashes before the grouping parentheses.)
awk is a better tool for this:
awk -F= '$1=="MP2" {print $2}' RS='\' input.log
Set the record separator to \ and the field separator to '=', and it's pretty trivial.

Print all characters upto a matching pattern from a file

Maybe a silly question but I have a text file that needs to display everything upto the first pattern match which is a '/'. (all lines contain no blank spaces)
Example.txt:
somename/for/example/
something/as/another/example
thisfile/dir/dir/example
Preferred output:
somename
something
thisfile
I know this grep code will display everything after a matching pattern:
grep -o '/[^\n]*' '/my/file.txt'
So is there any way to do the complete opposite, maybe rm everything after matching pattern or invert to display my preferred output?
Thanks.
If you're calling an external command like grep, you can get the same results your require with the sed command, i.e.
echo "something/as/another/example" | sed 's:/.*::'
something
Instead of focusing on what you want to keep, think about what you want to remove, in this case everything after the first '/' char. This is what this sed command does.
The leading s means substitute, the :/.*: is the pattern to match, with /.* meaning match the first /' char and all characters after that. The 2nd half of thesedcommand is the replacement. With::`, this means replace with nothing.
The traditional idom for sed is to use s/str/rep/, using / chars to delimit the search from the replacement, but you can use any character you want after the initial s (substitute) command.
Some seds expect the / char, and want a special indication that the following character is the sub/replace delimiter. So if s:/.*:: doesn't work, then s\:/.*:: should work.
IHTH.
Yu can use a much simpler reg exp:
/[^/]*/
The forward slash after the carat is what you're matching to.
jsFiddle
Assuming filename as "file.txt"
cat file.txt | cut -d "/" -f 1
Here, we are cutting the input line with "/" as the delimiter (-d "/"). Then we select the first field (-f 1).
You just need to include starting anchor ^ and also the / in a negated character class.
grep -o '^[^/]*' file

Resources