How to truncate extraneous output using shell script? [duplicate] - bash

This question already has answers here:
How to insert strings containing slashes with sed? [duplicate]
(11 answers)
Why it's not possible to use regex to parse HTML/XML: a formal explanation in layman's terms
(10 answers)
Closed 3 years ago.
I am trying to eliminate everything before and after the JSON contained in a specific part of a webpage so I can send that to a PHP script. I've tried a number of ways to get rid of the container content but all of them so far have failed, including one method that has worked in the exact same syntax for related purposes:
The characters that are between the two asterisks (**) at the beginning and end I need removed:
**var songs = [**{"timestamp":1555176393000,"title":"Enter Sandman","trackId":"ba_5cbb546d-5c1c-490e-9908-761b89dd5166","artist":"Metallica","artistId":"52_65f4f0c5-ef9e-490c-aee3-909e7ae6b2ab","album":"Metallica","albumId":"d0_6e729716-c0eb-3f50-a740-96ac173be50d","npe_id":"3cc5fe24d0ffcbb9152d861f27ae801660"},{"timestamp":1555176702000,"title":"Start Me Up","trackId":"76_d0b86399-11e5-4d11-b4fe-ce4b3f9a4736","artist":"The Rolling Stones","artistId":"1b_b071f9fa-14b0-4217-8e97-eb41da73f598","album":"Tattoo You","albumId":"d1_778b345b-e8a1-4054-b5ba-c611d3fda421","npe_id":"f0dc0ab12ef99a6e0087cad12886509b7b"},{"timestamp":1555176909000,"title":"Fame","trackId":"4e_cdef4b88-7314-431a-9cdd-d457296a65b7","artist":"David Bowie","artistId":"ab_5441c29d-3602-4898-b1a1-b77fa23b8e50","album":"Best of Bowie","albumId":"21_3709ee5a-d087-370f-afb4-f730092c7a94","npe_id":"2b8b3a170baa77125891d72a0474d3343a"},{"timestamp":1555177158000,"title":"Rocket","trackId":"34_aa5b9053-849e-4788-972f-7941303175b6","artist":"Def Leppard","artistId":"c1_7249b899-8db8-43e7-9e6e-22f1e736024e","album":"Hysteria","albumId":"06_de5cf055-d875-41f8-9261-89b11b7ff145","npe_id":"0d87b580f140a85feaebc7d77f75db2a3d"},{"timestamp":1555177826000,"title":"Mama, I'm Coming Home","trackId":"cb_e5b09171-9527-4d24-8ab6-1e922fdd66d3","artist":"Ozzy Osbourne","artistId":"4b_8aa5b65a-5b3c-4029-92bf-47a544356934","album":"No More Tears","albumId":"66_8f3d5a65-036c-3260-b9bb-36f1d0d80c11","npe_id":"6b766464fe945f275bf478192dcd33cfdc"},{"timestamp":1555178076000,"title":"Gold Dust Woman","trackId":"a4_ef8c1eca-f344-4bfb-82ea-763aa8aeaad9","artist":"Fleetwood Mac","artistId":"66_bd13909f-1c29-4c27-a874-d4aaf27c5b1a","album":"2010-01-08: The Rock Boat X, Lido Deck, Carnival Inspiration","albumId":"80_4f229af0-2afc-431d-87ff-f7f6af66268e","npe_id":"f6417d98fd1fefcca227d82a8ac9b84197"},{"timestamp":1555178363000,"title":"With or Without You","trackId":"79_6b9a509f-6907-4a6e-9345-2f12da09ba4b","artist":"U2","artistId":"26_a3cb23fc-acd3-4ce0-8f36-1e5aa6a18432","album":"The Joshua Tree","albumId":"0c_d287c703-5c25-3181-85d4-4d8c1a7d8ecd","npe_id":"23b19420196b28e2156ecda87c11b882e0"},{"timestamp":1555178654000,"title":"Who Are You","trackId":"7d_431b9746-c6ec-489d-9199-c83676171ae8","artist":"The Who","artistId":"22_f2fa2f0c-b6d7-4d09-be35-910c110bb342","album":"Who Are You","albumId":"40_b255da2c-6583-35f9-95e3-ef5f9c14e868","npe_id":"e01896f74f24968bb7727eaafbf6250b8f"},{"timestamp":1555179031000,"title":"Authority Song","trackId":"31_f5ff19f7-95f3-4a22-8996-3788c264e0b8","artist":"John Mellencamp","artistId":"4d_0aad6b52-fd93-4ea4-9c5d-1f66e1bc9f0a","album":"Words & Music: John Mellencamp's Greatest Hits","albumId":"9e_1240c510-7015-4484-baac-ce17f5277ea1","npe_id":"244785e3b1d75effb9fdecbb6df76b009f"},{"timestamp":1555179256000,"title":"Touch Me","trackId":"9d_1dd1f86c-2120-45f3-ac9f-3c87257fe414","artist":"The Doors","artistId":"13_9efff43b-3b29-4082-824e-bc82f646f93d","album":"The Soft Parade","albumId":"db_c29d7552-b5df-42b8-aae7-03d1e250cb3a","npe_id":"1b5d155eb2eeee6fc1fdb50a94b100669c"}]**; <ol class="songs tracks"></ol>**
Here is the shell script which produces the above at present:
#!/bin/sh
curl -v --silent http://player.listenlive.co/41851/en/songhistory >/var/tmp/wklh$1.a.txt
pta=`cat /var/tmp/wklh$1.a.txt | grep songs > /var/tmp/wklh$1.b.txt`
ptb=`cat /var/tmp/wklh$1.b.txt | sed -n -e '/var songs = /,/; <span title/ p' > /var/tmp/wklh$1.c.txt`
ptc=`cat /var/tmp/wklh$1.c.txt | grep songs > /var/tmp/wklh$1.d.txt`
#ptd=`cat /var/tmp/wklh$1.d.txt | sed -i 's/var songs = [//g' /var/tmp/wklh$1.d.txt`
#ptd=`cat /var/tmp/wklh$1.d.txt | sed -i 's/}]; <ol class="songs tracks"></ol>//g' /var/tmp/wklh$1.d.txt`
json=`cat /var/tmp/wklh$1.d.txt`
echo $json
metadata=`php /etc/asterisk/scripts/music/wklh.php $json`
echo $metadata
The commented out lines are what I was trying to use to remove the extraneous content, since it is predictable every time. However, when uncommented, I get the following errors:
sed: -e expression #1, char 18: unterminated `s' command
sed: -e expression #1, char 38: unknown option to `s'
I've examined my sed statement, but I can't find any discrepancies between how I use it here and in other working shell scripts.
Is there actually a syntax error here (or unallowed characters)? Or is there a better way I can do this?

Your shell script has serious issues.
The syntax
variable=`commands`
takes the output of commands and assigns it to variable. But in every case, you are redirecting all output to a file; so the variable will always be empty.
Unless you need the temporary files for reasons which are not revealed in your question (such as maybe being able to check how many bytes of output you got in each temporary file for a monitoring report, or something like that), a pipeline would be much superior.
#!/bin/sh
curl -v --silent http://player.listenlive.co/41851/en/songhistory |
grep songs |
sed -n -e '/var songs = /,/; <span title/ p' |
grep songs |
php /etc/asterisk/scripts/music/wklh.php
This also does away with the useless uses of cat and the useless uses of echo and so also coincidentally removes the quoting errors. The grep x | sed -n 's/y/z/p' is a useless use of grep which can easily be refactored to sed -n '/x/s/y/z/p'

Square brackets are special to sed. Simply escape them.
s/var songs = \[//g
If you use slash / as the regex delimiter, it becomes special. Either escape it or use a different delimiter.
s/}]; <ol class="songs tracks"><\/ol>//g
s|}]; <ol class="songs tracks"></ol>||g

if your data in 'd' file, try gnu sed,
sed -Ez 's/^\*\*[^\*]+\*\*(.+)]\*\*[^\*]+\*\*\s*$/\1/' d
remove last ] too, to correctly balance the Json

Related

Append text to top of file using sed doesn't work for variable whose content has "/" [duplicate]

This question already has answers here:
Using different delimiters in sed commands and range addresses
(3 answers)
Closed 1 year ago.
I have a Visual Studio project, which is developed locally. Code files have to be deployed to a remote server. The only problem is the URLs they contain, which are hard-coded.
The project contains URLs such as ?page=one. For the link to be valid on the server, it must be /page/one .
I've decided to replace all URLs in my code files with sed before deployment, but I'm stuck on slashes.
I know this is not a pretty solution, but it's simple and would save me a lot of time. The total number of strings I have to replace is fewer than 10. A total number of files which have to be checked is ~30.
An example describing my situation is below:
The command I'm using:
sed -f replace.txt < a.txt > b.txt
replace.txt which contains all the strings:
s/?page=one&/pageone/g
s/?page=two&/pagetwo/g
s/?page=three&/pagethree/g
a.txt:
?page=one&
?page=two&
?page=three&
Content of b.txt after I run my sed command:
pageone
pagetwo
pagethree
What I want b.txt to contain:
/page/one
/page/two
/page/three
The easiest way would be to use a different delimiter in your search/replace lines, e.g.:
s:?page=one&:pageone:g
You can use any character as a delimiter that's not part of either string. Or, you could escape it with a backslash:
s/\//foo/
Which would replace / with foo. You'd want to use the escaped backslash in cases where you don't know what characters might occur in the replacement strings (if they are shell variables, for example).
The s command can use any character as a delimiter; whatever character comes after the s is used. I was brought up to use a #. Like so:
s#?page=one&#/page/one#g
A very useful but lesser-known fact about sed is that the familiar s/foo/bar/ command can use any punctuation, not only slashes. A common alternative is s#foo#bar#, from which it becomes obvious how to solve your problem.
add \ before special characters:
s/\?page=one&/page\/one\//g
etc.
In a system I am developing, the string to be replaced by sed is input text from a user which is stored in a variable and passed to sed.
As noted earlier on this post, if the string contained within the sed command block contains the actual delimiter used by sed - then sed terminates on syntax error. Consider the following example:
This works:
$ VALUE=12345
$ echo "MyVar=%DEF_VALUE%" | sed -e s/%DEF_VALUE%/${VALUE}/g
MyVar=12345
This breaks:
$ VALUE=12345/6
$ echo "MyVar=%DEF_VALUE%" | sed -e s/%DEF_VALUE%/${VALUE}/g
sed: -e expression #1, char 21: unknown option to `s'
Replacing the default delimiter is not a robust solution in my case as I did not want to limit the user from entering specific characters used by sed as the delimiter (e.g. "/").
However, escaping any occurrences of the delimiter in the input string would solve the problem.
Consider the below solution of systematically escaping the delimiter character in the input string before having it parsed by sed.
Such escaping can be implemented as a replacement using sed itself, this replacement is safe even if the input string contains the delimiter - this is since the input string is not part of the sed command block:
$ VALUE=$(echo ${VALUE} | sed -e "s#/#\\\/#g")
$ echo "MyVar=%DEF_VALUE%" | sed -e s/%DEF_VALUE%/${VALUE}/g
MyVar=12345/6
I have converted this to a function to be used by various scripts:
escapeForwardSlashes() {
# Validate parameters
if [ -z "$1" ]
then
echo -e "Error - no parameter specified!"
return 1
fi
# Perform replacement
echo ${1} | sed -e "s#/#\\\/#g"
return 0
}
this line should work for your 3 examples:
sed -r 's#\?(page)=([^&]*)&#/\1/\2#g' a.txt
I used -r to save some escaping .
the line should be generic for your one, two three case. you don't have to do the sub 3 times
test with your example (a.txt):
kent$ echo "?page=one&
?page=two&
?page=three&"|sed -r 's#\?(page)=([^&]*)&#/\1/\2#g'
/page/one
/page/two
/page/three
replace.txt should be
s/?page=/\/page\//g
s/&//g
please see this article
http://netjunky.net/sed-replace-path-with-slash-separators/
Just using | instead of /
Great answer from Anonymous. \ solved my problem when I tried to escape quotes in HTML strings.
So if you use sed to return some HTML templates (on a server), use double backslash instead of single:
var htmlTemplate = "<div style=\\"color:green;\\"></div>";
A simplier alternative is using AWK as on this answer:
awk '$0="prefix"$0' file > new_file
You may use an alternative regex delimiter as a search pattern by backs lashing it:
sed '\,{some_path},d'
For the s command:
sed 's,{some_path},{other_path},'

Variable contains backslash not working in sed bash

Our log pattern is in the following format dd/Mon/year:time(22/Feb/2018:13).
Goal is we want to find logs between 2 different times. We used sed to get the log between 2 times.
sed -n '/22\/Feb\/2018:13:/,/22\/Feb\/2018:16/p' /var/log/apache2/domlogs/access.log
The above command is working manually. We created a two variables called LAST and NOW in the script and assigned the date variables as mentioned below.
NOW="22/Feb/2018:16"
LAST="22/Feb/2018:13"
We have used the following sed commands to print the same output however it doesn't help us to print the same output.
sed -n '/'"$LAST"'/'"$NOW"'/p' /var/log/apache2/domlogs/access.log
The command gives the below error
sed: -e expression #1, char 5: unknown command: `F'
If we use normal string for LAST and NOW then above command works fine. Only problem is if the variable contains / in the input
You can freely change your delimiter of sed's regular expression by preceding it with a backslash, e.g. \!. Following command should work:
sed -n '\!'"$LAST"'!,\!'"$NOW"'!p' /var/log/apache2/domlogs/access.log
If ! is expected to show up in your date time format, you can use your judgment to choose another one.
You should read some good sed tutorial before working with sed, e.g.: http://www.grymoire.com/Unix/sed.html
I think you originally want to implement a range search in sed while you might miss the syntax of that. I fixed it above and it's tested.

How to scrape end of line in grep? [duplicate]

This question already has answers here:
How to find patterns across multiple lines using grep?
(28 answers)
Closed 6 years ago.
I have a file that contains a sequence already broken into lines, something like this:
CGCCCATGGGTCGTATACGTAATGGGAAAACAAAGCATGGTGTAACTATGGTAAGTGCTA
GACAATACAAGAAGGCTGATATTTGTAGAATAATTCATTTGAATTATTATGCTGTAAATA
GCTAGATTATTATGCATAATTACTTTGAGAGGTGATCAATCAATTCGACCCTTGCCAATT
I want to search a specific pattern in this file like GCTGTAAATAGCTAGATTA for example.
The problem is that the pattern may be cut by a newline at an unpredictable place.
I can use :
grep -e "pattern" file
but it cannot avoid "new line" character and doesn't give the result. How can I modify my command to ignore \n in my search?
Edit:
I don't know either my query exists in the file or not, and if it is there, I don't know where it exists.
The best solution that came into my mind is
tr -d '\n' < file | grep -e "CTACCCCAGACAAACTGGTCAGATACCAACCATCAGCGAAACTAACCAAACAAA"
but I know there should be more efficient ways to do that.
pattern="GCTGTAAATA"$'\n'"GCTAGATTA" # $'\n' is Bash's way of mentioning special chars
grep -e "$pattern" file
OR
pattern="GCTGTAAATA
GCTAGATTA" # with an actual newline at the end of the first line
grep -e "$pattern" file

How to parse a config file using sed

I've never used sed apart from the few hours trying to solve this. I have a config file with parameters like:
test.us.param=value
test.eu.param=value
prod.us.param=value
prod.eu.param=value
I need to parse these and output this if REGIONID is US:
test.param=value
prod.param=value
Any help on how to do this (with sed or otherwise) would be great.
This works for me:
sed -n 's/\.us\././p'
i.e. if the ".us." can be replaced by a dot, print the result.
If there are hundreds and hundreds of lines it might be more efficient to first search for lines containing .us. and then do the string replacement... AWK is another good choice or pipe grep into sed
cat INPUT_FILE | grep "\.us\." | sed 's/\.us\./\./g'
Of course if '.us.' can be in the value this isn't sufficient.
You could also do with with the address syntax (technically you can embed the second sed into the first statement as well just can't remember syntax)
sed -n '/\(prod\|test\).us.[^=]*=/p' FILE | sed 's/\.us\./\./g'
We should probably do something cleaner. If the format is always environment.region.param we could look at forcing this only to occur on the text PRIOR to the equal sign.
sed -n 's/^\([^,]*\)\.us\.\([^=]\)=/\1.\2=/g'
This will only work on lines starting with any number of chars followed by '.' then 'us', then '.' and then anynumber prior to '=' sign. This way we won't potentially modify '.us.' if found within a "value"

Insert line after match using sed

For some reason I can't seem to find a straightforward answer to this and I'm on a bit of a time crunch at the moment. How would I go about inserting a choice line of text after the first line matching a specific string using the sed command. I have ...
CLIENTSCRIPT="foo"
CLIENTFILE="bar"
And I want insert a line after the CLIENTSCRIPT= line resulting in ...
CLIENTSCRIPT="foo"
CLIENTSCRIPT2="hello"
CLIENTFILE="bar"
Try doing this using GNU sed:
sed '/CLIENTSCRIPT="foo"/a CLIENTSCRIPT2="hello"' file
if you want to substitute in-place, use
sed -i '/CLIENTSCRIPT="foo"/a CLIENTSCRIPT2="hello"' file
Output
CLIENTSCRIPT="foo"
CLIENTSCRIPT2="hello"
CLIENTFILE="bar"
Doc
see sed doc and search \a (append)
Note the standard sed syntax (as in POSIX, so supported by all conforming sed implementations around (GNU, OS/X, BSD, Solaris...)):
sed '/CLIENTSCRIPT=/a\
CLIENTSCRIPT2="hello"' file
Or on one line:
sed -e '/CLIENTSCRIPT=/a\' -e 'CLIENTSCRIPT2="hello"' file
(-expressions (and the contents of -files) are joined with newlines to make up the sed script sed interprets).
The -i option for in-place editing is also a GNU extension, some other implementations (like FreeBSD's) support -i '' for that.
Alternatively, for portability, you can use perl instead:
perl -pi -e '$_ .= qq(CLIENTSCRIPT2="hello"\n) if /CLIENTSCRIPT=/' file
Or you could use ed or ex:
printf '%s\n' /CLIENTSCRIPT=/a 'CLIENTSCRIPT2="hello"' . w q | ex -s file
Sed command that works on MacOS (at least, OS 10) and Unix alike (ie. doesn't require gnu sed like Gilles' (currently accepted) one does):
sed -e '/CLIENTSCRIPT="foo"/a\'$'\n''CLIENTSCRIPT2="hello"' file
This works in bash and maybe other shells too that know the $'\n' evaluation quote style. Everything can be on one line and work in
older/POSIX sed commands. If there might be multiple lines matching the CLIENTSCRIPT="foo" (or your equivalent) and you wish to only add the extra line the first time, you can rework it as follows:
sed -e '/^ *CLIENTSCRIPT="foo"/b ins' -e b -e ':ins' -e 'a\'$'\n''CLIENTSCRIPT2="hello"' -e ': done' -e 'n;b done' file
(this creates a loop after the line insertion code that just cycles through the rest of the file, never getting back to the first sed command again).
You might notice I added a '^ *' to the matching pattern in case that line shows up in a comment, say, or is indented. Its not 100% perfect but covers some other situations likely to be common. Adjust as required...
These two solutions also get round the problem (for the generic solution to adding a line) that if your new inserted line contains unescaped backslashes or ampersands they will be interpreted by sed and likely not come out the same, just like the \n is - eg. \0 would be the first line matched. Especially handy if you're adding a line that comes from a variable where you'd otherwise have to escape everything first using ${var//} before, or another sed statement etc.
This solution is a little less messy in scripts (that quoting and \n is not easy to read though), when you don't want to put the replacement text for the a command at the start of a line if say, in a function with indented lines. I've taken advantage that $'\n' is evaluated to a newline by the shell, its not in regular '\n' single-quoted values.
Its getting long enough though that I think perl/even awk might win due to being more readable.
A POSIX compliant one using the s command:
sed '/CLIENTSCRIPT="foo"/s/.*/&\
CLIENTSCRIPT2="hello"/' file
Maybe a bit late to post an answer for this, but I found some of the above solutions a bit cumbersome.
I tried simple string replacement in sed and it worked:
sed 's/CLIENTSCRIPT="foo"/&\nCLIENTSCRIPT2="hello"/' file
& sign reflects the matched string, and then you add \n and the new line.
As mentioned, if you want to do it in-place:
sed -i 's/CLIENTSCRIPT="foo"/&\nCLIENTSCRIPT2="hello"/' file
Another thing. You can match using an expression:
sed -i 's/CLIENTSCRIPT=.*/&\nCLIENTSCRIPT2="hello"/' file
Hope this helps someone
The awk variant :
awk '1;/CLIENTSCRIPT=/{print "CLIENTSCRIPT2=\"hello\""}' file
I had a similar task, and was not able to get the above perl solution to work.
Here is my solution:
perl -i -pe "BEGIN{undef $/;} s/^\[mysqld\]$/[mysqld]\n\ncollation-server = utf8_unicode_ci\n/sgm" /etc/mysql/my.cnf
Explanation:
Uses a regular expression to search for a line in my /etc/mysql/my.cnf file that contained only [mysqld] and replaced it with
[mysqld]
collation-server = utf8_unicode_ci
effectively adding the collation-server = utf8_unicode_ci line after the line containing [mysqld].
I had to do this recently as well for both Mac and Linux OS's and after browsing through many posts and trying many things out, in my particular opinion I never got to where I wanted to which is: a simple enough to understand solution using well known and standard commands with simple patterns, one liner, portable, expandable to add in more constraints. Then I tried to looked at it with a different perspective, that's when I realized i could do without the "one liner" option if a "2-liner" met the rest of my criteria. At the end I came up with this solution I like that works in both Ubuntu and Mac which i wanted to share with everyone:
insertLine=$(( $(grep -n "foo" sample.txt | cut -f1 -d: | head -1) + 1 ))
sed -i -e "$insertLine"' i\'$'\n''bar'$'\n' sample.txt
In first command, grep looks for line numbers containing "foo", cut/head selects 1st occurrence, and the arithmetic op increments that first occurrence line number by 1 since I want to insert after the occurrence.
In second command, it's an in-place file edit, "i" for inserting: an ansi-c quoting new line, "bar", then another new line. The result is adding a new line containing "bar" after the "foo" line. Each of these 2 commands can be expanded to more complex operations and matching.

Resources