Trouble with piping through sed - bash

I am having trouble piping through sed. Once I have piped output to sed, I cannot pipe the output of sed elsewhere.
wget -r -nv http://127.0.0.1:3000/test.html
Outputs:
2010-03-12 04:41:48 URL:http://127.0.0.1:3000/test.html [99/99] -> "127.0.0.1:3000/test.html" [1]
2010-03-12 04:41:48 URL:http://127.0.0.1:3000/robots.txt [83/83] -> "127.0.0.1:3000/robots.txt" [1]
2010-03-12 04:41:48 URL:http://127.0.0.1:3000/shop [22818/22818] -> "127.0.0.1:3000/shop.29" [1]
I pipe the output through sed to get a clean list of URLs:
wget -r -nv http://127.0.0.1:3000/test.html 2>&1 | grep --line-buffered -v ERROR | sed 's/^.*URL:\([^ ]*\).*/\1/g'
Outputs:
http://127.0.0.1:3000/test.html
http://127.0.0.1:3000/robots.txt
http://127.0.0.1:3000/shop
I would like to then dump the output to file, so I do this:
wget -r -nv http://127.0.0.1:3000/test.html 2>&1 | grep --line-buffered -v ERROR | sed 's/^.*URL:\([^ ]*\).*/\1/g' > /tmp/DUMP_FILE
I interrupt the process after a few seconds and check the file, yet it is empty.
Interesting, the following yields no output (same as above, but piping sed output through cat):
wget -r -nv http://127.0.0.1:3000/test.html 2>&1 | grep --line-buffered -v ERROR | sed 's/^.*URL:\([^ ]*\).*/\1/g' | cat
Why can I not pipe the output of sed to another program like cat?

When sed is writing to another process or to a file, it will buffer data.
Try adding the --unbuffered options to sed.

you can also use awk. since your URL appears in field 3, you can use $3, and you can remove the grep as well.
awk '!/ERROR/{sub("URL:","",$3);print $3}' file

Related

Why doesn't this sed command put a newline

I have a file, ciao.py thas has only one line in it: print("ciao")
I want to do this: I want to do that via pipe stream, and als, if I do cat ciao.py | sed 's/.*/&\n&/' it would work, but I want to do this in two separated parts, simulating the case where I want to print it and then pass that to further commands.
If I do this:
cat ciao.py | sed 's/.*/&\n/' |tee >(xargs echo) | xargs echo
it does not work. It prints print("ciao") print("ciao") in the same line. I don't understand why, since I am putting \n with sed.
I'd guess print cia is appearing twice on the same line because xargs is calling echo with multiple strings since xargs calls the command you provide it with groups of input lines at a time by default.
Is this what you're trying to do?
$ cat ciao.py | sed 's/.*/&\n/' |tee >(xargs -n 1 echo) | xargs -n 1 echo
print(ciao)
print(ciao)
or:
$ cat ciao.py | sed 's/.*/&\n/' |tee >(cat) | xargs -n 1 echo
print(ciao)
print(ciao)
There are, of course, better ways to get that output from that input, e.g.:
$ sed 'p' ciao.py
print("ciao")
print("ciao")

tail -f | sed to file doesn't work [duplicate]

This question already has answers here:
write to a file after piping output from tail -f through to grep
(4 answers)
Closed 5 years ago.
I am having an issue with filtering a log file that is being written and writing the output to another file (if possible using tee, so I can see it working as it goes).
I can get it to output on stdout, but not write to a file, either using tee or >>.
I can also get it to write to the file, but only if I drop the -f options from tail, which I need.
So, here is an overview of the commands:
tail -f without writing to file: tail -f test.log | sed 's/a/b/' works
tail writing to file: tail test.log | sed 's/a/b/' | tee -a a.txt works
tail -f writing to file: tail -f test.log | sed 's/a/b/' | tee -a a.txt doesn't output on stdout nor writes to file.
I would like 3. to work.
It's the sed buffering. Use sed -u. man sed:
-u, --unbuffered
load minimal amounts of data from the input files and flush the
output buffers more often
And here's a test for it (creates files fooand bar):
$ for i in {1..3} ; do echo a $i ; sleep 1; done >> foo &
[1] 12218
$ tail -f foo | sed -u 's/a/b/' | tee -a bar
b 1
b 2
b 3
Be quick or increase the {1..3} to suit your skillz.

Curl and xargs in piped commands

I want to process an old database where password are plain text (comma separated ; passwd is the 5th field in the csv file where the database has been exported) to crypt them for further use by dokuwiki. Here is my bash command (grep and sed are there to extract the crypted passwd from curl output) :
cat users.csv | awk 'FS="," { print $4 }' | xargs -l bash -c 'curl -s --data-binary "pass1=$0&pass2=$0" "https://sprhost.com/tools/SMD5.php" -o - ' | xargs | grep -o '<tt.*tt>' | sed -e 's/tt//g' | sed -e 's/<[^>]*>//g'
I get the following comment from xargs
xargs: unmatched single quote; by default quotes are special to xargs unless you use the -0 option
And only the first line of the file is processed, and nothing appends then.
Using the -0 option, and playing around with quotes, doesn't solve anything. Where am I wrong in the command line ? May be a more advanced language will be more adequate to do this.
Thank for help, LM
In general, if you have such a long pipe of commands, it is better to split them if things go wrong. Going through your pipe:
cat users.csv |
Nothing unexpected there.
awk 'FS="," { print $4 }' |
You probably wanted to do awk 'BEGIN {FS=","} { print $4 }'. Try the first two commands in the pipe and see if they produce the correct answer.
xargs -l bash -c 'curl -s --data-binary "pass1=$0&pass2=$0" "https://sprhost.com/tools/SMD5.php" -o - ' |
Nothing wrong there, although there might be better ways to do an MD5 hash.
xargs |
What is this xargs doing in the pipe? It should be removed.
grep -o '<tt.*tt>' |
Note that this will produce two lines:
<tt>$1$17ab075e$0VQMuM3cr5CtElvMxrPcE0</tt>
<tt><your_docuwiki_root>/conf/users.auth.php</tt>
which is probably not what you expected.
sed -e 's/tt//g' |
sed -e 's/<[^>]*>//g'
which will remove the html-tags, though
sed 's/<tt>//;s/<.tt>//'
will do the same.
So I'd say a wrong awk and an xargs too many.

Redirecting piped command into a file in bash

I'm trying to do the following:
ping some.server.com | grep -Po '(?<=\=)[0-9].\.[0-9]' >> file.dat
i.e. I run a command (ping), grep part of the output and redirect the result of grep into a file to be inspected later. While the command itself works (i.e. the part before '>>'), nothing gets written into the file.
How do I do this correctly?
Use --line-buffered argument.
ping some.server.com | grep --line-buffered -Po '(?<=\=)[0-9].\.[0-9]' >> file.dat

Pipe output to terminal and file using tee from grep and sed pipe

I'm trying to get the output from a grep and sed pipe to go to the terminal and a text file.
Neither
grep -Filr "string1" * 2>&1 | tee ~/outputfile.txt | sed -i "s|string1|string2|g"
nor
grep -Filr "string1" * | sed -i "s|string1|string2|g" 2>&1 | tee ~/outputfile.txt
work. I get "sed: no input files" going to the terminal so sed is not getting the correct input. I just want to see and write out to a text file which files are modified from the search and replace. I know using find instead of grep would be more efficient since the search wouldn't be done twice, but I'm not sure how to output the file name using find and sed when there is a search hit.
EDIT:
Oops I forgot to include xargs in the code. It should have been:
grep -Filr "string1" * 2>&1 | tee ~/outputfile.txt | xargs sed -i "s|string1|string2|g"
and
grep -Filr "string1" * | xargs sed -i "s|string1|string2|g" 2>&1 | tee ~/outputfile.txt
To be clear, I'm looking for a solution that modifies the matched files with the search and replace, and then outputs the modified files' file names to the terminal and a log file.
The -i option to sed is only useful when sed operates on a file, not on standard input. Drop it, and your first option is correct.
I'd use a loop:
for i in `grep -lr string1 *`; do sed -i . 's/string1/string2/g' $i; echo $i >> ~/outputfile.txt; done
I'd advise against using the 'i' option for grep, because it would match files which the sed command won't actually modify.
You can do the same with find and exec, but that's a dangerous tool.
I almost forgot about this. I eventually went with a for loop in a bash script:
#!/bin/bash
for i in $( grep -Flr "string1" * ); do
sed -i "s|string1|string2|g" $i
echo $i
echo $i >> ~/outputfile.txt
done
I'm using the vertical pipe | as the separator, because I'm replacing URL paths with lots of forward slashes.
Thank you both for your help.

Resources