Convert String to HEX using sed command - bash

I need to convert a string in chinese to its appropriate HEX format. I can do it using sed in the following way
echo -n 欢迎 | xxd -p -u | sed 's/.\{2\}/&\\x/g' | sed 's/^\(.\{0\}\)/\1\\x/' | sed -r 's/(.*)\\x/\1 /'
which gives me output as:
\xE6\xAC\xA2\xE8\xBF\x8E
This is correct answer that I am looking for. Please suggest me making using of sed more efficiently in above command. The above command is being run on ubuntu 16.04 terminal

You can chain sed-commands with ";":
echo -n 欢迎 | xxd -p -u | sed 's/.\{2\}/&\\x/g;s/^\(.\{0\}\)/\1\\x/' | sed -r 's/(.*)\\x/\1 /'
\xE6\xAC\xA2\xE8\xBF\x8E
Since you use sed and sed -r interchangingly, you have to modify the second, remaining sed call, to combine the remaining ones:
echo -n 欢迎 | xxd -p -u | sed 's/.\{2\}/&\\x/g;s/^\(.\{0\}\)/\1\\x/;s/\(.*\)\\x/\1 /'
Having a second look at it, what the output of xxd is without sed, I observed, the solution is much more easy:
echo -n 欢迎 | xxd -p -u | sed -r 's/(..)/\\x\1/g'
Your initial approach appended \x to 2 characters, but you can preceed it your pairs. However chaining multiple sed commands might still be a useful thing to know.

From an efficiency standpoint, about the best option I could come up with would be to replace xdd, 3-pipes, and 3 calls to sed with od and 2 bash parameter expansions. (there may be more efficient ways, but this was what came to mind)
For example, you could assign the result of command substitution $(printf "欢迎" | od -A none -t x1) to a variable which would contain ' e6 ac a2 e8 bf 8e'. Then it is simply a matter of converting to upper-case and then using a substring replacement of 'space' to '\x' (both provided by bash parameter expansions, e.g.
a=$(printf "欢迎" | od -A none -t x1); \
a=${a^^}; \
a=${a// /\\x}; \
echo $a
\xE6\xAC\xA2\xE8\xBF\x8E
(shown with line-continuations above, you can just copy/paste into your terminal to test)
From Your Request in Comment for C
The code in C to output the upper-case hex bytes contained in your string is trivial, e.g.
#include <stdio.h>
int main (void) {
char *s = "欢迎";
while (*s) /* output each byte in upper-case hex */
printf ("\\x%hhX", ((unsigned char)*s++));
putchar ('\n');
return 0;
}
Example Use/Output
$ ./bin/str2hexbytes
\xE6\xAC\xA2\xE8\xBF\x8E
(note: you could use the exact-width types in stdint.h and the exact-width format specifiers provided in inttypes.h for a more formal solution, but it would accomplish the same thing. Similarly, you could use wide-character types, but virtually all modern compilers have no problem handling multibyte characters in an ordinary string or array of char)

Related

Command execution in sed while preserving unmatched part of the line

It is simple - I have a data stream with IPv4 addresses encoded into hexadecimal representation like for example 0c22384e which stands for 12.34.56.78.
I figured out sed command with substitution of captured octets into decimal numbers separated by dot.
echo "0c22384e" | sed -E 's/([0-9a-f]{2})([0-9a-f]{2})([0-9a-f]{2})([0-9a-f]{2})/printf "%d.%d.%d.%d" 0x\1 0x\2 0x\3 0x\4/eg'
This works with a single number BUT as soon I add some text that is not supposed to be matched, it is also passed for the execution - via printf in this case.
How can I preserve the unmatched part of the line without being passed for the execution?
With only one address in a line you could use
echo "Something 0c22384e more" |
sed -r 's/(.*)([0-9a-f]{2})([0-9a-f]{2})([0-9a-f]{2})([0-9a-f]{2})(.*)/"\1" 0x\2 0x\3 0x\4 0x\5 "\6"/' |
xargs -n6 printf '%s%d.%d.%d.%d%s\n'
EDIT:
Replaced solution for one line and more addresses
with solution for more lines (assuming no '\r' in the stream):
echo "Something 0c22384e more 0c22385e
Second line: 0c22386e and 0c223870
Third line: 0c22388e and 0c223890
4th line: 0c2238ae and 0c2238b0" |
sed 's/$/\r/' |
sed -r 's/[0-9a-f]{8}/\n&\n/g' |
sed -r 's/([0-9a-f]{2})([0-9a-f]{2})([0-9a-f]{2})([0-9a-f]{2})/printf '%d.%d.%d.%d' 0x\1 0x\2 0x\3 0x\4/e' |
tr -d '\n' |
tr '\r' '\n'

Elegant way to replace tr '\n' '\0' (Null byte generating warnings at runtime)

I strongly doubt about the grep best use in my code and would like to find a better and cleaner coding style for extracting the session ID and security level from my cookie file :
cat mycookie
# Netscape HTTP Cookie File
# https://curl.haxx.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.
#HttpOnly_127.0.0.1 FALSE / FALSE 0 PHPSESSID 1hjs18icittvqvpa4tm2lv9b12
#HttpOnly_127.0.0.1 FALSE /mydir/ FALSE 0 security medium
The expected output is the SSID hash :
1hjs18icittvqvpa4tm2lv9b12
Piping grep with tr '\n' '\0' works like a charm in the command line, but generates warnings (warning: command substitution: ignored null byte in input”) at the bash code execution. Here is the related code (with warnings):
ssid=$(grep -Po 'PHPSESSID.*' path/sessionFile | grep -Po '[a-z]|[0-9]' | tr '\n' '\0')
I am using bash 4.4.12 (x86_64-pc-linux-gnu) and could read here this crystal clear explanation :
Bash variables are stored as C strings. C strings are NUL-terminated.
They thus cannot store NULs by definition.
I could see here and there in both cases a coding solution using read:
# read content from stdin into array variable and a scalar variable "suffix"
array=( )
while IFS= read -r -d '' line; do
array+=( "$line" )
done < <(process that generates NUL stream here)
suffix=$line # content after last NUL, if any
# emit recorded content
printf '%s\0' "${array[#]}"; printf '%s' "$suffix"
I don't want to use arrays nor a while loop for this specific case, or others. I found this workaround using sed:
ssid=$(grep -Po 'PHPSESSID.*' path/sessionFile | grep -Po '[a-z]|[0-9]' | tr '\n' '_' | sed -e 's/_//g')
My two questions are :
1) Would it be a better way to substitute tr '\n' '\0', without using read into a while loop ?
2) Would it be a better way to extract properly the SSID and security level ?
Thx
It looks like you're trying to get rid of the newlines in the output from grep, but turning them into nulls doesn't do this. Nulls aren't visible in your terminal, but are still there and (like many other nonprinting characters) will wreak havoc if they get treated as part of your actual data. If you want to get rid of the newlines, just tell tr to delete them for you with ... | tr -d '\n'. But if you're trying to get the PHPSESSID value from a Netscape-format cookie file, there's a much much better way:
ssid=$(awk '($6 == "PHPSESSID") {print $7}' path/sessionFile)
This looks for "PHPSESSID" only in the sixth field (not in e.g. the path or cookie values -- both places it could legally appear), and specifically prints the seventh field of matching lines (not just anything after "PHPSESSID" that happens to be a digit or lowercase letter).
You could also try this, if you don't want to use awk:
ssid=$(grep -P '\bPHPSESSID\b' you_cookies_file)
echo $ssid # for debug only
which outputs something like
#HttpOnly_127.0.0.1 FALSE / FALSE 0 PHPSESSID 1hjs18icittvqvpa4tm2lv9b12
Then with cut(1) extract the relevant field:
echo $ssid |cut -d" " -f7
which outputs
1hjs18icittvqvpa4tm2lv9b12
Of course you should capture the last echo.
UPDATE
If you don't want to use cut, it is possible to emulate it with:
echo $ssid | (read a1 b2 c3 d4 e5 f6 g7; echo $g7)
Demonstration to capture in a variable:
$ field=$(echo $ssid | (read a1 b2 c3 d4 e5 f6 g7; echo $g7))
$ echo $field
1hjs18icittvqvpa4tm2lv9b12
$
Another way is to use positional parameters passing the string to a function which then refers to $7. Perhaps cleaner. Otherwise, you can use an array:
array=($(echo $ssid))
echo ${array[6]} # outputs the 7th field
It should also be possible to use regular expressions and/or string manipulation is bash, but they seem a little more difficult to me.

How to search & replace arbitrary literal strings in sed and awk (and perl)

Say we have some arbitrary literals in a file that we need to replace with some other literal.
Normally, we'd just reach for sed(1) or awk(1) and code something like:
sed "s/$target/$replacement/g" file.txt
But what if the $target and/or $replacement could contain characters that are sensitive to sed(1) such as regular expressions. You could escape them but suppose you don't know what they are - they are arbitrary, ok? You'd need to code up something to escape all possible sensitive characters - including the '/' separator. eg
t=$( echo "$target" | sed 's/\./\\./g; s/\*/\\*/g; s/\[/\\[/g; ...' ) # arghhh!
That's pretty awkward for such a simple problem.
perl(1) has \Q ... \E quotes but even that can't cope with the '/' separator in $target.
perl -pe "s/\Q$target\E/$replacement/g" file.txt
I just posted an answer!! So my real question is, "is there a better way to do literal replacements in sed/awk/perl?"
If not, I'll leave this here in case it comes in useful.
The quotemeta, which implements \Q, absolutely does what you ask for
all ASCII characters not matching /[A-Za-z_0-9]/ will be preceded by a backslash
Since this is presumably in a shell script, the problem is really of how and when shell variables get interpolated and so what the Perl program ends up seeing.
The best way is to avoid working out that interpolation mess and instead properly pass those shell variables to the Perl one-liner. This can be done in several ways; see this post for details.
Either pass the shell variables simply as arguments
#!/bin/bash
# define $target
perl -pe"BEGIN { $patt = shift }; s{\Q$patt}{$replacement}g" "$target" file.txt
where the needed arguments are removed from #ARGV and utilized in a BEGIN block, so before the runtime; then file.txt gets processed. There is no need for \E in the regex here.
Or, use the -s switch, which enables command-line switches for the program
# define $target, etc
perl -s -pe"s{\Q$patt}{$replacement}g" -- -patt="$target" file.txt
The -- is needed to mark the start of arguments, and switches must come before filenames.
Finally, you can also export the shell variables, which can then be used in the Perl script via %ENV; but in general I'd rather recommend either of the above two approaches.
A full example
#!/bin/bash
# Last modified: 2019 Jan 06 (22:15)
target="/{"
replacement="&"
echo "Replace $target with $replacement"
perl -wE'
BEGIN { $p = shift; $r = shift };
$_=q(ah/{yes); s/\Q$p/$r/; say
' "$target" "$replacement"
This prints
Replace /{ with &
ah&yes
where I've used characters mentioned in a comment.
The other way
#!/bin/bash
# Last modified: 2019 Jan 06 (22:05)
target="/{"
replacement="&"
echo "Replace $target with $replacement"
perl -s -wE'$_ = q(ah/{yes); s/\Q$patt/$repl/; say' \
-- -patt="$target" -repl="$replacement"
where code is broken over lines for readability here (and thus needs the \). Same printout.
Me again!
Here's a simpler way using xxd(1):
t=$( echo -n "$target" | xxd -p | tr -d '\n')
r=$( echo -n "$replacement" | xxd -p | tr -d '\n')
xxd -p file.txt | sed "s/$t/$r/g" | xxd -p -r
... so we're hex-encoding the original text with xxd(1) and doing search-replacement using hex-encoded search strings. Finally we hex-decode the result.
EDIT: I forgot to remove \n from the xxd output (| tr -d '\n') so that patterns can span the 60-column output of xxd. Of course, this relies on GNU sed's ability to operate on very long lines (limited only by memory).
EDIT: this also works on multi-line targets eg
target=$'foo\nbar'
replacement=$'bar\nfoo'
With awk you could do it like this:
awk -v t="$target" -v r="$replacement" '{gsub(t,r)}' file
The above expects t to be a regular expression, to use it a string you can use
awk -v t="$target" -v r="$replacement" '{while(i=index($0,t)){$0 = substr($0,1,i-1) r substr($0,i+length(t))} print}' file
Inspired from this post
Note that this won't work properly if the replacement string contains the target. The above link has solutions for that too.
This is an enhancement
of wef’s answer.
We can remove the issue of the special meaning of various special characters
and strings (^, ., [, *, $, \(, \), \{, \}, \+, \?,
&, \1, …, whatever, and the / delimiter)
by removing the special characters. 
Specifically, we can convert everything to hex;
then we have only 0-9 and a-f to deal with. 
This example demonstrates the principle:
$ echo -n '3.14' | xxd
0000000: 332e 3134 3.14
$ echo -n 'pi' | xxd
0000000: 7069 pi
$ echo '3.14 is a transcendental number. 3614 is an integer.' | xxd
0000000: 332e 3134 2069 7320 6120 7472 616e 7363 3.14 is a transc
0000010: 656e 6465 6e74 616c 206e 756d 6265 722e endental number.
0000020: 2020 3336 3134 2069 7320 616e 2069 6e74 3614 is an int
0000030: 6567 6572 2e0a eger..
$ echo "3.14 is a transcendental number. 3614 is an integer." | xxd -p \
| sed 's/332e3134/7069/g' | xxd -p -r
pi is a transcendental number. 3614 is an integer.
whereas, of course, sed 's/3.14/pi/g' would also change 3614.
The above is a slight oversimplification; it doesn’t account for boundaries. 
Consider this (somewhat contrived) example:
$ echo -n 'E' | xxd
0000000: 45 E
$ echo -n 'g' | xxd
0000000: 67 g
$ echo '$Q Eak!' | xxd
0000000: 2451 2045 616b 210a $Q Eak!.
$ echo '$Q Eak!' | xxd -p | sed 's/45/67/g' | xxd -p -r
&q gak!
Because $ (24) and Q (51)
combine to form 2451,
the s/45/67/g command rips it apart from the inside. 
It changes 2451 to 2671, which is &q (26 + 71). 
We can prevent that by separating the bytes of data in the search text,
the replacement text and the file with spaces. 
Here’s a stylized solution:
encode() {
xxd -p -- "$#" | sed 's/../& /g' | tr -d '\n'
}
decode() {
xxd -p -r -- "$#"
}
left=$( printf '%s' "$search" | encode)
right=$(printf '%s' "$replacement" | encode)
encode file.txt | sed "s/$left/$right/g" | decode
I defined an encode function because I used that functionality three times,
and then I defined decode for symmetry. 
If you don’t want to define a decode function, just change the last line to
encode file.txt | sed "s/$left/$right/g" | xxd -p –r
Note that the encode function triples the size of the data (text)
in the file, and then sends it through sed as a single line
— without even having a newline at the end. 
GNU sed seems to be able to handle this;
other versions might not be able to.
As an added bonus, this solution handles multi-line search and replace
(in other words, search and replacement strings that contain newline(s)).
I can explain why this doesn't work:
perl(1) has \Q ... \E quotes but even that can't cope with the '/' separator in $target.
The reason is because the \Q and \E (quotemeta) escapes are processed after the regex is parsed, and a regex is not parsed unless there are valid pattern delimiters defining a regex.
As an example, here's an attempt to replace the string /etc/ in /etc/hosts by using a variable in a string passed to perl:
$target="/etc/";
perl -pe "s/\Q$target\E/XXX/" <<<"/etc/hosts";
After the shell expands the variable in the string, perl receives the command s/\Q/etc/\E/XXX/ which is not a valid regex because it doesn't contain three pattern delimiters (perl sees five delimiters, i.e., s/…/…/…/…/). Therefore, the \Q and \E are never even executed.
The solution, as #zdim suggested, is to pass the variables to perl in a way that they are included in the regex after the regex is parsed, such as like this:
perl -s -pe 's/\Q$target\E/XXX/ig' -- -target="/etc/" <<<"/etc/123"
awk escaping is not all that complex either :
on the searching regex, just these 2 suffices to escape any and all awk variants - simply "cage" all of them, with additional escaping performed for just the circumflex/caret, and backslash itself :
-- technically you don't need to escape space at all - sometimes i like using it for marking an unambiguous anchoring point for the character instead of letting awk be too agile about how it handles spaces and tabs. swap the space for "!" inside the regex if u like
jot -s '' -c - 32 126 |
mawk 'gsub("[[-\440{-~:-# -/]", "[&]") \
gsub(/\\|\^/, "\\\\&")^_' FS='^$' RS='^$'
\440 is (`) - i'm just not a fan of having those exposed in my code
|
[ ][!]["][#][$][%][&]['][(][)][*][+] [,] [-][.] [/] # re-aligned for
0123456789 [:][;] [<] [=][>] [?] # readability
[#]ABCDEFGHIJKLMNOPQRSTUVWXYZ [[][\\] []][\^][_]
[`]abcdefghijklmnopqrstuvwxyz [{] [|] [}][~]
as for replacement, only literal "&" needs to be escaped via
gsub(target_regex, "&") # nothing escaped
matched text
gsub(target_regex, "\\&") # 2 backslashes
literal "&"
gsub("[[:punct:]]", "\\\\&") # 4 backslashes
\!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\#\[\\\]\^\_\`\{\|\}\~
—- (personally prefer using square-brackets i.e. char classes as an escaping mechanism than having backslash galore)
gsub("[[:punct:]]", "\\\\\\&") # 6 backslashes
\&\&\&\&\&\&\&\&\&\&\&\&\&\&\&\&\&\&\&\&\&\&\&\&\&\&\&\&\&\&\&\&
Use 6-backslashes only if you're planning to feed this output further down to another gsub()/match() function call

How to parse strace in shell into plain text?

I've trace log generated by strace command like on running PHP by:
sudo strace -e sendto -fp $(pgrep -n php) -o strace.log
And the output looks like:
11208 sendto(4, "set 29170397297_-cache-schema 85 0 127240\r\n\257\202\v\0?\0\0\0\2\27\10stdClass\24\7\21\3cid\21\6schema\21\4d\37ata\25\n\247\21\5block\24\6\21\6fields\24\f\21\3bid\24\2\5\21\4type 0\37erial\21\10not null\5\21\6module\24\4\16\7\21\7va\37rchar\21\6length\6#\16\t\5\21\7default\r\21\5de\2lta#\5\16\v\16\f\6 \35\7\16\r\21\0010\21\5t \207C\30#6\2\16\r\r n\4tatus#0\4\21\3int/\7\6\0\21\4size \222\finy\21\6weight\24\3 ;\0\22\300 \6\6region#8\340\5P\5custom\27\300,\17\16\23\16\24\21\nvisibility\340\t\34\7\5pages\24\2 \205\3\4tex#\206 \261\1it \365\0\5\240\0\377y\10\r\21\ftransl!N\2ble %\1ca!a\340\3Q\0\1n\31\vprimary key\24\1\6\0\16\6\21\vunique#\21\ts\24\1\21\3tmd\24\3 \31\0\20 2\v\n\6\2\16\16\21\7index \210\10\1\21\4list\24\5\240\36\0\21 \36\10\26\6\3\16\25\6\4\16\n \1\6\4\21\4name \7\0\na\317\2_ro\252\0\5!$\0\n \3\341\2\23\0\16\340\0\16A\214\2\21\3r!\354# \v\22\21\10unsigned\5#\332\0\36\213\0\n \213\0\16 l\6%\16!\24\1\16%\271\0%#p\5\16#\16$\21\f\200l\241b#n\2\4\16\6M\2\10\16&#E\4\21\4bod\201_\5\32\16\t\4\16\23B\\\2g\16\34 \30\3info .\0\7a\255\0\200#q!L\5\6forma\201\332B/!d\2\4\16\37 y\0*y\0 \225a;\240\201\2'\21\van\0_\207\200\2\5\16\1\340\0U =#U\1\16\3#\222 \212\2lob#O\n\23\16)\21\6expire#\30\342\0\26\7\21\7create\241\17< \25\0\n\203\1\"\177\0dY\0\22 \305\5\5small\240!a\32\0.\230\0.\240\240\0\1\240\240\3,\21\vb S\2kpo\"\313\2s\24\6!\220\2\t\21\2\241q\0\10 ?\4\21\tno \213\6ort\5\21\fm\";\3ine_A\313\232\241\3\2\5\16#\340\4\16!\345\340\0U\223\340\0'AC\4sourc\202\202\340\3\27\0\v\200\27\0_C\326\340\0074\1\16\21_\240\363\2\1\16\25\340\3\16\r\0\21\vmultipliers\31\0- \223\1\21\t\341\0\30B-\0\1!\10\0003a\253\0005\v\0005ac \327Dz\"\364 \20\0\10 \6\0 #\333\r\0165\16\36\0163\21\nidenti$x\nr\0166\21\vadmin_ce\10\21\5label\21\f\244H\6 hook\21\23\240\r\0_\340\1\375\fs\21\3api\24\4\21\5own F\0062\16C\16B\21\17 H\5imum_v \260$\25\7\6\1\21\17curr m\340\1\22!\242\0002\"\305\0022\21\20\340\1N\5_groupa\247\2\6\0163\352\0\10 \352\2\0164\5 \325C%\341\0P\341\5\220\1\0162aQA\26\4\16:\5\21\17\201\321\1 c\"$\5back\21#\340\7b\0_\200!\340\3\311\1\16\7C\340\0a!\312\1\no \300#\240!&}\241\237\0\0\242e\341\4n\5\16;\24\10\16< \7\2=\21\35\340\1m\0\320\0 \342\3XAz\v\16>\16G\16?\16#\16A\21\30\341\tT\201\5\1\21\22\200\243\0 B0\6 string#o\4toolsbD\1\16C \260\0D!D\4C\16L\16E!P\0F \3\201T\16G\21\21ckeditor_set%\266\0gE\323\0\5%Q\0# 4#\345!)\"w#\372\1\21\10\340\0!\0\1 \31\0\32\240\334\4#\16\n\21\10\300D \r\2O\21\25\300\r\6_input_\244+\340\16V\1\16+ \31\340\4h X\0\2!;\0# \245\0+ \247\0Q T\7R\21\26comme#/\0_%\266\2cko W\3pane ;\4\5\24\10\21\7#\v\0_\243\257\301\231\1\21\4F\35 !\340\1\22F\323\0021\21\10\"\311'B\0e#\223A\254&f`\346\"~\6\vcollap&q%\227\340\6\35\2\0\21\t\240\35\344\1a\3009\0\0#\212\300.\0001\200L$\247\1enFl\344\0\216\300,\0\1G\5\3view\340\0002\300\177 \372\0\1 K\0T!"..., 8196, MSG_NOSIGNAL|MSG_MORE, NULL, 0) = 8196
It sounds like these are represented by ordinary C escape codes.
I've tried to decode them in shell by printf like:
while read line; do printf "%s" "$line"; done < <(cat strace.log | head -n2)
but it failed (looks like it doesn't make any sense):
11208 sendto(4, "set 29170397297_-cache-schema 85 0 127240rn257202v0?00022710stdClass247213cid216schema214d37ata25n247215block246216fields24f213bid2425214type 037erial2110not null5216module244167217va37rchar216length6#16t5217defaultr215de2lta#516v16f6 35716r210010215t 207C30#6216rr n4tatus#04213int/760214size 222finy216weight243 ;022300 66region#83405P5custom27300,171623162421nvisibility340t3475pages242 20534tex#206 2611it 365052400377y10r21ftransl!N2ble %1ca!a3403Q01n31vprimary key2416016621vunique#21ts241213tmd243 31020 2vn621616217index 210101214list24524036021 3610266316256416n 164214name 70na3172_ro25205!$0n 3341223016340016A2142213r!354# v222110unsigned5#3320362130n 213016 l6%16!24116%2710%#p516#16$21f200l241b#n24166M21016&#E4214bod201_53216t41623B\2g1634 303info .07a2550200#q!L56forma201332B/!d241637 y0*y0 225a;2402012'21van0_207200251613400U =#U1163#222 2122lob#On2316)216expire#303420267217create24117< 250n2031"1770dY022 30555small240!a320.`2300.240240012402403,21vb S2kpo"3132s246!2202t212241q010...
Is there any better way to parse the output of strace command to see plain strings passed to recvfrom/sendto?
Ideally it is possible to print printable characters including new lines (\r\n), but cut-off NULLs and other non-printable characters?
The problem why read doesn't work, because shell is already escaping the characters, so the string is doubled escaped, therefore \r\n is printed as rn.
To ignore escaping of characters by shell, you can use read -r which allow backslashes to escape any characters (so they're treated literally). Here is example:
while read -r line; do printf "%b\n" "$line"; done < strace.log | strings
Since it's a binary data, above example also includes strings command to display only printable strings.
Strace also support printing all strings in hex when -x is specified, but it'll work the same.
Here is the version to parse strace output in real-time:
while read -r line;
do printf "%b\n" "$line" | strings
done < <(sudo strace -e recvfrom,sendto -s 1000 -fp $(pgrep -n php) 2>/dev/stdout)
Further more strings, can be replaced by more specific filter using grep, to get only what is inside double quotes:
grep -o '".\+[^"]"' | grep -o '[^"]\+[^"]'
however this may still print binary formats.
To avoid that, lets simplify the whole process, so lets define the following formatter alias:
alias format-strace='grep --line-buffered -o '\''".\+[^"]"'\'' | grep --line-buffered -o '\''[^"]*[^"]'\'' | while read -r line; do printf "%b" $line; done | tr "\r\n" "\275\276" | tr -d "[:cntrl:]" | tr "\275\276" "\r\n"'
where:
grep -o '".\+[^"]"' - select double-quoted string with quotes
grep -o '[^"]*[^"]' - select text within the double quotes
while read -r line - store each line into $line and do some action (help read)
printf "%b" $line - print line by expanding backslash escape sequences
tr "\r\n" "\275\276" - temporarily replace \r\n into \275\276
tr -d "[:cntrl:]" - remove all control characters
tr "\275\276" "\r\n" - restore new line endings
then the complete example to trace some command (e.g. php) can look like:
strace -e trace=read,write,recvfrom,sendto -s 1000 -fp $(pgrep -n php) 2>&1 | format-strace
Check for similar example: How to view the output of a running process in another bash session? at Unix.SE

Invisible 0131 number in bash

I am trying to gather number from file using bash. My file looks like this:
number = 123;
I use following command to get it:
grep number file.txt | tr -dc 0-9
And what I get is:
0131123
I tried different number and different methods sed 's/[^0-9]//g' but it always gives me result 0131+my_number.
Any ideas why?
UPDATE
Giving answers to your questions
user#domain:/tmp$ grep number file.txt
number = 123;
user#domain:/tmp$ grep number file.txt | tr -dc 0-9
0131123user#domain:/tmp$
user#domain:/tmp$ grep number file.txt | sed 's/[^0-9]//g'
0131123
user#domain:/tmp$ cat file.txt
number = 123;
As you can see file doesn't contain anything except one line.
UPDATE 2
Results of commands given by antak and Gordon Davisson
user#domain:/tmp$ grep number file.txt | cat -v
^[[01;31m^[[Knumber^[[m^[[K = 123;
user#domain:/tmp$ hexdump -C file.txt
00000000 6e 75 6d 62 65 72 20 3d 20 31 32 33 3b 0a |number = 123;.|
0000000e
After cat -v I can see magic 0131, but I have no idea where it comes from.
You can use sed:
sed -n 's/number = \(.*\);/\1/p' file.txt
or with grep and friends:
grep number file.txt | cut -d' ' -f3 | tr -d ';'
or (as you suggested) :
grep number file.txt | tr -dc 0-9
Update: I didn't tried your example before giving the original answer. Just wanted to say that your example worked for me.. (you see above)
It's from your grep --colour=always setting. When it's set, it will color the matching substring with red color by default (see detail here), which will save as ^[[01;31m^[[K....^[[m^[[K tag, where the ... is the matched substring. Turn off the --colour option or change to use --colour=auto, you will solve the problem.
btw, I think I deserve an up-vote for the answer :)
Here is a more general solution which works with numbers and other data as well:
#!/bin/bash
FILENAME='file.txt'
SEARCH_PATTERN='number'
foundLine=$(grep -m 1 "$SEARCH_PATTERN" "$FILENAME")
if [[ $foundLine =~ =.*\; ]]; then
value=$BASH_REMATCH
value=$(echo "${value:1:-1}" | xargs) # remove = and ; and then trim
echo "Your value is: _${value}_"
else
echo 'Line not found :('
fi
This is very useful because on a file with this content:
number = 'complex thing :3' ; # my comment with numbers 123 512
number = 'sadfhsah';
it must return
Your value is: _complex thing :3_
So it does not depend on numbers in comments and other garbage.
P.S. Please note that it will not work with variables that have newlines in them and with comments containing ';' characters

Resources