In a terminal I can clear another terminal by running:
echo -e "\033\0143" > /dev/pts/14
However, if I try this from my C program by doing:
system("echo -e '\033\0143' > /dev/pts/14");
it doesn't clear the screen and leaves some garbage. Any ideas on how to do this?
I have been programming in C for 30 years and thought this would be easy.
The \0143 escape is being interpreted by C, not by the shell or echo, and it's treated as having \014 and then the numeral 3. This ends up writing the bytes 1b 0c 33 0a to the tty, rather than 1b 63 0a. -e isn't doing anything at all at this point.
I don't understand why you would do this rather than opening the tty file and writing the bytes directly with write(3), though. In any case, this is really a programming question.
You must escape backslash \, otherwise, C will treat your sequence as special characters.
system("echo -e '\\033\\0143' > /dev/pts/14");
Related
Reading lines from a 'somefile' and writing them to 'sample.org' file.
echo "$line" 1>>sample.org gives correct result, which is 'Субъективная оценка (от 1 до 5): 4 - отличный, понятный и богатый вкусом ..' (russian letters)
echo "$line" | fold -w 160 1>>sample.org gives this, which is technically correct if you copypaste it anywhere outside emacs. But still. Why using fold results in my emacs displaying 'sample.org' buffer in 'RAW-TEXT' instead of 'UTF-8'
To reproduce it create 2 files in same directory - test.sh, which will contain
cat 'test.org' |
while read -r line; do
# echo "$line" 1>'newfile.org' # works fine
# line below writes those weird chars to the output file
echo "$line" | fold -w 160 1>'newfile.org'
done
and test.org file, which will contain just 'Среднеферментированный среднепрожаренный улун полусферической скрутки. Содержание ГАМК 200мг/100г.'
Run the script with bash text.sh and hopefully you will see the problem in the output file newfile.org
I can't repro this on MacOS, but in an Ubuntu Docker image, it happens because fold inserts a newline in the middle of a UTF-8 multibyte sequence.
root#ef177a152b15:/# cat test.org
Среднеферментированный среднепрожаренный улун полусферической скрутки. Содержание ГАМК 200мг/100г.
root#ef177a152b15:/# fold -w 160 test.org >newfile.org
root#ef177a152b15:/# cat newfile.org
Среднеферментированный среднепрожаренный улун полусферической скрутки. Содержание Г?
?МК 200мг/100г.
root#ef177a152b15:/# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.2 LTS"
(Perhaps also notice that your demo script can be reduced to a one-liner.)
I would have thought that GNU fold is locale-aware, but that you have to configure a UTF-8 locale for the support to be active; but that changes nothing for me.
root#ef177a152b15:/# locale -a
C
C.UTF-8
POSIX
root#ef177a152b15:/# LC_ALL=C.UTF-8 fold -w 160 test.org
Среднеферментированный среднепрожаренный улун полусферической скрутки. Содержание Г?
?МК 200мг/100г.
Under these circumstances, the best I can offer is to replace fold with a simple replacement.
#!/usr/bin/python3
from sys import argv
maxlen = int(argv.pop(1))
for file in argv[1:]:
with open(file) as lines:
for line in lines:
while len(line) > maxlen:
print(line[0:maxlen])
line = line[maxlen:]
print(line, end='')
For simplicity, this doesn't have any option processing; just pass in the maximum length as the first argument.
(Python 3 uses UTF-8 throughout on any sane platform. Unfortunately, that excludes Windows; but I am restating the obvious.)
Bash, of course, is entirely innocent here; the shell does not control external utilities like fold. (But not much help, either; echo "${tekst:48:64}" produces similar mojibake.)
I'm not sure where that images comes from, however fold and coreutils in general, as well as huge number of other common cli utils, can only be safely used with inputs consisting of symbols from Posix Portable Character Set and not with multibyte UTF-8, regardless of what bullshit websites such as utf8everywhere.org state. fold suffers from the common problem - it assumes that each symbol occupies just a singe char causing multibyte UTF-8 input to be corrupted when it splits the lines.
I'd like to save output of a command to a variable for later multiple uses. Bash provides Here String functionality for that purpose. However it is not binary safe. It sometimes adds new lines:
$ a=''
$ xxd <<< "$a"
00000000: 0a
Is there any binary safe alternative?
I use the variable in for loop so IIUIC it disqualifies tee command and any pipe like solution. I'd also prefer something else than temporary files as the are slow and clumsy to work with (require a writable directory, clean-up).
The answer depends on what, exactly, it is that you need. If your problem is only the newline that here-strings add, then all you need is echo -n:
$ foo=bar
$ echo -n "$foo" | od -t x1
0000000 62 61 72
If you need to preserve the trailing newline(s) that command-substitution strips, or you truly need full binary safety, however, then there are no "work-arounds", unfortunately. Command-substitution will always strip trailing newline no matter what, and as mentioned in the comments, shell variables are not binary safe as they cannot contain NULs. If you need any of those things, then I'm pretty sure your only option is using temporary files.
As for using temporary files, however, the problem you state of finding a writable directory should be a small one, as /tmp is always guaranteed to be writable by all unless you're working on a really weird system, or your script is supposed to run during an incomplete or failed boot, perhaps. In that case, you'll just have to switch to C instead. Otherwise, just use the mktemp command. As for cleanup, you may want to use the trap built-in command.
I picked up a copy of the book 10 PRINT CHR$(205.5+RND(1)); : GOTO 10
http://www.amazon.com/10-PRINT-CHR-205-5-RND/dp/0262018462
This book discusses the art produced by the single line of Commodore 64 BASIC:
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
This just repeatedly prints randomly character 205 or 206 to the screen from the PETSCII set:
http://en.wikipedia.org/wiki/PETSCII
https://vimeo.com/26472518
I'm not sure why the original uses the characters 205 and 206 instead of the identical 109 and 110. Also, I prefer to add a clear at the beginning. This is what I usually type into the C64:
1?CHR$(147)
2?CHR$(109.5+RND(1));:GOTO2
RUN
You can try this all for yourself in an emulator, such as this one using Flash or JavaScript:
http://codeazur.com.br/stuff/fc64_final/
http://www.kingsquare.nl/jsc64
When inputting the above code into the emulators listed, you'll need to realize that
( is *
) is (
+ is ]
I decided it would be amusing to write a bash line to do something similar.
I currently have:
clear; while :; do [ $(($RANDOM%2)) -eq 0 ] && (printf "\\") || (printf "/"); done;
Two questions:
Any suggestions for making this more concise?
Any suggestions
for a better output character? The forward and backward slash are
not nearly as beautiful since their points don't line up. The characters used from PETSCII are special characters, not slashes. I didn't see anything in ASCII that could work as well, but maybe you can suggest a way to pull in a character from UTF-8 or something else?
Best ANSWERS So Far
Shortest for bash (40 characters):
yes 'c=(╱ ╲);printf ${c[RANDOM%2]}'|bash
Here is a short one for zsh (53 characters):
c=(╱ ╲);clear;while :;do printf ${c[RANDOM%2+1]};done
Here is an alias I like to put in my .bashrc or .profile
alias art='c=(╱ ╲);while :;do printf "%s" ${c[RANDOM%2]};done'
Funny comparing this to the shortest I can do for C64 BASIC (23 characters):
1?C_(109.5+R_(1));:G_1
The underscores are shift+H, shift+N, and shift+O respectively. I can't paste the character here since they are specific to PETSCII. Also, the C64 output looks prettier ;)
You can read about the C64 BASIC abbreviations here:
http://www.commodore.ca/manuals/c64_programmers_reference/c64-programmers_reference_guide-02-basic_language_vocabulary.pdf
How about this?
# The characters you want to use
chars=( $'\xe2\x95\xb1' $'\xe2\x95\xb2' )
# Precompute the size of the array chars
nchars=${#chars[#]}
# clear screen
clear
# The loop that prints it:
while :; do
printf -- "${chars[RANDOM%nchars]}"
done
As a one-liner with shorter variable names to make it more concise:
c=($'\xe2\x95\xb1' $'\xe2\x95\xb2'); n=${#c[#]}; clear; while :; do printf -- "${c[RANDOM%n]}"; done
You can get rid of the loop if you know in advance how many characters to print (here 80*24=1920)
c=($'\xe2\x95\xb1' $'\xe2\x95\xb2'); n=${#c[#]}; clear; printf "%s" "${c[RANDOM%n]"{1..1920}"}"
Or, if you want to include the characters directly instead of their code:
c=(╱ ╲); n=${#c[#]}; clear; while :; do printf "${c[RANDOM%n]}"; done
Finally, with the size of the array c precomputed and removing unnecessary spaces and quotes (and I can't get shorter than this):
c=(╱ ╲);clear;while :;do printf ${c[RANDOM%2]};done
Number of bytes used for this line:
$ wc -c <<< 'c=(╱ ╲);clear;while :;do printf ${c[RANDOM%2]};done'
59
Edit. A funny way using the command yes:
clear;yes 'c=(╱ ╲);printf ${c[RANDOM%2]}'|bash
It uses 50 bytes:
$ wc -c <<< "clear;yes 'c=(╱ ╲);printf \${c[RANDOM%2]}'|bash"
51
or 46 characters:
$ wc -m <<< "clear;yes 'c=(╱ ╲);printf \${c[RANDOM%2]}'|bash"
47
After looking at some UTF stuff:
2571 BOX DRAWINGS LIGHT DIAGONAL UPPER RIGHT TO LOWER LEFT
2572 BOX DRAWINGS LIGHT DIAGONAL UPPER LEFT TO LOWER RIGHT
(╱ and ╲) seem best.
f="╱╲";while :;do print -n ${f[(RANDOM % 2) + 1]};done
also works in zsh (thanks Clint on OFTC for giving me bits of that)
Here is my 39 character command line solution I just posted to #climagic:
grep -ao "[/\\]" /dev/urandom|tr -d \\n
In bash, you can remove the double quotes around the [/\] match expression and make it even shorter than the C64 solution, but I've included them for good measure and cross shell compatibility. If there was a 1 character option to grep to make grep trim newlines, then you could make this 27 characters.
I know this doesn't use the Unicode characters so maybe it doesn't count. It is possible to grep for the Unicode characters in /dev/urandom, but that will take a long time because that sequence comes up less often and if you pipe it the command pipeline will probably "stick" for quite a while before producing anything due to line buffering.
Bash supports Unicode now, so we don't need to use UTF-8 character sequences such as $'\xe2\x95\xb1'.
This is my most-correct version: it loops, prints either / or \ based on a random number as others do.
for((;;x=RANDOM%2+2571)){ printf "\U$x";}
41
My previous best was:
while :;do printf "\U257"$((RANDOM%2+1));done
45
And this one 'cheats' using embedded Unicode (I think for obviousness, maintainability, and simplicity, this is my favourite).
Z=╱╲;for((;;)){ printf ${Z:RANDOM&1:1};}
40
My previous best was:
while Z=╱╲;do printf ${Z:RANDOM&1:1};done
41
And here are some more.
while :;do ((RANDOM&1))&&printf "\U2571"||printf "\U2572";done
while printf -v X "\\\U%d" $((2571+RANDOM%2));do printf $X;done
while :;do printf -v X "\\\U%d" $((2571+RANDOM%2));printf $X;done
while printf -v X '\\U%d' $((2571+RANDOM%2));do printf $X;done
c=('\U2571' '\U2572');while :;do printf ${c[RANDOM&1]};done
X="\U257";while :;do printf $X$((RANDOM%2+1));done
Now, this one runs until we get a stack overflow (not another one!) since bash does not seem to support tail-call elimination yet.
f(){ printf "\U257"$((RANDOM%2+1));f;};f
40
And this is my attempt to implement a crude form of tail-process elimination. But when you have had enough and press ctrl-c, your terminal will vanish.
f(){ printf "\U257"$((RANDOM%2+1));exec bash -c f;};export -f f;f
UPDATE:
And a few more.
X=(╱ ╲);echo -e "\b${X[RANDOM&1]"{1..1000}"}" 46
X=("\U2571" "\U2572");echo -e "\b${X[RANDOM&1]"{1..1000}"}" 60
X=(╱ ╲);while :;do echo -n ${X[RANDOM&1]};done 46
Z=╱╲;while :;do echo -n ${Z:RANDOM&1:1};done 44
Sorry for necroposting, but here's bash version in 38 characters.
yes 'printf \\u$[2571+RANDOM%2]'|bash
using for instead of yes inflates this to 40 characters:
for((;;)){ printf \\u$[2571+RANDOM%2];}
109 chr for Python 3
Which was the smallest I could get it.
#!/usr/bin/python3
import random
while True:
if random.randrange(2)==1:print('\u2572',end='')
else:print('\u2571',end='')
#!/usr/bin/python3
import random
import sys
while True:
if random.randrange(2)==1:sys.stdout.write("\u2571")
else:sys.stdout.write("\u2572")
sys.stdout.flush()
Here's a version for Batch which fits in 127 characters:
cmd /v:on /c "for /l %a in (0,0,0) do #set /a "a=!random!%2" >nul & if "!a!"=="0" (set /p ".=/" <nul) else (set /p ".=\" <nul)"
While trying to process a list of file-/foldernames correctly (see my other questions) through the use of a NULL-character as a delimiter I stumbled over a strange behaviour of Bash that I don't understand:
When assigning a string containing one or more NULL-character to a variable, the NULL-characters are lost / ignored / not stored.
For example,
echo -ne "n\0m\0k" | od -c # -> 0000000 n \0 m \0 k
But:
VAR1=`echo -ne "n\0m\0k"`
echo -ne "$VAR1" | od -c # -> 0000000 n m k
This means that I would need to write that string to a file (for example, in /tmp) and read it back from there if piping directly is not desired or feasible.
When executing these scripts in Z shell (zsh) the strings containing \0 are preserved in both cases, but sadly I can't assume that zsh is present in the systems running my script while Bash should be.
How can strings containing \0 chars be stored or handled efficiently without losing any (meta-) characters?
In Bash, you can't store the NULL-character in a variable.
You may, however, store a plain hex dump of the data (and later reverse this operation again) by using the xxd command.
VAR1=`echo -ne "n\0m\0k" | xxd -p | tr -d '\n'`
echo -ne "$VAR1" | xxd -r -p | od -c # -> 0000000 n \0 m \0 k
As others have already stated, you can't store/use NUL char:
in a variable
in an argument of the command line.
However, you can handle any binary data (including NUL char):
in pipes
in files
So to answer your last question:
can anybody give me a hint how strings containing \0 chars can be
stored or handled efficiently without losing any (meta-) characters?
You can use files or pipes to store and handle efficiently any string with any meta-characters.
If you plan to handle data, you should note additionally that:
Only the NUL char will be eaten by variable and argument of the command line, you can check this.
Be wary that command substitution (as $(command..) or `command..`) has an additional twist above being a variable as it'll eat your ending new lines.
Bypassing limitations
If you want to use variables, then you must get rid of the NUL char by encoding it, and various other solutions here give clever ways to do that (an obvious way is to use for example base64 encoding/decoding).
If you are concerned by memory or speed, you'll probably want to use a minimal parser and only quote NUL character (and the quoting char). In this case this would help you:
quote() { sed 's/\\/\\\\/g;s/\x0/\\x00/g'; }
Then, you can secure your data before storing them in variables and
command line argument by piping your sensitive data into quote, which will output a safe data stream without NUL chars. You can get back
the original string (with NUL chars) by using echo -en "$var_quoted" which will send the correct string on the standard output.
Example:
## Our example output generator, with NUL chars
ascii_table() { echo -en "$(echo '\'0{0..3}{0..7}{0..7} | tr -d " ")"; }
## store
myvar_quoted=$(ascii_table | quote)
## use
echo -en "$myvar_quoted"
Note: use | hd to get a clean view of your data in hexadecimal and
check that you didn't loose any NUL chars.
Changing tools
Remember you can go pretty far with pipes without using variables nor argument in command line, don't forget for instance the <(command ...) construct that will create a named pipe (sort of a temporary file).
EDIT: the first implementation of quote was incorrect and would not deal correctly with \ special characters interpreted by echo -en. Thanks #xhienne for spotting that.
EDIT2: the second implementation of quote had bug because of using only \0 than would actually eat up more zeroes as \0, \00, \000 and \0000 are equivalent. So \0 was replaced by \x00. Thanks for #MatthijsSteen for spotting this one.
Use uuencode and uudecode for POSIX portability
xxd and base64 are not POSIX 7 but uuencode is.
VAR="$(uuencode -m <(printf "a\0\n") /dev/stdout)"
uudecode -o /dev/stdout <(printf "$VAR") | od -tx1
Output:
0000000 61 00 0a
0000003
Unfortunately I don't see a POSIX 7 alternative for the Bash process <() substitution extension except writing to file, and they are not installed in Ubuntu 12.04 by default (sharutils package).
So I guess that the real answer is: don't use Bash for this, use Python or some other saner interpreted language.
I love jeff's answer. I would use Base64 encoding instead of xxd. It saves a little space and would be (I think) more recognizable as to what is intended.
VAR=$(echo -ne "foo\0bar" | base64)
echo -n "$VAR" | base64 -d | xargs -0 ...
As for -e, it is needed for the echo of a literal string with an encoded null ('\0'), though I also seem to recall something about "echo -e" being unsafe if you're echoing any user input as they could inject escape sequences that echo will interpret and end up with bad things. The -e flag is not needed when echoing the encoded stored string into the decode.
Here’s a maximally memory-efficient solution, that just escapes the NULL bytes with an \xFF.
(Since I wasn’t happy with base64 or the like. :)
esc0() { sed 's/\xFF/\xFF\xFF/g; s/\x00/\xFF0/g'; }
cse0() { sed 's/\xFF0/\xFF\x00/g; s/\xFF\(.\)/\1/g'; }
It of course escapes any actual \xFF by doubling it too, so it works exactly like when backslashes are used for escaping. This is also why a simple mapping can’t be used, and referring to the match in the replacement is required.
Here’s an example that paints gradients onto the framebuffer (doesn’t work in X), using variables to pre-render blocks and lines for speed:
width=7680; height=1080; # Set these to your framebuffer’s size.
blocksPerLine=$(( $width / 256 ))
block="$( for i in 0 1 2 3 4 5 6 7 8 9 A B C D E F; do for j in 0 1 2 3 4 5 6 7 8 9 A B C D E F; do echo -ne "\x$i$j"; done; done | esc0 )"
line="$( for ((b=0; b < blocksPerLine; b++)); do echo -en "$block"; done )"
for ((l=0; l <= $height; l++)); do echo -en "$line"; done | cse0 > /dev/fb0
Note how $block contains escaped NULLs (plus \xFFs), and at the end, before writing everything to the framebuffer, cse0 unescapes them.
I'm trying to do something like
read -d EOF stdin
for word in $stdin; do stuff; done
where I want to replace 'EOF' for an actual representation of the end of file character.
Edit: Thanks for the answers, that was indeed what I was trying to do. I actually had a facepalm moment when I saw stdin=$(cat) lol
Just for kicks though how would you go about matching something like a C-d (or C-v M-v etc), basically just a character combined with Control, Alt, Shift, whatever in bash?
There isn't an end-of-file character really. When you press Ctrl-d or similar characters, the terminal driver signals to the reading application that the end of file has been reached, by returning an invalid value. The same is done by the operation system, when you have reached the end of the file. This is done by using an integer instead of a byte (so you have range similar to -2^16 .. 2^16, instead of only 0..255) and returning an out-of-range value - usually -1. But there is no character that would represent eof, because its whole purpose is to be not a character. If you want to read everything from stdin, up until the end of file, try
stdin=$(cat)
for word in $stdin; do stuff; done
That will however read the whole standard input into the variable. You can get away with only allocating memory for one line using an array, and make read read words of a line into that array:
while read -r -a array; do
for word in "${array[#]}"; do
stuff;
done
done
To find what a control character is, run
$ cat | od -b
^D
0000000 004 012
0000002
I typed ^V^D after issuing the command, and then RET and another ^D (unquoted) and the result is that EOF is octal 004.
Combining that result with read(1):
$ read -d "$(echo -e '\004')" stdin
foo
bar quuz^Hx
^D
$ echo "$stdin"
foo
bar quux
$ for word in $stdin; do echo $word; done
foo
bar
quux
Yes, I typed ^H above for backspace to see if read(1) did the right thing. It does.
Two things...
The EOF character is represented by C-d (or C-v C-d if you want to type it), but to do what you're trying, it's better to do this:
while read line; do stuff "${line}"; done
litb & Daniel are right, I will just answer your "Just for kick" question:
Bash (as any command line unix program in general) only see characters as bytes. So you cannot match Alt-v, you will match whatever bytes are sent to you from the UI (pseudo-tty) that interpret these keypresses from the users. It can even be unix signals, not even bytes. It will depend on the terminal program used, the user settings and all kind of things so I would advise you not try to match them.
But if you know that your terminal sends C-v as the byte number 22 (0x16), you can use things like:
if test "$char" = '^V'; then...
by entering a real ^V char under your editor (C-q C-v under emacs, C-v C-v under an xterm , ...), not the two chars ^ and V
My own terminal driver, when getc returns the EOT, fclose's stdout and reopens. That way, when reader's getc senses an empty write queue and returns the EOF (non char value) to signal it's closed, user sub-routines such as the `cat' can shift the argument and eventually quit. Thus renders the EOF a stream condition or file marker, no value in the range of ``char''.