Clearing the screen by printing a character? - terminal

I'm using chez-scheme and I can't find a way to clear the screen completely. (If someone knows a better way than printing I'd be interested in that too but it's not my question here)
From what I can find clearing the screen by ^L (control-L) or giving the clear command (in bash at least) is equivalent to outputting ASCII character 12: Form feed.
However, printing this does nothing. If I use (display (integer->char 12)) it just prints a newline. Another way to encode this character is \f (analogous to \n for newline), but in Python print("\f") as well as in Scheme (display "\f") is just a newline.
Is my understanding of the meaning of ASCII 12 just wrong, or are implementations lacking?
Is there any way to clear the screen that should work across languages, analogous to \n for a newline?

If you want to clear the screen, the "ANSI" sequence in a printf
\033[2J
clears the entire screen, e.g.,
printf '\033[2J'
The command-line clear program uses this, along with moving the cursor to the "home" position, again an "ANSI" sequence:
\033[H
The program gets the information from the terminal database. For example, for TERM=vt100, it might see this (using \E as \033):
clear=\E[H\E[J$<50>
(the $<50> indicates padding needed for real VT100s). You might notice that the 2 is absent from this string. That is because the cursor is first moved to the home (upper left) position, and the 2 (entire screen) is not necessary. Eliminating that from the string made VT100s a little faster.
On the other hand, if you just want to reset the terminal, you can use the VT100-style RIS:
\033c
but that has side-effects, besides not being in ECMA-48. These bug reports were for side-effects of \033c:
Debian Bug report logs - #60377
"reset" broken for dumb terminals
Debian Bug report logs - #239205
"reset changes a unicode console to non-unicode"
Further reading:
Why doesn't the screen clear when I type control/L?
XTerm Control Sequences
CSI Ps J Erase in Display (ED).
Ps = 0 -> Erase Below (default).
Ps = 1 -> Erase Above.
Ps = 2 -> Erase All.
Ps = 3 -> Erase Saved Lines (xterm).
ECMA-48: Control Functions for Coded Character Sets

You can print \033c which resets the terminal:
petite -q <<< '(display "\033c")'
\033 is escape and c is literal c.
I can't give you any information about how widely this is supported.

Related

What does writing "\r\027[1A\027[K" to stdout do?

I came across some code for chat application in the terminal (in OCaml) and swa this string (in ASCII?) "\r\027[1A\027[K" being printed into the terminal before a new user message is printed to the terminal.
I have tried googling literals one by one, so I know that "\r" stands for cartridge return and \027 for ESC in ASCII, but what does "[1A" and "[K" do? What character encoding is this?
And finally, what is the aggregate effect of this command?
[ introduces a control sequence. A is the control sequence for "cursor up", and [1A moves the cursor up 1 line. K erases a line. So \x1b[1A\x1b[K moves up one line and deletes it (replaces it with spaces).
Of course, that is only valid if the terminal that receives that string recognizes the control sequences. Not all do.
See https://en.wikipedia.org/wiki/ANSI_escape_code
I'm not sure what 027 is trying to do. It seems like an error and should have been 033.

Where does the extra 'D' come from in dup1.go?

I'm new to golang and learning it now. I'm reading "The Go Programming Language" book and trying to run the dup1 example on my Mac. But I noticed a very weird issue. The output of the count contains an extra "D". Anyone has any idea why?
> go run dup1New.go test
test
test
hello
hello
world
3D test
2 hello
> cat dup1New.go
package main
import (
"bufio"
"fmt"
"os"
)
func main() {
counts := make(map[string]int)
input := bufio.NewScanner(os.Stdin)
for input.Scan() {
counts[input.Text()]++
}
// NOTE: ignoring potential errors from input.Err()
for line, n := range counts {
if n > 1 {
fmt.Printf("%d\t%s\n", n, line)
}
}
}
go version go1.13.5 darwin/amd64
You're getting that D character from Ctrl+D is because of echoctl option in your terminal device interface. You could easily remove that off by running this command in your shell/terminal:
stty -echoctl
Ref: man stty
As wlisrausr answered, this is in part from your MacOS Terminal stty settings. (You probably should not turn off echoctl, though.)
To be more complete: when you type the CTRL+D sequence to signal EOF,1 the tty driver2 "displays" the character as the two-character sequence ^D, but then prints two backspace or CTRL+H characters. More precisely, it does so as long as the ECHOCTL flag is set in the lflags control field in the underlying tty settings.
The window that is displaying the interactive Terminal session is treating output as directives to draw particular characters, move (position) the cursor, and have other interesting effects. Some character codes, particularly those in the range 0x20 (32 decimal) through 0x7e (126 decimal), are displayable ASCII characters. Others are controlling characters—ANSI escape codes—or Unicode characters that have been encoded in UTF-8. Go itself uses UTF-8 extensively, to encode runes, so Go's use of UTF-8 dovetails nicely with Terminal's use of UTF-8.3
The CTRL+H, ASCII code 8—which they call BACKSPACE or BS—has the effect of moving the cursor back one display-column. That is, it is a cursor-positioning control code. (There are many of these; see the ANSI escape codes page. This stuff has a very long history, going back to just after the first glass tty.)
So, the CTRL+D has been displayed as ^D, but the cursor is positioned over the ^ (hat or caret or circumflex) character. Now you, in your Go program, send to the Terminal display-handling code, a sequence of ASCII codes: 3, which is 0x33 or 51 decimal; then TAB or CTRL+I or ASCII Horizontal Tab (HT), which is code 9; then the ASCII codes for the letters test (0x74, 0x65, 0x73, 0x74), then a newline or CTRL+J or ASCII NL, which is code 10.
Like backspace, a horizontal tab is a cursor positioning operation. It directs the terminal (or window emulation of terminal) to move the cursor to the next tab-stop, without changing anything else on the display. So you first overwrite the ^ with 3, leaving 3D visible, and the cursor positioned over the letter D. Then you have Terminal move the cursor to column 9 (columns are numbered from 1 and the default tab stop is at every eighth column) and display the word test, and then move the cursor to column 1 of a new line. The result is that the line shows:
3D test
(with exactly six blank positions between D and the first t). On the newly exposed or created line, which is currently all-blank, you print the character 2, move to column 9, and print the letters hello (and another newline directive).
1In fact, control-D simply pushes the accumulating line through the "input canonization" queue as is. If the line is empty, this sends a zero-length record up the tty's read side. Reading zero bytes from a file or device-file is interpreted as EOF by many systems, including Go's os.File reader. If you type a partial line, without a terminating newline, and then use control-D to send it, you can no longer edit that partial line, and a reader that is reading and is not concerned with newlines will have obtained the data and be using it at this point. A second control-D is then required to signal the EOF: the reader simply got the non-newline terminated input from the first control-D.
2This link describes Linux tty drivers, but Linux tty drivers are derived from the same common ancestor behind MacOS tty drivers.
3This is not an accident, even though the Go folks are not the Darwin folks: again, all this stuff goes back (via different paths) to some common ancestors.

An obscure one: Documented VT100 'soft-wrap' escape sequence?

When connected to a remote BASH session via SSH (with the terminal type set to vt100), the console command line will soft-wrap when the cursor hits column 80.
What I am trying to discover is if the <space><carriage return> sequence that gets sent at this point is documented anywhere?
For example sending the following string
std::string str = "0123456789" // 1
"0123456789"
"0123456789" // 3
"0123456789"
"0123456789" // 5
"012345678 9"
"0123456789_" // 7
"0123456789"
"0";
gets the following response back from the host (Linux Mint as it happens)
01234567890123456789012345678901234567890123456789012345678<WS><WS><CR>90123456789_01234567890
The behaviour observed is not really part of bash; rather, it is part of the behaviour of the readline library. It doesn't happen if you simply use echo (which is a bash builtin) to output enough text to force an automatic line wrap, nor does it happen if bash produces an error message which is wider than the console. (Try, for example, the command . with an argument of more then 80 characters not corresponding to any existing file.)
So it's not an official "soft-wrap sequence", nor is it part of any standard. Rather, it's a pragmatic solution to one of the many irritating problems related to console display management.
There is an ambiguity in terminal implementation of line wrapping:
The terminal wraps after a character is inserted at the rightmost position.
The terminal wraps just before the next character is sent.
As a result, it is not possible to reliably send a newline after the last column position. If the terminal had already wrapped (option 1 above), then the newline will create an extra blank line. Otherwise (option 2), the following newline will be "eaten".
These days, almost all terminals follow some variant of option 2, which was the behaviour of the DEC VT-100 terminal. In the vocabulary of the terminfo terminal description database, this is called xenl: the "eat-newline-glitch".
There are actually two possible subvariants of option 2. In the one actually implemented by the VT-100 (and xterm), the cursor ends up in an anomalous state at the end of the line; effectively, it is one character position off the screen, so you can still backspace the cursor in the same line. Other historic terminals "ate" the newline, but positioned the cursor at the beginning of the next line anyway, so that a backspace would not be possible. (Unless the terminal has the bw capability.)
This creates a problem for programs which need to accurately keep track of the cursor position, even for apparently simple applications like echoing input. (Obviously, the easiest way to echo input is to let the terminal do that itself, but that precludes being able to implement extra control characters like tab completion.) Suppose the user has entered text right up to the right margin, and then types the backspace character to delete the last character typed. Normally, you could implement a backspace-delete by outputting a cub1 (move left 1) code and then an el (clear to end of line). (It's more complicated if the deletion is in the middle of a line, but the principle is the same.)
However, if the cursor could possibly be at the beginning of the next line, this won't work. If you knew the cursor was at the beginning of the next, you could move up and then to the right before doing the el, but that wouldn't work if the cursor was still on the same line.
Historically, what was considered "correct" was to force the cursor to the next line with a hard return. (Following quote is taken from the file terminfo.src found in the ncurses distribution. I don't know who wrote it or when):
# Note that the <xenl> glitch in vt100 is not quite the same as on the Concept,
# since the cursor is left in a different position while in the
# weird state (concept at beginning of next line, vt100 at end
# of this line) so all versions of vi before 3.7 don't handle
# <xenl> right on vt100. The correct way to handle <xenl> is when
# you output the char in column 80, immediately output CR LF
# and then assume you are in column 1 of the next line. If <xenl>
# is on, am should be on too.
But there is another way to handle the issue which doesn't require you to even know whether the terminal has the xenl "glitch" or not: output a space character, after which the terminal will definitely have line-wrapped, and then return to the leftmost column.
As it turns out, this trick has another benefit if the terminal emulator is xterm (and probably other such emulators), which allows you to select a "word" by double-clicking on it. If the automatic line wrap happens in the middle of a word, it would be ideal if you could still select the entire word even though it is split over two lines. If you follow the suggestion in the terminfo file above, then xterm will (quite reasonably) treat the split word as two words, because they have an explicit newline between them. But if you let the terminal wrap automatically, xterm treats the result as a single word. (It does this despite the output of the space character, presumably because the space character was overwritten.)
In short, the SPCR sequence is not in any way a standardized feature of the VT100 terminal. Rather, it is a pragmatic response to a specific feature of terminal descriptions combined with the observed behaviour of a specific (and common) terminal emulator. Variants of this code can be found in a variety of codebases, and although as far as I know it is not part of any textbook or formal documentation, it is certainly part of terminal-handling folkcraft [note 2].
In the case of readline, you'll find a comment in the code which is much more telegraphic than this answer: [note 1]
/* If we're at the right edge of a terminal that supports xn, we're
ready to wrap around, so do so. This fixes problems with knowing
the exact cursor position and cut-and-paste with certain terminal
emulators. In this calculation, TEMP is the physical screen
position of the cursor. */
(xn is the short form of xenl.)
Notes
The comment is at line 1326 of display.c in the current view of the git repository as I type this answer. In future versions it may be at a different line number, and the provided link will therefore not work. If you notice that it has changed, please feel free to correct the link.
In the original version of this answer, I described this procedure as "part of terminal handling folklore", in which I used the word "folklore" to describe knowledge passed down from programmer to programmer rather than being part of the canon of academic texts and international standards. While "folklore" is often used with a negative connotation, I use it without such prejudice. "lore" (according to wiktionary) refers to "all the facts and traditions about a particular subject that have been accumulated over time through education or experience", and is derived from an Old Germanic word meaning "teach". Folklore is therefore the accumulated education and experience of the "folk", as opposed to the establishment: in Eric S. Raymond's analogy of the Cathedral and the Bazaar, folklore is the knowledge base of the Bazaar.
This usage raised the eyebrows of at least one highly-skilled practitioner, who suggested the use of the word "esoteric" to describe this bit of information about terminal-handling. "Esoteric" (again according to wiktionary) applies to information "intended for or likely to be understood by only a small number of people with a specialized knowledge or interest, or an enlightened inner circle", being derived from the Greek ἐσωτερικός, "inner circle". (In other words, the knowledge of the Cathedral.)
While the semantic discussion is, at least, amusing, I changed the text by using the hopefully less emotionally-charged word "folkcraft".
There is more than one reason for making line-wrapping a special case (and "folklore" seems an inappropriate term):
The xterm FAQ That description of wrapping is odd, say more? is one of many places discussing vt100 line-wrapping.
vim and screen both take care to not use cursor-addressing to avoid the wrapping, since that would interfere with selecting a wrapped line in xterm. Instead (and the sample seems to show bash doing this too) they send a series of printable characters which step across the margin before sending other control sequences which would prevent the line-wrapping flag from being set in xterm. This is noted in xterm's manual page:
Logical words and lines selected by double- or triple-clicking may wrap
across more than one screen line if lines were wrapped by xterm itself
rather than by the application running in the window.
As for "comments in code" - there certainly are, to explain to maintainers what should not be changed. This from Sven Mascheck's XTerm resource file gives a good explanation:
! Wether this works also with _wrapped_ selections, depends on
! - the terminal emulator: Neither MIT X11R5/6 nor Suns openwin xterm
! know about that. Use the 'xfree xterm' or 'rxvt'. Both compile on
! all major platforms.
! - It only works if xterm is wrapping the line itself
! (not always really obvious for the user, though).
! - Among the different vi's, vim actually supports this with a
! clever and little hackish trick (see screen.c):
!
! But before: vim inspects the _name_ of the value of TERM.
! This must be similar to "xterm" (like "xterm-xfree86", which is
! better than "xterm-color", btw, see his FAQ).
! The terminfo entry _itself_ doesn't matter here
! (e.g.: 'xterm' and 'vs100' are the same entry, but with
! the latter it doesn't work).
!
! If vim has to wrap a word, it appends a space at the first part,
! this space will be wrapped by xterm. Going on with writing, vim
! in turn then positions the cursor again at the _beginning_ of this
! next line. Thus, the space is not visible. But xterm now believes
! that the two lines are actually a single one--as xterm _has_ done
! some wrapping also...
The comment which #rici quotes came from the terminfo file which Eric Raymond incorporated from SCO in 1995. The history section of the terminfo source refers to this. Some of the material in that is based on the BSD termcap sources, but differs, as one would notice when comparing the BSD termcap in this section with ncurses. The four paragraphs beginning with the "not quite" are the same (aside from line-wrapping) with the SCO file. Here is a cut/paste from that file:
# # --------------------------------
#
# dec: DEC (DIGITAL EQUIPMENT CORPORATION)
#
# Manufacturer: DEC (DIGITAL EQUIPTMENT CORP.)
# Class: II
#
# Info:
# Note that xenl glitch in vt100 is not quite the same as concept,
# since the cursor is left in a different position while in the
# weird state (concept at beginning of next line, vt100 at end
# of this line) so all versions of vi before 3.7 don't handle
# xenl right on vt100. The correct way to handle xenl is when
# you output the char in column 80, immediately output CR LF
# and then assume you are in column 1 of the next line. If xenl
# is on, am should be on too.
#
# I assume you have smooth scroll off or are at a slow enough baud
# rate that it doesn't matter (1200? or less). Also this assumes
# that you set auto-nl to "on", if you set it off use vt100-nam
# below.
#
# The padding requirements listed here are guesses. It is strongly
# recommended that xon/xoff be enabled, as this is assumed here.
#
# The vt100 uses rs2 and rf rather than is2/tbc/hts because the
# tab settings are in non-volatile memory and don't need to be
# reset upon login. Also setting the number of columns glitches
# the screen annoyingly. You can type "reset" to get them set.
#
# smkx and rmkx, given below, were removed.
# smkx=\E[?1h\E=, rmkx=\E[?1l\E>,
# Somtimes smkx and rmkx are included. This will put the auxilliary keypad in
# dec application mode, which is not appropriate for SCO applications.
vt100|vt100-am|dec vt100 (w/advanced video),
If you compare the two, the ncurses version has angle brackets added around the terminfo capability names, and a minor grammatical change was made in the first sentence. But the author of the comment clearly was not Raymond.

How can I make my terminal prompt extend the width of the terminal?

I noticed in this video, that the terminal prompt extends the entire width of the terminal before breaking down to a new line. How can I set my PS1 variable to fill the remaining terminal space with some character, like the way this user did?
The issue is, I don't know how to update the PS1 variable per command. It seems to me, that the string value for PS1 is only read in once just as the .bashrc file is only read in once. Do I have to write some kind of hook after each command or something?
I should also point out, that the PS1 variable will be evaluated to a different length based on the escape characters that make up it. For example, \w print the path.
I know I can get the terminal width using $(COLUMNS), and the width of the current PS1 variable with ${#PS1}, do the math, and print the right amount of buffer characters, but how do I get it to update everytime. Is there a preferred way?
Let's suppose you want your prompt to look something like this:
left text----------------------------------------------------------right text
prompt$
This is pretty straight-forward provided that right text has a known size. (For example, it might be the current date and time.) What we do is to print the right number of dashes (or, for utf-8 terminals, the prettier \u2500), followed by right text, then a carriage return (\r, not a newline) and the left text, which will overwrite the dashes. The only tricky bit is "the right number of dashes", but we can use $(tput cols) to see how wide the terminal is, and fortunately bash will command-expand PS1. So, for example:
PS1='\[$(printf "%*s" $(($(tput cols)-20)) "" | sed "s/ /-/g") \d \t\r\u#\h:\w \]\n\$ '
Here, $(($(tput cols)-20)) is the width of the terminal minus 20, which is based on \d \t being exactly 20 characters wide (including the initial space).
PS1 does not understand utf-8 escapes (\uxxxx), and inserting the appropriate substitution into the sed command involves an annoying embedded quote issue, although it's possible. However, printf does understand utf-8 escapes, so it is easier to produce the sequence of dashes in a different way:
PS1='\[$(printf "\\u2500%.0s" $(seq 21 $(tput cols))) \d \t\r\u#\h:\w \]\n\$ '
Yet another way to do this involves turning off the terminal's autowrap, which is possible if you are using xterm or a terminal emulator which implements the same control codes (or the linux console itself). To disable autowrap, output the sequence ESC[?7l. To turn it back on, use ESC[?7h. With autowrap disabled, once output reaches the end of a line, the last character will just get overwritten with the next character instead of starting a new line. With this technique, it's not really necessary to compute the exact length of the dash sequence; we just need a string of dashes which is longer than any console will be wide, say the following:
DASHES="$(printf '\u2500%0.s' {1..1000})"
PS1='\[\e[?7l\u#\h:\w $DASHES \e[19D \d \t\e[?7h\]\n\$ '
Here, \e[19D is the terminal-emulator code for "move cursor backwards 19 characters". I could have used $(tput cub 19) instead. (There might be a tput parameter for turning autowrap on and off, but I don't know what it would be.)
The example in the video also involves inserting a right-aligned string in the actual command line. I don't know any clean way of doing this with bash; the console in the video is almost certainly using zsh with the RPROMPT feature. Of course, you can output right-aligned prompts in bash, using the same technique as above, but readline won't know anything about them, so as soon as you do something to edit the line, the right prompt will vanish.
Use PROMPT_COMMAND to reset the value of PS1 before each command.
PROMPT_COMMAND=set_prompt
set_prompt () {
PS1=...
}
Although some system script (or you yourself) may already use PROMPT_COMMAND for something, in which case you can simply add to it.
PROMPT_COMMAND="$PROMPT_COMMAND; set_prompt"

How to escape unicode characters in bash prompt correctly

I have a specific method for my bash prompt, let's say it looks like this:
CHAR="༇ "
my_function="
prompt=\" \[\$CHAR\]\"
echo -e \$prompt"
PS1="\$(${my_function}) \$ "
To explain the above, I'm builidng my bash prompt by executing a function stored in a string, which was a decision made as the result of this question. Let's pretend like it works fine, because it does, except when unicode characters get involved
I am trying to find the proper way to escape a unicode character, because right now it messes with the bash line length. An easy way to test if it's broken is to type a long command, execute it, press CTRL-R and type to find it, and then pressing CTRL-A CTRL-E to jump to the beginning / end of the line. If the text gets garbled then it's not working.
I have tried several things to properly escape the unicode character in the function string, but nothing seems to be working.
Special characters like this work:
COLOR_BLUE=$(tput sgr0 && tput setaf 6)
my_function="
prompt="\\[\$COLOR_BLUE\\] \"
echo -e \$prompt"
Which is the main reason I made the prompt a function string. That escape sequence does NOT mess with the line length, it's just the unicode character.
The \[...\] sequence says to ignore this part of the string completely, which is useful when your prompt contains a zero-length sequence, such as a control sequence which changes the text color or the title bar, say. But in this case, you are printing a character, so the length of it is not zero. Perhaps you could work around this by, say, using a no-op escape sequence to fool Bash into calculating the correct line length, but it sounds like that way lies madness.
The correct solution would be for the line length calculations in Bash to correctly grok UTF-8 (or whichever Unicode encoding it is that you are using). Uhm, have you tried without the \[...\] sequence?
Edit: The following implements the solution I propose in the comments below. The cursor position is saved, then two spaces are printed, outside of \[...\], then the cursor position is restored, and the Unicode character is printed on top of the two spaces. This assumes a fixed font width, with double width for the Unicode character.
PS1='\['"`tput sc`"'\] \['"`tput rc`"'༇ \] \$ '
At least in the OSX Terminal, Bash 3.2.17(1)-release, this passes cursory [sic] testing.
In the interest of transparency and legibility, I have ignored the requirement to have the prompt's functionality inside a function, and the color coding; this just changes the prompt to the character, space, dollar prompt, space. Adapt to suit your somewhat more complex needs.
#tripleee wins it, posting the final solution here because it's a pain to post code in comments:
CHAR="༇"
my_function="
prompt=\" \\[`tput sc`\\] \\[`tput rc`\\]\\[\$CHAR\\] \"
echo -e \$prompt"
PS1="\$(${my_function}) \$ "
The trick as pointed out in #tripleee's link is the use of the commands tput sc and tput rc which save and then restore the cursor position. The code is effectively saving the cursor position, printing two spaces for width, restoring the cursor position to before the spaces, then printing the special character so that the width of the line is from the two spaces, not the character.
(Not the answer to your problem, but some pointers and general experience related to your issue.)
I see the behaviour you describe about cmd-line editing (Ctrl-R, ... Cntrl-A Ctrl-E ...) all the time, even without unicode chars.
At one work-site, I spent the time to figure out the diff between the terminals interpretation of the TERM setting VS the TERM definition used by the OS (well, stty I suppose).
NOW, when I have this problem, I escape out of my current attempt to edit the line, bring the line up again, and then immediately go to the 'vi' mode, which opens the vi editor. (press just the 'v' char, right?). All the ease of use of a full-fledged session of vi; why go with less ;-)?
Looking again at your problem description, when you say
my_function="
prompt=\" \[\$CHAR\]\"
echo -e \$prompt"
That is just a string definition, right? and I'm assuming your simplifying the problem definition by assuming this is the output of your my_function. It seems very likely in the steps of creating the function definition, calling the function AND using the values returned are a lot of opportunities for shell-quoting to not work the way you want it to.
If you edit your question to include the my_function definition, and its complete use (reducing your function to just what is causing the problem), it may be easier for others to help with this too. Finally, do you use set -vx regularly? It can help show how/wnen/what of variable expansions, you may find something there.
Failing all of those, look at Orielly termcap & terminfo. You may need to look at the man page for your local systems stty and related cmds AND you may do well to look for user groups specific to you Linux system (I'm assuming you use a Linux variant).
I hope this helps.

Resources