Any way to embed a comment within a Tcl command?

Any way to embed a comment within a Tcl command? - comments

I would like to have a comment within a command and it appears that this is not possible given that the '#' character is defined in Tcl 8.4 to be:
If a hash character (``#'') appears at a point where Tcl is expecting the first character of the first word of a command, then the hash character and the characters that follow it, up through the next newline, are treated as a comment and ignored. The comment character only has significance when it appears at the beginning of a command.
Imagine this example of how this might work (none of these comments worked in my experiments):
array set myArray [list red 3 \
blue 4 ;# Blue is before purple.
purple 5 # Purple is after red.
green 7 \
yellow 8]
Seems the tricky part is how to continue the list command with a comment embedded? Perhaps something like the C++ style of /* Embedded comment here. */ but I only see # as being used in Tcl for comments to end of line, nothing for begin and end comment syntax.

No, you cannot embed a comment within the invocation of a command. Comments in Tcl don't quite work the same way as they do in other languages. Some people stumble over this, most experienced Tcl programmers don't give it a second thought.
The rare times that you truly need to do this you can usually work around it easily enough. Using your example:
set myArray(red) 3
set myArray(blue) 4 ;# Blue is before purple
set myArray(purple) 5 ;# Purple is after red
set myArray(green) 7
set myArray(yellow) 8
You might think this is slower than doing it all on one line but the difference is negligible in all but the most time-critical situations, probably on the order of just a few microseconds.

Yes, there is a way to embed a comment into a command. It's not pretty, but it's possible. Add a command substitution containing only a comment to a list member, like this (the newline after the comment is mandatory):
array set myArray [list red 3 \
blue 4[
# Blue is before purple.
] \
purple 5[
# Purple is after red.
] \
green 7 \
yellow 8]
% array get myArray
yellow 8 purple 5 blue 4 green 7 red 3

Related

REGEX strategy to replace line breaks inside group

I'm trying to make a replacement in one pass with just one Regular expression but I think this is not possible at all. I'm using RegexBuddy but I'm always getting a catastrophic result and the expression cannot be evaluated.
For this text:
3 bla bla! !
4 yep yep! ?
FROM HERE
5 something randdom here!
6 perhaps some HTML there
TO HERE
7 what ever you like over here
8 and that's all folks!enter code here
I want to find a REGEX that replaces the line breaks by something else, let's say $$, but only on the section "from here" "to here". So basically the end result would be this:
3 bla bla! !
4 yep yep! ?
FROM HERE$$
$$
5 something randdom here!$$
$$
6 perhaps some HTML there$$
$$
TO HERE
7 what ever you like over here
8 and that's all folks!
I have this expression
((FROM HERE))((.*)(\n))+(TO HERE)
But I'm stuck so far trying to replace just the \n group by something else. I have done similar things in the past so I would say this should be possible in one go.
If this is not possible in regex I would simply create a C# console app to take first that text to a string and then replace each \n by $$, then put it back. That shouldn't be that difficult.

If you are using .NET, one option could be
(?s)(?<=^FROM HERE\b.*?)\r?\n\r?\n(?=.*?^TO HERE\b)
(?s) Inline modifier, dot matches a newline
(?<=^FROM HERE\b.*?) Assert FROM HERE at the left at the start of the line
\r?\n\r?\n Match 2 newlines
(?=.*?^TO HERE\b) Assert TO HERE at the right at the start of the line
In the replacement use (with double escapes $$)
\n$$$$\n
See a .NET regex demo and a C# demo.

An obscure one: Documented VT100 'soft-wrap' escape sequence?

When connected to a remote BASH session via SSH (with the terminal type set to vt100), the console command line will soft-wrap when the cursor hits column 80.
What I am trying to discover is if the <space><carriage return> sequence that gets sent at this point is documented anywhere?
For example sending the following string
std::string str = "0123456789" // 1
"0123456789"
"0123456789" // 3
"0123456789"
"0123456789" // 5
"012345678 9"
"0123456789_" // 7
"0123456789"
"0";
gets the following response back from the host (Linux Mint as it happens)
01234567890123456789012345678901234567890123456789012345678<WS><WS><CR>90123456789_01234567890

The behaviour observed is not really part of bash; rather, it is part of the behaviour of the readline library. It doesn't happen if you simply use echo (which is a bash builtin) to output enough text to force an automatic line wrap, nor does it happen if bash produces an error message which is wider than the console. (Try, for example, the command . with an argument of more then 80 characters not corresponding to any existing file.)
So it's not an official "soft-wrap sequence", nor is it part of any standard. Rather, it's a pragmatic solution to one of the many irritating problems related to console display management.
There is an ambiguity in terminal implementation of line wrapping:
The terminal wraps after a character is inserted at the rightmost position.
The terminal wraps just before the next character is sent.
As a result, it is not possible to reliably send a newline after the last column position. If the terminal had already wrapped (option 1 above), then the newline will create an extra blank line. Otherwise (option 2), the following newline will be "eaten".
These days, almost all terminals follow some variant of option 2, which was the behaviour of the DEC VT-100 terminal. In the vocabulary of the terminfo terminal description database, this is called xenl: the "eat-newline-glitch".
There are actually two possible subvariants of option 2. In the one actually implemented by the VT-100 (and xterm), the cursor ends up in an anomalous state at the end of the line; effectively, it is one character position off the screen, so you can still backspace the cursor in the same line. Other historic terminals "ate" the newline, but positioned the cursor at the beginning of the next line anyway, so that a backspace would not be possible. (Unless the terminal has the bw capability.)
This creates a problem for programs which need to accurately keep track of the cursor position, even for apparently simple applications like echoing input. (Obviously, the easiest way to echo input is to let the terminal do that itself, but that precludes being able to implement extra control characters like tab completion.) Suppose the user has entered text right up to the right margin, and then types the backspace character to delete the last character typed. Normally, you could implement a backspace-delete by outputting a cub1 (move left 1) code and then an el (clear to end of line). (It's more complicated if the deletion is in the middle of a line, but the principle is the same.)
However, if the cursor could possibly be at the beginning of the next line, this won't work. If you knew the cursor was at the beginning of the next, you could move up and then to the right before doing the el, but that wouldn't work if the cursor was still on the same line.
Historically, what was considered "correct" was to force the cursor to the next line with a hard return. (Following quote is taken from the file terminfo.src found in the ncurses distribution. I don't know who wrote it or when):
# Note that the <xenl> glitch in vt100 is not quite the same as on the Concept,
# since the cursor is left in a different position while in the
# weird state (concept at beginning of next line, vt100 at end
# of this line) so all versions of vi before 3.7 don't handle
# <xenl> right on vt100. The correct way to handle <xenl> is when
# you output the char in column 80, immediately output CR LF
# and then assume you are in column 1 of the next line. If <xenl>
# is on, am should be on too.
But there is another way to handle the issue which doesn't require you to even know whether the terminal has the xenl "glitch" or not: output a space character, after which the terminal will definitely have line-wrapped, and then return to the leftmost column.
As it turns out, this trick has another benefit if the terminal emulator is xterm (and probably other such emulators), which allows you to select a "word" by double-clicking on it. If the automatic line wrap happens in the middle of a word, it would be ideal if you could still select the entire word even though it is split over two lines. If you follow the suggestion in the terminfo file above, then xterm will (quite reasonably) treat the split word as two words, because they have an explicit newline between them. But if you let the terminal wrap automatically, xterm treats the result as a single word. (It does this despite the output of the space character, presumably because the space character was overwritten.)
In short, the SPCR sequence is not in any way a standardized feature of the VT100 terminal. Rather, it is a pragmatic response to a specific feature of terminal descriptions combined with the observed behaviour of a specific (and common) terminal emulator. Variants of this code can be found in a variety of codebases, and although as far as I know it is not part of any textbook or formal documentation, it is certainly part of terminal-handling folkcraft [note 2].
In the case of readline, you'll find a comment in the code which is much more telegraphic than this answer: [note 1]
/* If we're at the right edge of a terminal that supports xn, we're
ready to wrap around, so do so. This fixes problems with knowing
the exact cursor position and cut-and-paste with certain terminal
emulators. In this calculation, TEMP is the physical screen
position of the cursor. */
(xn is the short form of xenl.)
Notes
The comment is at line 1326 of display.c in the current view of the git repository as I type this answer. In future versions it may be at a different line number, and the provided link will therefore not work. If you notice that it has changed, please feel free to correct the link.
In the original version of this answer, I described this procedure as "part of terminal handling folklore", in which I used the word "folklore" to describe knowledge passed down from programmer to programmer rather than being part of the canon of academic texts and international standards. While "folklore" is often used with a negative connotation, I use it without such prejudice. "lore" (according to wiktionary) refers to "all the facts and traditions about a particular subject that have been accumulated over time through education or experience", and is derived from an Old Germanic word meaning "teach". Folklore is therefore the accumulated education and experience of the "folk", as opposed to the establishment: in Eric S. Raymond's analogy of the Cathedral and the Bazaar, folklore is the knowledge base of the Bazaar.
This usage raised the eyebrows of at least one highly-skilled practitioner, who suggested the use of the word "esoteric" to describe this bit of information about terminal-handling. "Esoteric" (again according to wiktionary) applies to information "intended for or likely to be understood by only a small number of people with a specialized knowledge or interest, or an enlightened inner circle", being derived from the Greek ἐσωτερικός, "inner circle". (In other words, the knowledge of the Cathedral.)
While the semantic discussion is, at least, amusing, I changed the text by using the hopefully less emotionally-charged word "folkcraft".

There is more than one reason for making line-wrapping a special case (and "folklore" seems an inappropriate term):
The xterm FAQ That description of wrapping is odd, say more? is one of many places discussing vt100 line-wrapping.
vim and screen both take care to not use cursor-addressing to avoid the wrapping, since that would interfere with selecting a wrapped line in xterm. Instead (and the sample seems to show bash doing this too) they send a series of printable characters which step across the margin before sending other control sequences which would prevent the line-wrapping flag from being set in xterm. This is noted in xterm's manual page:
Logical words and lines selected by double- or triple-clicking may wrap
across more than one screen line if lines were wrapped by xterm itself
rather than by the application running in the window.
As for "comments in code" - there certainly are, to explain to maintainers what should not be changed. This from Sven Mascheck's XTerm resource file gives a good explanation:
! Wether this works also with _wrapped_ selections, depends on
! - the terminal emulator: Neither MIT X11R5/6 nor Suns openwin xterm
! know about that. Use the 'xfree xterm' or 'rxvt'. Both compile on
! all major platforms.
! - It only works if xterm is wrapping the line itself
! (not always really obvious for the user, though).
! - Among the different vi's, vim actually supports this with a
! clever and little hackish trick (see screen.c):
!
! But before: vim inspects the _name_ of the value of TERM.
! This must be similar to "xterm" (like "xterm-xfree86", which is
! better than "xterm-color", btw, see his FAQ).
! The terminfo entry _itself_ doesn't matter here
! (e.g.: 'xterm' and 'vs100' are the same entry, but with
! the latter it doesn't work).
!
! If vim has to wrap a word, it appends a space at the first part,
! this space will be wrapped by xterm. Going on with writing, vim
! in turn then positions the cursor again at the _beginning_ of this
! next line. Thus, the space is not visible. But xterm now believes
! that the two lines are actually a single one--as xterm _has_ done
! some wrapping also...
The comment which #rici quotes came from the terminfo file which Eric Raymond incorporated from SCO in 1995. The history section of the terminfo source refers to this. Some of the material in that is based on the BSD termcap sources, but differs, as one would notice when comparing the BSD termcap in this section with ncurses. The four paragraphs beginning with the "not quite" are the same (aside from line-wrapping) with the SCO file. Here is a cut/paste from that file:
# # --------------------------------
#
# dec: DEC (DIGITAL EQUIPMENT CORPORATION)
#
# Manufacturer: DEC (DIGITAL EQUIPTMENT CORP.)
# Class: II
#
# Info:
# Note that xenl glitch in vt100 is not quite the same as concept,
# since the cursor is left in a different position while in the
# weird state (concept at beginning of next line, vt100 at end
# of this line) so all versions of vi before 3.7 don't handle
# xenl right on vt100. The correct way to handle xenl is when
# you output the char in column 80, immediately output CR LF
# and then assume you are in column 1 of the next line. If xenl
# is on, am should be on too.
#
# I assume you have smooth scroll off or are at a slow enough baud
# rate that it doesn't matter (1200? or less). Also this assumes
# that you set auto-nl to "on", if you set it off use vt100-nam
# below.
#
# The padding requirements listed here are guesses. It is strongly
# recommended that xon/xoff be enabled, as this is assumed here.
#
# The vt100 uses rs2 and rf rather than is2/tbc/hts because the
# tab settings are in non-volatile memory and don't need to be
# reset upon login. Also setting the number of columns glitches
# the screen annoyingly. You can type "reset" to get them set.
#
# smkx and rmkx, given below, were removed.
# smkx=\E[?1h\E=, rmkx=\E[?1l\E>,
# Somtimes smkx and rmkx are included. This will put the auxilliary keypad in
# dec application mode, which is not appropriate for SCO applications.
vt100|vt100-am|dec vt100 (w/advanced video),
If you compare the two, the ncurses version has angle brackets added around the terminfo capability names, and a minor grammatical change was made in the first sentence. But the author of the comment clearly was not Raymond.

How to write a hashtag matching regex

I have a problem with writing an regex (in Ruby, but I don't think that it changes anything) that selects all proper hashtags.
I used ( /(^|\s)(#+)(\w+)(\s|$)/ ), which doesn't work and I have no idea why.
In this example:
#start #middle #middle2 #middle3 bad#example #another#bad#example #end
it should mark #start, #middle, #middle2, #middle3 and #end.
Why doesn't my code work and how should a proper regex look?

As for why the original does not work lets look at each bit
(^|\s) Start of line or white space
(#+) one or more #
(\w+) one or more alphanumeric characters
(\s|$) white space or end of line
The main problem is a conflict between 1 and 4. When 1 matches white space that white space was already matched in the last group as part 4. So 1 does not exist and the match moves to the next possible
4 is not really needed since 3 will not match white space.
So here is the result
(?:^|\s)#(\w+)
https://regex101.com/r/iU4dZ3/3

does [^#\w](#[\w]*)|^(#[\w]*) works?
getting an # not following a character, and capturing everything until not a word.
the or case handle the case where the first char is #.
Live demo: http://regexr.com/3al01

How's this work for you?
(#[^\s+]+)
This says find a hash tag then everything until a whitespaces.

One more regex:
\B#\w+\b
This one doesn't capture whitespaces...
https://regex101.com/r/iU4dZ3/4

How do I retrieve only lines with specific words or phrases?

I need to read a file in a series of lines, and then
retrieve specific lines depending on words that are contained
in them. How can I do this?
So far I read the lines like this:
lines = File.readlines("myfile.txt")
Now, I need to scan for lines that contain "red", "rabbit", "blue". I want to do this part in as few lines of code as possible.
So if my file was:
the red queen of hearts.
yet another sentence.
and this has no relevant words.
the blue sky
the white chocolate rabbit.
another irrelevant line.
I would like to see only the lines:
the red queen of hearts.
the blue sky
the white chocolate rabbit.

lines = File.readlines("myfile.txt").grep(/red|rabbit|blue/)

Regular expressions are your friend. They will make quick work of this task.
http://www.tutorialspoint.com/ruby/ruby_regular_expressions.htm
You would want a regex along the lines of
/^.*(red|rabbit|blue).*$/
The ^ means start of line, the .* means match anything, (red|rabbit|blue) means exactly what you think it means, and lastly the $ means end of line.

I think an each loop would be best in this situation:
lines.each do |line|
if line.include? red or rabbit or blue
puts line
end
end
Give that a shot.

How to stop ANSI colour codes messing up printf alignment?

I discovered this while using ruby printf, but it also applies to C's printf.
If you include ANSI colour escape codes in an output string, it messes up the alignment.
Ruby:
ruby-1.9.2-head > printf "%20s\n%20s\n", "\033[32mGreen\033[0m", "Green"
Green # 6 spaces to the left of this one
Green # correctly padded to 20 chars
=> nil
The same line in a C program produces the same output.
Is there anyway to get printf (or something else) to align output and not add spaces for non-printed characters?
Is this is a bug, or is there a good reason for it?
Update: Since printf can't be relied upon to align data when there's ANSI codes and wide chars, is there a best practice way of lining up coloured tabular data in the console in ruby?

It's not a bug: there's no way ruby should know (at least within printf, it would be a different story for something like curses) that its stdout is going to a terminal that understands VT100 escape sequences.
If you're not adjusting background colours, something like this might be a better idea:
GREEN = "\033[32m"
NORMAL = "\033[0m"
printf "%s%20s%s\n", GREEN, "Green", NORMAL

I disagree with your characterization of '9 spaces after the green Green'. I use Perl rather than Ruby, but if I use a modification of your statement, printing a pipe symbol after the string, I get:
perl -e 'printf "%20s|\n%20s|\n", "\033[32mGreen\033[0m", "Green";'
Green|
Green|
This shows to me that the printf() statement counted 14 characters in the string, so it prepended 6 spaces to produce 20 characters right-aligned. However, the terminal swallowed 9 of those characters, interpreting them as colour changes. So, the output appeared 9 characters shorter than you wanted it to. However, the printf() did not print 9 blanks after the first 'Green'.
Regarding the best practices for aligned output (with colourization), I think you'll need to have each sized-and-aligned field surrounded by simple '%s' fields which deal with the colourization:
printf "%s%20.20s%s|%s%-10d%s|%s%12.12s%s|\n",
co_green, column_1_data, co_plain,
co_blue, column_2_data, co_plain,
co_red, column_3_data, co_plain;
Where, obviously, the co_XXXX variables (constants?) contain the escape sequences to switch to the named colour (and co_plain might be better as co_black). If it turns out that you don't need colourization on some field, you can use the empty string in place of the co_XXXX variables (or call it co_empty).

printf field width specifiers are not useful for aligning tabular data, interface elements, etc. Aside from the issue of control characters which you have already discovered, there are also nonspacing and double-width characters which your program will have to deal with if you don't want to limit things to legacy character encodings (which many users consider deprecated).
If you insist on using printf this way, you probably need to do something like:
printf("%*s\n%*s\n", bytestopad("\033[32mGreen\033[0m", 20), "\033[32mGreen\033[0m", bytestopad("Green", 20), "Green");
where bytestopad(s,n) is a function you write that computes how many total bytes are needed (string plus padding spaces) to result in the string s taking up n terminal columns. This would involve parsing escapes and processing multibyte characters and using a facility (like the POSIX wcwidth function) to lookup how many terminal columns each takes. Note the use of * in place of a constant field width in the printf format string. This allows you to pass an int argument to printf for runtime-variable field widths.

I would separate out any escape sequences from actual text to avoid the whole matter.
# in Ruby
printf "%s%20s\n%s%20s\n", "\033[32m", "Green", "\033[0m", "Green"
or
/* In C */
printf("%s%20s\n%s%20s\n", "\033[32m", "Green", "\033[0m", "Green");
Since ANSI escape sequences are not part of either Ruby or C neither thinks that they need to treat these characters special, and rightfully so.
If you are going to be doing a lot of terminal color stuff then you should look into curses and ncurses which provide functions to do color changes that work for many different types of terminals. They also provide much much more functionality, like text based windows, function keys, and sometimes even mouse interaction.

Here's a solution I came up with recently. This allows you to use color("my string", :red) in a printf statement. I like using the same formatting string for headers and the data -- DRY. This makes that possible. Also, I use the rainbow gem to generate the color codes; it's not perfect but gets the job done. The CPAD hash contains two values for each color, corresponding to left and right padding, respectively. Naturally, this solution should be extended to facilitate other colors and modifiers such as bold and underline.
CPAD = {
:default => [0, 2],
:green => [0, 3],
:yellow => [0, 2],
:red => [0, 1],
}
def color(text, color)
"%*s%s%*s" % [CPAD[color][0], '', text.color(color), CPAD[color][1], '']
end
Example:
puts "%-10s %-10s %-10s %-10s" % [
color('apple', :red),
color('pear', :green),
color('banana', :yellow)
color('kiwi', :default)
]

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio