Weird behaviour in elixir with whitespace character - syntax

I'm encountering a weird behaviour in Elixir when defining for example function default arguments or using head|tail in a list definitions.
This does not work and results in an error unexpected token: " ":
def a(b \\ "test") do
b
end
But this one does:
def a(b \\"test") do
b
end
The difference being the whitespace character " " preceding the default string argument "test"
Also this does not work and results in an error unexpected token: " ":
[0 | [1,2,3,4,5]]
But this one does work:
[0 |[1,2,3,4,5]]
Once again the difference being the whitespace character " " preceding the tail list definition [1,2,3,4,5]
The problem exists in IEX and compiled code. I'm running Elixir 1.4. My system is macOS Sierra and I'm using iTerm as my terminal app.
So the question is: is this the correct behaviour or is there something wrong for example in my environment and what it could possibly be? All the examples and guides allow whitespace in these positions but for some reason my environment does not. Is there something I can do about this?
Thank you in advance!

Issue got resolved as stated in the comments.
On macOS alt+space provides Non-breaking space character instead of normal space. The issue described occurred most of the times after inserting any character with alt-combination following whitespace because I just wasn't fast enough to release the alt-key and thus wrong whitespace was provided.
For instructions to resolve this on macOS (in case if you want to disable the alternative space) check out this question: https://superuser.com/questions/78245/how-to-disable-the-option-space-key-combination-for-non-breaking-spaces

Related

Are there ways to modify Rstudio's Console behavior when adding missing quotes

I've search both SO and Rstudio's community pages and failed to find aquestion, much less and answer to this annoyance I have experienced with Rstudio. (The Rstudio help pages won't let me post a second question within 12 hours of my first, which was explained as a bug.)
If I type:
(test)
... and then realize that test should be quoted, then putting the cursor at the end to test and entering a double-quote will give me two double-quotes "". It will not do this if I first enter a quote between ( and t and then it will also not give me doubling of the double-quote character at the end of `test. Why should it matter whether I first correct my error at the end of the symbol or at the beginning? Is there anything I can do to modify this quirk> It seems that a syntax aware console editor out to be able to tell when a doubling of quotes does not make sense. It's obviously making that "decision" when the quotes are entered between an open-paren and a character. Why not suppress the unhelpful behavior when it is between a character and a close-paren?

Sphinx issues mysterious error in literal blocks

In Sphinx (the ReStructuredText publishing system), are there any obscure rules that limit what a literal block can contain?
Background: My document contains many literal blocks that follow a double-colon paragraph, like this:
Background:... follow a double-colon paragraph, like this::
$ sudo su
# echo ttyS0,115200 > /sys/module/kgdboc/parameters/kgdboc
This block (with a different preceding paragraph) is one of the ones that issues an error: "WARNING: Inconsistent literal block quoting." The message indicates that the error is in the "echo" line. In the HTML output the literal block contains only the "sudo" line; the "echo" line is treated as ordinary text.
I haven't been able to identify any common property in the lines that report errors, or anything that distinguishes them, as a class, from lines in other literal blocks that don't get errors.
I stripped down the project to isolate the problem, and I identified it that way.
I had a numbered list item that contained a double-colon literal block that was indented only as far as the list item's text, like this:
2. Set up the... directory::
$ A Linux command
$ Another Linux command
$ And ANOTHER Linux command
$ etc.
When I indented the literal block further, the problem went away.
I was misled by two things:
The message does not point to the first line in the literal block, but to some apparently random line within it. In the case above, it pointed to the fifth line (out of eight) in the block!
In most cases this form of indention, although incorrect, works just fine.
Isolating the problem is a brute-force method of solving it, but is often effective when deduction fails. I'll keep that in mind in the future.

What is the difference between "hello".length and "hello" .length?

I am surprised when I run the following examples in ruby console. They both produce the same output.
"hello".length
and
"hello" .length
How does the ruby console remove the space and provide the right output?
You can put spaces wherever you want, the interpreter looks for the end of the line. For example:
Valid
"hello".
length
Invalid
"hello"
.length
The interpreter sees the dot at the end of the line and knows something has to follow it up. While in the second case it thinks the line is finished. The same goes for the amount of spaces in one line. Does it matter how the interpreter removes the spaces? What matters is that you know the behavior.
If you want you can even
"hello" . length
and it will still work.
I know this is not an answer to you question, but does the "how" matter?
EDIT: I was corrected in the comments below. The examples with multiple lines given above are both valid when run in a script instead of IRB. I was mixed them up with the operators. Where the following also applies when running a script:
Valid
result = true || false
Valid
result = true ||
false
Invalid
result = true
|| false
This doesn't have as much to do with the console as it has to do with how the language itself is parsed by the compiler.
Most languages are parsed in such a way that items to be parsed are first grouped into TOKENS. Then the compiler is defined to expect a certain SEQUENCE of tokens in order to interpret each programming statement.
Because the compiler is only looking for a TOKEN SEQUENCE, it doesn't matter if there is space in between or not.
In this case the compiler is looking for:
STRING DOT METHOD_NAME
So it won't matter if you write "hello".length, or even "hello" . length. The same sequence of tokens are present in both, and that is all that matters to the compiler.
If you are curious how these token sequences are defined in the Ruby source code, you can look at parse.y starting around line 1042:
https://github.com/ruby/ruby/blob/trunk/parse.y#L1042
This is a file that is written using the YACC language, which is a language used to define parsers with.
Even without knowing anything about YACC, you should already be able to get some clues on how it works by just looking around the file a bit.

Tools for automatically simplifying regexes

I'm trying to squash warnings in an open source project, and
/[\.\,\;\:\(\)\[\]\{\}\<\>\"\'\`\~\/\|\?\!\&\#\#\s\x00-\x1f\x7f]+/
is giving me
(irb):1: warning: character class has duplicated range
Are there any tools that automatically point out which parts of the regexp causes the overlap?
I don't know of any tool, but I've spotted the overlap: \s contains \t, \f, \n and \r, so that overlaps with the \x00-\x1f part.
So, unless there's a way to get Ruby itself to tell you that it found a "problem", you can write this regex as (removing all those unnecessary backslashes along the way):
/[.,;:()\[\]{}<>"'`~\/|?!&## \x00-\x1f\x7f]+/
If you ever reach that point of desperation, I guess you could put outputting some debug info in Ruby source and rebuild. :) I believe this is the place where the warning is thrown:
https://github.com/ruby/ruby/blob/trunk/regparse.c#L1787

how to use regex negation string

can any body tell me how to use regex for negation of string?
I wanna find all line that start with public class and then any thing except first,second and finally any thing else.
for example in the result i expect to see public class base but not public class myfirst:base
can any body help me please??
Use a negative lookahead:
public\s+class\s+(?!first|second).+
If Peter is correct and you're using Visual Studio's Find feature, this should work:
^:b*public:b+class:b+~(first|second):i.*$
:b matches a space or tab
~(...) is how VS does a negative lookahead
:i matches a C/C++ identifier
The rest is standard regex syntax:
^ for beginning of line
$ for end of line
. for any character
* for zero or more
+ for one or more
| for alternation
Both the other two answers come close, but probably fail for different reasons.
public\s+class\s+(?:(?!first|second).)+
Note how there is a (non-capturing) group around the negative lookahead, to ensure it applies to more than just the first position.
And that group is less restrictive - since . excludes newline, it's using that instead of \S, and the $ is not necessary - this will exclude the specified words and match others.
No slashes wrapping the expression since those aren't required in everything and may confuse people that have only encountered string-based regex use.
If this still fails, post the exact content that is wrongly matched or missed, and what language/ide you are using.
Update:
Turns out you're using Visual Studio, which has it's own special regex implementation, for some unfathomable reason. So, you'll be wanting to try this instead:
public:b+class:b+~(first|second)+$
I have no way of testing that - if it doesn't work, try dropping the $, but otherwise you'll have to find a VS user. Or better still, the VS engineer(s) responsible for this stupid non-standard regex.
Here is something that should work for you
/public\sclass\s(?:[^fs\s]+|(?!first|second)\S)+(?=\s|$)/
The second look a head could be changed to a $(end of line) or another anchor that works for your particular use case, like maybe a '{'
Edit: Try changing the last part to:
(?=\s|$)

Resources