How to syntax-highlight function arguments inside function in Sublime? - ruby

I would like to highlight the arguments of a Ruby function in Sublime, when they are used inside the function. Like so:
def my_func(arg1, arg2 = nil)
puts arg1 # should be highlighted
puts arg2 # should be highlighted
end
I've been messing with Sublime's plist syntax highlighting format for a while (same as Textmate's), but having trouble figuring out how to capture one group (the args in the def line) and use them to match more expressions in another group (the whole method)
I have seen \1 and \2 being used in EndCapture groups before, which gives me hope that this is possible, for example by using \1 in a match group. But I just can't seem to get it to work. Anybody have any ideas?

(too long for comment)
If writing regexes in XML/PLIST is driving you batty, try installing the PackageDev plugin via Package Control. There is an option to convert PLIST .tmLanguage syntax files to YAML, and when you're done editing you can convert it back to PLIST. This way, you don't have to mess around with trying to get all the <dict><array><whatever> tags correct in the .tmLanguage file, and you can focus on the regexes, capturing groups, etc. It also uses the Oniguruma syntax, which I assume you're at least somewhat familiar with if you're a Rubyist. I maintain an improved syntax for Python, and my work has been so much easier since I started using the .YAML-tmlanguage format.
Good luck!

Related

Monarch Syntax Highlighting and Multi-line Patterns

Right now I am trying to write a syntax highlighter for my custom language using Monarch. However, in my language (and many others), sometimes the syntax highlighting could change based on what's on the next line. For example, here is one such case in Javascript:
fn(foo // here, foo is an argument
=> bar)
fn(foo // but here, foo is an identifier
+ bar)
Unfortunately it seems like by default, monarch regexes work line by line. So I can't figure out how I would achieve this. Is there a way to enable multi-line regexes in Monarch? Or is there some other way to solve this?

Vim HereDoc Highlighting

I want to get syntax highlighting inside of a bash heredoc. In addition, I want bash variable dereference and command substitution highlighting in the heredoc.
Using these instructions as a starting point, I was able to add variable dereferencing by examining sh.vim. I imagine you could accomplish command substitution in a similar fashion.
let s:bcs = b:current_syntax
unlet b:current_syntax
syntax include #YAML syntax/yaml.vim
syntax region hereDocYAML matchgroup=Statement start=/<<-\?\s*\z(YML\)/ end=/^\s*\z1/ contains=#YAML,hereDocDeref,hereDocDerefSimple
syn match hereDocDerefSimple "\$\%(\h\w*\|\d\)"
syn region hereDocDeref matchgroup=PreProc start="\${" end="}" contains=#shDerefList,shDerefVarArray
hi def link hereDocDeref PreProc
hi def link hereDocDerefSimple PreProc
My problem is, this doesn't appear to work in any sort of block (if, function, for, etc.). For example:
The only thing I know is that this is not an indentation problem. Changing indentation has no effect, and using indentation outside of a block works properly.
The contains on the definition of shIf appears to limit the possible highlights to whatever items are in shIfList group.
Adding your item to the group as follows will give you highlighting withing that region:
syn cluster shIfList add=hereDocYAML
The same principle holds for all other blocks. For example:
syn cluster shFunctionList add=hereDocYAML
syn cluster shLoopList add=hereDocYAML
It should be noted that the approach you've outlined is not generally useful. First of all, it requires that the included syntax file use groups for their contains definitions.
In addition, it works for YAML because bash dereference syntax and YAML syntax do not conflict. For more complicated syntaxes like sed or awk, you will have conflicts, and the rules to resolve those conflicts will likely require a new syntax entirely.

Ignore commented out code when using YARD

I have some Ruby code that looks like this:
# some_string = "{really?}"
where the curly braces need to be part of the string. This line is commented out code that I'd like to remain there. I'm additionally using YARD to document code, so when I run yard doc it (naturally) throws a warning about being unable to link "really".
Is there a way I can tell YARD to ignore commented out code?
Is there a way I can tell YARD to ignore commented out code?
On the one hand, YARD is documented as supporting Rdoc markup. And Rdoc is documented to support a couple of ways to hide parts.
RDoc stops processing comments if it finds a comment line starting
with -- right after the # character (otherwise, it will be treated as
a rule if it has three dashes or more). This can be used to separate
external from internal comments, or to stop a comment being associated
with a method, class, or module. Commenting can be turned back on with
a line that starts with ++.
:stopdoc: / :startdoc:
Stop and start adding new documentation elements to the current
container. For example, if a class has a number of constants that you
don’t want to document, put a :stopdoc: before the first, and a
:startdoc: after the last. If you don’t specify a :startdoc: by the end
of the container, disables documentation for the rest of the current
file.
Source
On the other hand, I have never persuaded Rdoc or YARD to follow that markup. If your luck is better than mine, you can stop reading here.
If you, too, can't persuade YARD to follow that markup, I think your best bet might be to cut that line, and commit the file with a distinctive commit message--one that you'll be able to find easily by grepping the source control logs.
Finally, rake lets you transform text (code) files in arbitrary ways. You can write a Rakefile to delete lines before processing them through YARD.
$ cat silly-ruby-file.src
class Something
def this_method
end
def that_method
# some_string = "{really?}" # Hide me
end
end
I appended the text # Hide me; it's a lot easier to filter that specific text than it is to filter commented lines of arbitrary code.
$ cat Rakefile
task :default => "silly-ruby-file.rb"
sh "grep -v '# Hide me' silly-ruby-file.src > silly-ruby-file.rb"
This tells rake to run grep, copying all lines except those that have the text "# Hide me" to stdout, which is redirected to "silly-ruby-file.rb".

Most reliable way to get text into ruby script

I have a ruby script that’ll do some text parsing (à lá markdown). It does it in a sequence of steps, like
string = string.gsub # more code here
string = string.gsub # more code here
# and so on
what is the best (i.e. most reliable) way to feed text into string in the first place? It’s a script, and the text it’ll be fed can vary a lot — it can be multilingual, have some characters that might trip a shell (like ", ', ’, &, $ you get the idea), and will likely be multi-line.
Is there some trick on the lines of
cat << EOF
bunch of text here
EOF
Additional considerations
I’m not looking for a markdown parser, this is something I want to do, not something I want a tool for.
I’m not a big ruby user (I’m starting to use it), so the more detailed the answer you can provide, the better.
It must be completely scriptable (i.e., no interrupting to ask the user for information).
The Kernel#gets method will read a string separated using the record separator from stdin or files specified on the command line. So if you use that you can do things like:
yourscript <filename #read from filename
yourscript file1 file2 # read both file1 and file2
yourscript #lets you type at your script
So to run something like:
cat <<'eof' |ruby yourscript.rb
This' & will $all 'eof' be 'fine'''
eof
Script might contain something like:
s = gets() # read a line
lines = readlines() # read all lines into an array
That's fairly standard for command-line scripts. If you want to have a user-interface then you'll want something more complex. There is an option to the Ruby interpreter to set the encoding of files as they are read.
Just read from stdin (which is an IO object):
$stdin.read
As you can see, stdin is provided in the global variable $stdin. Since it’s an IO object, there are a lot of other methods available if read doesn’t suit your needs.
Here’s a simple one-line example in the shell:
$ echo "foo\nbar" | ruby -e 'puts $stdin.read.upcase'
FOO
BAR
Obviously reading from stdin is extremely flexible since you can pipe input in from anywhere.
Ruby is very adept at encodings (see eg. Encoding docs). To get text into Ruby, one typically uses either gets, or reads File objects, or uses a GUI, which one can build with gtk2 gem or rugui (if already finished). In case you are getting texts from the wild internet, security should be your concern. Ruby used to have 4 $SAFE levels, but after some discussions, now there might be only 3 of them left. In any case, the best strategy to handle strings is to know as much as possible about the properties of the string that you expect in advance. Handling absolutely arbitrary strings is a surprisingly difficult task. Try to limit the number of possible encodings and figure the maximum size for the string that you expect.
Also, with respect to your original stated goal writing a markdown-processor-like something, you might want to not reinvent the wheel (unless it is for didactic purposes). There is this SO post:
Better ruby markdown interpreter?
The answer will direct you to kramdown gem, which gets a lot of praise, though I have not tried it personally.

String.scan returning empty array in Ruby

I've written a very basic regex in Ruby for scraping email-addresses off the web. It looks like the following:
/\b\w+(\.\w+)*#\w+\.\w+(\.\w+)*\b/
When I load this into irb or rubular, I create the following string:
"example#live.com"
When I run the Regexp.match(string) command in irb, I get this:
regexp.match(string) =>#<MatchData "example#live.com" 1:nil 2:nil>
So the match seems to be recorded in the MatchData object. However, when I run the String.scan(regex) command (which is what I'm primarily interested in), I get the following:
string.scan(regex) => [[nil, nil]]
Why isn't scan returning the matched email address? Is it a problem with the regular expression? Or is it a nuance of String.scan/Regexp/MatchData that somebody could make me aware of?
The main issue is that your capturing groups (the stuff matched by whatever's in parentheses) aren't capturing what you want.
Let's say you want just the username and domain. You should use something along the lines of /\b(\w+(?:\.\w+)*)#(\w+(?:\.\w+)*)\.\w+\b/. As it stands, your pattern matches the input text, but the groups don't actually capture any text.
Also, why not just use /([\w\.]+)#([\w\.]+)\.\w+/? (not too familiar with ruby's regex engine, but that should be about right... you don't even need to check for word boundaries if you're using greedy quantifiers)

Resources