Parslet grammar for rules starting identical - ruby

I want to provide a parser for parsing so called Subversion config auth files (see patch based authorization in the Subversion red book). Here I want to define rules for directories like
[/]
* = r
[/trunk]
#PROJECT = rw
So the part of the grammar I have problems is the path definition. I currently have the following rules in Parslet:
rule(:auth_rule_head) { (str('[') >> path >> str(']') >> newline).as(:arh) }
rule(:top) { (str('/')).as(:top) }
rule(:path) { (top | ((str('/') >> path_ele).repeat)).as(:path) }
rule(:path_ele) { ((str('/').absent? >> any).repeat).as(:path_ele) }
So I want to divide in two cases:
To find only [/] (the root directory)
in all other cases [/<dir>] which may be repeated, but has to end without a /
The problematic rule seems to be the path that defines an alternative, here / XOR something like /trunk
I have defined test cases for those, and get the following error when running the test case:
Failed to match sequence (SPACES '[' PATH ']' NEWLINE) at line 1 char 3.
`- Expected "]", but got "t" at line 1 char 3.
So the problem seems to be, that the alternative (rule :path) is chosen all the time top.
What is a solution (as a grammar) for this problem? I think there should be a solution, and this looks like something idiomatic that should happen from here to there. I am not an expert at all with PEG parsers or parser / compiler generation, so if that is a base problem not solvable, I would like to know that as well.

In short: Swap the OR conditions around.
Parlset rules consume the input stream until they get a match, then they stop.
If you have two possible options (an OR), the first is tried, and only if it doesn't match is the second tried.
In your case, as all your paths start with '/' they all match the first part of the path rule, so the second half is never explored.
You need to try to match the full path first, and only match the 'top' if it fails.
# changing this
rule(:path) { (top | ((str('/') >> path_ele).repeat)).as(:path) }
# to this
rule(:path) { ((str('/') >> path_ele).repeat) | top).as(:path) }
# fixes your first problem :)
Also... Be careful of rules that can consume nothing being in a loop.
Repeat by default is repeat(0). Usually it needs to be repeat (1).
rule(:path) { ((str('/') >> path_ele).repeat(1)) | top).as(:path) }
also...
Is "top" really a special case? All paths end in a "/", so top is just the zero length path.
rule(:path) { (path_ele.repeat(0) >> str('/')).as(:path) }
Or
rule(:path) { (str('/') >> path_ele.repeat(0)).as(:path) }
rule(:path_ele) { ((str('/').absent? >> any).repeat(0)).as(:path_ele) >> str('/') }
# assuming "//" is valid otherwise repeat(1)

Seems to be I have not got the problem right. I have tried to reproduce the problem in creating a small example grammar including some unit tests, but now, the thing is working.
If you are interested in it, have a look at the gist https://gist.github.com/mliebelt/a36ace0641e61f49d78f. You should be able to download the file, and run it directly from the command line. You have to have installed first parslet, minitest should be already included in a current Ruby version.
I have added there only the (missing) rule for newline, and added 3 unit tests to test all cases:
The root: /
A path with only one element: /my
A path with more than one element: /my/path
Works like expected, so I get two cases here:
Top element only
One or more path elements
Perhaps this may help others how to debug a situation like that.

Related

Avoid "stack smashing detected" when recursively listing file or directories

I've to rename around 5 000 folders located in a remote storage. Running Dir['**/*/'] returns an error "*** stack smashing detected ***" and invites me to report the bug as it might occurs during the interpretation process (see bug report)
If it can help, here's the script I was planning to run (works fine on a test environment, though it's quite specific to my needs)
#!/usr/bin/env ruby
# Fetch root directories
dirs = Dir['**/*/'].select { |d| d =~ /\d([\.-]{1}\d{2,})?/ }
# Order subdirectories first
dirs = dirs.sort_by { |d| d.count('/') }.reverse
# Substitute "." and "-" placed after the last "/" with "_"
dirs.each do |dir|
File.rename(dir, dir.gsub(/[\.-](?!.*\/.*)/, '_'))
end
Any suggestion for mitigating this issue ?
It's neither a well formed question, nor a generic answer, but I managed to get around the issue by specifying the deepness to look at. Concretely, I replaced Dir['**/*/'] by Dir['*/*/*/*/'].
However I'm open to other suggestions, as others may face similar issues without the possibility to hardcode the deepness to look at.

Nested directory searching

I'm trying to make a program that searches through hopefully every directory, sub directory, sub sub directory and so on in C:\. I feel like I can take care of that part, but there's also the issue of the folder names. There may be case issues like a folder named FOO not being dected when my program searches for Foo or a giant if/else or case statement for multiple search criteria.
My questions are: 1. is there a way to ignore letter case? and 2. is there a way to make a more efficient statement for searching?
My current code:
#foldersniffer by Touka, ©2015
base = Dir.entries("C:\\")
trees = Dir.entries("#{base}")
trees.each do |tree|
if Dir.exist?("Foo")
puts "Found Folder \"Foo\" in C:\\"
elsif Dir.exist?("Bar")
puts "Found Folder \"Bar\" in C:\\"
else
puts "No folders found"
end
end
sleep
any help is appreciated.
edit: it's trying to scan files like bootmgr and it's giving me errors... I'm not sure how to fix that.
Consider using Dir.glob(...) and regular expressions for case insensitive matching:
Dir.glob('c:\\**\*') do |filename|
if filename =~ /c:\\(foo|bar)($|\\)/i
puts "Found #{filename}"
end
end
Case sensitivity for the Dir.glob argument is likely not relevant on Windows systems:
Note that this pattern is not a regexp, it’s closer to a shell glob. See File.fnmatch for the meaning of the flags parameter. Note that case sensitivity depends on your system (so File::FNM_CASEFOLD is ignored), as does the order in which the results are returned.
I am not expert enough to say for sure but I would look into File::FNM_CASEFOLD
https://lostechies.com/derickbailey/2011/04/14/case-insensitive-dir-glob-in-ruby-really-it-has-to-be-that-cryptic/

Use issue_closing_pattern variable to close multiple issues in gitlab

I'd like to have the ability to close multiple issues with one commit by referencing multiple issues with the default pattern ^([Cc]loses|[Ff]ixes) +#\d+a. I know that this will only affect fixes #number-patterns at the beginning of lines and that's what I want.
But I wasn't yet able to get it to work.
I'm currently using Gitlab 6.1, installed it according to the installation readme on github and didn't change anything other then the codesnippet below.
Here's what I tried:
First I changed in {gitlab-directory}/app/models/commit.rb the following (original code commented out):
def closes_issues project
md = safe_message.scan(/(?i)((\[)\s*(close|fix)(s|es|d|ed)*\s*#\d+\s*(\])|(\()\s*(close|fix)(s|es|d|ed)*\s*#\d+\s*(\)))/)
#md = issue_closing_regex.match(safe_message)
if md
extractor = Gitlab::ReferenceExtractor.new
md.each do |n|
extractor.analyze(n[0])
end
extractor.issues_for(project)
#extractor = Gitlab::ReferenceExtractor.new
#extractor.analyze(md[0])
#extractor.issues_for(project)
else
[]
end
end
But the regex used in this code snippet doesn't fit my needs and isn't really correct (e.g.: (fixs #123) and (closees #123) would both work).
After testing this codesnippet and confirming that this one works with patterns that match the regex used in the snippet, I tried to change the regex. At first, I tried to do this in the second line:
md safe_message.scan(/#{Gitlab.config.gitlab.issue_closing_pattern}/)
This one didn't work. I didn't found any error messages in log/unicorn.stderr.log so I tried to use the default regex from the config file directly without variable:
md safe_message.scan(/^([Cc]loses|[Ff]ixes) +#\d+a/)
But this one didn't work, too. Again, no error messages in log/unicorn.stderr.log.
How do I use the variable issue_closing_pattern from the config file as regex pattern in this code snippet?
If the regex you provide to the String#scan method contains capture groups, it returns an array of arrays containing the patterns matched by each group:
irb(main):014:0> regex = "^([Cc]loses|[Ff]ixes) +#\\d+"
=> "^([Cc]loses|[Ff]ixes) +#\\d+"
irb(main):017:0> safe_message = "foo\ncloses #1\nfixes #2\nbar"
=> "foo\ncloses #1\nfixes #2\nbar"
irb(main):018:0> safe_message.scan(/#{regex}/)
=> [["closes"], ["fixes"]]
Because the default regex has a capture group for just the "closes/fixes" bit, that's all the loop is seeing, and those strings don't contain the issue references! To fix it, just add a capture group around the entire pattern:
irb(main):019:0> regex = "^(([Cc]loses|[Ff]ixes) +#\\d+)"
=> "^(([Cc]loses|[Ff]ixes) +#\\d+)"
irb(main):020:0> safe_message.scan(/#{regex}/)
=> [["closes #1", "closes"], ["fixes #2", "fixes"]]

RSpec - script to upgrade from 'should' to 'expect' syntax?

I have hundreds of files that also have hundreds of 'should' statements.
Is there any sort of automated way to update these files to the new syntax?
I'd like options to both create new files and also modify the existing files inline.
sed is a good tool for this.
The following will process all the files in the current directory and write them out to new files in a _spec_seded directory. This currently handle about 99%+ of the changes but might still leave you with a couple of manual changes to make (the amount will depend on your code and coding style).
As always with a sed script you should check the results, run diffs and look at the files manually. Ideally you are using git which helps make the diffs even easier.
filenum=1
find . -type f -name '*_spec.rb' | while read file; do
mkdir -p ../_spec_seded/"${file%/*}"
echo "next file...$filenum...$file"
let filenum+=1
cp "$file" ../_spec_seded/"$file"
sed -i ' # Exclude:
/^ *describe .*do/! { # -describe...do descriptions
/^ *it .*do/! { # -it...do descriptions
/^[[:blank:]]*\#/! { # -comments
/^ *def .*\.should.*/! { # -inline methods
/\.should/ {
s/\.should/)\.to/ # Change .should to .to
s/\(\S\)/expect(\1/ # Add expect( at start of line.
/\.to\( \|_not \)>\=/ s/>\=/be >\=/ # Change operators for
/\.to\( \|_not \)>[^=]/ s/>/be >/ # >, >=, <, <= and !=
/\.to\( \|_not \)<\=/ s/<\=/be <\=/
/\.to\( \|_not \)<[^=]/ s/</be </
/\.to\( \|_not \)\!\=/ s/\!\=/be \!\=/
}
/\.to +==\( +\|$\)/ s/==/eq/
/=\~/ { # Change match operator
s/=\~/match(/
s/$/ )/
s/\[ )$/\[/
}
s/[^}.to|end.to]\.to /).to / # Add paren
/eq ({.*} )/ s/ ({/ ( {/ # Add space
/to\(_\|_not_\)receive/ s/_receive/ receive/ # receive
/\.to eq \[.*\]/ {
s/ eq \[/ match_array([/
s/\]$/\])/
}
/expect.*(.*lambda.*{.*})/ { # Remove unneeded lambdas
s/( *lambda *{/{/
s/ })\.to / }\.to /
}
/expect *{ *.*(.*) *})\.to/ { # Fix extra end paren
s/})\.to/}\.to/
}
}
}
}
}' ../_spec_seded/"$file"
done
Please use with caution. Currently the script create new files in _seded/ for review first for safety. The script is placed in /spec directory and run from there.
If you have hundreds of files this could save you hours or days of work!
If you use this I recommend that "step 2" is do manually copy files from _spec_seded to spec itself and run them. I recommend that you don't just rename the whole directories. For one thing, files, such as spec_helper.rb aren't currently copied to _spec_seded.
11/18/2013 Note: I continue to upgrade this script. Covering more edge cases and also making matches more specific and also excluding more edge cases, e.g. comment lines.
P.S. The differences which should be reviewed can be seen with (from the project directory root):
diff -r /spec /_spec_seded
git also has nice diff options but I like to look before adding files to git at all.
Belated update, mainly for those who may find their way to this page via a search engine.
Use Yuji Nakayama's excellent Transpec gem for this purpose. I've used it over 10 times now on different projects without issue.
From the website:
Transpec lets you upgrade your RSpec 2 specs to RSpec 3 in no time. It supports conversions for almost all of the RSpec 3 changes, and it’s recommended by the RSpec team.
Also, you can use it on your RSpec 2 project even if you’re not going to upgrade it to RSpec 3 for now.

A problem with folding bash functions in vim

I have a bash script file which starts with a function definition, like this:
#!/bin/bash
# .....
# .....
function test {
...
...
}
...
...
I use vim 7.2, and I have set g:sh_fold_enabled=1 such that folding is enabled with bash. The problem is that the folding of the function test is not ended correctly, i.e. it lasts until the end of file. It looks something like this:
#!/bin/bash
# .....
# .....
+-- 550 lines: function test {----------------------------------------
~
~
The function itself is just about 40 lines, and I want something that lookes like this ("images" say more than a thousend words, they say...):
#!/bin/bash
# .....
# .....
+-- 40 lines: function test {----------------------------------------
...
...
...
~
~
Does anyone know a good solution to this problem?
I have done some research, and found a way to fix the problem: To stop vim from folding functions until the end of file, I had to add a skip-statement to the syntax region for shExpr (in the file sh.vim, usually placed somewhere like /usr/share/vim/vim70/syntax/):
syn region shExpr ... start="{" skip="^function.*\_s\={" end="}" ...
This change stops the syntax file from thinking that the { and } belongs to the shExpr group, when they actually belong to the function group. Or that is how I have understood it, anyway.
Note: This fix only works for the following syntax:
function test
{
....
}
and not for this:
function test {
....
}
A quick and dirty fix for the last bug is to remove shExpr from the #shFunctionList cluster.
with vim 8.2+ the following worked for me:
syntax enable
let g:sh_fold_enabled=5
let g:is_sh=1
set filetype=on
set foldmethod=syntax
" :filteype plugin indent on
foldnestmax=3 "i use 3, change it to whatever you like.
it did not matter where i put it in my vimrc.
and this turns on syntax folding and the file type plugin for all installed file types.
It should just work, but there seems to be a bug in the syntax file. The fold region actually starts at the word 'function' and tries to continue to the closing '}', but the highlighting for the '{...}' region takes over the closing '}' and the fold continues on searching for another one. If you add another '}' you can see this in action:
function test {
...
}
}
There seems to be a simpler solution on Reddit.
To quote the author in the post:
The options I use are:
syntax=enable
filetype=sh
foldmethod=syntax
let g:sh_fold_enabled=3
g:is_sh=1
EDIT: Workaround
vim -u NONE -c 'let g:sh_fold_enabled=7' -c ':set fdm=syntax' -c 'sy
on' file.sh
g:sh_fold_enabled=4 seemed to be the agreed upon fold-level in the discussion. This solution is working perfectly for me. I did not have to edit the syntax file.
Edit: g:sh_fold_enabled=5 is actually the right one. Not 4.
Also, as the poster showed on Reddit, those commands must go before any other setting in vimrc, except the plugins.

Resources