How do I use regular expressions in a Cucumber table (multiline argument) to diff against table? - ruby

I am using a scenario table (multiline step arguments) to check some data from a screen using cucumber, using the in built .diff! method on the Cucumber AST table.
I would like to check the content matches against regular expressions.
Scenario: One
Then the table appears as:
| One | Two | Three |
| /\d+/ | /\d+/ | /\d+/ |
The actual table could look something like
| One | Two | Three |
| 123 | 456 | 789 |
which this scenario is translated to "as long as there are some digits, I don't care"
An example step implementation that fails:
Then /^the table appears as:$/ do |expected_table|
actual_table = [['One','Two', 'Three'],['123', '456', '789']]
expected_table.diff! actual_table
end
Error:
Then the table appears as: # features/step_definitions/my_steps.rb:230
| One | Two | Three |
| /\\d+/ | /\\d+/ | /\\d+/ |
| 123 | 456 | 789 |
Tables were not identical (Cucumber::Ast::Table::Different)
I have tried using step transforms to transform the cells into regular expressions, but they still aren't identical.
Transform code:
expected_table.raw[0].each do |column|
expected_table.map_column! column do |cell|
if cell.respond_to? :start_with?
if cell.start_with? "/"
cell.to_regexp
else
cell
end
else
cell
end
end
end
which provides the eror:
Then the table appears as: # features/step_definitions/my_steps.rb:228
| One | Two | Three |
| (?-mix:\\d+) | (?-mix:\\d+) | (?-mix:\\d+) |
| 123 | 456 | 789 |
Tables were not identical (Cucumber::Ast::Table::Different)
Any ideas? I am stuck.

Using regular expressions in a scenario is almost certainly the wrong approach. Cucumber features are intended to be read and understood by business-focussed stakeholders.
How about writing the step at a higher level, such as as:
Then the first three columns of the table should contain a digit

There is no way to do it without writing your own implementation of diff! method from Ast::Table. Take a look into cucumber/lib/ast/table.rb. Internally it uses diff-lcs library to do an actual comparison which doesn't support regex match.

It seems that you want to write this in a way that provides the cool diff output. Otherwise, I'd look at writing this such that you simply check the rows. It won't be as pretty, and it won't get you the diff of the entire table, but it's something.
Then /^the table appears as:$/ do |expected_table|
actual_table = [['One','Two', 'Three'],['123', '456', '789']]
expected_table.raw.each_with_index { |row, y|
row.each_with_index { |cell, x|
actual_table[x][y].should == cell
}
}
end

Related

How to split string using regex to split between +,-,*,/ symbols?

I need to tell Ruby in regex to split before and after the + - * / symbols in my program.
Examples:
I need to turn "1+12" into [1.0, "+", 12.0]
and "6/0.25" into [6.0, "/", 0.25]
There could be cases like "3/0.125" but highly unlikely. If first two I listed above are satisfied it should be good.
On the Ruby docs, "hi mom".split(%r{\s*}) #=> ["h", "i", "m", "o", "m"]
I looked up a cheat-sheet to try to understand %r{\s*}, and I know that the stuff inside %r{} such as \s are skipped and \s means white space in regex.
'1.0+23.7'.scan(/(((\d\.?)+)|[\+\-\*\/])/)
instead of splitting, match with capture groups to parse your inputs:
(?<operand1>(?:\d+(?:\.\d+)?)|(?:\.\d+))\s*(?<operator>[+\/*-])\s*(?<operand2>(?:\d+(?:\.\d+)?)|(?:\.\d+))
explanation:
I've used named groups (?<groupName>regex) but they aren't necessary and could just be ()'s - either way, the sub-captures will still be available as 1,2,and 3. Also note the (?:regex) constructs that are for grouping only and do not "remember" anything, and won't mess up your captures)
(?:\d+(?:\.\d+)?)|(?:\.\d+)) first number: either leading digit(s) followed optionally by a decimal point and digit(s), OR a leading decimal point followed by digit(s)
\s* zero or more spaces in between
[+\/*-] operator: character class meaning a plus, division sign, minus, or multiply.
\s* zero or more spaces in between
(?:\d+(?:\.\d+)?)|(?:\.\d+) second number: same pattern as first number.
regex demo output:
I arrived a little late to this party, and found that many of the good answers had already been taken. So, I set out to expand on the theme slightly and compare the performance and robustness of each of the solutions. It seemed like a fun way to entertain myself this morning.
In addition to the 3 examples given in the question, I added test cases for each of the four operators, as well as for some new edge cases. These edge cases included handling of negative numbers and arbitrary spaces between operands, as well as how each of the algorithms handled expected failures.
The answers revolved around 3 methods: split, scan, and match. I also wrote new solutions using each of these 3 methods, specifically respecting the additional edge cases that I added to here. I ran all of the algorithms against this full set of test cases, and ended up with a table of pass/fail results.
Next, I created a benchmark that created 1,000,000 test strings that each of the solutions would be able to parse properly, and ran each solution against that sample set.
On first benchmarking, Cary Swoveland's solution performed far better than the others, but didn't pass the added test cases. I made very minor changes to his solution to produce a solution that supported both negative numbers and arbitrary spaces, and included that test as Swoveland+.
The final results printed from to the console are here (note: horizontal scroll to see all results):
| Test Case | match | match | scan | scan |partition| split | split | split | split |
| | Gaskill | sweaver | Gaskill | techbio |Swoveland| Gaskill |Swoveland|Swoveland+| Lilue |
|------------------------------------------------------------------------------------------------------|
| "1+12" | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass |
| "6/0.25" | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass |
| "3/0.125" | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass |
| "30-6" | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass |
| "3*8" | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass |
| "20--4" | Pass | -- | Pass | -- | Pass | Pass | -- | Pass | Pass |
| "33+-9" | Pass | -- | Pass | -- | Pass | Pass | -- | Pass | Pass |
| "-12*-2" | Pass | -- | Pass | -- | Pass | Pass | -- | Pass | Pass |
| "-72/-3" | Pass | -- | Pass | -- | Pass | Pass | -- | Pass | Pass |
| "34 - 10" | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass |
| " 15+ 9" | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass |
| "4*6 " | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass | Pass |
| "b+0.5" | Pass | Pass | Pass | -- | -- | -- | -- | -- | -- |
| "8---0.5" | Pass | Pass | Pass | -- | -- | -- | -- | -- | -- |
| "8+6+10" | Pass | -- | Pass | -- | -- | -- | -- | -- | -- |
| "15*x" | Pass | Pass | Pass | -- | -- | -- | -- | -- | -- |
| "1.A^ff" | Pass | Pass | Pass | -- | -- | -- | -- | -- | -- |
ruby 2.2.5p319 (2016-04-26 revision 54774) [x86_64-darwin14]
============================================================
user system total real
match (Gaskill): 4.770000 0.090000 4.860000 ( 5.214996)
match (sweaver2112): 4.640000 0.040000 4.680000 ( 4.911849)
scan (Gaskill): 7.360000 0.080000 7.440000 ( 7.719646)
scan (techbio): 12.930000 0.140000 13.070000 ( 13.791613)
partition (Swoveland): 5.390000 0.050000 5.440000 ( 5.648762)
split (Gaskill): 5.150000 0.100000 5.250000 ( 5.455094)
split (Swoveland): 3.860000 0.060000 3.920000 ( 4.040774)
split (Swoveland+): 4.240000 0.040000 4.280000 ( 4.537570)
split (Lilue): 7.540000 0.090000 7.630000 ( 8.022252)
In order to keep this post from being far too long, I've included the complete code for this test at https://gist.github.com/mgaskill/96f04e7e1f72a86446f4939ac690759a
The robustness test cases can be found in the first table above. The Swoveland+ solution is:
f,op,l = formula.split(/\b\s*([+\/*-])\s*/)
return [f.to_f, op, l.to_f]
This includes a \b metacharacter prior to splitting on an operator ensures that the previous character is a word character, giving support for negative numbers in the second operand. The \s* metacharacter expressions support arbitrary spaces between operands and operator. These changes incur less than 10% performance overhead for the additional robustness.
The solutions that I provided are here:
def match_gaskill(formula)
return [] unless (match = formula.match(/^\s*(-?\d+(?:\.\d+)?)\s*([+\/*-])\s*(-?\d+(?:\.\d+)?)\s*$/))
return [match[1].to_f, match[2], match[3].to_f]
end
def scan_gaskill(formula)
return [] unless (match = formula.scan(/^\s*(-?\d+(?:\.\d+)?)\s*([+*\/-])\s*(-?\d+(?:\.\d+)?)\s*$/))[0]
return [match[0][0].to_f, match[0][1], match[0][2].to_f]
end
def split_gaskill(formula)
match = formula.split(/(-?\d+(?:\.\d+)?)\s*([+\/*-])\s*(-?\d+(?:\.\d+)?)/)
return [match[1].to_f, match[2], match[3].to_f]
end
The match and scan solutions are very similar, but perform significantly differently, which is very interesting, because they use the exact same regex to do the work. The split solution is slightly simpler, and only splits on the entire expression, capturing each operand and the operator, separately.
Note that none of the split solutions was able to properly identify failures. Adding this support requires additional parsing of the operands, which significantly increases the overhead of the solution, typically running about 3 times slower.
For both performance and robustness, match is the clear winner. If robustness isn't a concern, but performance is, use split. On the other hand, scan provided complete robustness, but was more than 50% slower than the equivalent match solution.
Also note that using an efficient way to extract the results from the solution into the result array is as important to performance as is the algorithm chosen. The technique of capturing the results array into multiple variables (used in Woveland) outperformed the map solutions dramatically. Early testing showed that the map extraction solution more than doubled the runtimes of even the highest-performing solutions, hence the exceptionally high runtime numbers for Lilue.
I think this could be useful:
"1.2+3.453".split('+').flat_map{|elem| [elem, "+"]}[0...-1]
# => ["1.2", "+", "3.453"]
"1.2+3.453".split('+').flat_map{|elem| [elem.to_f, "+"]}[0...-1]
# => [1.2, "+", 3.453]
Obviously this work only for +. But you can change the split character.
EDIT:
This version work for every operator
"1.2+3.453".split(%r{(\+|\-|\/|\*)}).map do |x|
unless x =~ /(\+|\-|\/|\*)/ then x.to_f else x end
end
# => [1.2, "+", 3.453]
R = /
(?<=\d) # match a digit in a positive lookbehind
[^\d\.] # match any character other than a digit or period
/x # free-spacing regex definition mode
def split_it(str)
f,op,l = str.delete(' ').partition(R)
[convert(f), op, convert(l)]
end
def convert(str)
(str =~ /\./) ? str.to_f : str.to_i
end
split_it "1+12"
#=> [1, "+", 12]
split_it "3/ 5.2"
#=> [3, "/", 5.2]
split_it "-4.1 * 6"
#=> [-4.1, "*", 6]
split_it "-8/-2"
#=> [-8, "/", -2]
The regex can of course be written in the conventional way:
R = /(?<=\d)[^\d\.]/

Ruby split pipes in regex

I have put the data from a file into an array, then I am just staying with the data I want of that array which looks like follows:
Basically what I want, is to access each column independently. As the file will keep changing I don't want something hard coded, I would have done it already :).
Element0: | data | address | type | source | disable |
Element1: | 0x000001 | 0x123456 | in | D | yes |
Element2: | 0x0d0f00 | 0xffffff | out | M | yes |
Element3: | 0xe00ab4 | 0xaefbd1 | in | E | no |
I have tried with the regexp /\|\s+.*\s+\|/it prints just few lines (it removes the data I care of). I also tried with /\|.*\|/ and it prints all empty.
I have googled the split method and I know that this is happening it is because of the .* removing the data I care of. I have also tried with the regexp \|\s*\| but it prints the whole line. I have tried with many regexp's but at this moment I can't think of a way to solve this.
Any recommendation?
`line_ary = ary_element.split(/\|\s.*\|/)
unless line_ary.nil? puts line_ary`
You should use the csv class instead of trying to regex parse it. Something like this will do:
require 'csv'
data = CSV.read('data.csv', 'r', col_sep: '|')
You can access rows and columns as a 2 dimentional array, e.g. to access row 2, column 4: data[1][3].
If for example you just wanted to print the address column for all rows you could do this instead:
CSV.foreach('data.csv', col_sep: '|') do |row|
puts row[2]
end
I'd probably use a CSV parser for this but if you want to use a regex and you're sure that you'll never have | inside one of the column values, then you want to say:
row = line.split(/\s*\|\s*/)
so that the whitespace on either side of the pipe becomes part of the delimiter. For example:
> 'Element0: | data | address | type | source | disable |'.split(/\s*\|\s*/)
=> ["Element0:", "data", "address", "type", "source", "disable"]
> 'Element1: | 0x000001 | 0x123456 | in | D | yes |'.split(/\s*\|\s*/)
=> ["Element1:", "0x000001", "0x123456", "in", "D", "yes"]
Split together with strip might be the easiest option. Have you tried something like this?
"Element3:...".split(/\|/).collect(&:strip)

Cucumber Scenario Outline with Special Characters

I'm currently trying to test a service that should be properly replacing certain special characters within a certain Unicode range, including emojis, transportation icons, emoticons and dingbats. I have been using Cucumber and Ruby to do the testing, and my latest scenario outline won't work. I've tried looking up other ways of getting the character from the examples table, however I can't seem to get it working, and the cucumber printout just complains that the Given step isn't defined.
Here is my feature scenario:
Scenario Outline: I update a coupon with a name including emojis/emoticons/dingbats/symbols
Given I have a name variable with a <character> included
When I patch my coupon with this variable
Then the patch should succeed
And the name should include replacement characters
Examples:
| character |
| 🐳 |
| πŸ… |
| πŸ˜₯ |
| πŸš€ |
| β˜€ |
| β˜‚ |
| βœ‹ |
| βœ‚ |
| ⨕ |
| πŠ€ |
| π¦Ώ° |
And Here is my step definition for the Given (which is the step that is complaining that it isn't defined)
Given(/^I have a name variable with a (\w+) included$/) do |char|
#name = 'min length ' + char
#json = { 'name' => #name }.to_json
end
I've tried using some regex's to capture the character, and a (\w+) and (\d+), although I can't find information on how to capture the special character. It's possible for me to write 11 different step definitions, but that would be such poor practice it would drive me nuts.
Unless you have spaces in your specials, it’s safe to use non-space \S:
Given(/^I have a name variable with a (\S+) included$/) do |char|
...
\w would not give you the desired result, since \w is resolved to [a-zA-Z0-9_].

How write table inside of cucumber step with table

I'm writing scenarios for QA engineers, and now I face a problem such as step encapsulation.
This is my scenario:
When I open connection
And device are ready to receive files
I send to device file with params:
| name | ololo |
| type | txt |
| size | 123 |
All of each steps are important for people, who will use my steps.
And I need to automate this steps and repeat it 100 times.
So I decide create new step, which run it 100 times.
First variant was to create step with other steps inside like:
Then I open connection, check device are ready and send file with params 100 times:
| name | ololo |
| type | txt |
| size | 123 |
But this version is not appropriate, because:
people who will use it, will don't understand which steps executes inside
and some times name of steps like this are to long
Second variant was to create step with other steps in parameters table:
I execute following steps 100 times:
| When I open connection |
| And device are ready to receive files |
| I send to device file |
It will be easy to understand for people, who will use me steps and scenarios.
But also I have some steps with parameters,
and I need to create something like two tier table:
I execute following steps 100 times:
| When I open connection |
| And device are ready to receive files |
| I send to device file with params: |
| | name | ololo | |
| | type | txt | |
| | size | 123 | |
This is the best variant in my situation.
But of cause cucumber can't parse it without errors ( it's not correct as cucumber code ).
How can I fix last example of step? (mark with bold font)
Does cucumber have some instruments, which helps me?
Can you suggest some suggest your type of solution?
Does someone have similar problems?
I decide to change symbols "|" to "/" in parameters table, which are inside.
It's not perfect, but it works:
This is scenarios steps:
I execute following steps 100 times:
| I open connection |
| device are ready to receive files |
| I send to device file with params: |
| / name / ololo / |
| / type / txt / |
| / size / 123 / |
This is step definition:
And /^I execute following steps (.*) times:$/ do |number, table|
data = table.raw.map{ |raw| raw.last }
number.to_i.times do
params = []
step_name = ''
data.each_with_index do |line,index|
next_is_not_param = data[index+1].nil? || ( data[index+1] && !data[index+1].include?('/') )
if !line.include?('/')
step_name = line
#p step_name if next_is_not_param
step step_name if next_is_not_param
else
params += [line.gsub('/','|')]
if next_is_not_param
step_table = Cucumber::Ast::Table.parse( params.join("\n"), nil, nil )
#p step_name
#p step_table
step step_name, step_table
params = []
end
end
end
#p '---------------------------------------------------------'
end
end

IRB and large variables?

How can I print a large variable nicely in an irb prompt? I have a variable that contains many variables which are long and the printout becomes a mess to wade through. What if I just want the variable names without their values? Or, can I print each one on a separate line, tabbed-in depending on depth?
Or, can I print each one on a separate line, tabbed-in depending on depth?
Use pp (pretty print):
require 'pp'
very_long_hash = Hash[(1..23).zip(20..42)]
pp very_long_hash
# Prints:
{1=>20,
2=>21,
3=>22,
4=>23,
5=>24,
6=>25,
7=>26,
8=>27,
9=>28,
10=>29,
11=>30,
12=>31,
13=>32,
14=>33,
15=>34,
16=>35,
17=>36,
18=>37,
19=>38,
20=>39,
21=>40,
22=>41,
23=>42}
If you want something that is even more awesome than "pretty" print, you can use "awesome" print. And for a truly spaced out experience, sprinkle some hirbal medicine on your IRb!
Hirb, for example, renders ActiveRecord objects (or pretty much any database access library) as actual ASCII tables:
+-----+-------------------------+-------------+-------------------+-----------+-----------+----------+
| id | created_at | description | name | namespace | predicate | value |
+-----+-------------------------+-------------+-------------------+-----------+-----------+----------+
| 907 | 2009-03-06 21:10:41 UTC | | gem:tags=yaml | gem | tags | yaml |
| 906 | 2009-03-06 08:47:04 UTC | | gem:tags=nomonkey | gem | tags | nomonkey |
| 905 | 2009-03-04 00:30:10 UTC | | article:tags=ruby | article | tags | ruby |
+-----+-------------------------+-------------+-------------------+-----------+-----------+----------+

Resources