With the html above, I need to get the 3rd mfc-tree-item based on the mfc-text-type text of the second mfc-tree-item, in other words, I need to click on the mfc-tree-item[data-qa-id='fma-tree-nav-headcount-planning-component'] that is within the mfc-tree-item[data-qa-id='fma-tree-nav-scenario'] which mfc-text-type's text equal to "Baseline" (contains is not enough). So I was thinking I could use a regex with ^ at the beginning and $ at the end of Baseline, but not sure if that's even supported. Any idea?
scenarioName = 'Baseline'
cy.get('mfc-tree-item[data-qa-id="fma-tree-nav-scenario"]')
.filter(':contains("^"' + scenarioName + '"$", "g")')
.find(mfc-tree-item[data-qa-id='fma-tree-nav-headcount-planning-component'])
To use a regex with text from a variable, you would need to build it with a RexExp object.
const scenarioName = 'Baseline'
const regex = new RegExp(scenarioName)
cy.contains('mfc-tree-item[data-qa-id="fma-tree-nav-scenario"]', regex)
.find('mfc-tree-item[data-qa-id="fma-tree-nav-headcount-planning-component"]')
.should('have.attr', 'icon-color', 'Moonstone')
But you can't use ^ and $ to denote beginning and end of string, because the text of the top element includes all the text of it's children plus white-space and new lines (text nodes).
You could try strengthening the regex, but it would be time consuming and give you a fragile test.
Instead, try going directly to the element with the text, which allows a strong regex, and traverse up the tree to it's ancestor
const scenarioName = 'Baseline'
const regex2 = new RegExp(`^${scenarioName}$`)
cy.contains('mfc-text-type', regex2)
.parents('mfc-tree-item[data-qa-id="fma-tree-nav-scenario"]')
.find('mfc-tree-item[data-qa-id="fma-tree-nav-headcount-planning-component"]')
.should('have.attr', 'icon-color', 'Moonstone')
string = 'xabcdexfghijk'
In the example above, 'x' appears twice. I want to capture everything between the first 'x' and the next 'x'. Thus, the desired result is a new string that equals 'xabcdex'. Any ideas?
You could use a simple regular expression: /x.*?x/. This basically means "match any characters in between two x characters, as few times as possible (non-greedy)".
The matched text can be extracted with String#[regexp]
string = 'xabcdexfghijk'
string[/x.*?x/] # => "xabcdex"
I am having a string as below:
str1='"{\"#Network\":{\"command\":\"Connect\",\"data\":
{\"Id\":\"xx:xx:xx:xx:xx:xx\",\"Name\":\"somename\",\"Pwd\":\"123456789\"}}}\0"'
I wanted to extract the somename string from the above string. Values of xx:xx:xx:xx:xx:xx, somename and 123456789 can change but the syntax will remain same as above.
I saw similar posts on this site but don't know how to use regex in the above case.
Any ideas how to extract the above string.
Parse the string to JSON and get the values that way.
require 'json'
str = "{\"#Network\":{\"command\":\"Connect\",\"data\":{\"Id\":\"xx:xx:xx:xx:xx:xx\",\"Name\":\"somename\",\"Pwd\":\"123456789\"}}}\0"
json = JSON.parse(str.strip)
name = json["#Network"]["data"]["Name"]
pwd = json["#Network"]["data"]["Pwd"]
Since you don't know regex, let's leave them out for now and try manual parsing which is a bit easier to understand.
Your original input, without the outer apostrophes and name of variable is:
"{\"#Network\":{\"command\":\"Connect\",\"data\":{\"Id\":\"xx:xx:xx:xx:xx:xx\",\"Name\":\"somename\",\"Pwd\":\"123456789\"}}}\0"
You say that you need to get the 'somename' value and that the 'grammar will not change'. Cool!.
First, look at what delimits that value: it has quotes, then there's a colon to the left and comma to the right. However, looking at other parts, such layout is also used near the command and near the pwd. So, colon-quote-data-quote-comma is not enough. Looking further to the sides, there's a \"Name\". It never occurs anywhere in the input data except this place. This is just great! That means, that we can quickly find the whereabouts of the data just by searching for the \"Name\" text:
inputdata = .....
estposition = inputdata.index('\"Name\"')
raise "well-known marker wa not found in the input" unless estposition
now, we know:
where the part starts
and that after the "Name" text there's always a colon, a quote, and then the-interesting-data
and that there's always a quote after the interesting-data
let's find all of them:
colonquote = inputdata.index(':\"', estposition)
datastart = colonquote+3
lastquote = inputdata.index('\"', datastart)
dataend = lastquote-1
The index returns the start position of the match, so it would return the position of : and position of \. Since we want to get the text between them, we must add/subtract a few positions to move past the :\" at begining or move back from \" at end.
Then, fetch the data from between them:
value = inputdata[datastart..dataend]
And that's it.
Now, step back and look at the input data once again. You say that grammar is always the same. The various bits are obviously separated by colons and commas. Let's try using it directly:
parts = inputdata.split(/[:,]/)
=> ["\"{\\\"#Network\\\"",
"{\\\"command\\\"",
"\\\"Connect\\\"",
"\\\"data\\\"",
"\n{\\\"Id\\\"",
"\\\"xx",
"xx",
"xx",
"xx",
"xx",
"xx\\\"",
"\\\"Name\\\"",
"\\\"somename\\\"",
"\\\"Pwd\\\"",
"\\\"123456789\\\"}}}\\0\""]
Please ignore the regex for now. Just assume it says a colon or comma. Now, in parts you will get all the, well, parts, that were detected by cutting the inputdata to pieces at every colon or comma.
If the layout never changes and is always the same, then your interesting-data will be always at place 13th:
almostvalue = parts[12]
=> "\\\"somename\\\""
Now, just strip the spurious characters. Since the grammar is constant, there's 2 chars to be cut from both sides:
value = almostvalue[2..-3]
Ok, another way. Since regex already showed up, let's try with them. We know:
data is prefixed with \"Name\" then colon and slash-quote
data consists of some text without quotes inside (well, at least I guess so)
data ends with a slash-quote
the parts in regex syntax would be, respectively:
\"Name\":\"
[^\"]*
\"
together:
inputdata =~ /\\"Name\\":\\"([^\"]*)\\"/
value = $1
Note that I surrounded the interesting part with (), hence after sucessful match that part is available in the $1 special variable.
Yet another way:
If you look at the grammar carefully, it really resembles a set of embedded hashes:
\"
{ \"#Network\" :
{ \"command\" : \"Connect\",
\"data\" :
{ \"Id\" : \"xx:xx:xx:xx:xx:xx\",
\"Name\" : \"somename\",
\"Pwd\" : \"123456789\"
}
}
}
\0\"
If we'd write something similar as Ruby hashes:
{ "#Network" =>
{ "command" => "Connect",
"data" =>
{ "Id" => "xx:xx:xx:xx:xx:xx",
"Name" => "somename",
"Pwd" => "123456789"
}
}
}
What's the difference? the colon was replaced with =>, and the slashes-before-quotes are gone. Oh, and also opening/closing \" is gone and that \0 at the end is gone too. Let's play:
tmp = inputdata[2..-4] # remove opening \" and closing \0\"
tmp.gsub!('\"', '"') # replace every \" with just "
Now, what about colons.. We cannot just replace : with =>, because it would damage the internal colons of the xx:xx:xx:xx:xx:xx part.. But, look: all the other colons have always a quote before them!
tmp.gsub!('":', '"=>') # replace every quote-colon with quote-arrow
Now our tmp is:
{"#Network"=>{"command"=>"Connect","data"=>{"Id"=>"xx:xx:xx:xx:xx:xx","Name"=>"somename","Pwd"=>"123456789"}}}
formatted a little:
{ "#Network"=>
{ "command"=>"Connect",
"data"=>
{ "Id"=>"xx:xx:xx:xx:xx:xx","Name"=>"somename","Pwd"=>"123456789" }
}
}
So, it looks just like a Ruby hash. Let's try 'destringizing' it:
packeddata = eval(tmp)
value = packeddata['#Network']['data']['Name']
Done.
Well, this has grown a bit and Jonas was obviously faster, so I'll leave the JSON part to him since he wrote it already ;) The data was so similar to Ruby hash because it was obviously formatted as JSON which is a hash-like structure too. Using the proper format-reading tools is usually the best idea, but mind that the JSON library when asked to read the data - will read all of the data and then you can ask them "what was inside at the key xx/yy/zz", just like I showed you with the read-it-as-a-Hash attempt. Sometimes when your program is very short on the deadline, you cannot afford to read-it-all. Then, scanning with regex or scanning manually for "known markers" may (not must) be much faster and thus prefereable. But, still, much less convenient. Have fun.
I need to read a text file like this,
regular: 12/04/2013, 13/04/2013
extract 'regular', and save it in a variable and all the dates in an array. How can I do this?
Based on what you say you tried, would the following do what you want?
data = line.split(/: */) # => ["regular", "12/04/2013, 13/04/2013"]
#customer = data[0] # => "regular"
#dates_array = data[1].split(/, */) # => ["12/04/2013", "13/04/2013"]
I used * to match (and eliminate) multiple blanks. I'm assuming here that you don't want the blanks, comma, or the colon (:) separator included in your results. If that's not correct, adjust the regular expressions accordingly.
I have a large file in a ruby variable, it follows a common pattern like so:
// ...
// comment
$myuser['bla'] = 'bla';
// comment
$myuser['bla2'] = 'bla2';
// ...
I am trying to given a 'key' replace the 'value'
This replaces the entire string how do I fix it? Another method I thought is to do it in two steps, step one would be to find the value within the quotes then to perform a string replace, what's best?
def keyvalr(content, key, value)
return content.gsub(/\$bla\[\'#{key}\'\]\s+\=\s+\'(.*)\'/) {|m| value }
end
The .* is greedy and consumes as much as possible (everything until the very last '). Make that . a [^'] then it is impossible for it to go past the first closing '.
/(\$bla\[\'#{key}\'\]\s+\=\s+\')[^']*(\')/
I also added parentheses to capture everything except for the value, which is to be replaced. The first set of parens will correspond to \1 and the second to \2. So that you replace the match of this with:
"\1yournewvaluehere\2"
I'd use something like:
text = %q{
// ...
// comment
$myuser['bla'] = 'bla';
// comment
$myuser['bla2'] = 'bla2';
// ...
}
from_to = {
'bla' => 'foo',
'bla2' => 'bar'
}
puts text.gsub(/\['([^']+)'\] = '([^']+)'/) { |t|
key, val = t.scan(/'([^']+)'/).flatten
"['%s'] = '%s'" % [ key, from_to[key] ]
}
Which outputs:
// ...
// comment
$myuser['bla'] = 'foo';
// comment
$myuser['bla2'] = 'bar';
// ...
This is how it works:
If I do:
puts text.gsub(/\['([^']+)'\] = '([^']+)'/) { |t|
puts t
}
I see:
['bla'] = 'bla'
['bla2'] = 'bla2'
Then I tried:
"['bla'] = 'bla'".scan(/'([^']+)'/).flatten
=> ["bla", "bla"]
That gave me a key, "value" pair, so I could use a hash to look-up the replacement value.
Sticking it inside a gsub block meant whatever matched got replaced by my return value for the block, so I created a string to replace the "hit" and let gsub do its "thang".
I'm not a big believer in using long regex. I've had to maintain too much code that tried to use complex patterns, and got something wrong, and failed to accomplish what was intended 100% of the time. They're very powerful, but maintenance of code is a lot harder/worse than developing it, so I try to keep patterns I write in spoon-size pieces, having mercy on those who follow me in maintaining the code.