I'm using Ruby 2.2 and have a string that looks like this:
myvar = '{"myval1"=>"value1","mayval2"=>"value2"}'
How can I get this into a key-value pair and/or hash of some sort? When I do myvar['myval1'] I get back 'myval1', which isn't quite what I'm after. The answer's probably staring right at me but nothing's worked so far.
As I've seen times and times again - simply mentioning eval makes people instantly upset, even if it was a proper use case (which this is not).
So I'm going to go with another hate magnet - parsing nested structures with regexes.
Iteration (1) - a naive approach:
JSON.parse(myvar.gsub(/=>/, ':'))
Problem - will mess up your data if the string key/values contain =>.
Iteration (2) - even number of "s remaining mean you are not inside a string:
JSON.parse(myvar.gsub(/=>(?=(?:[^"]*"){2}*[^"]*$)/, ':'))
Problem - there might be a " inside a string, that is escaped with a slash.
Iteration (3) - like iteration (2), but count only " that are preceded by unescaped slashes. An unescaped slash would be a sequence of odd number of slashes:
eq_gt_finder = /(?<non_quote>
(?:
[^"\\]|
\\{2}*\\.
)*
){0}
=>(?=
(?:
\g<non_quote>
"
\g<non_quote>
){2}*
$
)/x
JSON.parse(myvar.gsub(eq_gt_finder, ':'))
See it in action
Q: Are you an infallible divine creature that is absolutely certain this will work 100% of the time?
A: Nope.
Q: Isn't this slow and unreadable as shit?
Q: Ok?
A: Yep.
You can change that string to valid JSON easily and use JSON.parse then:
require 'JSON'
myvar = '{"myval1"=>"value1","mayval2"=>"value2"}'
hash = JSON.parse(myvar.gsub(/=>/, ': '))
#=> { "myval1" => "value1", "mayval2" => "value2" }
hash['myval1']
#=> "value1"
Related
I got a task on code wars.
The task is
In this simple Kata your task is to create a function that turns a string into a Mexican Wave. You will be passed a string and you must return that string in an array where an uppercase letter is a person standing up.
Rules are
The input string will always be lower case but maybe empty.
If the character in the string is whitespace then pass over it as if it was an empty seat
Example
wave("hello") => []string{"Hello", "hEllo", "heLlo", "helLo", "hellO"}
So I have found the solution but I want to understand the logic of it. Since its so minimalistic and looks cool but I don't understand what happens there. So the solution is
fun wave(str: String) = str.indices.map { str.take(it) + str.drop(it).capitalize() }.filter { it != str }
Could you please explain?
str.indices just returns the valid indices of the string. This means the numbers from 0 to and including str.length - 1 - a total of str.length numbers.
Then, these numbers are mapped (in other words, transformed) into strings. We will now refer to each of these numbers as "it", as that is what it refers to in the map lambda.
Here's how we do the transformation: we first take the first it characters of str, then combine that with the last str.length - it characters of str, but with the first of those characters capitalized. How do we get the last str.length - it characters? We drop the first it characters.
Here's an example for when str is "hello", illustrated in a table:
it
str.take(it)
str.drop(it)
str.drop(it).capitalize()
Combined
0
hello
Hello
Hello
1
h
ello
Ello
hEllo
2
he
llo
Llo
heLLo
3
hel
lo
Lo
helLo
4
hell
o
O
hellO
Lastly, the solution also filters out transformed strings that are the same as str. This is to handle Rule #2. Transformed strings can only be the same as str if the capitalised character is a whitespace (because capitalising a whitespace character doesn't change it).
Side note: capitalize is deprecated. For other ways to capitalise the first character, see Is there a shorter replacement for Kotlin's deprecated String.capitalize() function?
Here's another way you could do it:
fun wave2(str: String) = str.mapIndexed { i, c -> str.replaceRange(i, i + 1, c.uppercase()) }
.filter { it.any(Char::isUpperCase) }
The filter on the original is way more elegant IMO, this is just as an example of how else you might check for a condition. replaceRange is a way to make a copy of a string with some of the characters changed, in this case we're just replacing the one at the current index by uppercasing what's already there. Not as clever as the original, but good to know!
I've printed the code, wit ruby
string = "hahahah"
pring string.gsub("a","b")
How do I add more letter replacements into gsub?
string.gsub("a","b")("h","l") and string.gsub("a","b";"h","l")
didnt work...
*update I have tried this too but without any success .
letters = {
"a" => "l"
"b" => "n"
...
"z" => "f"
}
string = "hahahah"
print string.gsub(\/w\,letters)
You're overcomplicating. As with most method calls in Ruby, you can simply chain #gsub calls together, one after the other:
str = 'adfh'
print str.gsub("a","b").gsub("h","l") #=> 'bdfl'
What you're doing here is applying the second #gsub to the result of the first one.
Of course, that gets a bit long-winded if you do too many of them. So, when you find yourself stringing too many together, you'll want to look for a regex solution. Rubular is a great place to tinker with them.
The way to use your hash trick with #gsub and a regex expression is to provide a hash for all possible matches. This has the same result as the two #gsub calls:
print str.gsub(/[ah]/, {'a'=>'b', 'h'=>'l'}) #=> 'bdfl'
The regex matches either a or h (/[ah]/), and the hash is saying what to substitute for each of them.
All that said, str.tr('ah', 'bl') is the simplest way to solve your problem as specified, as some commenters have mentioned, so long as you are working with single letters. If you need to work with two or more characters per substitution, you'll need to use #gsub.
I need to check if the last character in a string is a digit, and if so, increment it.
I have a directory structure of /u01/app/oracle/... and that's where it goes off the rails. Sometimes it ends with the version number, sometimes it ends with dbhome_1 (or 2, or 3), and sometimes, I have to assume, it will take some other form. If it ends with dbhome_X, I need to parse that and bump that final digit, if it is a digit.
I use split to split the directory structure on '/', and use include? to check if the final element is something like "dbhome". As long as my directory structure ends with dbhome_X it seems to work. As I was testing, though, I tried a path that ended with dbhome, and found that my check for the last character being a digit didn't work.
db_home = '/u01/app/oracle/product/11.2.0/dbhome'
if db_home.split('/')[-1].include?('dbhome')
homedir=db_home.split('/')[-1]
if homedir[-1].to_i.is_a? Numeric
homedir=homedir[0...-1]+(homedir[-1].to_i+1).to_s
new_path="/"+db_home.split('/')[1...-1].join("/")+"/"+homedir.to_s
end
else
new_path=db_home+"/dbhome_1"
end
puts new_path
I did not expect the output to be /u01/app/oracle/11.2.0/product/dbhom1 - it seems to have fallen into the if block that added 1 to the final character.
If I set the initial path to /u01/app/.../dbhome_1, I get the expected /u01/app/.../dbhome_2 as the output.
You could use a regular expression to make matching a tad bit easier
if !!(db_home[/.*dbhome.*\z]) ..
You could use regex's
/[0-9]$/.match("How3").nil?
I need to check if the last character in a string is a digit, and if
so, increment it.
This is one option:
s = 'string9'
s[-1].then { |last| last.to_i.to_s == last ? [s[0..-2], last.to_i+1].join : s }
#=> "string10"
'/u01/app/11.2.0/dbhome'.sub(/\d\z/) { |s| s.succ }
#=> "/u01/app/11.2.0/dbhome"
'/u01/app/11.2.0/dbhome9'.sub(/\d\z/) { |s| s.succ }
#=> "/u01/app/11.2.0/dbhome10"
This is a starting point if you're running Ruby v2.6+:
fname = 'filename1'
fname[/\d+$/].then { |digits|
fname[/\d+$/] = digits.to_i.next.to_s if digits
}
fname # => "filename2"
And it's safe if the filename doesn't end with a digit:
fname = 'filename'
fname[/\d+$/].then { |digits|
fname[/\d+$/] = digits.to_i.next.to_s if digits
}
fname # => "filename"
I'm not sure if I like doing it that way better than the more traditional way which works with much older Rubies:
digits = fname[/\d+$/]
fname[/\d+$/] = digits.to_i.next.to_s if digits
except for the fact that digits gets stuck into the variable space after only being used once. There's probably worse things that happen in my code though.
This is taking advantage of String's [] and []= methods.
I am having a string as below:
str1='"{\"#Network\":{\"command\":\"Connect\",\"data\":
{\"Id\":\"xx:xx:xx:xx:xx:xx\",\"Name\":\"somename\",\"Pwd\":\"123456789\"}}}\0"'
I wanted to extract the somename string from the above string. Values of xx:xx:xx:xx:xx:xx, somename and 123456789 can change but the syntax will remain same as above.
I saw similar posts on this site but don't know how to use regex in the above case.
Any ideas how to extract the above string.
Parse the string to JSON and get the values that way.
require 'json'
str = "{\"#Network\":{\"command\":\"Connect\",\"data\":{\"Id\":\"xx:xx:xx:xx:xx:xx\",\"Name\":\"somename\",\"Pwd\":\"123456789\"}}}\0"
json = JSON.parse(str.strip)
name = json["#Network"]["data"]["Name"]
pwd = json["#Network"]["data"]["Pwd"]
Since you don't know regex, let's leave them out for now and try manual parsing which is a bit easier to understand.
Your original input, without the outer apostrophes and name of variable is:
"{\"#Network\":{\"command\":\"Connect\",\"data\":{\"Id\":\"xx:xx:xx:xx:xx:xx\",\"Name\":\"somename\",\"Pwd\":\"123456789\"}}}\0"
You say that you need to get the 'somename' value and that the 'grammar will not change'. Cool!.
First, look at what delimits that value: it has quotes, then there's a colon to the left and comma to the right. However, looking at other parts, such layout is also used near the command and near the pwd. So, colon-quote-data-quote-comma is not enough. Looking further to the sides, there's a \"Name\". It never occurs anywhere in the input data except this place. This is just great! That means, that we can quickly find the whereabouts of the data just by searching for the \"Name\" text:
inputdata = .....
estposition = inputdata.index('\"Name\"')
raise "well-known marker wa not found in the input" unless estposition
now, we know:
where the part starts
and that after the "Name" text there's always a colon, a quote, and then the-interesting-data
and that there's always a quote after the interesting-data
let's find all of them:
colonquote = inputdata.index(':\"', estposition)
datastart = colonquote+3
lastquote = inputdata.index('\"', datastart)
dataend = lastquote-1
The index returns the start position of the match, so it would return the position of : and position of \. Since we want to get the text between them, we must add/subtract a few positions to move past the :\" at begining or move back from \" at end.
Then, fetch the data from between them:
value = inputdata[datastart..dataend]
And that's it.
Now, step back and look at the input data once again. You say that grammar is always the same. The various bits are obviously separated by colons and commas. Let's try using it directly:
parts = inputdata.split(/[:,]/)
=> ["\"{\\\"#Network\\\"",
"{\\\"command\\\"",
"\\\"Connect\\\"",
"\\\"data\\\"",
"\n{\\\"Id\\\"",
"\\\"xx",
"xx",
"xx",
"xx",
"xx",
"xx\\\"",
"\\\"Name\\\"",
"\\\"somename\\\"",
"\\\"Pwd\\\"",
"\\\"123456789\\\"}}}\\0\""]
Please ignore the regex for now. Just assume it says a colon or comma. Now, in parts you will get all the, well, parts, that were detected by cutting the inputdata to pieces at every colon or comma.
If the layout never changes and is always the same, then your interesting-data will be always at place 13th:
almostvalue = parts[12]
=> "\\\"somename\\\""
Now, just strip the spurious characters. Since the grammar is constant, there's 2 chars to be cut from both sides:
value = almostvalue[2..-3]
Ok, another way. Since regex already showed up, let's try with them. We know:
data is prefixed with \"Name\" then colon and slash-quote
data consists of some text without quotes inside (well, at least I guess so)
data ends with a slash-quote
the parts in regex syntax would be, respectively:
\"Name\":\"
[^\"]*
\"
together:
inputdata =~ /\\"Name\\":\\"([^\"]*)\\"/
value = $1
Note that I surrounded the interesting part with (), hence after sucessful match that part is available in the $1 special variable.
Yet another way:
If you look at the grammar carefully, it really resembles a set of embedded hashes:
\"
{ \"#Network\" :
{ \"command\" : \"Connect\",
\"data\" :
{ \"Id\" : \"xx:xx:xx:xx:xx:xx\",
\"Name\" : \"somename\",
\"Pwd\" : \"123456789\"
}
}
}
\0\"
If we'd write something similar as Ruby hashes:
{ "#Network" =>
{ "command" => "Connect",
"data" =>
{ "Id" => "xx:xx:xx:xx:xx:xx",
"Name" => "somename",
"Pwd" => "123456789"
}
}
}
What's the difference? the colon was replaced with =>, and the slashes-before-quotes are gone. Oh, and also opening/closing \" is gone and that \0 at the end is gone too. Let's play:
tmp = inputdata[2..-4] # remove opening \" and closing \0\"
tmp.gsub!('\"', '"') # replace every \" with just "
Now, what about colons.. We cannot just replace : with =>, because it would damage the internal colons of the xx:xx:xx:xx:xx:xx part.. But, look: all the other colons have always a quote before them!
tmp.gsub!('":', '"=>') # replace every quote-colon with quote-arrow
Now our tmp is:
{"#Network"=>{"command"=>"Connect","data"=>{"Id"=>"xx:xx:xx:xx:xx:xx","Name"=>"somename","Pwd"=>"123456789"}}}
formatted a little:
{ "#Network"=>
{ "command"=>"Connect",
"data"=>
{ "Id"=>"xx:xx:xx:xx:xx:xx","Name"=>"somename","Pwd"=>"123456789" }
}
}
So, it looks just like a Ruby hash. Let's try 'destringizing' it:
packeddata = eval(tmp)
value = packeddata['#Network']['data']['Name']
Done.
Well, this has grown a bit and Jonas was obviously faster, so I'll leave the JSON part to him since he wrote it already ;) The data was so similar to Ruby hash because it was obviously formatted as JSON which is a hash-like structure too. Using the proper format-reading tools is usually the best idea, but mind that the JSON library when asked to read the data - will read all of the data and then you can ask them "what was inside at the key xx/yy/zz", just like I showed you with the read-it-as-a-Hash attempt. Sometimes when your program is very short on the deadline, you cannot afford to read-it-all. Then, scanning with regex or scanning manually for "known markers" may (not must) be much faster and thus prefereable. But, still, much less convenient. Have fun.
I never tried regex before today, and I like it so far, but I'm lost on some things.
I have a string that looks like this:
Type OtherType ThirdType - SubType AnotherSubType QuiteTheType
I want two regex, both care about the '-' character.
First I want all words before that character, then all words after it. I will be using Ruby's gsub to turn them into an array of strings, two arrays, which is why I need two regex expressions.
So far I have this: ([a-zA-z]{1,}) (?=-) but that only gets me the word right before the dash, I.E. ThirdType.
If I just use ([a-zA-z]{1,}) I get all words highlighted, but that includes the ones AFTER the - which I don't want yet.
How can I get all occurrences of [a-zA-z]{1,} that happen before - but not necessarily IMMEDIATELY before it?
s = "Type OtherType ThirdType - SubType AnotherSubType QuiteTheType"
words_before, words_after = s.split(/\s*-\s*/).map do |t|
t.split(/\s+/)
end
p words_before # => ["Type", "OtherType", "ThirdType"]
p words_after # => ["SubType", "AnotherSubType", "QuiteTheType"]
Here's how this works:
s.split(/\s*-\s*/)
This splits the string in two, using a regular expression delimiter. The delimiter means "any amount of white-space, then a dash, then any amount of white-space." The result is an array with two strings in it: The part on the left of the delimeter, and the part on the right.
...map do |t|
...
end
map takes an array and transforms it into another array with the same number of elements. It takes each element of the array, passes it to the block, and uses the return value from the block as the new value for that element. We'll use it to transform the two strings into two arrays of words.
So, what's in the block?
t.split(/\s+/)
It's another split. This time we'll split on one or more whitespace characters. That results in an array of words.
Since the map applies that split to first the left side and then the right side, the result of the entire s.split... expression is an array of two arrays.
Now we'll use one of Ruby's fun syntaxes:
words_before, words_after = s.split...
Whenever you have multiple variables on the left side of an assignment, ruby will "decompose" the array on the right side, assigning the first element of the array to the first variable, the second element of the array to the second variable, and so on. Since our array has two elements (the first being an array of words from the left side, and the second being an array of words from the right side), we'll use two variables to hold them.
I don't know exactly how Ruby's regex implementation works, but the following regex in Perl should get you what you want:
/^([a-zA-z\s]+) \- ([a-zA-Z\s]+)$/
For example:
perl -e '$_="Type OtherType ThirdType - SubType AnotherSubType QuiteTheType";
if(/^([a-zA-z\s]+) \- ([a-zA-Z\s]+)$/){print "$1\n";print "$2\n";}'
produces
Type OtherType ThirdType
SubType AnotherSubType QuiteTheType
ETA: To explain what's going on, the initial ^ denotes the beginning of the line and the ending $ denotes the end of the line. So, ^([a-zA-Z\s]+) starts at the beginning and (greedily) matches all of the words from the beginning of the line up until the space before the dash (which is escaped by a backslash, since - is a reserved character in most regex implementations). Likewise with ([a-zA-Z\s]+)$.
You can use look-ahead:
(\w+)(?=.*?-)
While regex is powerful and useful, it often leads to a more complicated solution than you need, and complicated results in more work and maintenance.
sentence = 'Type OtherType ThirdType - SubType AnotherSubType QuiteTheType'
sentence.split('-') # => ["Type OtherType ThirdType ", " SubType AnotherSubType QuiteTheType"]
sentence.scan(/[^-]+/) # => ["Type OtherType ThirdType ", " SubType AnotherSubType QuiteTheType"]
If the whitespace surrounding the hyphen is annoying pass the returned sections through strip:
sentence.split('-').map{ |w| w.strip } # => ["Type OtherType ThirdType", "SubType AnotherSubType QuiteTheType"]
sentence.scan(/[^-]+/).map{ |w| w.strip } # => ["Type OtherType ThirdType", "SubType AnotherSubType QuiteTheType"]
If you want the individual words, and not the sentences before and after the hyphen:
sentence.split('-').map{ |w| w.strip.split(' ') } # => [["Type", "OtherType", "ThirdType"], ["SubType", "AnotherSubType", "QuiteTheType"]]
sentence.scan(/[^-]+/).map{ |w| w.strip.split(' ') } # => [["Type", "OtherType", "ThirdType"], ["SubType", "AnotherSubType", "QuiteTheType"]]