Incrementing numeric parameter in a URL parameter string? - ruby

I've had a look round and can't find what I need on Stack Overflow, and was wondering if someone had a simple solution.
I want to find a parameter within a URL and increment its value, so, as an example:
?kws=&pstc=&cty=&prvnm=1
I want to be able to locate the prvnm parameter no matter where it is in the string and increment its value by 1.
I know I could split the parameters into an array, find the key, increment it and write it back but that seems rather long winded and wondered if someone else had any ideas!

require "uri"
url = "http://example.com/?kws=&pstc=&cty=&prvnm=1"
def new_url(url)
uri = URI.parse(url)
hsh = Hash[URI.decode_www_form(uri.query)]
hsh['prvnm'] = hsh['prvnm'].next
uri.query = URI.encode_www_form(hsh).to_s
uri.to_s
end
new_url(url) # => "http://example.com/?kws=&pstc=&cty=&prvnm=2"

There are already four answers, so I had to come up with something a little different:
s = "?kws=&pstc=&cty=&prvnm=1"
head, sep, tail = s.partition(/(?<=[?&]prvnm=)\d+/)
head + (sep.to_i + 1).to_s + tail # => "?kws=&pstc=&cty=&prvnm=2"
'String#partition' returns an array of three strings [head, sep, tail], such that head + sep + tail => s, where separator is partition's argument, which can be a string or a regex.
We want the separator to be the digits following &prvnm=. We therefore use a regex with \d+ preceeded by the aforementioned string which we want to treat as having zero length, so it will not be included in the separator. That calls for a "positive look-behind": (?<=&prvnm=). \d+ is "greedy", so it take all consequetive digits.
For the given value of s, head, sep, tail = s.partition(/(?<=&prvnm=)(\d+)/)
=> ["?kws=&pstc=&cty=&prvnm=", "1", ""].
Edit: my thanks to #quetzalcoatl for pointing out that I needed to change (?<=&prvnm=) in my regex to what I have now, as what I had would fail when ?prvnm= was at the beginning of the string.

split the string by `&`
then iterate over the parts
then split each part by `=` and inspect the results
when found `prvnm`, parse the integer and increment it
then join the bits by '='
then join the parts by '&'
Or, use regex like:
/[?&]prvnm=\d+/
and parse the result and then do a replacement.
Or, get some URL-parsing library..

Try something like this:
params = "?kws=&pstc=&cty=&prvnm=1"
num = params.scan(/prvnm=(\d)/)[0].join.to_i
puts num + 1

Use:
require 'uri'
Then:
parsed-url= URI.parse( ur full url)
r = CGI.parse(parsed_url.query)
r is now a hash of all your query parameters.
You can easily access it by using:
r["prsvn"].to_i + 1

Related

How to verify that the last character in a string is a number

I need to check if the last character in a string is a digit, and if so, increment it.
I have a directory structure of /u01/app/oracle/... and that's where it goes off the rails. Sometimes it ends with the version number, sometimes it ends with dbhome_1 (or 2, or 3), and sometimes, I have to assume, it will take some other form. If it ends with dbhome_X, I need to parse that and bump that final digit, if it is a digit.
I use split to split the directory structure on '/', and use include? to check if the final element is something like "dbhome". As long as my directory structure ends with dbhome_X it seems to work. As I was testing, though, I tried a path that ended with dbhome, and found that my check for the last character being a digit didn't work.
db_home = '/u01/app/oracle/product/11.2.0/dbhome'
if db_home.split('/')[-1].include?('dbhome')
homedir=db_home.split('/')[-1]
if homedir[-1].to_i.is_a? Numeric
homedir=homedir[0...-1]+(homedir[-1].to_i+1).to_s
new_path="/"+db_home.split('/')[1...-1].join("/")+"/"+homedir.to_s
end
else
new_path=db_home+"/dbhome_1"
end
puts new_path
I did not expect the output to be /u01/app/oracle/11.2.0/product/dbhom1 - it seems to have fallen into the if block that added 1 to the final character.
If I set the initial path to /u01/app/.../dbhome_1, I get the expected /u01/app/.../dbhome_2 as the output.
You could use a regular expression to make matching a tad bit easier
if !!(db_home[/.*dbhome.*\z]) ..
You could use regex's
/[0-9]$/.match("How3").nil?
I need to check if the last character in a string is a digit, and if
so, increment it.
This is one option:
s = 'string9'
s[-1].then { |last| last.to_i.to_s == last ? [s[0..-2], last.to_i+1].join : s }
#=> "string10"
'/u01/app/11.2.0/dbhome'.sub(/\d\z/) { |s| s.succ }
#=> "/u01/app/11.2.0/dbhome"
'/u01/app/11.2.0/dbhome9'.sub(/\d\z/) { |s| s.succ }
#=> "/u01/app/11.2.0/dbhome10"
This is a starting point if you're running Ruby v2.6+:
fname = 'filename1'
fname[/\d+$/].then { |digits|
fname[/\d+$/] = digits.to_i.next.to_s if digits
}
fname # => "filename2"
And it's safe if the filename doesn't end with a digit:
fname = 'filename'
fname[/\d+$/].then { |digits|
fname[/\d+$/] = digits.to_i.next.to_s if digits
}
fname # => "filename"
I'm not sure if I like doing it that way better than the more traditional way which works with much older Rubies:
digits = fname[/\d+$/]
fname[/\d+$/] = digits.to_i.next.to_s if digits
except for the fact that digits gets stuck into the variable space after only being used once. There's probably worse things that happen in my code though.
This is taking advantage of String's [] and []= methods.

Regex to extract last number portion of varying URL

I'm creating a URL parser and have three kind of URLs from which I would like to extract the number portion from the end of the URL and increment the extracted number by 10 and update the URL. I'm trying to use regex to extract but I'm new to regex and having trouble.
These are three URL structures of which I'd like to increment the last number portion of:
Increment last number 20 by 10:
http://forums.scamadviser.com/site-feedback-issues-feature-requests/20/
Increment last number 50 by 10:
https://forums.questionablecontent.net/index.php/board,1.50.html
Increment last number 30 by 10:
https://forums.comodo.com/how-can-i-help-comodo-please-we-need-you-b39.30/
With \d+(?!.*\d) regex, you will get the last digit chunk in the string. Then, use s.gsub with a block to modify the number and put back to the result.
See this Ruby demo:
strs = ['http://forums.scamadviser.com/site-feedback-issues-feature-requests/20/', 'https://forums.questionablecontent.net/index.php/board,1.50.html', 'https://forums.comodo.com/how-can-i-help-comodo-please-we-need-you-b39.30/']
arr = strs.map {|item| item.gsub(/\d+(?!.*\d)/) {$~[0].to_i+10}}
Note: $~ is a MatchData object, and using the [0] index we can access the whole match value.
Results:
http://forums.scamadviser.com/site-feedback-issues-feature-requests/30/
https://forums.questionablecontent.net/index.php/board,1.60.html
https://forums.comodo.com/how-can-i-help-comodo-please-we-need-you-b39.40/
Try this regex:
\d+(?=(\/)|(.html))
It will extract the last number.
Demo: https://regex101.com/r/zqUQlF/1
Substitute back with this regex:
(.*?)(\d+)((\/)|(.html))
Demo: https://regex101.com/r/zqUQlF/2
this regex matches only the last whole number in each URL by using a lookahead (which 'sees' patterns but doesn't eat any characters):
\d+(?=\D*$)
online demo here.
Like this:
urls = ['http://forums.scamadviser.com/site-feedback-issues-feature-requests/20/', 'https://forums.questionablecontent.net/index.php/board,1.50.html', 'https://forums.comodo.com/how-can-i-help-comodo-please-we-need-you-b39.30/']
pattern = /(\d+)(?=[^\d]+$)/
urls.each do |url|
url.gsub!(pattern) {|m| m.to_i + 10}
end
puts urls
You can also test it online here: https://ideone.com/smBJCQ

best way to find substring in ruby using regular expression

I have a string https://stackverflow.com. I want a new string that contains the domain from the given string using regular expressions.
Example:
x = "https://stackverflow.com"
newstring = "stackoverflow.com"
Example 2:
x = "https://www.stackverflow.com"
newstring = "www.stackoverflow.com"
"https://stackverflow.com"[/(?<=:\/\/).*/]
#⇒ "stackverflow.com"
(?<=..) is a positive lookbehind.
If string = "http://stackoverflow.com",
a really easy way is string.split("http://")[1]. But this isn't regex.
A regex solution would be as follows:
string.scan(/^http:\/\/(.+)$/).flatten.first
To explain:
String#scan returns the first match of the regex.
The regex:
^ matches beginning of line
http: matches those characters
\/\/ matches //
(.+) sets a "match group" containing any number of any characters. This is the value returned by the scan.
$ matches end of line
.flatten.first extracts the results from String#scan, which in this case returns a nested array.
You might want to try this:
#!/usr/bin/env ruby
str = "https://stackoverflow.com"
if mtch = str.match(/(?::\/\/)(/S)/)
f1 = mtch.captures
end
There are two capturing groups in the match method: the first one is a non-capturing group referring to your search pattern and the second one referring to everything else afterwards. After that, the captures method will assign the desired result to f1.
I hope this solves your problem.

RegEx to remove new line characters and replace with comma

I scraped a website using Nokogiri and after using xpath I was left with the following string (which is a few td's pushed into one string).
"Total First Downs\n\t\t\t\t\t\t\t\t359\n\t\t\t\t\t\t\t\t274\n\t\t\t\t\t\t\t"
My goal is to make this into an array that looks like the following(it will be a nested array):
["Total First Downs", "359", "274"]
The issue is creating a regex equation that removes the escaped characters, subs in one "," but does not sub in a "," after the last set of integers. If the comma after the last set of integers is necessary, I could use #compact to get rid of the nil that occurs in the array. If you need the code on how I scraped the website here it is: (please note i saved the webpage for testing in order for my ip address to not get burned during the trial phase)
f = File.open('page')
doc = Nokogiri::HTML:(f)
f.close
number = doc.xpath('//tr[#class="tbdy1"]').count
stats = Array.new(number) {Array.new}
i = 0
doc.xpath('//tr[#class="tbdy1"]').each do |tr|
stats[i] << tr.text
i += 1
end
Thanks for your help
I don't fully understand your problem, but the result can be easily achieved with this:
"Total First Downs\n\t\t\t\t\t\t\t\t359\n\t\t\t\t\t\t\t\t274\n\t\t\t\t\t\t\t"
.split(/[\n\t]+/)
# => ["Total First Downs", "359", "274"]
Try with gsub
"Total First Downs\n\t\t\t\t\t\t\t\t359\n\t\t\t\t\t\t\t\t274\n\t\t\t\t\t\t\t".gsub("/[\n\t]+/",",")

How do I convert a Ruby string with brackets to an array?

I would like to convert the following string into an array/nested array:
str = "[[this, is],[a, nested],[array]]"
newarray = # this is what I need help with!
newarray.inspect # => [['this','is'],['a','nested'],['array']]
You'll get what you want with YAML.
But there is a little problem with your string. YAML expects that there's a space behind the comma. So we need this
str = "[[this, is], [a, nested], [array]]"
Code:
require 'yaml'
str = "[[this, is],[a, nested],[array]]"
### transform your string in a valid YAML-String
str.gsub!(/(\,)(\S)/, "\\1 \\2")
YAML::load(str)
# => [["this", "is"], ["a", "nested"], ["array"]]
You could also treat it as almost-JSON. If the strings really are only letters, like in your example, then this will work:
JSON.parse(yourarray.gsub(/([a-z]+)/,'"\1"'))
If they could have arbitrary characters (other than [ ] , ), you'd need a little more:
JSON.parse("[[this, is],[a, nested],[array]]".gsub(/, /,",").gsub(/([^\[\]\,]+)/,'"\1"'))
For a laugh:
ary = eval("[[this, is],[a, nested],[array]]".gsub(/(\w+?)/, "'\\1'") )
=> [["this", "is"], ["a", "nested"], ["array"]]
Disclaimer: You definitely shouldn't do this as eval is a terrible idea, but it is fast and has the useful side effect of throwing an exception if your nested arrays aren't valid
Looks like a basic parsing task. Generally the approach you are going to want to take is to create a recursive function with the following general algorithm
base case (input doesn't begin with '[') return the input
recursive case:
split the input on ',' (you will need to find commas only at this level)
for each sub string call this method again with the sub string
return array containing the results from this recursive method
The only slighlty tricky part here is splitting the input on a single ','. You could write a separate function for this that would scan through the string and keep a count of the openbrackets - closedbrakets seen so far. Then only split on commas when the count is equal to zero.
Make a recursive function that takes the string and an integer offset, and "reads" out an array. That is, have it return an array or string (that it has read) and an integer offset pointing after the array. For example:
s = "[[this, is],[a, nested],[array]]"
yourFunc(s, 1) # returns ['this', 'is'] and 11.
yourFunc(s, 2) # returns 'this' and 6.
Then you can call it with another function that provides an offset of 0, and makes sure that the finishing offset is the length of the string.

Resources