I want to replace 'hoge' to 'foo' with regex. But the user's value is dynamic so I can't use str.gsub('hoge', 'foo').
str = '?user=hoge&tab=fuga'
What should I do?
Don't do this with a regular expression.
This is how to manipulate URIs using the existing wheels:
require 'uri'
str = 'http://example.com?user=hoge&tab=fuga'
uri = URI.parse(str)
query = URI.decode_www_form(uri.query).to_h # => {"user"=>"hoge", "tab"=>"fuga"}
query['user'] = 'foo'
uri.query = URI.encode_www_form(query)
uri.to_s # => "http://example.com?user=foo&tab=fuga"
Alternately:
require 'addressable'
uri = Addressable::URI.parse('http://example.com?tab=fuga&user=hoge')
query = uri.query_values # => {"tab"=>"fuga", "user"=>"hoge"}
query['user'] = 'foo'
uri.query_values = query
uri.to_s # => "http://example.com?tab=fuga&user=foo"
Note that in the examples the order of the parameters changed, but the code handled the difference without problems.
The reason you want to use URI or Addressable is because parameters and values have to be correctly encoded when they contain illegal characters. URI and Addressable know the rules and will follow them, whereas naive code assumes it's OK to not bother with encoding, causing broken URIs.
URI is part of the Ruby Standard Library, and Addressable is more full-featured. Take your pick.
You can try below regex
([?&]user=)([^&]+)
DEMO
You probably want to find out what the user query maps to first before using a .gsub to replace whatever value it is.
First, parse the URL string into an URI object using the URI module. And then, you can use the CGI query methods to get the key value pairs of the query params off the URI object using the CGI module. And finally, you can .gsub off the values in that hash.
I want to merge a hash with default parameters and the actual parameters given in a request. When I call this seemingly innocent script:
#!/usr/bin/env ruby
require 'sinatra'
get '/' do
defaults = { 'p1' => 'default1', 'p2' => 'default2' }
# params = request.params
params = defaults.merge(params)
params
end
with curl http://localhost:4567?p0=request then it crashes with
Listening on localhost:4567, CTRL+C to stop
2016-06-17 11:10:34 - TypeError - no implicit conversion of nil into Hash:
sinatrabug:8:in `merge'
sinatrabug:8:in `block in <main>'
When I access the Rack request.params directly it works. I looked into the Sinatra sources but I couldn't figure it out.
So I have a solution for my actual problem. But I don't know why it works.
My question is: Why can I assign param to a parameter, why is the class Hash but in defaults.merge params it throws an exception?
Any idea?
This is caused by the way Ruby handles local variables and setter methods (i.e. methods that end in =) with the same name. When Ruby reaches the line
params = defaults.merge(params)
it assumes you want to create a new local variable named params, rather than use the method. The initial value of this variable will be nil, and this is the value that the merge method sees.
If you want to refer to the method, you need to refer to it as self.params=. This is for any object that has such a method, not just Sinatra.
A better solution, to avoid this confusion altogether, might be to use a different name. Something like:
get '/' do
defaults = { 'p1' => 'default1', 'p2' => 'default2' }
normalized_params = defaults.merge(params)
normalized_params.inspect
end
Your code is throwing an error because params is nil when you make this call defaults.merge(params). I assume you are trying to merge defaults with request.params, which should contain the parameters from your GET.
Change this line
params = defaults.merge(params)
to this
params = defaults.merge(request.params)
I found this in rack gem
http://www.rubydoc.info/gems/rack/Rack/Request#params-instance_method
It seems you can retrieve GET and POST data by params method but you can't write in it. You have to use update_param and delete_param instead.
I am writting a test to check if meta tag exists and if the content matches. Look at the example below:
def have_meta(name, expected)
page.find "meta[property='#{name}'][content='#{expected}']", :visible => false
end
I am calling the function as below:
have_meta("og:title", "this is the title")
have_meta("og:url", "http://www.lorem.com/ipsum")
The problem is that my url contents changes based on which environment I am working in. E.g. It can be:
http://www.lorem.com/ipsum
or
http://www.sdfsd.com/ipsum
I want to have a regex match for the "content" part of Meta. Ignoring the main url but checking the "ipsum" part. Do you know how can I do that?
I would use CSS's ends-with attribute selector:
def have_meta_with_path(name, path)
page.find "meta[property='#{name}'][content$='#{path}']", :visible => false
end
where path is just e.g. '/ipsum'. No regex needed. It will be safest to have two methods: your original have_meta, which tests the entire value, and the one above, which tests the end. You wouldn't want to just test the end of the og:title.
You could use XPath instead of CSS.
page.should have_xpath "//meta[#property='og:url'][contains(#content, 'ipsum')]"
In code like this,
get '/posts.?:format?' do
# ...
end
How can I get the format that was requested?
Look into the params hash as in a no-optional param situation:
params[:format]
How can I check if a string is a valid URL?
For example:
http://hello.it => yes
http:||bra.ziz, => no
If this is a valid URL how can I check if this is relative to a image file?
Notice:
As pointed by #CGuess, there's a bug with this issue and it's been documented for over 9 years now that validation is not the purpose of this regular expression (see https://bugs.ruby-lang.org/issues/6520).
Use the URI module distributed with Ruby:
require 'uri'
if url =~ URI::regexp
# Correct URL
end
Like Alexander Günther said in the comments, it checks if a string contains a URL.
To check if the string is a URL, use:
url =~ /\A#{URI::regexp}\z/
If you only want to check for web URLs (http or https), use this:
url =~ /\A#{URI::regexp(['http', 'https'])}\z/
Similar to the answers above, I find using this regex to be slightly more accurate:
URI::DEFAULT_PARSER.regexp[:ABS_URI]
That will invalidate URLs with spaces, as opposed to URI.regexp which allows spaces for some reason.
I have recently found a shortcut that is provided for the different URI rgexps. You can access any of URI::DEFAULT_PARSER.regexp.keys directly from URI::#{key}.
For example, the :ABS_URI regexp can be accessed from URI::ABS_URI.
The problem with the current answers is that a URI is not an URL.
A URI can be further classified as a locator, a name, or both. The
term "Uniform Resource Locator" (URL) refers to the subset of URIs
that, in addition to identifying a resource, provide a means of
locating the resource by describing its primary access mechanism
(e.g., its network "location").
Since URLs are a subset of URIs, it is clear that matching specifically for URIs will successfully match undesired values. For example, URNs:
"urn:isbn:0451450523" =~ URI::regexp
=> 0
That being said, as far as I know, Ruby doesn't have a default way to parse URLs , so you'll most likely need a gem to do so. If you need to match URLs specifically in HTTP or HTTPS format, you could do something like this:
uri = URI.parse(my_possible_url)
if uri.kind_of?(URI::HTTP) or uri.kind_of?(URI::HTTPS)
# do your stuff
end
I prefer the Addressable gem. I have found that it handles URLs more intelligently.
require 'addressable/uri'
SCHEMES = %w(http https)
def valid_url?(url)
parsed = Addressable::URI.parse(url) or return false
SCHEMES.include?(parsed.scheme)
rescue Addressable::URI::InvalidURIError
false
end
This is a fairly old entry, but I thought I'd go ahead and contribute:
String.class_eval do
def is_valid_url?
uri = URI.parse self
uri.kind_of? URI::HTTP
rescue URI::InvalidURIError
false
end
end
Now you can do something like:
if "http://www.omg.wtf".is_valid_url?
p "huzzah!"
end
For me, I use this regular expression:
/\A(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?\z/ix
Option:
i - case insensitive
x - ignore whitespace in regex
You can set this method to check URL validation:
def valid_url?(url)
return false if url.include?("<script")
url_regexp = /\A(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?\z/ix
url =~ url_regexp ? true : false
end
To use it:
valid_url?("http://stackoverflow.com/questions/1805761/check-if-url-is-valid-ruby")
Testing with wrong URLs:
http://ruby3arabi - result is invalid
http://http://ruby3arabi.com - result is invalid
http:// - result is invalid
http://test.com\n<script src=\"nasty.js\"> (Just simply check "<script")
127.0.0.1 - not support IP address
Test with correct URLs:
http://ruby3arabi.com - result is valid
http://www.ruby3arabi.com - result is valid
https://www.ruby3arabi.com - result is valid
https://www.ruby3arabi.com/article/1 - result is valid
https://www.ruby3arabi.com/websites/58e212ff6d275e4bf9000000?locale=en - result is valid
In general,
/^#{URI::regexp}$/
will work well, but if you only want to match http or https, you can pass those in as options to the method:
/^#{URI::regexp(%w(http https))}$/
That tends to work a little better, if you want to reject protocols like ftp://.
This is a little bit old but here is how I do it. Use Ruby's URI module to parse the URL. If it can be parsed then it's a valid URL. (But that doesn't mean accessible.)
URI supports many schemes, plus you can add custom schemes yourself:
irb> uri = URI.parse "http://hello.it" rescue nil
=> #<URI::HTTP:0x10755c50 URL:http://hello.it>
irb> uri.instance_values
=> {"fragment"=>nil,
"registry"=>nil,
"scheme"=>"http",
"query"=>nil,
"port"=>80,
"path"=>"",
"host"=>"hello.it",
"password"=>nil,
"user"=>nil,
"opaque"=>nil}
irb> uri = URI.parse "http:||bra.ziz" rescue nil
=> nil
irb> uri = URI.parse "ssh://hello.it:5888" rescue nil
=> #<URI::Generic:0x105fe938 URL:ssh://hello.it:5888>
[26] pry(main)> uri.instance_values
=> {"fragment"=>nil,
"registry"=>nil,
"scheme"=>"ssh",
"query"=>nil,
"port"=>5888,
"path"=>"",
"host"=>"hello.it",
"password"=>nil,
"user"=>nil,
"opaque"=>nil}
See the documentation for more information about the URI module.
You could also use a regex, maybe something like http://www.geekzilla.co.uk/View2D3B0109-C1B2-4B4E-BFFD-E8088CBC85FD.htm assuming this regex is correct (I haven't fully checked it) the following will show the validity of the url.
url_regex = Regexp.new("((https?|ftp|file):((//)|(\\\\))+[\w\d:\##%/;$()~_?\+-=\\\\.&]*)")
urls = [
"http://hello.it",
"http:||bra.ziz"
]
urls.each { |url|
if url =~ url_regex then
puts "%s is valid" % url
else
puts "%s not valid" % url
end
}
The above example outputs:
http://hello.it is valid
http:||bra.ziz not valid