ruby sinatra how to redirect with regex - ruby

I am trying to move stuff at root to /en/ directory to make my little service multi-lingual.
So, I want to redirect this url
mysite.com/?year=2018
to
mysite.com/en/?year=2018
My code is like
get %r{^/(\?year\=\d{4})$} do |c|
redirect "/en/#{c}"
end
but it seems like I never get #{c} part from the url.
Why is that? or are there just better ways to do this?
Thanks!

You can use the request.path variable to get the information you're looking for.
For example,
get "/something" do
puts request.path # => "/something"
redirect "/en#{request.path}"
end
However if you are using query parameters (i.e. ?yeah=2000) you'll have to manually pass those off to the redirect route.
Kind of non-intuitively, there's a helper method for this in ActiveRecord.
require 'active_record'
get "/something" do
puts params.to_param
# if params[:year] is 2000, you'll get "year=2000"
redirect "/en#{request.path}?#{params.to_param}"
end
You could alternatively write your own helper method pretty easily:
def hash_to_param_string(hash)
hash.reduce("") do |string, (key, val)|
string << "#{key}=#{val}&"
end
end
puts hash_to_param_string({key1: "val1", key2: "val2"})
# => "key1=val1&key2=val2"

Related

Anemone Ruby spider - create key value array without domain name

I'm using Anemone to spider a domain and it works fine.
the code to initiate the crawl looks like this:
require 'anemone'
Anemone.crawl("http://www.example.com/") do |anemone|
anemone.on_every_page do |page|
puts page.url
end
end
This very nicely prints out all the page urls for the domain like so:
http://www.example.com/
http://www.example.com/about
http://www.example.com/articles
http://www.example.com/articles/article_01
http://www.example.com/contact
What I would like to do is create an array of key value pairs using the last part of the url for the key, and the url 'minus the domain' for the value.
E.g.
[
['','/'],
['about','/about'],
['articles','/articles'],
['article_01','/articles/article_01']
]
Apologies if this is rudimentary stuff but I'm a Ruby novice.
I would define an array or hash first outside of the block of code and then add your key value pairs to it:
require 'anemone'
path_array = []
crawl_url = "http://www.example.com/"
Anemone.crawl(crawl_url) do |anemone|
anemone.on_every_page do |page|
path_array << page.url
puts page.url
end
end
From here you can then .map your array into a useable multi-dimensional array:
path_array.map{|x| [x[crawl_url.length..10000], x.gsub("http://www.example.com","")]}
=> [["", "/"], ["about", "/about"], ["articles", "/articles"], ["articles/article_01", "/articles/article_01"], ["contact", "/contact"]]
I'm not sure if it will work in every scenario, however I think this can give you a good start for how to collect the data and manipulate it. Also if you are wanting a key/value pair you should look into Ruby's class Hash for more information on how to use and create hash's in Ruby.
The simplest and possibly least robust way to do this would be to use
page.url.split('/').last
to obtain your 'key'. You would need to test various edge cases to ensure it worked reliably.
edit: this will return 'www.example.com' as the key for 'http://www.example.com/' which is not the result you require

Using Ruby to wait for a page to load an element

I currently have a working piece of Ruby that looks like this:
def error_message
browser.span(:id => 'ctl00_cphMainContent_lblMessage').wait_until_present(30) do
not errors.empty?
end
errors
end
However, I'd prefer something more like this:
span(:errors, :id => 'ctl00_cphMainContent_lblMessage')
##farther down##
def error_message
browser.errors.wait_until_present(30) do
etc...
I'm new to Ruby, but how can I do something like this, assuming it's possible?
Typically, you make use of the Watir::Wait.until or <element>.wait_until_present methods.
In this way, you could do something like:
# Earlier in code
#browser = Watir::Browser.start('http://mypage.com/')
# ...
errors = { my_first_error: "You forgot your keys in the car!" }
check_for_errors(error[:my_first_error])
# Wherever
def check_for_errors(error, expiry=30)
error_element = #browser.span(:id => 'ctl00_cphMainContent_lblMessage')
error_element(value: error).wait_until_present(expiry)
end
See the watir documentation for more information.

Adding a querystring to rack-rewrite response

I've got a simple 301 redirect to capture all non .com domains I have registered for my site as follows:
DOMAIN = 'www.mywebsite.com'
use Rack::Rewrite do
r301 %r{.*}, "http://#{DOMAIN}$&", :if => Proc.new {|rack_env|
rack_env['SERVER_NAME'] != DOMAIN && ENV["RACK_ENV"] == 'production'
}
end
I'd like to add a querystring to the response to add the original domain in the format
?utm_source=#{rack_env['SERVER_NAME']}
But can't quite work out how not to crash the server :) Can it be done & retain any original query string?
It's unlikely that anyone will hit any subpages under the main domain, but when I drop the $& from the rewrite, and replace it with my string, it blows up with no errors in the logs...
I think the reason your original code won't work is because rack_env is not available within your second argument as it's a block argument to the third. Does that make sense?
You can however pass a Proc as the second argument of a redirect, so I think something like this should work (partially tested :)
DOMAIN = 'www.mywebsite.com'
ORIGINAL_REQUEST = Proc.new do |match, rack_env|
"#{DOMAIN}?utm_source=#{rack_env['REQUEST_URI']}"
end
use Rack::Rewrite do
r301 %r{.*}, ORIGINAL_REQUEST, :if => Proc.new {|rack_env|
rack_env['SERVER_NAME'] != DOMAIN && ENV["RACK_ENV"] == 'production'
}
end

URI Response Code

I would like to use Ruby's OpenURI to check whether the URL can be properly accessed. So I would like to check its response code (4xx or 5xx means error, etc.) Is it possible to find that?
You can use the status method to return an array that contains the status code and message.
require "open-uri"
open("http://www.example.org") do |f|
puts f.base_uri #=> http://www.example.org
puts f.status #=> ["200", "OK"]
end

Using a Regex in the URI of a Mongrel Handler

I'm currently using Mongrel to develop a custom web application project.
I would like Mongrel to use a defined Http Handler based on a regular expression. For example, everytime someone calls a url like http://test/bla1.js or http://test/bla2.js the same Http handler is called to manage the request.
My code so far looks a like that:
http_server = Mongrel::Configurator.new :host => config.get("http_host") do
listener :port => config.get("http_port") do
uri Regexp.escape("/[a-z0-9]+.js"), :handler => BLAH::CustomHandler.new
uri '/ui/public', :handler => Mongrel::DirHandler.new("#{$d}/public/")
uri '/favicon', :handler => Mongrel::Error404Handler.new('')
trap("INT") { stop }
run
end
end
As you can see, I am trying to use a regex instead of a string here:
uri Regexp.escape("/[a-z0-9]+.js"), :handler => BLAH::CustomHandler.new
but that does not work. Any solution?
Thanks for that.
You should consider creating a Rack application instead. Rack is:
the standard for Ruby web applications
used internally by all popular Ruby web frameworks (Rails, Merb, Sinatra, Camping, Ramaze, ...)
much easier to extend
ready to be run on any application server (Mongrel, Webrick, Thin, Passenger, ...)
Rack has a URL mapping DSL, Rack::Builder, which allows you to map different Rack applications to particular URL prefixes. You typically save it as config.ru, and run it with rackup.
Unfortunately, it does not allow regular expressions either. But because of the simplicity of Rack, it is really easy to write an "application" (a lambda, actually) that will call the proper app if the URL matches a certain regex.
Based on your example, your config.ru may look something like this:
require "my_custom_rack_app" # Whatever provides your MyCustomRackApp.
js_handler = MyCustomRackApp.new
default_handlers = Rack::Builder.new do
map "/public" do
run Rack::Directory.new("my_dir/public")
end
# Uncomment this to replace Rack::Builder's 404 handler with your own:
# map "/" do
# run lambda { |env|
# [404, {"Content-Type" => "text/plain"}, ["My 404 response"]]
# }
# end
end
run lambda { |env|
if env["PATH_INFO"] =~ %r{/[a-z0-9]+\.js}
js_handler.call(env)
else
default_handlers.call(env)
end
}
Next, run your Rack app on the command line:
% rackup
If you have mongrel installed, it will be started on port 9292. Done!
You have to inject new code into part of Mongrel's URIClassifier, which is otherwise blissfully unaware of regular expression URIs.
Below is one way of doing just that:
#
# Must do the following BEFORE Mongrel::Configurator.new
# Augment some of the key methods in Mongrel::URIClassifier
# See lib/ruby/gems/XXX/gems/mongrel-1.1.5/lib/mongrel/uri_classifier.rb
#
Mongrel::URIClassifier.class_eval <<-EOS, __FILE__, __LINE__
# Save original methods
alias_method :register_without_regexp, :register
alias_method :unregister_without_regexp, :unregister
alias_method :resolve_without_regexp, :resolve
def register(uri, handler)
if uri.is_a?(Regexp)
unless (#regexp_handlers ||= []).any? { |(re,h)| re==uri ? h.concat(handler) : false }
#regexp_handlers << [ uri, handler ]
end
else
# Original behaviour
register_without_regexp(uri, handler)
end
end
def unregister(uri)
if uri.is_a?(Regexp)
raise Mongrel::URIClassifier::RegistrationError, "\#{uri.inspect} was not registered" unless (#regexp_handlers ||= []).reject! { |(re,h)| re==uri }
else
# Original behaviour
unregister_without_regexp(uri)
end
end
def resolve(request_uri)
# Try original behaviour FIRST
result = resolve_without_regexp(request_uri)
# If a match is not found with non-regexp URIs, try regexp
if result[0].blank?
(#regexp_handlers ||= []).any? { |(re,h)| (m = re.match(request_uri)) ? (result = [ m.pre_match + m.to_s, (m.to_s == Mongrel::Const::SLASH ? request_uri : m.post_match), h ]) : false }
end
result
end
EOS
http_server = Mongrel::Configurator.new :host => config.get("http_host") do
listener :port => config.get("http_port") do
# Can pass a regular expression as URI
# (URI must be of type Regexp, no escaping please!)
# Regular expression can match any part of an URL, start with "^/..." to
# anchor match at URI beginning.
# The way this is implemented, regexp matches are only evaluated AFTER
# all non-regexp matches have failed (mostly for performance reasons.)
# Also, for regexp URIs, the :in_front is ignored; adding multiple handlers
# to the same URI regexp behaves as if :in_front => false
uri /^[a-z0-9]+.js/, :handler => BLAH::CustomHandler.new
uri '/ui/public', :handler => Mongrel::DirHandler.new("#{$d}/public/")
uri '/favicon', :handler => Mongrel::Error404Handler.new('')
trap("INT") { stop }
run
end
end
Seems to work just fine with Mongrel 1.1.5.

Resources