Convert string with white space into URL - ruby

I'm using ruby and googles reverse geocode yql table to ideally automate some search query I have. The problem I hit is turning the query into a legal url format. The issue is that the encoding I'm using is returning illegal urls. The query I'm running is as follows
query="select * from google.geocoding where q='40.714224,-73.961452'"
pQuery= CGI::escape(query)
The eventual output for the processed query looks like this
http://query.yahooapis.com/v1/public/yql?q=select+%2A+from+google.geocoding+where+q%3D%2740.3714224%2C--73.961452%27+format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback=
Alas the url is illegal. When checking what the query shoud look like in the YQL console I get the following
http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20google.geocoding%20where%20q%3D%2240.714224%2C-73.961452%22&format=json&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys&callback=
As you can hopefully see :), the encoding is all wrong. I was wondering does anyone know how I can go about generating correct urls.

If you want to escape a URI, you should use URI::escape:
require 'uri'
URI.escape("select * from google.geocoding where q='40.714224,-73.961452'")
# => "select%20*%20from%20google.geocoding%20where%20q='40.714224,-73.961452'"

Related

How to work with base64 encoded params that include dynamic variables in Jmeter

I am working on a performance script in Jmeter that contains a number of http requests. One of the parameters I pass in my request will always be formatted as follows:
{"a":"transition9","ap":"203867"}
Everything about the above remains constant with the exception of "ap". I need to pull "ap" from regular expression extractor, which I can do.
So at the end of the day the above will actually look something like this.
{"a":"transition9","ap":"${regexExtractedValue}"}
Here is the really tricky part. If I can achieve the above, I then need to base64 encode the value, which I know can be done using ${__base64Encode(test string)}. See https://jmeter-plugins.org/wiki/Functions/#base64Encodesupfont-color-gray-size-1-since-1-2-0-font-sup.
I have tried a number of approaches, which mainly have involved splitting up the hardcoded values and trying to combine them with the dynamic values, but the comma seems to throw it off. An example of something I have tried.
prefix = eyJhIjoidHJhbnNpdGlvbjkiLCJhcCI6Ij
ap = ${__base64Encode(203867"})
Then you would combine the 2 and the value being passed into the param would look something like this
{"stuff":"thing","__Action":"${prefix}${app}","__Scroll":"base64:MA=="}
This yields strange results. Is there a way to get what I need here?
In the parameter value I used this format:
${prefix}${__base64Encode("${post}"})}

Request::getQueryString() without some parameters

I am using the following code to append the query strings with two links. But I want to exclude the page parameter of pagination from the query string.
<li>Teachers</li>
<li>Courses</li>
What is the way to do it? I tried the following code but it generates error.
<li>Teachers</li>
<li>Courses</li>
Well getQueryString() just returns a string. Instead you can use Request::except() directly and then call http_build_query() to generate the query string:
<li>Teachers</li>
Note that if you have POST values, those will be included too. If you want to avoid that do this:
<li>Teachers</li>

Strange characters returned after screen scraping using Ruby/Nokogiri?

I'm using Ruby and Nokogiri to scrape data off a client's legacy system.
The text I'm getting contains a trademark symbol. But when I display it on the console or save it to the database, the TM gets converted to a different character.
Diet™ BECOMES Dietâ¢
I'm pretty sure it's just an encoding problem and I'm pretty sure Ruby has an easy way to deal with it, but after several minutes of googling and trying a few obvious options, I'm not any closer.
Thanks in advance!
You have an encoding mismatch, but you haven't told us enough to help you.
Things to check:
What encoding does the server say their page is? It'll be in the HTTPD headers returned.
Is the document REALLY encoded as the server says, or are there characters that are not in that codeset?
Typically, you'll get documents as UTF-8, ISO-8859-1 or Win-1252, so try using those values to give Nokogiri a hint. The documentation for Nokogiri::HTML.parse says:
parse(thing, url = nil, encoding = nil, options = XML::ParseOptions::DEFAULT_HTML, &block)
Where:
encoding is the encoding that should be used when processing the document.
One way to figure out what the server is sending back is:
require 'open-uri'
open('http://www.example.net') { |io| io.charset }
# => "iso-8859-1"
Warning: What the server sends back is not necessarily what the content really is, so it's only a preliminary hint. The document returned could be anything, and at that point you're on your own to figure out what it is.
Typically we use Nokogiri::HTML('some html to parse'), but you can use:
Nokogiri::HTML('some html to parse', nil, 'UTF-8')
Look at Ruby's Encoding to figure out what the available codesets are:
Encoding.constants

Elasticsearch URI based query with AND operator

How do I specify AND operation in URI based query? I'm looking for something like:
http://localhost:9200/_search?q="profiletype:student AND username:s*"
For URI search in ElasticSearch, you can use either the AND operator profiletype:student AND username:s, as you say but without quotes:
_search?q=profiletype:student AND username:s*
You can also set the default operator to AND
_search?q=profiletype:student username:s*&default_operator=AND
Or you can use the + operator for terms that must be present, i.e. one would use +profiletype:student +username:s as query string. This doesn't work without URL encoding, though. In URL encoding + is %2Band space is %20, therefore the alternative would be
_search?q=%2Bprofiletype:student%20%2Busername:s*
According to documentation, it should work as you described it. See http://www.elasticsearch.org/guide/reference/query-dsl/query-string-query.html
That said, you can also use the following:
http://localhost:9200/_search?q="+profiletype:student +username:s*"
You can try bellow line.
http://localhost:9200/_search?q=profiletype:student%20AND%20username:s*
http://localhost:9200/_search?q=profiletype:student AND username:s*
i had to combine the default operator and + sign to make it work
curl -XGET 'localhost:9200/apm/inventory/_search?default_operator=AND&q=tenant_id:t2+_all:node_1'

Hpricot error parsing special characters in URI

I'm working on a ruby script to grab historical stock prices from Yahoo, using Hpricot to parse the pages. This is mostly straighforward: the url is "http://finance.yahoo.com/q/hp?s=TickerSymbol" For example, to look up Google, I would use "http://finance.yahoo.com/q/hp?s=GOOG"
Unfortunately, it breaks down when I'm looking up the price of an index. The indexes are prefixed with a caret, such as "http://finance.yahoo.com/q/hp?s=^DJI" for the Dow.
The line:
ticker_symbol = '^DJI'
doc = Hpricot(open("http://finance.yahoo.com/q/hp?s=#{ticker_symbol}"))
throws this exception:
bad URI(is not URI?): http://finance.yahoo.com/q/hp?s=^DJI
Hpricot chokes on the caret (I think because the underlying Ruby URI library does). Is there a way to escape that character or force the library to try it?
Well, don't I feel dumb. Five more minutes and I got this working:
doc = Hpricot(open(URI.encode("http://finance.yahoo.com/q/hp?s=#{ticker_symbol}")))
So if anyone else is wondering, that's how you do it. facepalm
The escape for ^ is %5E; you could do a straight substitution on the URL.
http://finance.yahoo.com/q/hp?s=%5EDJI

Resources