I want to scrape rubygems.org, gem install part. But this is not in text it is the value of the input box, how could I scraps it?
rubygems.org the input box which value part I want to scrape
<input type="text" class="gem__code" id="install_text" value="gem install nokogiri" readonly="readonly">
My ruby Code
require "httparty"
require "nokogiri"
req=HTTParty.get("https://rubygems.org/gems/nokogiri")
parse=Nokogiri::HTML(req.body)
val=parse.xpath("//*[#id=\"install_text\"]").text
puts val
Please help me out!
You were to close to the final solution I think that the best option to working with ids is using at_css method try this:
require "httparty"
require "nokogiri"
req = HTTParty.get("https://rubygems.org/gems/nokogiri")
doc = Nokogiri::HTML req
puts doc.at_css('[id="install_text"]').attributes["value"].value
Then execute this script, or in an ruby interpreter:
☸ sandas-nonpro (qa) in learn/ruby/stackoverflow via 💎 v3.0.1
❯ ruby ./nokogiri_rubygems.rb
gem install nokogiri
in a pry interactive session:
[79] pry(main)> doc.at_css('[id="install_text"]').attributes
=> {"type"=>#(Attr:0xe1c8 { name = "type", value = "text" }),
"class"=>#(Attr:0xe1dc { name = "class", value = "gem__code" }),
"id"=>#(Attr:0xe1f0 { name = "id", value = "install_text" }),
"value"=>#(Attr:0xe204 { name = "value", value = "gem install nokogiri" }),
"readonly"=>#(Attr:0xe218 { name = "readonly", value = "readonly" })}
[80] pry(main)> doc.at_css('[id="install_text"]').attributes["value"].value
=> "gem install nokogiri"
Related
I was trying to test with Selenium, but can not download a pdf, pdfs keep opening.
See my other post: RUBY: Selenium webdriver, setup to download pdf files instead of opening them
It was advised to try Watir, so I did, and I get the same result. Here is my Watir setup. Please advise on how to fix this issue.
require 'watir'
require 'pry'
prefs = {
download: {
prompt_for_download: false,
default_directory: '/Users/ar/pdf_downloads'
}
}
browser = Watir::Browser.new :chrome, options: {prefs: prefs}
# Goto Login Page (file)
url="file:///Users/ar/info.html"
browser.goto url
browser.button(id: 'formsubmit').click
sleep 5
# Goto info
info_url = 'https://webapp.domain.com/info'
browser.goto info_url
sleep 5
elements = browser.elements(css: "#ar-pdfreport a")
link = elements.first.attribute("href")
browser.goto link
There is a bug in Selenium-WebDriver v3.142.7 where using symbols for the prefs generates the wrong result - eg does not set the download directory. See https://github.com/SeleniumHQ/selenium/issues/7917 for more details.
Switch the symbols to Strings:
prefs = {
download: {
'prompt_for_download' => false,
'default_directory' => '/Users/ar/pdf_downloads'
},
plugins: {
'always_open_pdf_externally' => true
}
}
I'm trying to automate my fedex processing via Ruby gem Mechanize.
I've set up the initial script similar to what has worked for me on the Amazon Seller Central page. However, I seem to be unable to fully log in to the fedex website and I appear to just be "staying" on the login page.
If you have any idea what my code is missing, it would be appreciated!
CODE
Dir.chdir 'C:\Ruby22\bin'
require "rubygems"
require "mechanize"
require "certified"
require "logger"
#Initialize and set agent settings:
agent = Mechanize.new
agent.user_agent_alias = 'Windows Chrome'
agent.follow_meta_refresh = true
agent.redirect_ok = true
#Log in to fedex
login_url = "https://www.fedex.com/dk/"
page_login = agent.get(login_url)
puts "Login Page: " + page_login.title.to_s
form = page_login.form_with(:name => "logonForm")
#form.fields.each { |f| puts f.name }
form.username = "************"
form['ap_signin_existing_radio'] = "1"
form.password = "*********"
agent.submit(form)
page_home = agent.get(login_url)
puts "Fedex Dashboard Page: " + page_home.title
I'm new to Ruby, and to using HTTParty, and was trying to follow the HTTParty examples from their github page to execute a basic POST. When I run the code below I get an error:
require 'pp'
require 'HTTParty'
require 'pry'
class Partay
include HTTParty
base_uri "http://<myapidomain>/search/semanticsearch/query/"
end
options= {
query: {
version: "0.4",
query: "lawyer"
}}
response = Partay.post(options)
puts response
The error I get is:
rbenv/versions/2.2.0/lib/ruby/2.2.0/uri/common.rb:715:in `URI': bad argument (expected URI object or URI string) (ArgumentError)
from ~/.ruby/2.2.0/gems/httparty-0.13.3/lib/httparty/request.rb:47:in `path='
from ~/.ruby/2.2.0/gems/httparty-0.13.3/lib/httparty/request.rb:34:in `initialize'
from ~/.ruby/2.2.0/gems/httparty-0.13.3/lib/HTTParty.rb:539:in `new'
from ~/.ruby/2.2.0/gems/httparty-0.13.3/lib/HTTParty.rb:539:in `perform_request'
from ~/.ruby/2.2.0/gems/httparty-0.13.3/lib/HTTParty.rb:491:in `post'
from json-to-csv.rb:16:in `<main>'
What I am looking for is calling a post that receives JSON in the same way that calling this URL works:
http://somedomain.com/search/semanticsearch/query/?version=0.4&query=lawyer
Noting a solution with the suggested gem - unirest:
require 'unirest'
url = "http://somedomain.com/search/semanticsearch/query"
response = Unirest.post url,
headers:{ "Accept" => "application/json" },
parameters:{ :version => 0.4, :query => "lawyer" }
I have a wsdl url to give request and get response this is my code(using savon gem for this)
client = Savon.new('http://services.chromedata.com/Description/7a?wsdl')
service = :Description7a
port = :Description7aPort
operation = :getDivisions
division = client.operation(service, port, operation)
I am able to print example_body like
division.example_body
=> {:DivisionsRequest=>{:accountInfo=>{:_number=>"string", :_secret=>"string", :_country=>"string", :_language=>"string", :_behalfOf=>"string"}, :_modelYear=>"int"}}
and i'm able to set values like
division.body = {.........}
other operation such like
operationlist = client.operations(service, port)
=> ["getVersionInfo", "getModelYears", "getDivisions", "getSubdivisions", "getModels", "getStyles", "describeVehicle", "getCategoryDefinitions", "getTechnicalSpecificationDefinitions"]
I used describe vehicle
desc_veh = client.operation(service, port, "describeVehicle")
whose example_body is like
desc_veh.example_body
=> {:VehicleDescriptionRequest=>{}}
so unable to set values for desc_veh.body and the use the .call function
I don know whether it is a savon gem problem or the wsdl url problem
Your code could look like this:
gem "savon", "~> 2.0"
require "savon"
client = Savon.client(
:wsdl => 'http://services.chromedata.com/Description/7a?wsdl',
:convert_request_keys_to => :camelcase,
:log => true,
:log_level => :debug,
:pretty_print_xml => true
)
res = client.call(:get_divisions,
message: { :param1 => 'value1', :param2 => 'value2' }
)
print res.to_hash
The parameters are simply a hash in key/value pairs.
I've captured the login HTTP headers using firefox plugin LiveHTTPheaders.
I've found the following url and variables.
POST /login
email=myemail%40gmail.com&password=something&remember=1&loginSubmit=Login
And here's the code I am running:
require 'rubygems'
require 'mechanize'
browser = Mechanize.new
browser.post('http://www.mysite.com/login',
[
["email","myemail%40gmail.com"],
["password","something"],
["remember","1"],
["loginSubmit","Login"],
["url"=>""]
]
) do |page|
puts page.body
end
However, this gives me nothing ! is something wrong with my post parameters ?
post() doesn't take a block. Try this:
page = browser.post('http://www.mysite.com/login', {
"email" => "myemail%40gmail.com",
"password" => "something",
"remember" => "1",
"loginSubmit" => "Login",
"url" => ""
})
edit: changed for accuracy