ruby code to search and get a string from a html content - ruby

Am new to ruby.
am trying to get the webpage contents and search and return a string from that response
following code retunrs the webpage as html
require 'rubygems'
require 'uri'
require 'net/http'
AppArgs = Array.new
def get()
content = Net::HTTP.get('integration.twosmiles.com', '/status')
puts content
end
get()
html content
<!-- PAGE CONTENT -->
<div class="container-fluid page-content">
<div class="row-fluid">
<h1>Status</h1>
<p>The rails app is up. Nothing to see here, move on.</p>
<br>
<p>uptime:</p>
22:09:18 up 66 days, 22:37, 0 users, load average: 0.00, 0.01, 0.05
<br>
<br><br>
<p>other</p>
# On branch deploy
<br>
commit bc1407b29697bab36bc2f5e35aa197228181e225
<br>
</div>
</div>
<!-- END PAGE CONTENT -->
Above is the part of the web page content . From this content i want to get the commit bc1407b29697bab36bc2f5e35aa197228181e225
and ony want to return the key value bc1407b29697bab36bc2f5e35aa197228181e225.how it is possbile using ruby code

key = get()[/commit\s+([a-f0-9]{10,})/i, 1]
puts key
Regex explanation here.

Related

Scraping the href value of anchor in Ruby

Working on this project where I have to scrape a "website," which is just a an html file in one of the local folders. Anyway, I've been trying to scrape down to the href value (a url) of the anchor tag for each student object. I am also scraping for other things, so ignore the rest. Here is what I have so far:
def self.scrape_index_page(index_url) #responsible for scraping the index page that lists all of the students
#return an array of hashes in which each hash represents one student.
html = index_url
doc = Nokogiri::HTML(open(html))
# doc.css(".student-name").first.text
# doc.css(".student-location").first.text
#student_card = doc.css(".student-card").first
#student_card.css("a").text
end
Here is one of the student profiles. They are all the same, so I'm just interested in scraping the href url value.
<div class="student-card" id="eric-chu-card">
<a href="students/eric-chu.html">
<div class="view-profile-div">
<h3 class="view-profile-text">View Profile</h3>
</div>
<div class="card-text-container">
<h4 class="student-name">Eric Chu</h4>
<p class="student-location">Glenelg, MD</p>
</div>
</a>
</div>
thanks for your help!
Once you get an anchor tag in Nokogiri, you can get the href like this:
anchor["href"]
So in your example, you could get the href by doing the following:
student_card = doc.css(".student-card").first
href = student_card.css("a").first["href"]
If you wanted to collect all of the href values at once, you could do something like this:
hrefs = doc.css(".student-card a").map { |anchor| anchor["href"] }

Run Ruby file from html form submit

I have a Ruby program that reads a file and returns a certain output. I have to now create a web app of this program using Sinatra. I created a form with all the file options and I want to now run that Ruby code with that selected file from the form after the submit button is pressed.
Basically, I’m not sure how to get this external Ruby program to run with the the filename that was selected by the user from the HTML form.
The Ruby program (example.rb) starts with the definition def read_grammar_defs(filename).
// sinatra_main.rb
require 'sinatra'
require 'sinatra/reloader' if development? #gem install sinatra-contrib
require './rsg.rb'
get '/' do
erb :home
end
post '/p' do
//call program to read file with the parameter from form
end
// layout.erb
<!doctype html>
<html lang="en">
<head>
<title><%= #title || "RSG" %></title>
<meta charset="UTF8">
</head>
<body>
<h1>RubyRSG Demo</h1>
<p>Select grammar file to create randomly generated sentence</p>
<form action="/p" method="post">
<select name="grammar_file">
<option value="Select" hidden>Select</option>
<option value="Poem">Poem</option>
<option value="Insult">Insult</option>
<option value="Extension-request">Extension-request</option>
<option value="Bond-movie">Bond-movie</option>
</select>
<br><br>
</form>
<button type="submit">submit</button>
<section>
<%= yield %>
</section>
</body>
</html>
The easiest way is as follows:
Package the example.rb code into a class or module like so:
class FileReader
def self.read_grammar_defs(filename)
# ...
end
end
require the file from your sinatra server
Inside the post action, read the params and call the method:
post '/p' do
#result = FileReader.read_grammar_defs(params[:grammar_file])
erb :home
end
With this code, after submitting the form, it would populate the #result variable and render the :home template. Instance variables are accessible from ERB and so you could access it from therer if you wanted to display the result.
This is one potential issue with this, though - when the page is rendered the url will still say "your_host.com/p" and if the user reloads the page, they will get a 404 / "route not found" error because there is no get "/p" defined.
As a workaround, you can redirect '/' and use session as described in this StackOverflow answer or Sinatra' official FAQ to pass the result value.

Undefined Method Join-error when running CarrierWave and Sinatra.

I'm trying the Gentle Introduction to CarrierWave-tutorial by using the web-framework Sinatra. When I run my app it starts just fine and the app asks me to upload a file and it does so without any problems. However, while uploading the file, the app throws me an "undefined method `join' for # String:0x3480d50 "-error.
I've looked around a little bit on the internet and I found this issue at github where they say that the error may be due to incompatibilities between Rack and Sinatra or for having installed duplicate versions of Sinatra.
Does anybody know what's happening?
My uploader_app:
require 'carrierwave'
require 'sinatra'
require 'sqlite3'
require 'sequel'
require 'carrierwave/sequel'
DB = Sequel.sqlite
DB.create_table :uploads do
String :file
end
# uploader
class MyUploader < CarrierWave::Uploader::Base
storage :file
end
# model
class Upload < Sequel::Model
mount_uploader :file, MyUploader
end
# sinatra app
get '/' do
#uploads = Upload.all
erb :index
end
post '/' do
upload = Upload.new
upload.file = params[:image]
upload.save
redirect to('/')
end
__END__
## index
<!DOCTYPE html>
<html>
<body>
<div>
<form action="/" method="post" enctype="multipart/form-data">
<input type="file" name="image" />
<input type="submit" name="submit" value="Upload" />
</form>
<% #uploads.each do |upload| %>
<img src="<%= upload.file.url %>" />
<% end %>
</div>
</body>
</html>
The error is occurring on this line in the Carrierwave Library:
path = encode_path(file.path.gsub(File.expand_path(root), ''))
It fails because root is nil, so File.expand_path(root) raises an error. I don't know why root isn't set, but the following code (that I modified from this answer) worked for me:
CarrierWave.configure do |config|
config.root = settings.root
end
I just added it to the code after declaring the Sequel class and before defining the route. Probably best to stick it in a configure block too. Note that settings.root in the code above is Sinatra's root setting.
This doesn't seem to be caused by the current problems between Rack 1.6.0 and Sinatra 1.4.5 as that's what I was running, although I'm on Ruby v2.1.2 as I mentioned in the comments above.
Depending on what you want, Sinatra's root might not be the best place to put things, as I ended up with a directory inside the project root called "uploads" which had the files in, but config.root obviously needs to be set to something.
Hope that helps.

How do I change a dom node input and parse relevant information with Mechanize?

How do I change the dom node input and parse relevant information with Mechanize?
I want the website to show a range of trucks manufactured from 2010 upwards so I can parse the relevant information.
What I want it to do:
Visit https://www.kleyntrucks.com/trucks/tractorunit/
Set "MATRICULATION YEAR" (iow year of manufacture) to the range of 2010 and 2014 (http://d.pr/i/kDky)
Scrape information about all the trucks that are listed (and are manufactured between 2010 and 2014)
This is the code:
require "mechanize"
#url = "https://www.kleyntrucks.nl/vrachtwagens/trekker/"
a = Mechanize.new do |agent|
agent.user_agent_alias = 'Mac Safari'
agent.follow_meta_refresh = true
end
a.get(#url) do |page|
# Put range input to "2010"
bouwjaar_range_field = page.search("#imprp0")
bouwjaar_range_details = bouwjaar_range_field.search(".details")
input = bouwjaar_range_details.search("input")[0]
input['value'] = "2010"
puts bouwjaar_range_field
end
This is the output:
<li id="imprp0" class="">
<a name="bouwjaar"></a>
<div class="title">Bouwjaar</div>
<div class="details">
<input type="hidden" name="slider_from" value="2010"><input type="hidden" name="slider_till" value="2014"><input type="hidden" name="slider_selected_from" value="0"><input type="hidden" name="slider_selected_till" value="0"><input type="hidden" name="slider_units" value="">
</div>
</li>
It does not show all the truck related information I need.
Any ideas?
Unfortunately, you won't be able to use Mechanize that easily because the page is using AJAX updates.
Unlike a browser, Mechanize doesn't execute any Javascript on the page, and thus doesn't update the results via AJAX when you change an input.
It's still possible to grab the results using Mechanize, but you'll have to craft the request with the necessary parameters manually (the request that is posted with AJAX; you can see it with Developer Tools), post it, and parse the result page.

Watir Webdriver : Entering text in <p> tag within an iframe

I am really stuck now. I have an iframe in which there is a < p> tag where I want to send some text, but I am just not able to do it.
HTML:
<iframe id="edit-field-verdict-0-value_ifr" frameborder="0" src="javascript:""" style="width: 100%; height: 100px;">
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head xmlns="http://www.w3.org/1999/xhtml">
<body id="tinymce" class="mceContentBody " spellcheck="false" dir="ltr">
<p>
<br mce_bogus="1">
</p>
</body>
</html>
</iframe>
The code that i have tried is :
#browser.elements(:xpath => '//*[#id="tinymce"]').p.send_keys [:control, 'a']
The error that I am getting is:
undefined method `elements' for #<String:0x24ba570> (NoMethodError)
I also tried
$browser.frame(:id,'edit-field-verdict-0-value_ifr').html.body(:id,'tinymce').p.send_keys [:control, 'a']
But as the body is not recognized by Watir, I tried elements_by_xpath as well. It didn't work.
How can I make this thing work?
For the first attempt, the error message is saying that #browser is a string rather than a Watir::Browser object. You should verify that #browser is correctly set. Based on your second example, perhaps it is meant to be the global variable $browser.
For the second attempt, body is supported in Watir. However, html will return the page's html rather than the html element. Given that there should only be one body element, the html element can be omitted.
$browser.frame(:id,'edit-field-verdict-0-value_ifr').body(:id,'tinymce').p.send_keys [:control, 'a']
But also keep in mind that you only need to include the frame method (to tell watir to look inside the frame) and as little as needed to reliably find the element you are interacting with. Anything extra is just making the code more verbose, and also perhaps making things more brittle and easy to break. So the above could be shortened down to just
$browser.frame(:id,'edit-field-verdict-0-value_ifr').p.send_keys [:control, 'a']
Based on the id of the element you are testing, I assume it is a WYSIMYG Editor. You should look at the Watir-Webdriver page for an example - http://watirwebdriver.com/wysiwyg-editors/. The TinyMCE Editor example from the webpage:
require 'watir-webdriver'
b = Watir::Browser.new
b.goto 'http://tinymce.moxiecode.com/tryit/full.php'
b.execute_script("tinyMCE.get('content').execCommand('mceSetContent',false, 'hello world' );")
b.frame(:id => "content_ifr").send_keys 'hello world again'

Resources