I am performing a screen grab to get football results and the score comes as a string, 2-2 for example. What I would ideally like to have is have that score split into home_score and away_score which is then saved into my model for each result
At the moment i do this
def get_results # Get me all results
doc = Nokogiri::HTML(open(RESULTS_URL))
days = doc.css('.table-header').each do |h2_tag|
date = Date.parse(h2_tag.text.strip).to_date
matches = h2_tag.xpath('following-sibling::*[1]').css('tr.report')
matches.each do |match|
home_team = match.css('.team-home').text.strip
away_team = match.css('.team-away').text.strip
score = match.css('.score').text.strip
Result.create!(home_team: home_team, away_team: away_team, score: score, fixture_date: date)
end
end
From some further reading i can see that you can use the .split method
.split("x").map(&:to_i)
so would i be able to do this
score.each do |s|
home_score, away_score = s.split("-").map(&:to_i)
Result.create!(home_score: home_score, away_score: away_score)
end
but how to integrate into my current setup is whats throwing me and thats even if my logic is correct, I still want the home_score and away_score to be assigned to the correct result
Thanks in advance for any help
EDIT
Ok so far the answer is no i cannot do it this way, after running the rake task I get an error
undefined method `each' for "1-2":String
The reason .each doesnt work is because each was a method of String in ruby 1.8 and it was removed in Ruby 1.9. i have tried each_char, which now saves some results and not others and when it does save home_score and away_score are not assigned correctly
Answer
As #seph pointed out the each was not needed, if it helps anyone else my final task looks like this
def get_results # Get me all results
doc = Nokogiri::HTML(open(RESULTS_URL))
days = doc.css('.table-header').each do |h2_tag|
date = Date.parse(h2_tag.text.strip).to_date
matches = h2_tag.xpath('following-sibling::*[1]').css('tr.report')
matches.each do |match|
home_team = match.css('.team-home').text.strip
away_team = match.css('.team-away').text.strip
score = match.css('.score').text.strip
home_score, away_score = score.split("-").map(&:to_i)
Result.create!(home_team: home_team, away_team: away_team, fixture_date: date, home_score: home_score, away_score: away_score)
end
end
end
No need for the each. Do this:
home_score, away_score = score.split("-").map(&:to_i)
Related
I'm trying to scrape a website however I cannot seem to get my while-loop to break out once it hits a page with no more information:
def scrape_verse_items(keyword)
pg = 1
while pg < 1000
puts "page #{pg}"
url = "https://www.bible.com/search/bible?page=#{pg}&q=#{keyword}&version_id=1"
doc = Nokogiri::HTML(open(url))
items = doc.css("ul.search-result li.reference")
error = doc.css('div#noresults')
until error.any? do
if keyword != ''
item_hash = {}
items.each do |item|
title = item.css("h3").text.strip
content = item.css("p").text.strip
item_hash[title] = content
end
else
puts "Please enter a valid search"
end
if error.any?
break
end
end
pg += 1
end
item_hash
end
puts scrape_verse_items('joy')
I know this doesn't exactly answer your question, but perhaps you might consider using a different approach altogether.
Using while and until loops can get a bit confusing, and usually isn't the most performant way of doing things.
Maybe you would consider using recursion instead.
I've written a small script that seems to work :
class MyScrapper
def initialize;end
def call(keyword)
puts "Please enter a valid search" && return unless keyword
scrape({}, keyword, 1)
end
private
def scrape(results, keyword, page)
doc = load_page(keyword, page)
return results if doc.css('div#noresults').any?
build_new_items(doc).merge(scrape(results, keyword, page+1))
end
def load_page(keyword, page)
url = "https://www.bible.com/search/bible?page=#{page}&q=#{keyword}&version_id=1"
Nokogiri::HTML(open(url))
end
def build_new_items(doc)
items = doc.css("ul.search-result li.reference")
items.reduce({}) do |list, item|
title = item.css("h3").text.strip
content = item.css("p").text.strip
list[title] = content
list
end
end
end
You call it by doing MyScrapper.new.call("Keyword") (It might make more sense to have this as a module you include or even have them as class methods to avoid the need to instantiate the class.
What this does is, call a method called scrape and you give it the starting results, keyword, and page. It loads the page, if there are no results it returns the existing results it has found.
Otherwise it builds a hash from the page it loaded, and then the method calls itself, and merges the results with the new hash it just build. It does this till there are no more results.
If you want to limit the page results you can just change this like:
return results if doc.css('div#noresults').any?
to this:
return results if doc.css('div#noresults').any? || page > 999
Note: You might want to double-check the results that are being returned are correct. I think they should be but I wrote this quite quickly, so there could always be a small bug hiding somewhere in there.
I am trying to do a post and run some if statement. What I want to do is:
check all fields are filled
if all fields are filled move on to next step, or else reload page
check if already in data base
add if not already in data base
post "/movies/new" do
title = params[:title]
year = params[:year]
gross = params[:gross]
poster = params[:poster]
trailer = params[:trailer]
if title && year && gross && poster && trailer
movie = Movie.find_by(title: title, year: year, gross: gross)
if movie
redirect "/movies/#{movie.id}"
else
movie = Movie.new(title: title, year: year, gross: gross, poster: poster, trailer: trailer)
if movie.save
redirect "/movies/#{movie.id}"
else
erb :'movies/new'
end
end
else
erb :'movies/new'
end
end
I don't think my if statement is correct. It works even if all my fields are not filled
Your code is doing a lot of work in one single method. I would suggest to restructure it into smaller chunks to make it easier to manage. I mostly code for Rails, so apologies if parts of these do not apply to your framework.
post "/movies/new" do
movie = find_movie || create_movie
if movie
redirect "/movies/#{movie.id}"
else
erb :'movies/new'
end
end
def find_movie
# guard condition to ensure that the required parameters are there
required_params = [:title, :year, :gross]
return nil unless params_present?(required_params)
Movie.find_by(params_from_keys(required_params))
end
def create_movie
required_params = [:title, :year, :gross, :poster, :trailer]
return nil unless params_present?(required_params)
movie = Movie.new(params_from_keys(required_params))
movie.save ? movie : nil # only return the movie if it is successfully saved
end
# utility method to check whether all provided params are present
def params_present?(keys)
keys.each {|key| return false if params[key].blank? }
true
end
# utility method to convert params into the hash format required to create / find a record
def params_from_keys(keys)
paras = {}
keys.each { |key| paras.merge!(key: params[key]) }
paras
end
Even if you type nothing in the HTML fields, they will still be submitted as empty strings.
You can avoid having empty parameters by, for example, filtering them:
post '/movies/new' do
params.reject! { |key, value| value.empty? }
# rest of your code
end
Also I would rather post to /movies rather than to /movies/new, that's more REST-wise.
Try if condition to check fields are blank like below -
unless [title, year, gross, poster, trailer].any?(&:blank?)
This will check any of the field should not be nil or blank("").
I have been trying to puts some executed statements after I prepare them. The purpose of this is to sanitize my data inputs, which I have never done before. I followed the steps here, but I am not getting the result I want.
Here's what I have:
require 'sqlite3'
$db = SQLite3::Database.open "congress_poll_results.db"
def rep_pull(state)
pull = $db.prepare("SELECT name, location FROM congress_members WHERE location = ?")
pull.bind_param 1, state
puts pull.execute
end
rep_pull("MN")
=> #<SQLite3::ResultSet:0x2e69e00>
What I am expecting is a list of reps in MN, but instead I just get "SQLite3::ResultSet:0x2e69e00" thing.
What am I missing here? Thanks very much.
Try this
def rep_pull(state)
pull = $db.prepare("SELECT name, location FROM congress_members WHERE location = ?")
pull.bind_param 1, state
pull.execute do |row|
p row
end
end
I am trying to follow a tutorial on big data, it wants to reads data from a keyspace defined with cqlsh.
I have compiled this piece of code successfully:
require 'rubygems'
require 'cassandra'
db = Cassandra.new('big_data', '127.0.0.1:9160')
# get a specific user's tags
row = db.get(:user_tags,"paul")
###
def tag_counts_from_row(row)
tags = {}
row.each_pair do |pair|
column, tag_count = pair
#tag_name = column.parts.first
tag_name = column
tags[tag_name] = tag_count
end
tags
end
###
# insert a new user
db.add(:user_tags, "todd", 3, "postgres")
db.add(:user_tags, "lili", 4, "win")
tags = tag_counts_from_row(row)
puts "paul - #{tags.inspect}"
but when I write this part to output everyone's tags I get an error.
user_ids = []
db.get_range(:user_tags, :batch_size => 10000) do |id|
# user_ids << id
end
rows_with_ids = db.multi_get(:user_tags, user_ids)
rows_with_ids.each do |row_with_id|
name, row = row_with_id
tags = tag_counts_from_row(row)
puts "#{name} - #{tags.inspect}"
end
the Error is:
line 33: warning: multiple values for a block parameter (2 for 1)
I think the error may have came from incompatible versions of Cassandra and Ruby. How to fix it?
Its a little hard to tell which line is 33, but it looks like the problem is that get_range yields two values, but your block is only taking the first one. If you only care about the row keys and not the columns then you should use get_range_keys.
It looks like you do in fact care about the column values because you fetch them out again using db.multi_get. This is an unnecessary additional query. You can update your code to something like:
db.get_range(:user_tags, :batch_size => 10000) do |id, columns|
tags = tag_counts_from_row(columns)
puts "#{id} - #{tags.inspect}"
end
im trying to optimize my code as much as possible and i've reached a dead end.
my code looks like this:
class Person
attr_accessor :age
def initialize(age)
#age = age
end
end
people = [Person.new(10), Person.new(20), Person.new(30)]
newperson1 = [Person.new(10)]
newperson2 = [Person.new(20)]
newperson3 = [Person.new(30)]
Is there a way where i can get ruby to automatically pull data out from the people array and name them as following newperson1 and so on..
Best regards
That is definitely a code smell. You should refer to them as [people[0]], [people[1]], ... .
But if you insist on doing so, and if you can wait until December 25 (Ruby 2.1), then you can do:
people.each.with_index(1) do |person, i|
binding.local_variable_set("newperson#{i}", [person])
end
I think this is what you're trying to do...
newperson1 = people[0]
puts newperson1.age
The output of this 10 as expected.