I'm creating a small app based on the conditions of the results of the last game played, or the last row with game data (win/lose and game number).
My issue is accessing the first column of the last row (most recent game played). How is that accomplished?
require 'open-uri'
class BrooklynPizzaController < ApplicationController
def index
# URL for dynamic content
url = "http://www.basketball-reference.com/teams/BRK/2015_games.html"
# Open URL using nokogiri
doc = Nokogiri::HTML(open(url))
# Scrape result from Web site
#result = doc.css("#teams_games").xpath("//table/tbody/tr/td[8]/text()")
# IN PROGRESS - Get date of last game played
#result_date = doc.xpath('//table/tbody/tr/td[2]/a/text()') do |link|
#result_date[link.text.strip] = link['a']
end
###############################################################
# IN PROGRESS - Get number of last game played from 1st column
# doc.xpath('//table/tbody/tr/td[1]/text()') do |game|
# last_game_number =
# end
################################################################
# #result_date = doc.css("#teams_games").xpath("//table/tbody/tr/td[2]/text()")
# Set date to current
#date = Date.today
# Get date of last game played
if (#result.last.next == nil)
flag = doc.xpath("//table/tbody/tr[#{#result}]")
#result_date = doc.xpath("//table/tbody/tr#{flag}/td[2]/a/text()")
end
end
end
Please let me know what lack of information I'm giving you, because I feel like I've left out some things.
To get the row you would do this:
win_loss_tds = doc.css("#teams_games tbody tr td:nth-child(8):not(:empty)").last
last_win_loss_row = win_loss_tds.last.parent
There's undoubtedly a way to do that in a single XPath expression, but I'll leave that as an exercise to the reader since I don't care for XPath.
To get the game number from the first column you would do this:
game_num_col = last_win_loss_row.at("td:first-child")
game_num = game_num_col.text.to_i
# => 82
And to get the date from the second column:
date_col = last_win_loss_row.at("td:nth-child(2)") # XPath: td[2]
date = DateTime.parse(date_col.text)
# => 2015-04-15T00:00:00+00:00
If you want date and time, you could do this:
time_col = last_win_loss_row.at("td:nth-child(3)")
date_time = DateTime.parse("#{date_col.text} #{time_col.text}")
# => 2015-04-15T08:00:00-03:00
Well, I'd do this:
require 'open-uri'
require 'nokogiri'
doc = Nokogiri::HTML(open("http://www.basketball-reference.com/teams/BRK/2015_games.html"))
latest_score_row = doc.search('//tr/td/a[contains(.,"Box Score")]/../..').last
latest_text = latest_score_row.search('td').map(&:text)
# => ["13",
# "Sat, Nov 22, 2014",
# "8:30p EST",
# "",
# "Box Score",
# "#",
# "San Antonio Spurs",
# "L",
# "",
# "87",
# "99",
# "5",
# "8",
# "L 1",
# ""]
But YMMV.
How does it work? Easy. It looks for <a> nodes in the page containing "Box Score", then, for each one found, backs up two levels to the <tr> node and returns an array to Nokogiri/Ruby. last takes the last one found.
Then it's just a matter of looking in that row for the <td> nodes and grabbing their text.
The time stamp is then a matter of pulling the date and time from the array, doing a tiny bit of massaging of the "am/pm" and letting Ruby build an object:
latest_time = Time.strptime(
[
latest_text[1], # => "Sat, Nov 22, 2014"
latest_text[2].sub(/([ap])/, '\1m') # => "8:30pm EST"
].join(' '), # => "Sat, Nov 22, 2014 8:30pm EST"
'%a, %b %d, %Y %H:%M%P %Z' # => "%a, %b %d, %Y %H:%M%P %Z"
) # => 2014-11-22 18:30:00 -0700
Related
I am scraping the website https://www.bananatic.com/de/forum/games/. I want to extract only the year of the dates.
require 'nokogiri'
require 'open-uri'
require 'pp'
unless File.readable?('data.html')
url = 'https://www.bananatic.com/de/forum/games/'
data = URI.open(url).read
File.open('data.html', 'wb') { |f| f << data }
end
data = File.read('data.html')
document = Nokogiri::HTML(data)
links3 = document.css('.topics ul li div')
re = links3.map do |lk3|
name = lk3.css('.name').children.text.strip.split("\n")[2]
end
date = ' '
size_dates = re.length
(0..size_dates).each do |i|
unless i.nil?
date = re[i]
print date
end
end
As a result of the execution I get dates in what appears to be a String with the following format:
day .month.year, hour:minutes
But I only need the year I have made a split but I get an error.
Your issue is that if you look at the output from this block
re = links3.map do |lk3|
lk3.css('.name').children.text.strip.split("\n")[2]
end
You will see:
[" 07.08.2016, 13:47", nil, nil, nil, nil, " 06.08.2016, 9:24", nil, nil, nil, nil,...]
So you could solve your immediate issue by just adding .compact to the end or switching map to filter_map.
That being said here is another way to solve your issue:
You can get just the year from that text on that page using the following:
require 'nokogiri'
require 'open-uri'
url = "https://www.bananatic.com/de/forum/games/"
doc = Nokogiri::HTML(URI.open(url))
doc
.xpath('//div[#class="name"]/text()[string-length(normalize-space(.)) > 0]')
.map {|node| node.to_s[/\d{4}/]}
#=> ["2016", "2016", "2022", "2022", "2022", "2021", "2022", "2017", "2022", "2021", "2019", "2016", "2021", "2021", "2021", "2021", "2020", "2021", "2017", "2021"]
The 2 parts are:
//div[#class="name"]/text()[string-length(normalize-space(.)) > 0] - the XPath which finds all divs with the class "name" and then pulls the non zero length (trimmed of white space) text nodes.
.map {|node| node.to_s[/\d{4}/]} - map these into an array by slicing the String based on a regex for 4 contiguous digits.
If you would like the XPath to be as specific as your post you can use:
'//div[#class="topics"]/ul/li//div[#class="name"]/text()[string-length(normalize-space(.)) > 0]'
You could use REGEX to get only the year after having the list.
Of course, if what you showing is the pattern. Will work. Years would be the only one with 4 straight digits.
Example:
17.01.2023, 17:40
this \b\d{4}\b will result in 2023.
I have ISO 8601 compliant date strings like "2016" or "2016-09" representing year or months. How can I get start end dates from this.
for example:
2016 -> ["2016-01-01", "2016-12-31"]
2016-09 -> ["2016-09-01", "2016-09-30"]
Thank you
Try this
require 'date'
def iso8601_range(str)
parts = str.scan(/\d+/).map(&:to_i)
date = Date.new(*parts)
case parts.size
when 1
date .. date.next_year - 1
when 2
date .. date.next_month - 1
else
date .. date
end
end
iso8601_range('2016') # => 2016-01-01..2016-12-31
iso8601_range('2016-09') # => 2016-09-01..2016-09-30
iso8601_range('2016-09-20') # => 2016-09-20..2016-09-20
If you are cool with using send you can replace the case statement with
date .. date.send([:next_year,:next_month,:next_day][parts.size - 1]) - 1
require 'date'
def create_start_end(string)
year, month = string.split('-').map(&:to_i)
if month && !month.zero?
[Date.new(year, month, 1).to_s, Date.new(year, month, -1).to_s]
else
[Date.new(year, 1, 1).to_s, Date.new(year, 12, -1).to_s]
end
end
create_start_end('2016')
#=> ["2016-01-01", "2016-12-31"]
create_start_end('2016-01')
#=> ["2016-01-01", "2016-01-31"]
create_start_end('2016-09')
#=> ["2016-09-01", "2016-09-30"]
One more solution in according to #AndreyDeineko :)
require 'date'
def create_date date
date = date.split('-').map(&:to_i)
[Date.new(*date, 1, 1), Date.new(*date, -1, -1)].map(&:to_s)
end
I have a few strings that I am retrieving from a file birthdays.txt. An example of a string is below:
Christopher Alexander, Oct 4, 1936
I would like to separate the strings and let variable name be a hash key and birthdate the hash value. Here is my code:
birthdays = {}
File.read('birthdays.txt').each_line do |line|
line = line.chomp
name, birthdate = line.split(/\s*,\s*/).first
birthdays = {"#{name}" => "#{birthdate}"}
puts birthdays
end
I managed to assign name to the key. However, birthdate returns "".
File.new('birthdays.txt').each.with_object({}) do
|line, birthdays|
birthdays.store(*line.chomp.split(/\s*,\s*/, 2))
puts birthdays
end
I feel like some of the other solutions are overthinking this a bit. All you need to do is split each line into two parts, the part before the first comma and the part after, which you can do with line.split(/,\s*/, 2), then call to_h on the resulting array of arrays:
data = <<END
Christopher Alexander, Oct 4, 1936
Winston Churchill, Nov 30, 1874
Max Headroom, Apr 4, 1985
END
data.each_line.map do |line|
line.chomp.split(/,\s*/, 2)
end.to_h
# => { "Christopher Alexander" => "Oct 4, 1936",
# "Winston Churchill" => "Nov 30, 1874",
# "Max Headroom" => "April 4, 1985" }
(You will, of course, want to replace data with your File object.)
birthdays = Hash.new
File.read('birthdays.txt').each_line do |line|
line = line.chomp
name, birthdate = line.split(/\s*,\s*/, 2)
birthdays[name]= birthdate
puts birthdays
end
Using #Jordan's data:
data.each_line.with_object({}) do |line, h|
name, _, bdate = line.chomp.partition(/,\s*/)
h[name] = bdate
end
#=> {"Christopher Alexander"=>"Oct 4, 1936",
# "Winston Churchill"=>"Nov 30, 1874",
# "Max Headroom"=>"Apr 4, 1985"}
In ruby, how can I get every 14th day of the year, going backwards and forwards from a date.
So consider I'm billed for 2 weeks of recycling on today, 6-16-2015. How can I get an array of every recycling billing day this year based on that date.
Date has a step method:
require 'date'
d = Date.strptime("6-16-2015", '%m-%d-%Y') # strange date format
end_year = Date.new(d.year, -1, -1)
p d.step(end_year, 14).to_a
# =>[#<Date: 2015-06-16 ((2457190j,0s,0n),+0s,2299161j)>, #<Date: 2015-06-30 ((2457204j,0s,0n),+0s,2299161j)>, ...
# Going backward:
begin_year = Date.new(d.year, 1, 1)
p d.step(begin_year,-14).to_a
# =>[#<Date: 2015-06-16 ((2457190j,0s,0n),+0s,2299161j)>, #<Date: 2015-06-02 ((2457176j,0s,0n),+0s,2299161j)>,...
A more descriptive and easy to understand solution:
require 'date'
current_date = Date.parse "16-june-15"
start_date = Date.parse '1-jan-15'
end_date = Date.parse '31-dec-15'
interval = 14
result = current_date.step(start_date, -interval).to_a
result.sort!.pop
result += current_date.step(end_date, interval).to_a
You could do that as follows:
require 'date'
date_str = "6-16-2015"
d = Date.strptime(date_str, '%m-%d-%Y')
f = Date.new(d.year)
((f + (f-d).abs % 14)..Date.new(d.year,-1,-1)).step(14).to_a
#=> [#<Date: 2015-01-13 ((2457036j,0s,0n),+0s,2299161j)>,
# #<Date: 2015-01-27 ((2457050j,0s,0n),+0s,2299161j)>,
# ...
# #<Date: 2015-06-16 ((2457190j,0s,0n),+0s,2299161j)>,
# ...
# #<Date: 2015-12-29 ((2457386j,0s,0n),+0s,2299161j)>]
Based on the second sentence of your question, I assume you simply want an array of all dates in the given year that are two-weeks apart and include the given day.
I attempted a mathy modulus biased approach which turned out unexpectedly confusing.
require 'date'
a_recycle_date_string = "6-17-2015"
interval = 14
a_recycle_date = Date.strptime(a_recycle_date_string, '%m-%d-%Y')
current_year = a_recycle_date.year
end_of_year = Date.new(current_year, -1, -1)
# Find out which index of the first interval's days is the first recycle day
# of the year the (1 indexed)
remainder = (a_recycle_date.yday) % interval
# => 0
# make sure remainder 0 is treated as interval-1 so it doesn't louse
# the equation up
n_days_from_first_recycling_yday_of_year = (remainder - 1) % interval
first_recycle_date_this_year = Date.new(current_year,
1,
1 + n_days_from_first_recycling_yday_of_year)
first_recycle_date_this_year.step(end_of_year, interval).to_a
I'm working on an example problem from Chris Pine's Learn to Program book and I'm having an issue removing white space in my hash values.
I start with a txt file that contains names and birthday information, like so:
Christopher Alexander, Oct 4, 1936
Christopher Lambert, Mar 29, 1957
Christopher Lee, May 27, 1922
Christopher Lloyd, Oct 22, 1938
Christopher Pine, Aug 3, 1976
Then I go through each line, split at the first comma, and then try to go through each key,value to strip the white space.
birth_dates = Hash.new {}
File.open 'birthdays.txt', 'r' do |f|
f.read.each_line do |line|
name, date = line.split(/,/, 2)
birth_dates[name] = date
birth_dates.each_key { |a| birth_dates[a].strip! }
end
But nothing is getting stripped.
{"Christopher Alexander"=>" Oct 4, 1936", "Christopher Lambert"=>" Mar 29, 1957", "Christopher Lee"=>" May 27, 1922", "Christopher Lloyd"=>" Oct 22, 1938", "Christopher Pine"=>" Aug 3, 1976", "Christopher Plummer"=>" Dec 13, 1927", "Christopher Walken"=>" Mar 31, 1943", "The King of Spain"=>" Jan 5, 1938"}
I've seen a handful of solutions for Arrays using .map - but this was the only hash example I came across. Any idea why it may not be working for me?
UPDATE: removed the redundant chomp as per sawa's comment.
For parsing comma delimited files i use CSV like this
def parse_birthdays(file='birthdays.txt', hash={})
CSV.foreach(file, :converters=> lambda {|f| f ? f.strip : nil}){|name, date, year|hash[name] = "#{year}-#{date.gsub(/ +/,'-')}" }
hash
end
parse_birthdays
# {"Christopher Alexander"=>"1936-Oct-4", "Christopher Lambert"=>"1957-Mar-29", "Christopher Lee"=>"1922-May-27", "Christopher Lloyd"=>"1938-Oct-22", "Christopher Pine"=>"1976-Aug-3"}
of if you need real date's you can drop the lambda
def parse_birthdays(file='birthdays.txt', hash={})
CSV.foreach(file){|name, date, year|hash[name] = Date.parse("#{year}-#{date}")}
hash
end
parse_birthdays
# {"Christopher Alexander"=>#<Date: 2014-10-04 ((2456935j,0s,0n),+0s,2299161j)>, "Christopher Lambert"=>#<Date: 2014-03-29 ((2456746j,0s,0n),+0s,2299161j)>, "Christopher Lee"=>#<Date: 2014-05-27 ((2456805j,0s,0n),+0s,2299161j)>, "Christopher Lloyd"=>#<Date: 2014-10-22 ((2456953j,0s,0n),+0s,2299161j)>, "Christopher Pine"=>#<Date: 2014-08-03 ((2456873j,0s,0n),+0s,2299161j)>}
I would write this
File.open 'birthdays.txt', 'r' do |f|
f.read.each_line do |line|
name, date = line.split(/,/, 2)
birth_dates[name] = date.chomp
birth_dates.each_key { |a| birth_dates[a].strip! }
end
as below:
File.open 'birthdays.txt', 'r' do |f|
f.read.each_line do |line|
name, date = line.split(/,/, 2)
birth_dates[name] = date.chomp.strip
end
end
or
birth_dates = File.readlines('birthdays.txt').with_object({}) do |line,hsh|
name, date = line.split(/,/, 2)
hsh[name] = date.chomp.strip
end