Feedjira: Feed update returns duplicates - ruby

I am trying to print rss/atom feed updates using ruby. Feedjira seems to be the best bet for this. Unfortunately the update feature does not seem to work properly. I get duplicate entries.
Here is a simple example that produces the problem:
require 'feedjira'
require 'pp'
feed = Feedjira::Feed.fetch_and_parse "http://lorem-rss.herokuapp.com/feed?unit=second&interval=10"
loop do
feed = Feedjira::Feed.update(feed)
pp feed.new_entries
sleep 20
end
Any suggestions? Maybe other libraries? Or am I missing something important when using Feedjira?
There a several questions around this topic for Feedzirra the former name for Feedjira, but the update feature seems to be a new feature: http://feedjira.com/updating-feeds.html

The updated functionality was removed from feedjira due to serious problems. See
(commit) https://github.com/feedjira/feedjira/commit/6f56516934a9bdb8691f2bbe98be0f2b7c25b7ea
(discussion) https://github.com/feedjira/feedjira/issues/218

Related

Accessing and scraping sporadically available Wikipedia sections

I need to fetch some data but I'm completely stumped after trying a few things.
I want to access Airlines & Destinations from the Albuquerque_International_Sunport's wiki page - keep in mind, I'll be going through a prepopulated list of airports with this data.
There are multiple "types" of Airlines: Passenger, Cargo, sometimes there's other (sub?)sections; other times there are none:
Articles for multiple airports will be accessed automatically - including some less known airports. This means I need to:
Check if "Airlines & Destinations" section exists
Take all data inside of any table
Scrape it; otherwise do nothing
I've tried using the ruby wikipedia-client gem however, the .raw_data method isn't even returning the section data:
Next, I went to Wikipedia's API: unless I am mistaken, but it doesn't return "section" names! This doesn't seem right but I wasn't able to get it working.
So I suppose that leaves Nokogiri. I can grab and parse the pages fine, but:
How would I go about detecting "Airlines & Destinations" section presence, getting all table data BEFORE end of section? I have a suspicion I need some tricky Xpath for this.
Seems to be the only viable solution.
Any thoughts welcome. Putting a bounty on this question when I can.
Edit: Perhaps it's better to simply somehow grab a list of all airlines in the world and hit them against HTML? Seems like it could be computationally expensive.
Well, I'm not an expert user of Nokogiri but maybe this can give you some idea.
require 'nokogiri'
require 'open-uri'
page = Nokogiri::HTML(open("https://en.wikipedia.org/wiki/Albuquerque_International_Sunport"))
# this is the passenger table
page.xpath('//*[#id="mw-content-text"]/div/table[2]/tr').each do |tr|
p tr.text()
puts "-"*50
end
# this is the cargo table
page.xpath('//*[#id="mw-content-text"]/div/table[3]/tr').each do |tr|
p tr.text()
puts "-"*50
end

Ruby gem for retrieving details/information on a torrent via info hash

Is there a ruby gem that I can use with Ruby or Ruby on Rails that accepts an info hash and returns information on the torrent? Like seeders, leachers, size, etc.?
If not is there any other way I can get this information using Ruby? Is there an API that I can easily digest?
Thanks in advance.
Take a look at the thepiratebay.
Although, it seems like it's not maintained actively anymore. But, should solve your problem.
You can find a torrent:
ThePirateBay::Torrent.find("123123123")
Also, you can get all the seeders, leechers and size:
ThePirateBay::SortBy::Size # Size, largest first
ThePirateBay::SortBy::Seeders # Most seeders first
ThePirateBay::SortBy::Leechers # Most leechers first
So, why not giving it a try?
It really depends what torrents you are talking about. Different torrent trackers have different APIs.
You might want to dig into specific tracker API (please be mindful these ones are not Ruby APIs):
https://getstrike.net/api/
https://www.npmjs.com/package/thepiratebay

trim song using ruby

In my site built on Ruby on Rails, I need to provide functionality to trim songs (say first 20 sec). Does anybody know any relevant API to manipulate songs (like 'rmagick' for images)?
You could try https://github.com/fugalh/ruby-audio. It looks a little out of date, but there's probably a fork with updates.
Another solution might be to limit how much the song plays via javascript.
And yet another option might be just to make the snippets yourself.

Ruby XML Parsing with Nokogiri/XPath

I have a shopify store that I want to automatically update the product variants inventory levels with, using a live xml feed from the wholesaler I use.
I'm learning to program (Ruby) and this is my first project, but after researching here is how I think it should work.
Use Ruby/Nokugiri to parse the XML feed from the wholesaler, and then Xpath to locate both the unique product variant SKU code, and the stock level.
Somehow I need to use this SKU to refer back to my Shopify store product XML list, and pull out the variants unique ID using the SKU code.
Then use something like the builder gem to build the XML format that shopify needs, and then use curl to PUT the changes. I'm guessing I loop this process for every product?
I know Shopify only has a 300 call limit, so I've got the article on putting a delay in the script, but I get the feeling the above method isn't the easiest way to go about this?
With Shopify you need to apply the variant stock level update against unique variant xml files, so I need to build the unique xml file/code and PUT it against /admin/variants/#[thevariantid].xml
I'm looking forward to trying to put this together and learning in the process, but am I on the right track with this? Are there simpler gems I should be looking at?
n.b I've only recently started learning Ruby, and will head to Rails afterwards. I know a bit about XML and it's structure so should be ok finding what I need with XPath.
You’re on the right track, but I’d use the shopify_api gem to do the talking to Shopify instead of having to form the XML and URIs yourself: https://github.com/Shopify/shopify_api
There’s an article on our wiki that might also help you out with regards to the API call limit but just let me know if you need more space – we’re pretty flexible and the limit is really just there to keep scripts from going wild and affecting service for everyone else.
Your proposed path seems good, except that there's no need to use the 'builder' gem, as Nokogiri has some very nice XML-building built into it.

Using Rack::Session::Datamapper

mkristgan's rack_datamapper gem says that it "can be wrapped to be used in a specific environement, i.e. Rack::Session::Datamapper".
Unfortunately, I don't know quite enough about Ruby to accomplish this task yet –Modules/Classes in Ruby are still above my head (coming from PHP).
Can anyone offer assistance with using rack_datamapper to implement Rack::Session::Datamapper?
You probably don't want to do this anyway.
The answer below is great, but upon closer consideration, I realized I shouldn't do it anyway. Instead, I'm placing the user_id, ip and first name (for convenience) in a cookie and protecting it.
This gem should help:
In Sinatra just add:
use Rack::Session::Moneta,
store: Moneta.new(:DataMapper, setup: (ENV['DATABASE_URL'] || "sqlite://#{Dir.pwd}/development.db"))
and use session[] object at will.

Resources