How to get RSS feed in xml format for ruby script - ruby

I am using the following ruby script from this dashing widget that retrieves an RSS feed and parses it and sends that parsed title and description to a widget.
require 'net/http'
require 'uri'
require 'nokogiri'
require 'htmlentities'
news_feeds = {
"seattle-times" => "http://seattletimes.com/rss/home.xml",
}
Decoder = HTMLEntities.new
class News
def initialize(widget_id, feed)
#widget_id = widget_id
# pick apart feed into domain and path
uri = URI.parse(feed)
#path = uri.path
#http = Net::HTTP.new(uri.host)
end
def widget_id()
#widget_id
end
def latest_headlines()
response = #http.request(Net::HTTP::Get.new(#path))
doc = Nokogiri::XML(response.body)
news_headlines = [];
doc.xpath('//channel/item').each do |news_item|
title = clean_html( news_item.xpath('title').text )
summary = clean_html( news_item.xpath('description').text )
news_headlines.push({ title: title, description: summary })
end
news_headlines
end
def clean_html( html )
html = html.gsub(/<\/?[^>]*>/, "")
html = Decoder.decode( html )
return html
end
end
#News = []
news_feeds.each do |widget_id, feed|
begin
#News.push(News.new(widget_id, feed))
rescue Exception => e
puts e.to_s
end
end
SCHEDULER.every '60m', :first_in => 0 do |job|
#News.each do |news|
headlines = news.latest_headlines()
send_event(news.widget_id, { :headlines => headlines })
end
end
The example rss feed works correctly because the URL is for an xml file. However I want to use this for a different rss feed that does not provide an actual xml file. This rss feed I want is at http://www.ttc.ca/RSS/Service_Alerts/index.rss
This doesn't seem to display anything on the widget. Instead of using "http://www.ttc.ca/RSS/Service_Alerts/index.rss", I also tried "http://www.ttc.ca/RSS/Service_Alerts/index.rss?format=xml" and "view-source:http://www.ttc.ca/RSS/Service_Alerts/index.rss" but with no luck. Does anyone know how I can get the actual xml data related to this rss feed so that I can use it with this ruby script?

You're right, that link does not provide regular XML, so that script won't work in parsing it since it's written specifically to parse the example XML. The rss feed you're trying to parse is providing RDF XML and you can use the Rubygem: RDFXML to parse it.
Something like:
require 'nokogiri'
require 'rdf/rdfxml'
rss_feed = 'http://www.ttc.ca/RSS/Service_Alerts/index.rss'
RDF::RDFXML::Reader.open(rss_feed) do |reader|
# use reader to iterate over elements within the document
end
From here you can try learning how to use RDFXML to extract the content you want. I'd begin by inspecting the reader object for methods I could use:
puts reader.methods.sort - Object.methods
That will print out the reader's own methods, look for one you might be able to use for your purposes, such as reader.each_entry
To further dig down you can inspect what each entry looks like:
reader.each_entry do |entry|
puts "----here's an entry----"
puts entry.inspect
end
or see what methods you can call on the entry:
reader.each_entry do |entry|
puts "----here's an entry's methods----"
puts entry.methods.sort - Object.methods
break
end
I was able to crudely find some titles and descriptions using this hack job:
RDF::RDFXML::Reader.open('http://www.ttc.ca/RSS/Service_Alerts/index.rss') do |reader|
reader.each_object do |object|
puts object.to_s if object.is_a? RDF::Literal
end
end
# returns:
# TTC Service Alerts
# http://www.ttc.ca/Service_Advisories/index.jsp
# TTC Service Alerts.
# TTC.ca
# http://www.ttc.ca
# http://www.ttc.ca/images/ttc-main-logo.gif
# Service Advisory
# http://www.ttc.ca/Service_Advisories/all_service_alerts.jsp#Service+Advisory
# 196 York University Rocket route diverting northbound via Sentinel, Finch due to a collision that has closed the York U Bus way.
# - Affecting: Bus Routes: 196 York University Rocket
# 2013-12-17T13:49:03.800-05:00
# Service Advisory (2)
# http://www.ttc.ca/Service_Advisories/all_service_alerts.jsp#Service+Advisory+(2)
# 107B Keele North route diverting northbound via Keele, Lepage due to a collision that has closed the York U Bus way.
# - Affecting: Bus Routes: 107 Keele North
# 2013-12-17T13:51:08.347-05:00
But I couldn't quickly find a way to know which one was a title, and which a description :/
Finally, if you still can't find how to extract what you want, start a new question with this info.
Good luck!

Related

Why do I get "undefined local variable or method" in my code?

I want to scrap links from a Google search query.
I can't save results in a TAB (links):
error : test.rb:17:in `parse_result': undefined local variable or
method `links' for main:Object (NameError)
This is my code:
require 'open-uri'
require 'nokogiri'
doc = Nokogiri::HTML(open('https://www.google.fr/search?q=estimation+immobilier'))
links = []
def parse_results(doc)
doc.search('.g').map do |element|
parse_block(element)
end
end
def parse_block(element)
tempo = element.search('.r').to_s
links << tempo.scan(/<a href=\"\/url\?q=(.*)&sa=U/)[0][0]
end
parse_results(doc)
puts links
The problem is variable scope, and is very common.
I'd rewrite the code like this:
require 'open-uri'
require 'nokogiri'
doc = Nokogiri::HTML(open('https://www.google.fr/search?q=estimation+immobilier'))
def parse_results(doc)
_links = []
doc.search('.g').each do |element|
_links << parse_block(element)
end
_links
end
def parse_block(element)
tempo = element.search('.r').to_s
tempo.scan(/<a href=\"\/url\?q=(.*)&sa=U/)[0][0]
end
links = parse_results(doc)
puts links
links could be defined as an instance, class or global variable, but all of those have code smell. You'd be trying to circumvent scoping, which is really your friend when it comes to avoiding wasting space on the variable stack.
scan is going to return an array of results, so push its results to _links.
map wasn't the right method for what you're doing; each is more appropriate since you're looping over the results of searching for class="g" in the HTML. Using map, you could write parse_results() like:
def parse_results(doc)
doc.search('.g').map { |element| parse_block(element) }
end
parse_block() isn't written correctly, or at least it can be written a lot more idiomatically for Nokogiri. If you ever have to resort to using regex when using an XML or HTML parser, you know there's something that should be reconsidered. Looking at what's happening, here's what the code sees as it dives through parse_results() and parse_block():
doc.search('.g').first.search('.r').to_s
# => "<h3 class=\"r\"><b>Estimation immobiliere</b> gratuite - MeilleursAgents.com</h3>\n"
You're trying to grab a parameter from the links, so use Nokogiri to do that cleanly, instead of trying to use a pattern and scan. I opened the page and parsed it as you did, then tried this:
doc.search('.g h3.r a').map(&:to_html)
# => ["<b>Estimation immobiliere</b> gratuite - MeilleursAgents.com",
# "<b>Estimation immobili\u00E8re</b> gratuite (maison, appartement <b>...</b> - Drimki",
# "<b>Estimation immobili\u00E8re</b> avec Particulier \u00E0 Particulier | De <b>...</b> - P.a.p",
# "LaCoteImmo - <b>Estimation immobili\u00E8re</b> et Prix immobilier",
# "<b>Estimation immobili\u00E8re</b> - Efficity",
# "<b>Estimation</b> gratuite d'un bien <b>immobilier</b> - ParuVendu",
# "<b>Estimer</b> la valeur de son bien <b>immobilier</b>- Meilleurtaux.com",
# "<b>Estimation immobiliere</b> gratuite avec MeilleursAgents.com",
# "<b>Estimation</b> gratuite de votre appartement en ligne - Refleximmo",
# "<b>Estimation Immobili\u00E8re</b> - Immobilier - Capital.fr"]
A bit more comprehensive CSS narrowed down the returned results significantly.
A bit of tweaking results in:
doc.search('.g h3.r a').map{ |a| a['href'] }
# => ["/url?q=http://www.meilleursagents.com/estimation-immobiliere/&sa=U&ei=OoHSU7KaEszwoAS__YDoCA&ved=0CBQQFjAA&usg=AFQjCNFNCH0iR3pr0fQX6wSjcj1_s3CsRg",
# "/url?q=http://www.drimki.fr/estimation-immobiliere-gratuite&sa=U&ei=OoHSU7KaEszwoAS__YDoCA&ved=0CBoQFjAB&usg=AFQjCNGUbFcsWWQY-bc8Vu-d-GD9YFcbVg",
# "/url?q=http://www.pap.fr/evaluation/estimation-immobiliere&sa=U&ei=OoHSU7KaEszwoAS__YDoCA&ved=0CCAQFjAC&usg=AFQjCNGztbZlDWWGS4kNPHzR06ayRdAQKg",
# "/url?q=http://www.lacoteimmo.com/&sa=U&ei=OoHSU7KaEszwoAS__YDoCA&ved=0CCYQFjAD&usg=AFQjCNEZK_JVduJKJvFpDDXu4yIsTXGMFg",
# "/url?q=http://www.efficity.com/estimation-immobiliere/&sa=U&ei=OoHSU7KaEszwoAS__YDoCA&ved=0CCwQFjAE&usg=AFQjCNHHc-GuJoHXTx3N3_Ex_fz1KUp1cg",
# "/url?q=http://www.paruvendu.fr/pa/prix-immobilier-prix-m2-estimation-gratuite-bien-immobilier/&sa=U&ei=OoHSU7KaEszwoAS__YDoCA&ved=0CDIQFjAF&usg=AFQjCNGmwWmo19asoooWz6Lbh0YMOC8wlg",
# "/url?q=http://www.meilleurtaux.com/services-immo/vendre-un-bien-immobilier/estimation-immobiliere.html&sa=U&ei=OoHSU7KaEszwoAS__YDoCA&ved=0CDgQFjAG&usg=AFQjCNFJ_fAsPBmZvVU60jRLh-yKzvuEiw",
# "/url?q=http://prix-immobilier.latribune.fr/estimation-immobiliere/&sa=U&ei=OoHSU7KaEszwoAS__YDoCA&ved=0CD4QFjAH&usg=AFQjCNHHaVmKGg4jiaT-6AwZAfby2-H4sg",
# "/url?q=http://www.refleximmo.com/estimation-immobiliere-gratuite-appartement&sa=U&ei=OoHSU7KaEszwoAS__YDoCA&ved=0CEQQFjAI&usg=AFQjCNGiBMMYrK-EO9wqIh82eW2uFT0n8w",
# "/url?q=http://www.capital.fr/immobilier/estimation-immobiliere&sa=U&ei=OoHSU7KaEszwoAS__YDoCA&ved=0CEoQFjAJ&usg=AFQjCNEf8FQuKCYBMXBB5FA2dJ2gor4Wmg"]
At this point it's obvious we're looking at an array of absolute URLs, which can be handled using Ruby's built-in URI class:
require 'uri'
doc.search('.g h3.r a').map{ |a|
uri = URI.parse(a['href'])
query_hash = Hash[URI::decode_www_form(uri.query)]
query_hash['q']
}
# => [
"http://www.meilleursagents.com/estimation-immobiliere/",
"http://www.drimki.fr/estimation-immobiliere-gratuite",
"http://www.pap.fr/evaluation/estimation-immobiliere",
...
That should give you enough information to rewrite your code a bit more robustly. Regular expressions are not good tools for parsing HTML, and it's better to use well-tested, pre-built wheels whenever possible, like URI.
The reason I say this approach is more robust is because of this piece of code:
links << tempo.scan(/<a href=\"\/url\?q=(.*)&sa=U/)[0][0]
That search is very prone to breaking. URL formats can change quickly, especially if a site suspects that people are scraping their pages and they don't want scraping to happen, such as Google. They could easily change the order of the parameters, they could change the way the link is written in the page, etc., since HTML allows very liberal formatting of the source and a browser will still render the same view to the user. Imagine the fun you'd have if Google chose to render a link like:
<a
href="/url?amp;sa=U&q=...
The regex would break, causing your code to break, whereas using URI and Nokogiri to drill down would continue to work.
It works if you make links and instance variable:
require 'open-uri'
require 'nokogiri'
doc = Nokogiri::HTML(open('https://www.google.fr/search?q=estimation+immobilier'))
#links = []
def parse_results(doc)
doc.search('.g').map do |element|
parse_block(element)
end
end
def parse_block(element)
tempo = element.search('.r').to_s
#links << tempo.scan(/<a href=\"\/url\?q=(.*)&sa=U/)[0][0]
end
parse_results(doc)
puts #links

Structuring Nokogiri output without HTML tags

I got Ruby to travel to a web site, iterate through a list of campaigns and scrape the pages for specific data. The problem I have now is getting it from the structure Nokogiri gives me, and outputting it into a readable form.
campaign_list = Array.new
campaign_list.push(1042360, 1042386, 1042365, 992307)
browser = Watir::Browser.new :chrome
browser.goto '<redacted>'
browser.text_field(:id => 'email').set '<redacted>'
browser.text_field(:id => 'password').set '<redacted>'
browser.send_keys :enter
file = File.new('hourlysales.csv', 'w')
data = {}
campaign_list.each do |campaign|
browser.goto "<redacted>"
if browser.text.include? "Application Error"
puts "Error loading page, I recommend restarting script"
# Possibly automatic restart of script
else
hourly_data = Nokogiri::HTML.parse(browser.html).text
# file.write data
puts hourly_data
end
This is the output I get:
{"views":[[17,145],[18,165],[19,99],[20,71],[21,31],[22,26],[23,10],[0,15],[1,1], [2,18],[3,19],[4,35],[5,47],[6,44],[7,67],[8,179],[9,141],[10,112],[11,95],[12,46],[13,82],[14,79],[15,70],[16,103]],"orders":[[17,10],[18,9],[19,5],[20,1],[21,1],[22,0],[23,0],[0,1],[1,0],[2,1],[3,0],[4,1],[5,2],[6,1],[7,5],[8,11],[9,6],[10,5],[11,3],[12,1],[13,2],[14,4],[15,6],[16,7]],"conversion_rates":[0.06870229007633588,0.05442176870748299,0.050505050505050504,0.014084507042253521,0.03225806451612903,0.0,0.0,0.06666666666666667,0.0,0.05555555555555555,0.0,0.02857142857142857,0.0425531914893617,0.022727272727272728,0.07462686567164178,0.06134969325153374,0.0425531914893617,0.044642857142857144,0.031578947368421054,0.021739130434782608,0.024390243902439025,0.05063291139240506,0.08571428571428572,0.06741573033707865]}
The arrays stand for { views [[hour, # of views], [hour, # of views], etc. }. Same with orders. I don't need conversion rates.
I also need to add the values up for each key, so after doing this for 5 pages, I have one key for each hour of the day, and the total number of views for that hour. I tried a couple each loops, but couldn't make any progress.
I appreciate any help you guys can give me.
It looks like the output (which from your code I assume is the content of hourly_data) is JSON. In that case, it's easy to parse and add up the numbers. Something like this:
require "json" # at the top of your script
# ...
def sum_hours_values(data, hours_values=nil)
# Start with an empty hash that automatically initializes missing keys to `0`
hours_values ||= Hash.new {|hsh,hour| hsh[hour] = 0 }
# Iterate through the [hour, value] arrays, adding `value` to the running
# count for that `hour`, and return `hours_values`
data.each_with_object(hours_values) do |(hour, value), hsh|
hsh[hour] += value
end
end
# ... Watir/Nokogiri stuff here...
# Initialize these so they persist outside the loop
hours_views, orders_views = nil
campaign_list.each do |campaign|
browser.goto "<redacted>"
if browser.text.include? "Application Error"
# ...
else
# ...
hourly_data_parsed = JSON.parse(hourly_data)
hours_views = sum_hours_values(hourly_data_parsed["views"], hours_views)
hours_orders = sum_hours_values(hourly_data_parsed["orders"], orders_views)
end
end
puts "Views by hour:"
puts hours_views.sort.map {|hour_views| "%2i\t%4i" % hour_views }
puts "Orders by hour:"
puts hours_orders.sort.map {|hour_orders| "%2i\t%4i" % hour_orders }
P.S. There's a really nice recursive version of sum_hours_values I didn't include since the iterative version is clearer to most Ruby programmers. If you're into recursion I leave it as an exercise for you. ;)

Jekyll - generating JSON files alongside the HTML files

I'd like to make Jekyll create an HTML file and a JSON file for each page and post. This is to offer a JSON API of my Jekyll blog - e.g. a post can be accessed either at /posts/2012/01/01/my-post.html or /posts/2012/01/01/my-post.json
Does anyone know if there's a Jekyll plugin, or how I would begin to write such a plugin, to generate two sets of files side-by-side?
I was looking for something like this too, so I learned a bit of ruby and made a script that generates JSON representations of Jekyll blog posts. I’m still working on it, but most of it is there.
I put this together with Gruntjs, Sass, Backbonejs, Requirejs and Coffeescript. If you like, you can take a look at my jekyll-backbone project on Github.
# encoding: utf-8
#
# Title:
# ======
# Jekyll to JSON Generator
#
# Description:
# ============
# A plugin for generating JSON representations of your
# site content for easy use with JS MVC frameworks like Backbone.
#
# Author:
# ======
# Jezen Thomas
# jezenthomas#gmail.com
# http://jezenthomas.com
module Jekyll
require 'json'
class JSONGenerator < Generator
safe true
priority :low
def generate(site)
# Converter for .md > .html
converter = site.getConverterImpl(Jekyll::Converters::Markdown)
# Iterate over all posts
site.posts.each do |post|
# Encode the HTML to JSON
hash = { "content" => converter.convert(post.content)}
title = post.title.downcase.tr(' ', '-').delete("’!")
# Start building the path
path = "_site/dist/"
# Add categories to path if they exist
if (post.data['categories'].class == String)
path << post.data['categories'].tr(' ', '/')
elsif (post.data['categories'].class == Array)
path << post.data['categories'].join('/')
end
# Add the sanitized post title to complete the path
path << "/#{title}"
# Create the directories from the path
FileUtils.mkpath(path) unless File.exists?(path)
# Create the JSON file and inject the data
f = File.new("#{path}/raw.json", "w+")
f.puts JSON.generate(hash)
end
end
end
end
There are two ways you can accomplish this, depending on your needs. If you want to use a layout to accomplish the task, then you want to use a Generator. You would loop through each page of your site and generate a new .json version of the page. You could optionally make which pages get generated conditional upon the site.config or the presence of a variable in the YAML front matter of the pages. Jekyll uses a generator to handle slicing blog posts up into indices with a given number of posts per page.
The second way is to use a Converter (same link, scroll down). The converter will allow you to execute arbitrary code on your content to translate it to a different format. For an example of how this works, check out the markdown converter that comes with Jekyll.
I think this is a cool idea!
Take a look at JekyllBot and the following code.
require 'json'
module Jekyll
class JSONPostGenerator < Generator
safe true
def generate(site)
site.posts.each do |post|
render_json(post,site)
end
site.pages.each do |page|
render_json(page,site)
end
end
def render_json(post, site)
#add `json: false` to YAML to prevent JSONification
if post.data.has_key? "json" and !post.data["json"]
return
end
path = post.destination( site.source )
#only act on post/pages index in /index.html
return if /\/index\.html$/.match(path).nil?
#change file path
path['/index.html'] = '.json'
#render post using no template(s)
post.render( {}, site.site_payload)
#prepare output for JSON
post.data["related_posts"] = related_posts(post,site)
output = post.to_liquid
output["next"] = output["next"].id unless output["next"].nil?
output["previous"] = output["previous"].id unless output["previous"].nil?
#write
#todo, figure out how to overwrite post.destination
#so we can just use post.write
FileUtils.mkdir_p(File.dirname(path))
File.open(path, 'w') do |f|
f.write(output.to_json)
end
end
def related_posts(post, site)
related = []
return related unless post.instance_of?(Post)
post.related_posts(site.posts).each do |post|
related.push :url => post.url, :id => post.id, :title => post.to_liquid["title"]
end
related
end
end
end
Both should do exactly what you want.

understanding Ruby code?

I was wondering if anyone can help me understanding the Ruby code below? I'm pretty new to Ruby programming and having trouble understanding the meaning of each functions.
When I run this with my twitter username and password as parameter, I get a stream of twitter feed samples. What do I need to do with this code to only display the hashtags?
I'm trying to gather the hashtags every 30 seconds, then sort from least to most occurrences of the hashtags.
Not looking for solutions, but for ideas. Thanks!
require 'eventmachine'
require 'em-http'
require 'json'
usage = "#{$0} <user> <password>"
abort usage unless user = ARGV.shift
abort usage unless password = ARGV.shift
url = 'https://stream.twitter.com/1/statuses/sample.json'
def handle_tweet(tweet)
return unless tweet['text']
puts "#{tweet['user']['screen_name']}: #{tweet['text']}"
end
EventMachine.run do
http = EventMachine::HttpRequest.new(url).get :head => { 'Authorization' => [ user, password ] }
buffer = ""
http.stream do |chunk|
buffer += chunk
while line = buffer.slice!(/.+\r?\n/)
handle_tweet JSON.parse(line)
end
end
end
puts "#{tweet['user']['screen_name']}: #{tweet['text']}"
That line shows you a user name followed by the content of the tweet.
Let's take a step back for a sec.
Hash tags appear inside the tweet's content--this means they're inside tweet['text']. A hash tag always takes the form of a # followed by a bunch of non-space characters. That's really easy to grab with a regex. Ruby's core API facilitates that via String#scan. Example:
"twitter is short #foo yawn #bar".scan(/\#\w+/) # => ["#foo", "#bar"]
What you want is something like this:
def handle_tweet(tweet)
return unless tweet['text']
# puts "#{tweet['user']['screen_name']}: #{tweet['text']}" # OLD
puts tweet['text'].scan(/\#\w+/).to_s
end
tweet['text'].scan(/#\w+/) is an array of strings. You can do whatever you want with that array. Supposing you're new to Ruby and want to print the hash tags to the console, here's a brief note about printing arrays with puts:
puts array # => "#foo\n#bar"
puts array.to_s # => '["#foo", "#bar"]'
#Load Libraries
require 'eventmachine'
require 'em-http'
require 'json'
# Looks like this section assumes you're calling this from commandline.
usage = "#{$0} <user> <password>" # $0 returns the name of the program
abort usage unless user = ARGV.shift # Return first argument passed when program called
abort usage unless password = ARGV.shift
# The URL
url = 'https://stream.twitter.com/1/statuses/sample.json'
# method which, when called later, prints out the tweets
def handle_tweet(tweet)
return unless tweet['text'] # Ensures tweet object has 'text' property
puts "#{tweet['user']['screen_name']}: #{tweet['text']}" # write the result
end
# Create an HTTP request obj to URL above with user authorization
EventMachine.run do
http = EventMachine::HttpRequest.new(url).get :head => { 'Authorization' => [ user, password ] }
# Initiate an empty string for the buffer
buffer = ""
# Read the stream by line
http.stream do |chunk|
buffer += chunk
while line = buffer.slice!(/.+\r?\n/) # cut each line at newline
handle_tweet JSON.parse(line) # send each tweet object to handle_tweet method
end
end
end
Here's a commented version of what the source is doing. If you just want the hashtag, you'll want to rewrite handle_tweet to something like this:
handle_tweet(tweet)
tweet.scan(/#\w/) do |tag|
puts tag
end
end

Ruby EOFError with open-uri and loop

I'm attempting to build a web crawler and ran into a bit of a snag. Basically what I'm doing is extracting the links from a web page and pushing each link to a queue. Whenever the Ruby interpreter hits this section of code:
links.each do |link|
url_frontier.push(link)
end
I receive the following error:
/home/blah/.rvm/rubies/ruby-1.9.3-p0/lib/ruby/1.9.1/net/protocol.rb:141:in `read_nonblock': end of file reached (EOFError)
If I comment out the above block of code I get no errors. Please, any help would be appreciated. Here is the rest of the code:
require 'open-uri'
require 'net/http'
require 'uri'
class WebCrawler
def self.Spider(root)
eNDCHARS = %{.,'?!:;}
num_documents = 0
token_list = []
url_repository = Hash.new
url_frontier = Queue.new
url_frontier.push(root.to_s)
while !url_frontier.empty? && num_documents < 10
url = url_frontier.pop
if !url_repository.has_key?(url)
document = open(url)
html = document.read
# extract url's
links = URI.extract(html, ['http']).collect { |u| eNDCHARS.index(u[-1]) ? u.chop : u }
links.each do |link|
url_frontier.push(link)
end
# tokenize
Tokenizer.tokenize(document).each do |word|
token_list.push(IndexStructures::Term.new(word, url))
end
# add to the repository
url_repository[url] = true
num_documents += 1
end
end
# sort by term (primary) and document id (secondary) in reverse to aid in the construction of the inverted index
return num_documents, token_list.sort_by! { |term| [term.term, term.document_id]}.reverse!
end
end
I encountered the same error but with Watir-webdriver, running firefox in headless mode. What I found out was, if I was running two of my applications in parallel and I destroy "headless" in one of the applications, it automatically kills the other one as well with the exact error you quoted. Though my situation is not the same as yours, I think the issue is related to prematurely closing the file handle externally while your application is still using it. I removed the destroy command from my application and the error disappeared.
Hope this helps.

Resources