I am using GDAL 1.7.1 from ruby1.9 to generate GeoTIFF files. In the tutorial they recommend to use GDALClose() to close the datasets and flush any remaining content to the filesystem. The same happens in the destructor for the dataset. The problem is that the ruby bindings rely on this destructor mechanism to close the dataset, and I need the result of the file already in the process that generates it. Since ruby is garbage collected, it seems I can not reliably close my files, without exiting the ruby process. For now I patched my version of GDAL to support the GDALClose method, but this doesn't seem to be a good long term solution.
require 'gdal/gdal'
[...]
# open the driver for geotiff format
driver = Gdal::Gdal.get_driver_by_name('GTiff')
# create a new file
target_map = driver.create(output_path,
xsize,
ysize, 3,
Gdal::Gdalconst::GDT_UINT16, ["PHOTOMETRIC=RGB"])
# write band data
3.times do |i|
band = target_map.band(i + 1)
target_map.write_band(i + 1, mapped_data)
end
# now I would like to use the file in output_path, but at this point
# large parts of the data still resides in memory it seems until
# target_map is destroyed
file = File.open( output_path, "r" )
[...]
Is there something in either ruby or swig to force the destructor call, that I may have overlooked?
Normally what is done with the GDAL bindings in Python is to set the objects to None. So in Ruby, this would be nil:
band = nil
target_map = nil
It's a funny way to save/flush/close the data, but it is how it is done.
Related
I'd like to get a word list from a text file using Ruby. I found how to use regex to parse only words here, so I made a script like following:
src = File.open("text.txt")
word_list = []
src.each do |line|
word_list << line.downcase.split(/[^[:alpha:]]/).delete_if {|x| x == ""}
end
word_list.flatten!.uniq!.sort!
p word_list
And the following is a sample text file text.txt:
TextMate may be the latest craze for developing Ruby on Rails
applications, but Vim is forever. This plugin offers the following
features for Ruby on Rails application development.
Automatically detects buffers containing files from Rails applications, and applies settings to those buffers (and only those
buffers). You can use an autocommand to apply your own custom
settings as well.
Unintrusive. Only files in a Rails application should be affected; regular Ruby scripts are left untouched. Even when enabled, the
plugin should keep out of your way if you're not using its features.
Easy navigation of the Rails directory structure. gf considers context and knows about partials, fixtures, and much more. There are
two commands, :A (alternate) and :R (related) for easy jumping between
files, including favorites like model to migration, template to
helper, and controller to functional test. For more advanced usage,
:Rmodel, :Rview, :Rcontroller, and several other commands are
provided.
As a Ruby novice, I'd like to learn better (more clear, concise, and following conventions) solutions for this problem.
Thanks for any advices and corrections.
A more idiomatic code would be:
word_list = open("text.txt")
.lines
.flat_map { |line| line.downcase.split(/[^[:alpha:]]/).reject(&:empty?) }
.uniq
.sort
# I suppose you want each line and collect the results
word_list = File.open("text.txt").each_line.collect do |line|
# collecting is done via collect above, no need anymore
# .reject(&:empty?) calls .empty? on each element
line.downcase.split(/[^[:alpha:]]/).reject(&:empty?)
# you can chain on blocks as well
end.flatten!.uniq!.sort!
p word_list
When trying to open() remote images, some return as StringIO and others return as File...how do I force the File?
data = open("http://graph.facebook.com/61700024/picture?type=square")
=> #<StringIO:0x007fd09b013948>
data = open("http://28.media.tumblr.com/avatar_7ef57cb42cb0_64.png")
=> #<StringIO:0x007fd098bf9490>
data = open("http://25.media.tumblr.com/avatar_279ec8ee3427_64.png")
=> #<File:/var/folders/_z/bb18gdw52ns0x5r8z9f2ncj40000gn/T/open-uri20120229-9190-mn52fu>
I'm using Paperclip to save remote images (which are stored in S3), so basically wanting to do:
user = User.new
user.avatar = open(url)
user.save
Open-URI has a 10KB limit on StringIO objects, anything above that and it stores it as a temp file.
One way to get past this is by actually changing the constant that Open-URI takes for the limit of StringIO objects. You can do this by setting the constant to 0;
OpenURI::Buffer.send :remove_const, 'StringMax' if OpenURI::Buffer.const_defined?('StringMax')
OpenURI::Buffer.const_set 'StringMax', 0
Add that to your initialiser and you should be good to go.
While steigers solution is a simple all around solution, some of us might be repelled by the "nasty hack" feeling of it and the way it changes behaviour globally. Including other gems and such that might benefit or depend on this feature of OpenURI. Ofc. you could also use the above approach and then when your done reset the constant back to it's original value and because of the GIL you might get away with that sort of nastiness as well (though be sure to stay away from jruby and threads then!).
Alternatively you could do something like this, which basically ensures that if you get a stream it's piped to a temp file:
def write_stream_to_a_temp_file(stream)
ext = begin
"."+MIME::Types[stream.meta["content-type"]].first.extensions.first
rescue #In case meta data is not available
#It seems sometimes the content-type is binary/octet-stream
#In this case we should grab the original ext name.
File.extname(stream.base_uri.path)
end
file = Tempfile.new ["temp", ext]
begin
file.binmode
file.write stream.read
ensure
file.flush rescue nil
file.close rescue nil
end
file
end
# and when you want to enforce that data must be a temp file then just...
data = write_stream_to_a_temp_file data unless data.is_a? Tempfile
Is it possible to open every link in certain div and collect values of opened fields alltogether in one file or at least terminal output?
I am trying to get list of coordinates from all markers visible on google map.
all_links = b.div(:id, "kmlfolders").links
all_links.each do |link|
b.link.click
b.link(:text, "Norādījumi").click
puts b.text_field(:title, "Galapunkta_adrese").value
end
Are there easier or more effective ways how to automatically collect coordinates from all markers?
Unless there is other data (alt tags? elements invoked via onhover?) in the HTML already that you could pick through, that does seem like the most practical way to iterate through the links, however from what I can see you are not actually making use of the 'link' object inside your loop. You'd need something more like this I think
all_links = b.div(:id, "kmlfolders").links
all_links.each do |thelink|
b.link(:href => thelink.href).click
b.link(:text, "Norādījumi").click
puts b.text_field(:title, "Galapunkta_adrese").value
end
Probably using their API is a lot more effective means to get what you want however, it's why folks make API's after all, and if one is available, then using it is almost always best. Using a test tool as a screen-scraper to gather the info is liable to be a lot harder in the long run than learning how to make some api calls and get the data that way.
for web based api's and Ruby I find the REST-CLIENT gem works great, other folks like HTTP-Party
As I'm not already familiar with Google API, I find it hard for me to dig into API for one particular need. Therefor I made short watir-webdriver script for collecting coordinates of markers on protected google map. Resulting file is used in python script that creates speedcam files for navigation devices.
In this case it's speedcam map maintained and updated by Latvian police, but this script can probably be used with any google map just by replacing url.
# encoding: utf-8
require "rubygems"
require "watir-webdriver"
#b = Watir::Browser.new :ff
#--------------------------------
#b.goto "http://maps.google.com/maps?source=s_q&f=q&hl=lv&geocode=&q=htt%2F%2Fmaps.google.com%2Fmaps%2Fms%3Fmsid%3D207561992958290099079.0004b731f1c645294488e%26msa%3D0%26output%3Dkml&aq=&sll=56.799934,24.5753&sspn=3.85093,8.64624&ie=UTF8&ll=56.799934,24.5753&spn=3.610137,9.887695&z=7&vpsrc=0&oi=map_misc&ct=api_logo"
#b.div(:id, "kmlfolders").wait_until_present
all_markers = #b.div(:id, "kmlfolders").divs(:class, "fdrlt")
#prev_coordinates = 1
puts "#{all_markers.length} speedcam markers detected"
File.open("list_of_coordinates.txt","w") do |outfile|
all_markers.each do |marker|
sleep 1
marker.click
sleep 1
description = #b.div(:id => "iw_kml").text
#b.span(:class, "actbar-text").click
sleep 2
coordinates = #b.text_field(:name, "daddr").value
redo if coordinates == #prev_coordinates
puts coordinates
outfile.puts coordinates
#prev_coordinates = coordinates
end
end
puts "Coordinates saved in file!"
#b.close
Works both on Mac OSX 10.7 and Windows7.
I'm using an opaque API in some ruby code which takes a File/IO as a parameter. I want to be able to pass it an IO object that only gives access to a given range of data in the real IO object.
For example, I have a 8GB file, and I want to give the api an IO object that has a 1GB range within the middle of my real file.
real_file = File.new('my-big-file')
offset = 1 * 2**30 # start 1 GB into it
length = 1 * 2**30 # end 1 GB after start
filter = IOFilter.new(real_file, offset, length)
# The api only sees the 1GB of data in the middle
opaque_api(filter)
The filter_io project looks like it would be the easiest to adapt to do this, but doesn't seem to support this use case directly.
I think you would have to write it yourself, as it seems like a rather specific thing: you would have to implement all (or, a subset that you need) of IO's methods using a chunk of the opened file as a data source. An example of the "speciality" would be writing to such stream - you would have to take care not to cross the boundary of the segment given, i.e. constantly keeping track of your current position in the big file. Doesn't seem like a trivial job, and I don't see any shortcuts that could help you there.
Perhaps you can find some OS-based solution, e.g. making a loopback device out of the part of the large file (see man losetup and particularly -o and --sizelimit options, for example).
Variant 2:
If you are ok with keeping the contents of the window in memory all the time, you may wrap StringIO like this (just a sketch, not tested):
def sliding_io filename, offset, length
File.open(filename, 'r+') do |f|
# read the window into a buffer
f.seek(offset)
buf = f.read(length)
# wrap a buffer into StringIO and pass it given block
StringIO.open(buf) do |buf_io|
yield(buf_io)
end
# write altered buffer back to the big file
f.seek(offset)
f.write(buf[0,length])
end
end
And use it as you would use block variant of IO#open.
I believe the IO object has the functionality you are looking for. I've used it before for MD5 hash summing similarly sized files.
incr_digest = Digest::MD5.new()
file = File.open(filename, 'rb') do |io|
while chunk = io.read(50000)
incr_digest << chunk
end
end
This was the block I used, where I was passing the chunk to the MD5 Digest object.
http://www.ruby-doc.org/core/classes/IO.html#M000918
I want to preserve the order of the keys in a YAML file loaded from disk, processed in some way and written back to disk.
Here is a basic example of loading YAML in Ruby (v1.8.7):
require 'yaml'
configuration = nil
File.open('configuration.yaml', 'r') do |file|
configuration = YAML::load(file)
# at this point configuration is a hash with keys in an undefined order
end
# process configuration in some way
File.open('output.yaml', 'w+') do |file|
YAML::dump(configuration, file)
end
Unfortunately, this will destroy the order of the keys in configuration.yaml once the hash is built. I cannot find a way of controlling what data structure is used by YAML::load(), e.g. alib's orderedmap.
I've had no luck searching the web for a solution.
Use Ruby 1.9.x. Previous version of Ruby do not preserve the order of Hash keys, but 1.9 does.
If you're stuck using 1.8.7 for whatever reason (like I am), I've resorted to using active_support/ordered_hash. I know activesupport seems like a big include, but they've refactored it in later versions to where you pretty much only require the part you need in the file and the rest gets left out. Just gem install activesupport, and include it as shown below. Also, in your YAML file, be sure to use an !!omap declaration (and an array of Hashes). Example time!
# config.yml #
months: !!omap
- january: enero
- february: febrero
- march: marzo
- april: abril
- may: mayo
Here's what the Ruby behind it looks like.
# loader.rb #
require 'yaml'
require 'active_support/ordered_hash'
# Load up the file into a Hash
config = File.open('config.yml','r') { |f| YAML::load f }
# So long as you specified an !!omap, this is actually a
# YAML::PrivateClass, an array of Hashes
puts config['months'].class
# Parse through its value attribute, stick results in an OrderedHash,
# and reassign it to our hash
ordered = ActiveSupport::OrderedHash.new
config['months'].value.each { |m| ordered[m.keys.first] = m.values.first }
config['months'] = ordered
I'm looking for a solution that allows me to recursively dig through a Hash loaded from a .yml file, look for those YAML::PrivateClass objects, and convert them into an ActiveSupport::OrderedHash. I may post a question on that.
Someone came up with the same issue. There is a gem ordered hash. Note that it is not a hash, it creates a subclass of hash. You might give it a try, but if you see a problem dealing with YAML, then you should consider upgrading to ruby1.9.