bad char after creating a Database from csv - ruby

I am trying to create a database using mongoid but it fails to find the create method. I am trying to create 2 databases based on csv files:
extract_data class:
class ExtractData
include Mongoid::Document
include Mongoid::Timestamps
def self.create_all_databases
#cbsa2msa = DbForCsv.import!('./share/private/csv/cbsa_to_msa.csv')
#zip2cbsa = DbForCsv.import!('./share/private/csv/zip_to_cbsa.csv')
end
def self.show_all_database
ap #cbsa2msa.all.to_a
ap #zip2cbsa.all.to_a
end
end
the class DbForCSV works as below:
class DbForCsv
include Mongoid::Document
include Mongoid::Timestamps
include Mongoid::Attributes::Dynamic
def self.import!(file_path)
columns = []
instances = []
CSV.foreach(file_path, encoding: 'iso-8859-1:UTF-8') do |row|
if columns.empty?
# We dont want attributes with whitespaces
columns = row.collect { |c| c.downcase.gsub(' ', '_') }
next
end
instances << create!(build_attributes(row, columns))
end
instances
end
private
def self.build_attributes(row, columns)
attrs = {}
columns.each_with_index do |column, index|
attrs[column] = row[index]
end
ap attrs
attrs
end
end
I am not aware of all fields and it may change in time. that's why I have create database and generic mehtods.
I have also another issue after having fixed the 'create!' issue.
I am using the encoding to make sure only UTF8 char are handled but I still see:
{
"zip" => "71964",
"cbsa" => "31680",
"res_ratio" => "0.086511098",
"bus_ratio" => "0.012048193",
"oth_ratio" => "0.000000000",
"tot_ratio" => "0.082435345"
}
when doing 'ap attrs' in the code. how to make sure that 'zip' -> 'zip'
Thanks

create! is a class method but you're trying to call it as an instance method. Your import! method shouldn't be an instance method either, it should be a class method since it produces instances of your class:
def self.import!(file_path)
#-^^^^
# everything else would be the same...
end
You'd also make build_attributes a class method since it is just a helper method for another class method:
def self.build_attributes
#...
end
And then you don't need that odd looking new call when using import!:
def self.create_all_databases
#cbsa2msa = DbForCsv.import!('./share/private/csv/cbsa_to_msa.csv')
#zip2cbsa = DbForCsv.import!('./share/private/csv/zip_to_cbsa.csv')
end

Related

How do I implement hashids in ruby on rails

I will go ahead and apologize upfront as I am new to ruby and rails and I cannot for the life of me figure out how to implement using hashids in my project. The project is a simple image host. I have it already working using Base58 to encode the sql ID and then decode it in the controller. However I wanted to make the URLs more random hence switching to hashids.
I have placed the hashids.rb file in my lib directory from here: https://github.com/peterhellberg/hashids.rb
Now some of the confusion starts here. Do I need to initialize hashids on every page that uses hashids.encode and hashids.decode via
hashids = Hashids.new("mysalt")
I found this post (http://zogovic.com/post/75234760043/youtube-like-ids-for-your-activerecord-models) which leads me to believe I can put it into an initializer however after doing that I am still getting NameError (undefined local variable or method `hashids' for ImageManager:Class)
so in my ImageManager.rb class I have
require 'hashids'
class ImageManager
class << self
def save_image(imgpath, name)
mime = %x(/usr/bin/exiftool -MIMEType #{imgpath})[34..-1].rstrip
if mime.nil? || !VALID_MIME.include?(mime)
return { status: 'failure', message: "#{name} uses an invalid format." }
end
hash = Digest::MD5.file(imgpath).hexdigest
image = Image.find_by_imghash(hash)
if image.nil?
image = Image.new
image.mimetype = mime
image.imghash = hash
unless image.save!
return { status: 'failure', message: "Failed to save #{name}." }
end
unless File.directory?(Rails.root.join('uploads'))
Dir.mkdir(Rails.root.join('uploads'))
end
#File.open(Rails.root.join('uploads', "#{Base58.encode(image.id)}.png"), 'wb') { |f| f.write(File.open(imgpath, 'rb').read) }
File.open(Rails.root.join('uploads', "#{hashids.encode(image.id)}.png"), 'wb') { |f| f.write(File.open(imgpath, 'rb').read) }
end
link = ImageLink.new
link.image = image
link.save
#return { status: 'success', message: Base58.encode(link.id) }
return { status: 'success', message: hashids.encode(link.id) }
end
private
VALID_MIME = %w(image/png image/jpeg image/gif)
end
end
And in my controller I have:
require 'hashids'
class MainController < ApplicationController
MAX_FILE_SIZE = 10 * 1024 * 1024
MAX_CACHE_SIZE = 128 * 1024 * 1024
#links = Hash.new
#files = Hash.new
#tstamps = Hash.new
#sizes = Hash.new
#cache_size = 0
class << self
attr_accessor :links
attr_accessor :files
attr_accessor :tstamps
attr_accessor :sizes
attr_accessor :cache_size
attr_accessor :hashids
end
def index
end
def transparency
end
def image
##imglist = params[:id].split(',').map{ |id| ImageLink.find(Base58.decode(id)) }
#imglist = params[:id].split(',').map{ |id| ImageLink.find(hashids.decode(id)) }
end
def image_direct
#linkid = Base58.decode(params[:id])
linkid = hashids.decode(params[:id])
file =
if Rails.env.production?
puts "#{Base58.encode(ImageLink.find(linkid).image.id)}.png"
File.open(Rails.root.join('uploads', "#{Base58.encode(ImageLink.find(linkid).image.id)}.png"), 'rb') { |f| f.read }
else
puts "#{hashids.encode(ImageLink.find(linkid).image.id)}.png"
File.open(Rails.root.join('uploads', "#{hashids.encode(ImageLink.find(linkid).image.id)}.png"), 'rb') { |f| f.read }
end
send_data(file, type: ImageLink.find(linkid).image.mimetype, disposition: 'inline')
end
def upload
imgparam = params[:image]
if imgparam.is_a?(String)
name = File.basename(imgparam)
imgpath = save_to_tempfile(imgparam).path
else
name = imgparam.original_filename
imgpath = imgparam.tempfile.path
end
File.chmod(0666, imgpath)
%x(/usr/bin/exiftool -all= -overwrite_original #{imgpath})
logger.debug %x(which exiftool)
render json: ImageManager.save_image(imgpath, name)
end
private
def save_to_tempfile(url)
uri = URI.parse(url)
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = uri.scheme == 'https'
http.start do
resp = http.get(uri.path)
file = Tempfile.new('urlupload', Dir.tmpdir, :encoding => 'ascii-8bit')
file.write(resp.body)
file.flush
return file
end
end
end
Then in my image.html.erb view I have this:
<%
#imglist.each_with_index { |link, i|
id = hashids.encode(link.id)
ext = link.image.mimetype.split('/')[1]
if ext == 'jpeg'
ext = 'jpg'
end
puts id + '.' + ext
%>
Now if I add
hashids = Hashids.new("mysalt")
in ImageManager.rb main_controller.rb and in my image.html.erb I am getting this error:
ActionView::Template::Error (undefined method `id' for #<Array:0x000000062f69c0>)
So all in all implementing hashids.encode/decode is not as easy as implementing Base58.encode/decode and I am confused on how to get it working... Any help would be greatly appreciated.
I would suggest loading it as a gem by including it into your Gemfile and running bundle install. It will save you the hassle of requiring it in every file and allow you to manage updates using Bundler.
Yes, you do need to initialize it wherever it is going to be used with the same salt. Would suggest that you define the salt as a constant, perhaps in application.rb.
The link you provided injects hashids into ActiveRecord, which means it will not work anywhere else. I would not recommend the same approach as it will require a high level of familiarity with Rails.
You might want to spend some time understanding ActiveRecord and ActiveModel. Will save you a lot of reinventing the wheel. :)
Before everythink you should just to test if Hashlib is included in your project, you can run command rails c in your project folder and make just a small test :
>> my_id = ImageLink.last.id
>> puts Hashids.new(my_id)
If not working, add the gem in gemfile (that anyway make a lot more sence).
Then, I think you should add a getter for your hash_id in your ImageLink model.
Even you don't want to save your hash in the database, this hash have it's pllace in your model. See virtual property for more info.
Remember "Skinny Controller, Fat Model".
class ImageLink < ActiveRecord::Base
def hash_id()
# cache the hash
#hash_id ||= Hashids.new(id)
end
def extension()
# you could add the logic of extension here also.
ext = image.mimetype.split('/')[1]
if ext == 'jpeg'
'jpg'
else
ext
end
end
end
Change the return in your ImageManager#save_image
link = ImageLink.new
link.image = image
# Be sure your image have been saved (validation errors, etc.)
if link.save
{ status: 'success', message: link.hash_id }
else
{status: 'failure', message: link.errors.join(", ")}
end
In your template
<%
#imglist.each_with_index do |link, i|
puts link.hash_id + '.' + link.extension
end # <- I prefer the do..end to not forgot the ending parenthesis
%>
All this code is not tested...
I was looking for something similar where I can disguise the ids of my records. I came across act_as_hashids.
https://github.com/dtaniwaki/acts_as_hashids
This little gem integrates seamlessly. You can still find your records through the ids. Or with the hash. On nested records you can use the method with_hashids.
To get the hash you use to_param on the object itself which result in a string similar to this ePQgabdg.
Since I just implemented this I can't tell how useful this gem will be. So far I just had to adjust my code a little bit.
I also gave the records a virtual attribute hashid so I can access it easily.
attr_accessor :hashid
after_find :set_hashid
private
def set_hashid
self.hashid = self.to_param
end

How to create a migration file dynamically from library in rails 4.0

How to create a migration file dynamically in rails 4.0?
I want to add some columns to different tables dynamically via my library module. There would be a method that create a migration file and add content to it.
How can I create it from a library?
Yes, I found a way to adding a migration file from lib. I created a method for this. The following code snippet shows the methods we use and gives a better idea to create migration file dynamically.
def create_columns(tb_with_cols)
add_columns = ""
tb_name = tb_with_cols.keys.first
columns = tb_with_cols.values.first
columns.each { |c_name, c_type| add_columns << "\tadd_column(':#{tb_name}', :#{c_name}, :#{c_type})\n" }
add_columns
end
def migration_file_content(tb_with_cols)
cols = create_columns(tb_with_cols)
<<-RUBY
class AddMissingColumnsToTable < ActiveRecord::Migration
def change_table
#{cols}
end
end
RUBY
end
def write_content_to_file(path, content)
File.open(path, 'w+') do |f|
f.write(content)
end
end
Just call the method migration_file_content in your code. Pass the parameter tb_with_cols as a Hash, in which the key is the table_name and value is the columns that should be added to that table. Ex:
tb_with_cols = {:users => {:name => :string, :age => :integer, :address => :text} }
content = migration_file_content(tb_with_cols)
write_content_to_file("#{Rails.root}/db/migrations/', content)
After that call the method write_content_to_file with your new migration file path and the content from our migration_file_content method.

How do I integrate a module?

UPDATE: I changed my model a bit, but it still do not works. I get the following error message: ActionController::RoutingError (undefined local variable or method `stop_words_finder' for #< Class:0x007facb57f6908 >)
models/pool.rb
class Pool < ActiveRecord::Base
include StopWords
attr_accessible :fragment
def self.delete_stop_words(data)
words = data.scan(/\w+/)
stop_words = stop_words_finder
key_words = words.select { |word| !stop_words.include?(word) }
pool_frag = Pool.create :fragment => key_words.join(' ')
end
end
lib/stop_words.rb
module StopWords
def stop_words_finder
%w{house}
end
end
controllers/tweets_controller.rb
class TweetsController < ApplicationController
def index
#tweets = Pool.all
respond_with(#tweets)
end
end
stop_words = :stop_words_finder
assigns the symbol :stop_words_finder to stop_words. What you want to do is call the stop_words_finder method that you included from Stopwords, which will return the array. In this case, all you have to do is remove the colon.
stop_words = stop_words_finder
Add this to your model to make stop_words_finder available to Pool instances:
include StopWords
Pool.new.stop_words_finder will work
To make stop_words_finder available to the Pool class, use extend:
extend StopWords
Pool.stop_words_finder will work.
Also, why on earth are you creating an instance of Pool inside the Pool class definition?
As it stands you're including the module into your ApplicationController class. This has absolutely no effect on the Pool class. In addition creating instances of the Pool class inside its definition is rather unorthodox - do you really want to create a new row in your database everytime the code to your app is loaded? I would refactor things along these lines
class Pool < ActiveRecord::Base
class << self
include StopWords
def create_from_data(data)
words = data.scan(/\w+/)
stop_words = stop_words_finder
key_words = words.select { |word| !stop_words.include?(word) }
pool = Pool.create :pooltext => key_words.join(' ')
end
end
end
You'd then call Pool.create_from_data %q{Ich gehe heute schwimmen. Und du?} when you want to create it.

Ruby structure for extendable handler/plugin architechture

I'm writing something that is a bit like Facebook's shared link preview.
I would like to make it easily extendable for new sites by just dropping in a new file for each new site I want to write a custom parser for. I have the basic idea of the design pattern figured out but don't have enough experience with modules to nail the details. I'm sure there are plenty of examples of something like this in other projects.
The result should be something like this:
> require 'link'
=> true
> Link.new('http://youtube.com/foo').preview
=> {:title => 'Xxx', :description => 'Yyy', :embed => '<zzz/>' }
> Link.new('http://stackoverflow.com/bar').preview
=> {:title => 'Xyz', :description => 'Zyx' }
And the code would be something like this:
#parsers/youtube.rb
module YoutubeParser
url_match /(youtube\.com)|(youtu.be)\//
def preview
get_stuff_using youtube_api
end
end
#parsers/stackoverflow.rb
module SOFParser
url_match /stachoverflow.com\//
def preview
get_stuff
end
end
#link.rb
class Link
def initialize(url)
extend self with the module that has matching regexp
end
end
# url_processor.rb
class UrlProcessor
# registers url handler for given pattern
def self.register_url pattern, &block
#patterns ||= {}
#patterns[pattern] = block
end
def self.process_url url
_, handler = #patterns.find{|p, _| url =~ p}
if handler
handler.call(url)
else
{}
end
end
end
# plugins/so_plugin.rb
class SOPlugin
UrlProcessor.register_url /stackoverflow\.com/ do |url|
{:title => 'foo', :description => 'bar'}
end
end
# plugins/youtube_plugin.rb
class YoutubePlugin
UrlProcessor.register_url /youtube\.com/ do |url|
{:title => 'baz', :description => 'boo'}
end
end
p UrlProcessor.process_url 'http://www.stackoverflow.com/1234'
#=>{:title=>"foo", :description=>"bar"}
p UrlProcessor.process_url 'http://www.youtube.com/1234'
#=>{:title=>"baz", :description=>"boo"}
p UrlProcessor.process_url 'http://www.foobar.com/1234'
#=>{}
You just need to require every .rb from plugins directory.
If you're willing to take this approach you should probably scan the filed for the mathing string and then include the right one.
In the same situation I attempted a different approach. I'm extending the module with new methods, ##registering them so that I won't register two identically named methods. So far it works good, though the project I started is nowhere near leaving the specific domain of one tangled mess of a particular web-site.
This is the main file.
module Onigiri
extend self
##registry ||= {}
class OnigiriHandlerTaken < StandardError
def description
"There was an attempt to override registered handler. This usually indicates a bug in Onigiri."
end
end
def clean(data, *params)
dupe = Onigiri::Document.parse data
params.flatten.each do |method|
dupe = dupe.send(method) if ##registry[method]
end
dupe.to_html
end
class Document < Nokogiri::HTML::DocumentFragment
end
private
def register_handler(name)
unless ##registry[name]
##registry[name] = true
else
raise OnigiriHandlerTaken
end
end
end
And here's the extending file.
# encoding: utf-8
module Onigiri
register_handler :fix_backslash
class Document
def fix_backslash
dupe = dup
attrset = ['src', 'longdesc', 'href', 'action']
dupe.css("[#{attrset.join('], [')}]").each do |target|
attrset.each do |attr|
target[attr] = target[attr].gsub("\\", "/") if target[attr]
end
end
dupe
end
end
end
Another way I see is to use a set of different (but behaviorally indistinguishable) classes with a simple decision making mechanism to call a right one. A simple hash that holds class names and corresponding url_matcher would probably suffice.
Hope this helps.

How to extend DataMapper::Resource with custom method

I have following code:
module DataMapper
module Resource
##page_size = 25
attr_accessor :current_page
attr_accessor :next_page
attr_accessor :prev_page
def first_page?
#prev_page
end
def last_page?
#next_page
end
def self.paginate(page)
if(page && page.to_i > 0)
#current_page = page.to_i - 1
else
#current_page = 0
end
entites = self.all(:offset => #current_page * ##page_size, :limit => ##page_size + 1)
if #current_page > 0
#prev_page = #current_page
end
if entites.size == ##page_size + 1
entites.pop
#next_page = (#current_page || 1) + 2
end
entites
end
end
end
Then I have call of #paginate:
#photos = Photo.paginate(params[:page])
And getting following error:
application error
NoMethodError at /dashboard/photos/
undefined method `paginate' for Photo:Class
In Active record this concept works fine for me... I'am using JRuby for notice. What I'am doing wrong?
Andrew,
You can think of DataMapper::Resource as the instance (a row) and of DataMapper::Model as the class (a table). Now to alter the default capabilities at either the resource or the model level, you can either append inclusions or extensions to your model.
First you will need to wrap your #paginate method in a module. I've also added a probably useless #page method to show how to append to a resource in case you ever need to.
module Pagination
module ClassMethods
def paginate(page)
# ...
end
end
module InstanceMethods
def page
# ...
end
end
end
In your case, you want #paginate to be available on the model, so you would do:
DataMapper::Model.append_extensions(Pagination::ClassMethods)
If you find yourself in need of adding default capabilities to every resource, do:
DataMapper::Model.append_inclusions(Pagination::InstanceMethods)
Hope that helps!

Resources