This is a sort of followup to my other MongoDB question about the torrent indexer.
I'm making an open source torrent indexer (like a mini TPB, in essence), and offer both SQLite and MongoDB for backend, currently.
However, I'm having trouble with the MongoDB part of it. In Sinatra, I get when trying to upload a torrent, or search for one.
In uploading, one needs to tag the torrent — and it fails here. The code for adding tags is as follows:
def add_tag(tag)
if $sqlite
unless tag_exists? tag
$db.execute("insert into #{$tag_table} values ( ? )", tag)
end
id = $db.execute("select oid from #{$tag_table} where tag = ?", tag)
return id[0]
elsif $mongo
unless tag_exists? tag
$tag.insert({:tag => tag})
end
return $tag.find({:tag => tag})[:_id] #this is the line it presumably crashes on
end
end
It reaches line 105 (noted above), and then fails. What's going on? Also, as an FYI this might turn into a few other questions as solutions come in.
Thanks!
EDIT
So instead of returning the tag result with [:_id], I changed the block inside the elsif to:
id = $tag.find({:tag => tag})
puts id.inspect
return id
and still get an error. You can see a demo at http://torrent.hypeno.de and the source at http://github.com/tekknolagi/indexer/
Given that you are doing an insert(), the easiest way to get the id is:
id = $tag.insert({:tag => tag})
id will be a BSON::ObjectId, so you can use appropriate methods depending on the return value you want:
return id # BSON::ObjectId('5017cace1d5710170b000001')
return id.to_s # "5017cace1d5710170b000001"
In your original question you are trying to use the Collection.find() method. This returns a Mongo::Cursor, but you are trying to reference the cursor as a document. You need to iterate over the cursor using each or next, eg:
cursor = $tag.find_one({:tag => tag})
return cursor.next['_id'];
If you want a single document, you should be using Collection.find_one().
For example, you can find and return the _id using:
return $tag.find_one({:tag => tag})['_id']
I think the problem here is [:_id]. I dont know much about Mongo but `$tag.find({:tag => tag}) is probably retutning an array and passing a symbol to the [] array operator is not defined.
Related
I'm working on a web-scraping solution that grabs totally different webpages and lets the user define rules/scripts in order to extract information from the page.
I started scraping from a single domain and build a parser based on Nokogiri.
Basically everything works fine.
I could now add a ruby class each time somebody wants to add a webpage with a different layout/style.
Instead I thought about using an approach where the user specifies elements where content is stored using xpath and storing this as a sort of recipe for this webpage.
Example: The user wants to scrape a table-structure extracting the rows using a hash (column-name => cell-content)
I was thinking about writing a ruby function for extraction of this generic table information once:
# extracts a table's rows as an array of hashes (column_name => cell content)
# html - the html-file as a string
# xpath_table - specifies the html table as xpath which hold the data to be extracted
def basic_table(html, xpath_table)
xpath_headers = "#{xpath_table}/thead/tr/th"
html_doc = Nokogiri::HTML(html)
html_doc = Nokogiri::HTML(html)
row_headers = html_doc.xpath(xpath_headers)
row_headers = row_headers.map do |column|
column.inner_text
end
row_contents = Array.new
table_rows = html_doc.xpath('#{xpath_table}/tbody/tr')
table_rows.each do |table_row|
cells = table_row.xpath('td')
cells = cells.map do |cell|
cell.inner_text
end
row_content_hash = Hash.new
cells.each_with_index do |cell_string, column_index|
row_content_hash[row_headers[column_index]] = cell_string
end
row_contents << [row_content_hash]
end
return row_contents
end
The user could now specify a website-recipe-file like this:
<basic_table xpath='//div[#id="grid"]/table[#id="displayGrid"]'
The function basic_table is referenced here, so that by parsing the website-recipe-file I would know that I can use the function basic_table to extract the content from the table referenced by the xPath.
This way the user can specify simple recipe-scripts and only has to dive into writing actual code if he needs a new way of extracting information.
The code would not change every time a new webpage needs to be parsed.
Whenever the structure of a webpage changes only the recipe-script would need to be changed.
I was thinking that someone might be able to tell me how he would approach this. Rules/rule engines pop into my mind, but I'm not sure if that really is the solution to my problem.
Somehow I have the feeling that I don't want to "invent" my own solution to handle this problem.
Does anybody have a suggestion?
J.
How do I create an object if one is not found? This is the query I was running:
#event_object = #event_entry.event_objects.find_all_by_plantype('dog')
and I was trying this:
#event_object = EventObject.new unless #event_entry.event_objects.find_all_by_plantype('dog')
but that does not seem to work. I know I'm missing something very simple like normal :( Thanks for any help!!! :)
find_all style methods return an array of matching records. That is an empty array if no matching records are found. And an empty is truthy. Which means:
arr = []
if arr
puts 'arr is considered turthy!' # this line will execute
end
Also, the dynamic finder methods (like find_by_whatever) are officially depreacted So you shouldn't be using them.
You probably want something more like:
#event_object = #event_entry.event_objects.where(plantype: 'dog').first || EventObject.new
But you can also configure the event object better, since you obviously want it to belong to #event_entry.
#event_object = #event_entry.event_objects.where(plantype: 'dog').first
#event_object ||= #event_entry.event_objects.build(plantype: dog)
In this last example, we try to find an existing object by getting an array of matching records and asking for the first item. If there are no items, #event_object will be nil.
Then we use the ||= operator that says "assign the value on the right if this is currently set to a falsy value". And nil is falsy. So if it's nil we can build the object form the association it should belong to. And we can preset it's attributes while we are at it.
Why not use built in query methods like find_or_create_by or find_or_initialize_by
#event_object = #event_entry.event_objects.find_or_create_by(plantype:'dog')
This will find an #event_entry.event_object with plantype = 'dog' if one does not exist it will then create one instead.
find_or_initialize_by is probably more what you want as it will leave #event_object in an unsaved state with just the association and plantype set
#event_object = #event_entry.event_objects.find_or_initialize_by(plantype:'dog')
This assumes you are looking for a single event_object as it will return the first one it finds with plantype = 'dog'. If more than 1 event_object can have the plantype ='dog' within the #event_entry scope then this might not be the best solution but it seems to fit with your description.
EDIT
The problem was with something else so the trouble wasn't really Qt, still I don't know why this happened.
The thing was that in the method display_filesize #yt.get_filesize(row_id, format) I used Nokogiri to parse the XML. I don't know if the XML was corrupted (it was loaded from quvi), but it was definitely the culprit. After switching to XMLSimple everything works fine.
The code I used:
def get_filesize(video_id, format)
video = #videos[video_id]
if video.formats[format].empty?
to_parse = `quvi --xml --format #{format} #{video.player_url}`
parsed = Nokogiri.parse(to_parse)
video.formats[format] = { :size => parsed.at('length_bytes').text,
:url => parsed.at('link').at('url').text }
end
video.formats[format][:size]
end
Now I use something like this:
def get_filesize(video_id, format)
video = #videos[video_id]
if video.formats[format].empty?
to_parse = `quvi --xml --format #{format} #{video.player_url}`
parsed = XmlSimple.xml_in(to_parse, {'KeyAttr' => 'name'})
video.formats[format] = { :size => parsed['link'][0]['length_bytes'][0],
:url => URI.decode(parsed['link'][0]['url'][0]) }
end
video.formats[format][:size]
end
It works beautifully. Still, I don't know why it crashed. This is the real question.
OLD QUESTION
I have a Qt::TableView that contains Qt::StandardItemModel. A row in the model consists of text, Qt::PushButton, checkbox and Qt::ComboBox. It works like this:
The user is presented with text values and can explore further if they want to.
The user clicks Qt::PushButton and the next cell is populated with a Qt::ComboBox containing other possible values to choose from.
If the user chooses an option from Qt::ComboBox, magic happens, objects get created, hashes populated and the cell on the right gets populated with appropriate text (through a Qt::StandardItem)
Then the checkbox can be checked.
After selecting the rows the user wants, a Qt::PushButton located outside of the Qt::TableView can be clicked. It then iterates through the model, tests if the checkbox is selected and should it be, tries to access the value in the appropriate ComboBox.
The problem is, when I insert code that tries to access the Qt::ComboBox, I can't insert the Qt::StandardItem, because I can't get the model, because Qt::TableView.model returns NilClass (at some point).
I don't know why and how this happens. It's a random thing, sometimes the value of Qt::ComboBox can be changed a couple times, sometimes the first try ends with an error.
Here is how I create the Qt::StandardItem:
def display_filesize
row_id = row_id_from_object_name(sender.objectName)
format = sender.currentText
filesize = #yt.get_filesize(row_id, format) # get the text
filesize_item = Qt::StandardItem.new("#{(filesize.to_i/1024/1024)} MB ")
# #tc simply stores the indexes of columns so I can access them easily
#ui.tableView.model.setItem(row_id, #tc[:filesize], filesize_item)
end
And here is how I try to access the Qt::ComboBox value:
model = #ui.tableView.model
checked = model.rowCount.times.map do |i|
if model.item(i, #tc[:check]).checkState == Qt::Checked
# if I remove the following two lines it works...
index = model.index(i, #tc[:formats])
format = #ui.tableView.indexWidget(index).currentText
#yt.videos[i][format]
end
end
And this is the error I am trying to get rid of:
searcher.rb:86:in `display_filesize': undefined method `index' for nil:NilClass (NoMethodError)
from /var/lib/gems/1.9.1/gems/qtbindings-4.8.3.0/lib/Qt/qtruby4.rb:469:in `qt_metacall'
from /var/lib/gems/1.9.1/gems/qtbindings-4.8.3.0/lib/Qt/qtruby4.rb:469:in `method_missing'
from /var/lib/gems/1.9.1/gems/qtbindings-4.8.3.0/lib/Qt/qtruby4.rb:469:in `exec'
from qutub-player.rb:17:in `<main>'
I'm having some issues deleting my document using Mongoid...
The code actually does delete the gallery, but I get a browser error which looks like:
Mongoid::Errors::DocumentNotFound at /admin/galleries/delete/4e897ce07df6d15a5e000001
The suspect code is below:
def self.removeGalleryFor(user_session_id, gallery_id)
person = Person.any_in(session_ids: [user_session_id])
return false if person.count != 1
return false if person[0].userContent.nil?
return false if person[0].userContent.galleries.empty?
gallery = person[0].userContent.galleries.find(gallery_id) #ERROR is on this line
gallery.delete if !gallery.nil?
end
My Person class embeds one userContent which embeds many galleries.
Strangely enough I've got a couple of tests around this which work fine...
I'm really not sure what's happening - my gallery seems to be found fine, and is even deleted from Mongo.
Any ideas?
find throws an error if it can't find a document with the given id. Instead of checking presence of given gallery and returning nil if it doesn't exist, you directly ask mongodb while querying to remove any such gallery.
def self.remove_gallery_for(user_session_id, gallery_id)
user_session_id = BSON::ObjectId.from_string(user_session_id) if user_session_id.is_a?(String)
gallery_id = BSON::ObjectId.from_string(gallery_id) if gallery_id.is_a?(String)
# dropping to mongo collection object wrapped by mongoid,
# as I don't know how to do it using mongoid's convenience methods
last_error = Person.collection.update(
# only remove gallery for user matching user_session_id
{"session_ids" => user_session_id},
# remove gallery if there exists any
{"$pull" => {:userContent.galleries => {:gallery_id => gallery_id}}},
# [optional] check if successfully removed the gallery
:safe => true
)
return last_error["err"].nil?
end
This way you do not load the Person, you don't even get the data from monogdb to application server. Just get the gallery removed if it exists.
But you should prefer #fl00r's answer if you need to fire callbacks and switch to destroy instead of delete
def self.removeGalleryFor(user_session_id, gallery_id)
# person = Person.where(session_ids: user_session_id).first
person = Person.any_in(session_ids: [user_session_id])
if person && person.userContent && person.userContent.galleries.any?
gallery = person.userContent.galleries.where(id: gallery_id).first
gallery.delete if gallery
end
end
ps:
In Ruby usually under_score naming rather then CamelCase is used
Kudos to Rubish for pointing me to a solution that at least passes my tests - for some reason fl00r's code didn't work - it looks like it should, but doesn't for some reason...
Person.collection.update(
{"session_ids" => user_session_id},
{"$pull" => {'userContent.galleries' => {:_id => gallery_id}}},
:safe => true
)
=> this code will pass my tests, but then once it's running in sinatra it doesn't work.... so frustrating!
have posted this code with tests on github https://github.com/LouisSayers/bugFixes/tree/master/mongoDelete
Probably really easy but im having trouble finding documentation online about this
I have two activerecord queries in Ruby that i want to join together via an OR operator
#pro = Project.where(:manager_user_id => current_user.id )
#proa = Project.where(:account_manager => current_user.id)
im new to ruby but tried this myself using ||
#pro = Project.where(:manager_user_id => current_user.id || :account_manager => current_user.id)
this didnt work, So 1. id like to know how to actually do this in Ruby and 2. if that person can also give me a heads up on the boolean syntax in a ruby statement like this altogether.
e.g. AND,OR,XOR...
You can't use the Hash syntax in this case.
Project.where("manager_user_id = ? OR account_manager = ?", current_user.id, current_user.id)
You should take a look at the API documentation and follow conventions, too. In this case for the code that you might send to the where method.
This should work:
#projects = Project.where("manager_user_id = '#{current_user.id}' or account_manager_id = '#{current_user.id}'")
This should be safe since I'm assuming current_user's id value comes from your own app and not from an external source such as form submissions. If you are using form submitted data that you intent to use in your queries you should use placeholders so that Rails creates properly escaped SQL.
# with placeholders
#projects = Project.where(["manager_user_id = ? or account_manager_id = ?", some_value_from_form1, some_value_from_form_2])
When you pass multiple parameters to the where method (the example with placeholders), the first parameter will be treated by Rails as a template for the SQL. The remaining elements in the array will be replaced at runtime by the number of placeholders (?) you use in the first element, which is the template.
Metawhere can do OR operations, plus a lot of other nifty things.