find() from MongoDB, return result, supress fields and turn into JSON - ruby

I Have data saved in a MongoDB in the following format
{"_id": "VALVE22","state": "1","element": "BNK1FLOW","data":{"type": "SEN","descr": "TOWER6"}}
I have the following code in a Ruby script;
db = Mongo::Connection.new.db("cooler-lookup")
coll = db.collection("elements")
kitty = coll.find({"_id" => table[address][i], "state" => char}).to_a
'table[address][i]' and 'char' are variables defined & used elsewhere in the bigger script feeding data into this lookup section. For testing these can be replaced with "VALVE22" and "1" respectively (and that's how I've been testing in irb)
When run from the command line the script outputs the following correct result from a valid query.
{"_id"=>"VLAVE22", "state"=>"1", "element"=>"BNK1FLOW", "data"=>{"type"=>"SEN", "descr"=>"TOWER6"}}
But I need to suppress the _id and state fields. I've tried using :fields modifier in all sorts of ways but can't remove the fields. I have tested this in irb and along with the valid lookup I also get => nil returned. I'm sure this is something really simple but I can't see what I need to be able to JSON.generate the query results without the ID & State fields and then puts it.
Using the code below I was able to get this working, however when I tried to do kittylitter = JSON.generate(kitty) I was getting a lot of empty []'s as well as my valid result. It looks like they where the failed queries from the DB coming back with no record.
After many hours of being confused I managed to find this bit of code to fix the problem
kitty.each do |key|
keyjson = JSON.generate(key)
puts keyjson
end
That gave me exactly what I needed out - which was the result on 1 line as valid JSON. Part of my head hurting confusion comes from the fact to.a makes an array, yet when I tried to do array type stuff on the result kitty nothing would work as expected. I then tried treating it like a hash which led me to that bit of code above! Once I'd done that everything worked... Am I wrong to be confused by arrays and hashes or have I missed something real obvious like my array is or contains a hash?

This works for me:
kitty = coll.find({"_id" => table[address][i], "state" => char}, :fields => {"_id" => 0, "state" => 0}).to_a
It returns
[{"element"=>"BNK1FLOW", "data"=>{"type"=>"SEN", "descr"=>"TOWER6"}}]
See http://api.mongodb.org/ruby/current/Mongo/Collection.html#find-instance_method for usage instructions for Mongo::Collection#find

Using gem mongo -v 2.4.3 , the following works for me
mongo_results = collection.find({"shop_id" => shop_id}, :projection => {"_id" => 0, "child_products" => 0}).to_a
In the example above, I'm omitting "_id" and "child_products" from showing up in the results.

Related

Fix deprecation warning `Dangerous query method` on `.order`

I have a custom gem which creates a AR query with input that comes from an elasticsearch instance.
# record_ids: are the returned ids of the ES results
# order: is the order of the of the ids that ES returns
search_class.where(search_class.primary_key => record_ids).order(order)
Right now the implementation is that I build the order string directly into the order variable so it looks like this: ["\"positions\".\"id\" = 'fcdc924a-21da-440e-8d20-eec9a71321a7' DESC"]
This works fine but throws a deprecation warning which ultimately will not work in rails6.
DEPRECATION WARNING: Dangerous query method (method whose arguments are used as raw SQL) called with non-attribute argument(s): "\"positions\".\"id\" = 'fcdc924a-21da-440e-8d20-eec9a71321a7' DESC". Non-attribute arguments will be disallowed in Rails 6.0. This method should not be called with user-provided values, such as request parameters or model attributes. Known-safe values can be passed by wrapping them in Arel.sql()
So I tried couple of different approaches but all of them with no success.
order = ["\"positions\".\"id\" = 'fcdc924a-21da-440e-8d20-eec9a71321a7' DESC"]
# Does not work since order is an array
.order(Arel.sql(order))
# No errors but only returns an ActiveRecord_Relation
# on .inspect it returns `PG::SyntaxError: ERROR: syntax error at or near "["`
.order(Arel.sql("#{order}"))
# .to_sql: ORDER BY [\"\\\"positions\\\".\\\"id\\\" = 'fcdc924a-21da-440e-8d20-eec9a71321a7' DESC\"]"
order = ['fcdc924a-21da-440e-8d20-eec9a71321a7', ...]
# Won't work since its only for integer values
.order("idx(ARRAY#{order}, #{search_class.primary_key})")
# .to_sql ORDER BY idx(ARRAY[\"fcdc924a-21da-440e-8d20-eec9a71321a7\", ...], id)
# Only returns an ActiveRecord_Relation
# on .inspect it returns `PG::InFailedSqlTransaction: ERROR:`
.order("array_position(ARRAY#{order}, #{search_class.primary_key})")
# .to_sql : ORDER BY array_position(ARRAY[\"fcdc924a-21da-440e-8d20-eec9a71321a7\", ...], id)
I am sort of stuck since rails forces attribute arguments in the future and an has no option to opt out of this. Since the order is a code generated array and I have full control of the values I am curious how I can implement this. Maybe someone had this issue before an give some useful insight or idea?
You could try to apply Arel.sql to the elements of the array, that should work, ie
search_class.where(search_class.primary_key => record_ids)
.order(order.map {|i| i.is_a?(String) ? Arel.sql(i) : i})

Ruby RDF query - extracting simple data from Seq and Bag items

I am receiving xml-serialised RDF (as part of XMP media descriptions in case that is relevent), and processing in Ruby. I am trying to work with rdf gem, although happy to look at other solutions.
I have managed to load and query the most basic data, but am stuck when trying to build a query for items which contain sequences and bags.
Example XML RDF:
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
<rdf:Description rdf:about='' xmlns:dc='http://purl.org/dc/elements/1.1/'>
<dc:date>
<rdf:Seq>
<rdf:li>2013-04-08</rdf:li>
</rdf:Seq>
</dc:date>
</rdf:Description>
</rdf:RDF>
My best attempt at putting together a query:
require 'rdf'
require 'rdf/rdfxml'
require 'rdf/vocab/dc11'
graph = RDF::Graph.load( 'test.rdf' )
date_query = RDF::Query.new( :subject => { RDF::DC11.date => :date } )
results = date_query.execute(graph)
results.map { |result| { result.subject.to_s => result.date.inspect } }
=> [{"test.rdf"=>"#<RDF::Node:0x3fc186b3eef8(_:g70100421177080)>"}]
I get the impression that my results at this stage ("query solutions"?) are a reference to the rdf:Seq container. But I am lost as to how to progress. For the example above, I'd expect to end up, eventually, with an array ["2013-04-08"].
When there is incoming data without the rdf:Seq and rdf:li containers, I am able to extract the strings I want using RDF::Query, following examples at http://rdf.rubyforge.org/RDF/Query.html - unfortunately I cannot find any examples of more complex queries or RDF structures processed in Ruby.
Edit: In addition, when I try to find appropriate methods to use with the RDF::Node object, I cannot see any way to explore any further relations it may have:
results[0].date.methods - Object.methods
=> [:original, :original=, :id, :id=, :node?, :anonymous?, :unlabeled?, :labeled?, :to_sym, :resource?, :constant?, :variable?, :between?, :graph?, :literal?, :statement?, :iri?, :uri?, :valid?, :invalid?, :validate!, :validate, :to_rdf, :inspect!, :type_error, :to_ntriples]
# None of the above leads AFAICS to more data in the graph
I know how to get the same data in xpath (well, at least provided we always get the same paths in the serialisation), but feel it is not the best query language to use in this case (it's my backup plan, however, if it turns out too complex to implement an RDF-query solution)
I think you're correct when saying "my results at this stage ("query solutions"?) are a reference to the rdf:Seq container". RDF/XML is a really horrible serialisation format, instead think of the data as a graph. Here a picture of an RDF:Bag. RDF:Seq works the same and the #students in the example is analogous to the #date in your case.
So to get to the date literal, you need to hop one node further in the graph. I'm not familiar with the syntax of this Ruby library, but something like:
require 'rdf'
require 'rdf/rdfxml'
require 'rdf/vocab/dc11'
graph = RDF::Graph.load( 'test.rdf' )
date_query = RDF::Query.new({
:yourThing => {
RDF::DC11.date => :dateSeq
},
:dateSeq => {
RDF.type => RDF.Seq,
RDF._1 => :dateLiteral
}
})
date_query.execute(graph).each do |solution|
puts "date=#{solution.dateLiteral}"
end
Of course, if you expect the Seq to actually to contain multiple dates (otherwise it wouldn't make sense to have a Seq), you will have to match them with RDF._1 => :dateLiteral1, RDF._2 => :dateLiteral2, RDF._3 => :dateLiteral3 etc.
Or for a more generic solution, match all the properties and objects on the dateSeq with:
:dateSeq => {
:property => :dateLiteral
}
and then filter out the case where :property ends up being RDF:type while :dateLiteral isn't actually the date but RDF:Seq. Maybe the library has also a special method to get all the Seq's contents.

Sinatra can't convert Symbol into Integer when making MongoDB query

This is a sort of followup to my other MongoDB question about the torrent indexer.
I'm making an open source torrent indexer (like a mini TPB, in essence), and offer both SQLite and MongoDB for backend, currently.
However, I'm having trouble with the MongoDB part of it. In Sinatra, I get when trying to upload a torrent, or search for one.
In uploading, one needs to tag the torrent — and it fails here. The code for adding tags is as follows:
def add_tag(tag)
if $sqlite
unless tag_exists? tag
$db.execute("insert into #{$tag_table} values ( ? )", tag)
end
id = $db.execute("select oid from #{$tag_table} where tag = ?", tag)
return id[0]
elsif $mongo
unless tag_exists? tag
$tag.insert({:tag => tag})
end
return $tag.find({:tag => tag})[:_id] #this is the line it presumably crashes on
end
end
It reaches line 105 (noted above), and then fails. What's going on? Also, as an FYI this might turn into a few other questions as solutions come in.
Thanks!
EDIT
So instead of returning the tag result with [:_id], I changed the block inside the elsif to:
id = $tag.find({:tag => tag})
puts id.inspect
return id
and still get an error. You can see a demo at http://torrent.hypeno.de and the source at http://github.com/tekknolagi/indexer/
Given that you are doing an insert(), the easiest way to get the id is:
id = $tag.insert({:tag => tag})
id will be a BSON::ObjectId, so you can use appropriate methods depending on the return value you want:
return id # BSON::ObjectId('5017cace1d5710170b000001')
return id.to_s # "5017cace1d5710170b000001"
In your original question you are trying to use the Collection.find() method. This returns a Mongo::Cursor, but you are trying to reference the cursor as a document. You need to iterate over the cursor using each or next, eg:
cursor = $tag.find_one({:tag => tag})
return cursor.next['_id'];
If you want a single document, you should be using Collection.find_one().
For example, you can find and return the _id using:
return $tag.find_one({:tag => tag})['_id']
I think the problem here is [:_id]. I dont know much about Mongo but `$tag.find({:tag => tag}) is probably retutning an array and passing a symbol to the [] array operator is not defined.

How do I combine map with to_s?

I am using Mongoid and retrieving a bunch of BSON::ObjectId instances. Ideally, I'd like to convert them to strings upon retrieval. What's the correct syntax? It can be done in two lines like this:
foo = Bar.where(:some_id => N).map(&:another_id)
ids_as_strings = foo.map(&:to_s)
What's the proper Ruby way to chain to_s after the map invocation above?
This works fine, but don't do it!
ids_as_string = Bar.where(:some_id => N).map(&:another_id).map(&:to_s)
It looks cool for sure, but think about it, you are doing two maps. A map is for looping over an array, or something else, and will operate in each position, retrieving a new array, or something else, with the results.
So why do two loops if you want to do two operations?
ids_as_string = Bar.where(:some_id => N).map {|v| v.another_id.to_s}
This should be the way to go in this situation, and actually looks nicer.
You can just chain it directly:
ids_as_string = Bar.where(:some_id => N).map(&:another_id).map(&:to_s)
I tried this out with a model and I got what you expected, something like:
["1", "2", ...]

Why does LIKE in SQLite3 work in this statement but = does not?

I use SQLite3 and have a table called blobs that stores content and *hash_value*.
Here is the schema:
CREATE TABLE "blobs" (
"id" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
"content" blob,
"hash_value" text,
"created_at" datetime NOT NULL,
"updated_at" datetime NOT NULL
);
Now I inserted some data. that looks like this:
1|--- foo
...
|34dc86f45b3dc92b352fd45f525192c0|2012-04-09 17:02:54.219504|2012-04-09 17:02:54.219504
And I tried the following two queries:
select * from blobs where hash_value = '34dc86f45b3dc92b352fd45f525192c0';
select * from blobs where hash_value LIKE '34dc86f45b3dc92b352fd45f525192c0';
The first does not work, but the second one does. I do not understand why the = operator does not work.
I tried to break this down to a simple example where my hash is just 'abc' and = works. I mean this string is hardly too long.
EDIT
Ok I actually narrowed it down to this:
I am using Ruby to generate the hash like this Digest::MD5.hexdigest("foobar")
This generates a string like this: '3858f62230ac3c915f300c664312c63f'
My test look somewhat like this: b = Blob.new(...);b.save!;Blob.find_by_hash(b.hash)
And the find_hash is Blob.find(:all, :conditions => ["hash_value = ?", hash_value])
It works if I set the hash manually to '3858f62230ac3c915f300c664312c63f' (hardcoded string).
But if this string is generated I get the following error:
Failure/Error: Blob.find_by_hash(b.hash_value)[0].load.should == txt
ArgumentError: wrong number of arguments (0 for 1)
And I cannot query SQLite3 as stated above.
Solution
The solution is:
Instead of using Digest::MD5.hexdigest("foobar") use Digest::MD5.base64digest("foobar")
I do not know why sqlite3 has problems with hexdigest but there definitively is something fishy about this.
The difference between the two is encoding:
Digest::MD5.hexdigest("foobar").encoding #=> #<Encoding:ASCII-8BIT>
Digest::MD5.base64digest("foobar").encoding #=> #<Encoding:US-ASCII>
I don't think there's a particular reason why hexdigest has the 8bit encoding (which effectively means 'this is raw data', but that's what ruby seems to do. When the ruby sqlite3 driver sees something with the ascii-8bit encoding it binds the value to the query as a blob, rather than as text. This in turn affects how sqlite3 does the comparison (although I don't understand exactly how).
See also this question
The solution is:
Instead of using Digest::MD5.hexdigest("foobar") use Digest::MD5.base64digest("foobar").
I do not know why sqlite3 has problems with hexdigest but there definitively is something fishy about this.

Resources