Ruby RDF query - extracting simple data from Seq and Bag items - ruby

I am receiving xml-serialised RDF (as part of XMP media descriptions in case that is relevent), and processing in Ruby. I am trying to work with rdf gem, although happy to look at other solutions.
I have managed to load and query the most basic data, but am stuck when trying to build a query for items which contain sequences and bags.
Example XML RDF:
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
<rdf:Description rdf:about='' xmlns:dc='http://purl.org/dc/elements/1.1/'>
<dc:date>
<rdf:Seq>
<rdf:li>2013-04-08</rdf:li>
</rdf:Seq>
</dc:date>
</rdf:Description>
</rdf:RDF>
My best attempt at putting together a query:
require 'rdf'
require 'rdf/rdfxml'
require 'rdf/vocab/dc11'
graph = RDF::Graph.load( 'test.rdf' )
date_query = RDF::Query.new( :subject => { RDF::DC11.date => :date } )
results = date_query.execute(graph)
results.map { |result| { result.subject.to_s => result.date.inspect } }
=> [{"test.rdf"=>"#<RDF::Node:0x3fc186b3eef8(_:g70100421177080)>"}]
I get the impression that my results at this stage ("query solutions"?) are a reference to the rdf:Seq container. But I am lost as to how to progress. For the example above, I'd expect to end up, eventually, with an array ["2013-04-08"].
When there is incoming data without the rdf:Seq and rdf:li containers, I am able to extract the strings I want using RDF::Query, following examples at http://rdf.rubyforge.org/RDF/Query.html - unfortunately I cannot find any examples of more complex queries or RDF structures processed in Ruby.
Edit: In addition, when I try to find appropriate methods to use with the RDF::Node object, I cannot see any way to explore any further relations it may have:
results[0].date.methods - Object.methods
=> [:original, :original=, :id, :id=, :node?, :anonymous?, :unlabeled?, :labeled?, :to_sym, :resource?, :constant?, :variable?, :between?, :graph?, :literal?, :statement?, :iri?, :uri?, :valid?, :invalid?, :validate!, :validate, :to_rdf, :inspect!, :type_error, :to_ntriples]
# None of the above leads AFAICS to more data in the graph
I know how to get the same data in xpath (well, at least provided we always get the same paths in the serialisation), but feel it is not the best query language to use in this case (it's my backup plan, however, if it turns out too complex to implement an RDF-query solution)

I think you're correct when saying "my results at this stage ("query solutions"?) are a reference to the rdf:Seq container". RDF/XML is a really horrible serialisation format, instead think of the data as a graph. Here a picture of an RDF:Bag. RDF:Seq works the same and the #students in the example is analogous to the #date in your case.
So to get to the date literal, you need to hop one node further in the graph. I'm not familiar with the syntax of this Ruby library, but something like:
require 'rdf'
require 'rdf/rdfxml'
require 'rdf/vocab/dc11'
graph = RDF::Graph.load( 'test.rdf' )
date_query = RDF::Query.new({
:yourThing => {
RDF::DC11.date => :dateSeq
},
:dateSeq => {
RDF.type => RDF.Seq,
RDF._1 => :dateLiteral
}
})
date_query.execute(graph).each do |solution|
puts "date=#{solution.dateLiteral}"
end
Of course, if you expect the Seq to actually to contain multiple dates (otherwise it wouldn't make sense to have a Seq), you will have to match them with RDF._1 => :dateLiteral1, RDF._2 => :dateLiteral2, RDF._3 => :dateLiteral3 etc.
Or for a more generic solution, match all the properties and objects on the dateSeq with:
:dateSeq => {
:property => :dateLiteral
}
and then filter out the case where :property ends up being RDF:type while :dateLiteral isn't actually the date but RDF:Seq. Maybe the library has also a special method to get all the Seq's contents.

Related

Searching a Hash

I'm trying to complete this Codewars Challenge and I'm confused as to where I'm going wrong. Could someone please give me a hand?
The question provides a "database" of translations for Welcome, and the instructions say:
Think of a way to store the languages as a database (eg an object). The languages are listed below so you can copy and paste!
Write a 'welcome' function that takes a parameter 'language' (always a string), and returns a greeting - if you have it in your database. It should default to English if the language is not in the database, or in the event of an invalid input.
My attempt:
def greet(language)
greeting = { 'english'=>'Welcome',
'czech'=>'Vitejte',
'danish'=>'Velkomst',
'dutch'=>'Welkom',
'estonian'=>'Tere tulemast',
'finnish'=>'Tervetuloa',
'flemish'=>'Welgekomen',
'french'=>'Bienvenue',
'german'=>'Willkommen',
'irish'=>'Failte',
'italian'=>'Benvenuto',
'latvian'=>'Gaidits',
'lithuanian'=>'Laukiamas',
'polish'=>'Witamy',
'spanish'=>'Bienvenido',
'swedish'=>'Valkommen',
'welsh'=>'Croeso'
}
greeting.key?(language) ? greeting.each { |k, v| return v if language == k } : 'IP_ADDRESS_INVALID'
end
To my eyes when I run my code through the IDE it seems to be working as per request but I guess I must be wrong somehow.
It's telling me it :
Expected: "Laukiamas", instead got: "Welcome"
But when I type:
p greet("lithuanian")
I get Laukiamas.
You can provide you greeting hash with a default value. It is as simple as
greeting.default = "Welcome"
This enhanced hash does all the work for you. Just look up the key; when it is not there you'll get "Welcome".
Preface
First of all, please don't post links to exercises or homework questions. Quote them in your original question to avoid link rot or additional create work for people trying to help you out.
Understanding the Problem Defined by the Linked Question
Secondly, you're misunderstanding the core question. The requirement is basically to return the Hash value for a given language key if the key exists in the Hash. If it doesn't, then return the value of the 'english' key instead. Implicit in the exercise is to understand the various types of improper inputs that would fail to find a matching key; the solution below addresses most of them, and will work even if your Ruby has frozen strings enabled.
A Working Solution
There are lots of ways to do this, but here's a simple example that will handle invalid keys, nil as a language argument, and abstract away capitalization as a potential issue.
DEFAULT_LANG = 'english'
TRANSLATIONS = {
'english' => 'Welcome',
'czech' => 'Vitejte',
'danish' => 'Velkomst',
'dutch' => 'Welkom',
'estonian' => 'Tere tulemast',
'finnish' => 'Tervetuloa',
'flemish' => 'Welgekomen',
'french' => 'Bienvenue',
'german' => 'Willkommen',
'irish' => 'Failte',
'italian' => 'Benvenuto',
'latvian' => 'Gaidits',
'lithuanian' => 'Laukiamas',
'polish' => 'Witamy',
'spanish' => 'Bienvenido',
'swedish' => 'Valkommen',
'welsh' => 'Croeso'
}
# Return a translation of "Welcome" into the language
# passed as an argument.
#
# #param language [String, #to_s] any object that can
# be coerced into a String, and therefore to
# String#downcase
# #return [String] a translation of "Welcome" or the
# string-literal +Welcome+ if no translation found
def greet language
language = language.to_s.downcase
TRANSLATIONS.fetch language, TRANSLATIONS[DEFAULT_LANG]
end
# Everything in the following Array of examples except
# +Spanish+ should return the Hash value for +english+.
['Spanish', 'EspaƱol', 123, nil].map { greet(_1) }
This will correctly return:
#=> ["Bienvenido", "Welcome", "Welcome", "Welcome"]
because only Spanish (when lower-cased) will match any of the keys currently defined in the TRANSLATIONS Hash. All the rest will use the default value defined for the exercise.
Test Results
Since there are some RSpec tests included with the linked question:
describe "Welcome! Translation" do
it "should translate input" do
Test.assert_equals(greet('english'), 'Welcome', "It didn't work out this time, keep trying!");
Test.assert_equals(greet('dutch'), 'Welkom', "It didn't work out this time, keep trying!");
Test.assert_equals(greet('IP_ADDRESS_INVALID'), 'Welcome', "It didn't work out this time, keep trying!")
end
end
The code provided not only passes the provided tests, but it also passes a number of other edge cases not defined in the unit tests. When run against the defined tests, the code above passes cleanly:
If this is homework, then you might want to create additional tests to cover all the various edge cases. You might also choose to refactor to less idiomatic code if you want more explanatory variables, more explicit intermediate conversions, or more explicit key handling. The point of good code is to be readable, so be as explicit in your code and as thorough in your tests as you need to be in order to make debugging easier.

find() from MongoDB, return result, supress fields and turn into JSON

I Have data saved in a MongoDB in the following format
{"_id": "VALVE22","state": "1","element": "BNK1FLOW","data":{"type": "SEN","descr": "TOWER6"}}
I have the following code in a Ruby script;
db = Mongo::Connection.new.db("cooler-lookup")
coll = db.collection("elements")
kitty = coll.find({"_id" => table[address][i], "state" => char}).to_a
'table[address][i]' and 'char' are variables defined & used elsewhere in the bigger script feeding data into this lookup section. For testing these can be replaced with "VALVE22" and "1" respectively (and that's how I've been testing in irb)
When run from the command line the script outputs the following correct result from a valid query.
{"_id"=>"VLAVE22", "state"=>"1", "element"=>"BNK1FLOW", "data"=>{"type"=>"SEN", "descr"=>"TOWER6"}}
But I need to suppress the _id and state fields. I've tried using :fields modifier in all sorts of ways but can't remove the fields. I have tested this in irb and along with the valid lookup I also get => nil returned. I'm sure this is something really simple but I can't see what I need to be able to JSON.generate the query results without the ID & State fields and then puts it.
Using the code below I was able to get this working, however when I tried to do kittylitter = JSON.generate(kitty) I was getting a lot of empty []'s as well as my valid result. It looks like they where the failed queries from the DB coming back with no record.
After many hours of being confused I managed to find this bit of code to fix the problem
kitty.each do |key|
keyjson = JSON.generate(key)
puts keyjson
end
That gave me exactly what I needed out - which was the result on 1 line as valid JSON. Part of my head hurting confusion comes from the fact to.a makes an array, yet when I tried to do array type stuff on the result kitty nothing would work as expected. I then tried treating it like a hash which led me to that bit of code above! Once I'd done that everything worked... Am I wrong to be confused by arrays and hashes or have I missed something real obvious like my array is or contains a hash?
This works for me:
kitty = coll.find({"_id" => table[address][i], "state" => char}, :fields => {"_id" => 0, "state" => 0}).to_a
It returns
[{"element"=>"BNK1FLOW", "data"=>{"type"=>"SEN", "descr"=>"TOWER6"}}]
See http://api.mongodb.org/ruby/current/Mongo/Collection.html#find-instance_method for usage instructions for Mongo::Collection#find
Using gem mongo -v 2.4.3 , the following works for me
mongo_results = collection.find({"shop_id" => shop_id}, :projection => {"_id" => 0, "child_products" => 0}).to_a
In the example above, I'm omitting "_id" and "child_products" from showing up in the results.

Any string to XML in Ruby

I am trying to convert a random string (which is build in XML format) in to an xml, so I can apply the "to_hash" function to it.
This is what I have:
model = live_requests[3]
parser = XML::Parser.string(model)
model_xml = parser.parse
puts model.to_hash
Now why am I getting an error when 'model_xml' should be an XML file?
I am using LibXML by the way.
http://libxml.rubyforge.org/rdoc/index.html
Libxml does not support the to_hash method. If you are looking for a way to do this that doesn't require traversing XML nodes and bulding the hash manually you should take a look at Nori.
Nori.parse("<tag>This is the contents</tag>")
# => { 'tag' => 'This is the contents' }
If you want to learn how to traverse Libxml's node trees take a look at the answer to this question.

Can I avoid transposing an array in Ruby on Rails?

I have a Rails app that has a COUNTRIES list with full country names and abbreviations created inside the Company model. The array for the COUNTRIES list is used for a select tag on the input form to store abbreviations in the DB. See below. VALID_COUNTRIES is used for validations of abbreviations in the DB. FULL_COUNTRIES is used to display the full country name from the abbreviation.
class Company < ActiveRecord::Base
COUNTRIES = [["Afghanistan","AF"],["Aland Islands","AX"],["Albania","AL"],...]
COUNTRIES_TRANSFORM = COUNTRIES.transpose
VALID_COUNTRIES = COUNTRIES_TRANSPOSE[1]
FULL_COUNTRIES = COUNTRIES_TRANSPOSE[0]
validates :country, inclusion: { in: VALID_COUNTRIES, message: "enter a valid country" }
...
end
On the form:
<%= select_tag(:country, options_for_select(Company::COUNTRIES, 'US')) %>
And to convert back the the full country name:
full_country = FULL_COUNTRIES[VALID_COUNTRIES.index(:country)]
This seems like an excellent application for a hash, except the key/value order is wrong. For the select I need:
COUNTRIES = {"Afghanistan" => "AF", "Aland Islands" => "AX", "Albania" => "AL",...}
While to take the abbreviation from the DB and display the full country name I need:
COUNTRIES = {"AF" => "Afghanistan", "AX" => "Aland Islands", "AL" => "Albania",...}
Which is a shame, because COUNTRIES.keys or COUNTRIES.values would give me the validation list (depending on which hash layout is used).
I'm relatively new to Ruby/Rails and am looking for the more Ruby-like way to solve the problem. Here are the questions:
Does the transpose occur only once, and if so, when is it executed?
Is there a way to specify the FULL_ and VALID_ lists that do not require the transpose?
Is there a better or reasonable alternate way to do this? For instance, VALID_COUNTRIES is COUNTRIES[x][1] and FULL_COUNTRIES is COUNTRIES[x][0], but VALID_ must work with the validation.
Is there a way to make a hash work with just one hash rather then one for the select_tag and one for converting the abbreviations in the DB back to full names for display?
1) Does the transpose occur only once, and if so, when is it executed?
Yes at compile time because you are assigning to constants if you want it to be evaluated every time use a lambda
FULL_COUNTRIES = lambda { COUNTRIES_TRANSPOSE[0] }
2) Is there a way to specify the FULL_ and VALID_ lists that do not require the transpose?
Yes use a map or collect (they are the same thing)
VALID_COUNTRIES = COUNTRIES.map &:first
FULL_COUNTRIES = COUNTRIES.map &:last
3) Is there a better or reasonable alternate way to do this? For instance, VALID_COUNTRIES is COUNTRIES[x][1] and FULL_COUNTRIES is COUNTRIES[x][0], but VALID_ must work with the validation.
See Above
4) Is there a way to make the hash work?
Yes I am not sure why a hash isn't working as the rails docs say options_for_select will use hash.to_a.map &:first for the options text and hash.to_a.map &:last for the options value so the first hash you give should be working if you can clarify why it is not I can help you more.

How do you stop a MongoDB search from being applied recursively to the key-value tree?

Imagine I have this object (written with Ruby literals) stored in a MongoDB:
{"tags" => ["foo", "bar"],
"jobs" => [{"title" => "Chief Donkey Wrangler", "tags" => ["donkeys"]}] }
Now, I want to search for objects based on the tags on the first level of data, not the second. I can write a query like this (using the Ruby MongoDB library):
things.find("tags" => {"$exists" => "foo"})
This will obviously match the first example, but it will also match an example like this:
{"tags" => ["baz", "bar"],
"jobs" => [{"title" => "Trainee Donkey Wrangler", "tags" => ["donkeys", "foo"]}] }
How do I ensure that I am searching only the top-level of keys? I'm interested in knowing the answer in both JavaScript, Ruby and in a language-agnostic way, as I'd like to use MongoDB as a cross-language store.
Obviously, I could pass a map-reduce function to the datastore to pick out the stuff I'm trying to get, but I'm interested to see if it is supported at a higher level (and to reduce the amount of time I spend writing JavaScript map-reduce functions!)
Actually, the query you specify won't match your second example. To match the second example, you'd do:
things.find({"jobs.tags" => "foo"})
There's no recursive application of the query selector.
You're not using $exists properly. $exists does not allow you to search for a match of a field, it just checks for the existence of such a field. I'm guessing that the Ruby MongoDB library is treating your request for 'foo' as equivalent to true, b/c $exists only accepts true/false as an argument
As #kb points out, you want to use the dot notation to reach into the objects.

Resources