Pagination with MongoDB - ruby

I have been using MongoDB and RoR to store logging data. I am pulling out the data and looking to page the results. Has anyone done paging with MongoDB or know of any resources online that might help get me started?
Cheers
Eef

Pagination in MongoDB can be accomplished by using a combination of limit() and skip().
For example, assume we have a collection called users in our active database.
>> db.users.find().limit(3)
This retrieves a list of the first three user documents for us. Note, this is essentially the same as writing:
>> db.users.find().skip(0).limit(3)
For the next three, we can do this:
>> db.users.find().skip(3).limit(3)
This skips over the first three user records, and gives us the next three. If there is only one more user in your database, don't worry; MongoDB is smart enough to only return data that is present, and won't crash.
This can be generalised like so, and would be roughly equivalent to what you would do in a web application. Assuming we have variables called PAGE_SIZE which is set to 3, and an arbitrary PAGE_NUMBER:
>> db.users.find().skip(PAGE_SIZE * (PAGE_NUMBER - 1)).limit(PAGE_SIZE)
I cannot speak directly as to how to employ this method in Ruby on Rails, but I suspect the Ruby MongoDB library exposes these methods.

Related

Umbraco 8 - Get Children Of Node Using ContentAtXPath() Method

I've been refactoring an existing Umbraco project to use more performant querying when getting back document data as everything was previously being returned using LINQ. I have been using a combination of Umbraco's querying via XPaths and Examine.
I am currently stumped on trying to get child documents using the Umbraco.ContentAtXPath() method. What I would like to do is get child document based on a path I parse to the method. This is what I have currently:
IEnumerable<IPublishedContent> umbracoPages = Umbraco.ContentAtXPath("//* [#isDoc]/descendant::/About [#isDoc]");
Running this returns a "Object reference not set to an instance of an object." error and unable to see exactly where I'm going wrong (new to this form of querying in Umbraco).
Ideally, I'd like to enhance the querying to also carry out sorting using the non-LINQ approach, as demonstrated here.
Up until Umbraco 8, content was cached in an XML file, which made XPath perfect for querying content efficiently. In v8, however, the so called "NuCache" is not file based nor XML based, so the XPath query support is only there for ... well... Old times sake, I guess? Either way it's probably not going to be super efficient and (I'd advise) not something to "aim for". That said I of course don't know what you are changing from (Linq can be a lot of things) :-/
It certainly depends on how big your dataset is.
As Umbraco has moved away from the XML backed cache, you should look into Linq queries against your content models. Make sure you use ModelsBuilder to generate the models.
On a small dataset Linq will be much quicker than examine. On a large dataset Examine/Lucene will be much more steady on performance.
Querying NuCache is pretty fast in Umbraco 8, only beaten by an Examine search.
Assuming you're using Models Builder, and your About page is a child of Home page, you could use:
var homePage = (HomePage) Model.Root();
var aboutPage = homePage?.Children<AboutPage>().FirstOrDefault();
var umbracoPages = aboutPage.Children();
Where HomePage is your home page Document Type Alias and AboutPage is your About page Document Type alias.

If I eager load associated child records, then that means future WHERE retrievals won't dig through database again?

Just trying to understand... if at the start of some method I eager load a record and its associated children like this:
#object = Object.include(:children).where(email:"test#example.com").first
Then does that mean that if later I have to look through that object's children this will not generate more database queries?
I.e.,
#found_child = #object.children.where(type_of_child:"this type").first
Unfortunately not - using ActiveRecord::Relation methods such as where will query the database again.
You could however filter the data without any further queries, using the standard Array / Enumerable methods:
#object.children.detect {|child| child.type_of_child == "this type"}
It will generate another database query in your case.
Eager loading is used to avoid N+1 queries. This is done by loading all associated objects. But this doesn't work when you want to filter that list with where later on, Rails will than build a new query and run that one.
That said: In your example the include makes your code actually slower, because it loads associated object, but cannot use them.
I would change your example to:
#object = Object.find_by(email: "test#example.com")
#found_child = #object.children.find_by(type_of_child: "this type")

How do I get a count of the Shopify products in a collection using the Ruby API

I want to get the count of products in each collection in the shop as part of a Shopify App that I'm building.
I know that for a single collection Product.all(params: {collection_id: 29238895}).count will show me the count in the shopify console, but I'm not certain about how it is implemented.
The API document describes a call that counts all products that belong to a certain collection GET /admin/products/count.json?collection_id=841564295 but I have been unable to get a ruby expression that runs this.
Is there a more complete document on the Ruby API?
If you want to know exactly what is going on with the API, may I suggest the simple command: bundle open shopify_api
That will load the entire API into your text editor, allowing to quickly determine the answer to your question. The /lib/resources directory is especially rich, but do not forget to check the base class as well. In fact, I think the count option is declared right in the base itself. Nothing beats a few minutes of examining the code.

Caching dataset results in Sequel and Sinatra

I'm building an API with Sinatra along with Sequel as ORM on a postgres database.
I have some complexes datasets to query in a paging style, so i'd like to keep the dataset in cache for following next pages requests after a first call.
I've read Sequel dataset are cached by default, but i need to keep this object between 2 requests to benefit this behavior.
So I was wondering to put this object somewhere to retrieve it later if the same query is called again rather than doing a full new dataset each time.
I tried in Sinatra session hash, but i've got a TypeError : can't dump anonymous class #<Class:0x000000028c13b8> when putting the object dataset in it.
I'm wondering maybe to use memcached for that.
Any advices on the best way to do that will be very appreciated, thanks.
Memcached or Redis (using LRU) would likely be appropriate solutions for what you are describing. The Ruby Dalli gem makes it pretty easy to get started with memcached. You can find it at https://github.com/mperham/dalli.
On the Github page you will see the following basic example
require 'dalli'
options = { :namespace => "app_v1", :compress => true }
dc = Dalli::Client.new('localhost:11211', options)
dc.set('abc', 123)
value = dc.get('abc')
This illustrates the basics to use the gem. Consider that Memcached is simply a key/value store utilizing LRU (least recently used) fallout. This means you allocate memory to Memcached and let your keys organically expire unless there is a reason to manually expire the key.
From there it becomes simply attempting to fetch a key from memcached, and then only running your real queries if there is no match found.
found = dc.get('my_unique_key')
unless found
# Do your Sequel query here
dc.set('my_unique_key', 'value_goes_here')
end

Adviced on how to array a mongodb document

I am building an API using Codeigniter and MongoDB.
I got some questions about how to "model" the mongoDB.
A user should have basic data like name and user should also be able to
follow other users. Like it is now each user document keeps track of all people
that is following him and all that he is following. This is done by using arrays
of user _ids.
Like this:
"following": [323424,2323123,2312312],
"followers": [355656,5656565,5656234234,23424243,234246456],
"fullname": "James Bond"
Is this a good way? Perhaps the user document should only contain ids of peoples that the user is following and not who is following him? I can imaging that keeping potentially thousands of ids (for followers) in an array will make the document to big?
All input is welcome!
The max-document size is currently limited to 16MB (v1.8.x and up), this is pretty big. But i still think, that it would be ok in this case to move the follower-relations to an own collection -- you never know how big your project gets.
However: i would recommend using database references for storing the follower-relations: it's way easier to resolve the user from a database reference. Have a look at:
http://www.mongodb.org/display/DOCS/Database+References

Resources