Adviced on how to array a mongodb document - codeigniter

I am building an API using Codeigniter and MongoDB.
I got some questions about how to "model" the mongoDB.
A user should have basic data like name and user should also be able to
follow other users. Like it is now each user document keeps track of all people
that is following him and all that he is following. This is done by using arrays
of user _ids.
Like this:
"following": [323424,2323123,2312312],
"followers": [355656,5656565,5656234234,23424243,234246456],
"fullname": "James Bond"
Is this a good way? Perhaps the user document should only contain ids of peoples that the user is following and not who is following him? I can imaging that keeping potentially thousands of ids (for followers) in an array will make the document to big?
All input is welcome!

The max-document size is currently limited to 16MB (v1.8.x and up), this is pretty big. But i still think, that it would be ok in this case to move the follower-relations to an own collection -- you never know how big your project gets.
However: i would recommend using database references for storing the follower-relations: it's way easier to resolve the user from a database reference. Have a look at:
http://www.mongodb.org/display/DOCS/Database+References

Related

Umbraco 8 - Get Children Of Node Using ContentAtXPath() Method

I've been refactoring an existing Umbraco project to use more performant querying when getting back document data as everything was previously being returned using LINQ. I have been using a combination of Umbraco's querying via XPaths and Examine.
I am currently stumped on trying to get child documents using the Umbraco.ContentAtXPath() method. What I would like to do is get child document based on a path I parse to the method. This is what I have currently:
IEnumerable<IPublishedContent> umbracoPages = Umbraco.ContentAtXPath("//* [#isDoc]/descendant::/About [#isDoc]");
Running this returns a "Object reference not set to an instance of an object." error and unable to see exactly where I'm going wrong (new to this form of querying in Umbraco).
Ideally, I'd like to enhance the querying to also carry out sorting using the non-LINQ approach, as demonstrated here.
Up until Umbraco 8, content was cached in an XML file, which made XPath perfect for querying content efficiently. In v8, however, the so called "NuCache" is not file based nor XML based, so the XPath query support is only there for ... well... Old times sake, I guess? Either way it's probably not going to be super efficient and (I'd advise) not something to "aim for". That said I of course don't know what you are changing from (Linq can be a lot of things) :-/
It certainly depends on how big your dataset is.
As Umbraco has moved away from the XML backed cache, you should look into Linq queries against your content models. Make sure you use ModelsBuilder to generate the models.
On a small dataset Linq will be much quicker than examine. On a large dataset Examine/Lucene will be much more steady on performance.
Querying NuCache is pretty fast in Umbraco 8, only beaten by an Examine search.
Assuming you're using Models Builder, and your About page is a child of Home page, you could use:
var homePage = (HomePage) Model.Root();
var aboutPage = homePage?.Children<AboutPage>().FirstOrDefault();
var umbracoPages = aboutPage.Children();
Where HomePage is your home page Document Type Alias and AboutPage is your About page Document Type alias.

How to best create sorted sets with Redis

I'm still a bit lost when it comes to Sorted Sets and how to best construct them. Currently I have a simple set of activity on my site. Normally it will display things like User Followed, User liked, User Post etc. The JSON looks something like...
id: 2808697,
activity_type: "created_follower",
description: "Bob followed this profile",
body: null,
user: "Bob",
user_id: 99384,
user_profile_id: 233007,
user_channel_id: 2165811,
user_cube_url: "bob-anerson",
user_action: "followed this profile",
buddy: "http://s3.amazonaws.com/stuff/ju-logo.jpg",
affected: "Bill Anerson is following Jon Denver.",
created_at: "2014-06-24T20:34:11-05:00",
created_ms: 1403660051902,
profile_id: 232811,
channel_id: 2165604,
cube_url: "jondenver",
type: "profiles",
So if the activity type can be multiple things (IE Created Follow, Liked Event, Posted News, ETC) how would I go about putting this all in a sorted set? I'm already sure I want the score to be the created_ms but the question is, can I do multiple values in a sorted set that all have keys as fields? Should most of this be in a hash? I realize this is a fairly open question but after trying to wrap my head around all the tutorials Im just concerned about setting up the data structure before had so I dont get caught to deep in the weeds.
A sorted set is useful if you want to... keep stuff sorted! ;)
So, I assume you're interested in keeping the activities sorted by their creation time (ms). As for storing the actual data, you have two options:
Use the sorted set itself to store the data, even in native JSON format. Note that with this approach you'll only be able to fetch the entire JSON and you'll have to parse it at the client.
Alternatively, use the sorted to store "pointers" to hashes - i.e. the values will be key names in which you'll store the data. From your description, this appears the preferable approach.

Using Product and Location Models, how to find "deals" near locations (Rails 3.1.1)

I have a Product Model having deals of stores (another Model Store) in whole city. Now if someone selects particular store I want my view to display deals of all stores in geographically nearby areas of that store (say within range of 3 miles).
One way would be finding all deals on zipcode basis. But wondering if there is any better way to do this. Maybe some gem..
Thanks.
Use geokit gem: http://geokit.rubyforge.org/ . Example:
Store.find(:all, :origin =>[37.792,-122.393], :within=>10)
If works with relational database. However, it is not optimized like Geo spatial databases.
What you're looking for is a spatial database. You can achieve this with Postgres via PostGIS. I'd also highly recommend using GeoServer or MapServer as a front-end to PostGIS. You're going to want to do some serious reading on GIS in general. This is not a topic to cover in a single answer. You may want to spend some time poking around the OSGeo site.
If you're feeling trendy, you can use MongoDB's spatial indexes. This is probably what I would recommend if you're looking for a quick fix. FourSquare actually runs entirely on MongoDB's spatial functionality. It's what they use to find people close-by. So with Mongo you could find nearby deals with something like
db.deals.find({
loc: {
$near: [YOUR_X, YOUR_Y],
$maxDistance : DEAL_DISTANCE
}
});
This will return all deals that are within DEAL_DISTANCE of your coordinates.

Pagination with MongoDB

I have been using MongoDB and RoR to store logging data. I am pulling out the data and looking to page the results. Has anyone done paging with MongoDB or know of any resources online that might help get me started?
Cheers
Eef
Pagination in MongoDB can be accomplished by using a combination of limit() and skip().
For example, assume we have a collection called users in our active database.
>> db.users.find().limit(3)
This retrieves a list of the first three user documents for us. Note, this is essentially the same as writing:
>> db.users.find().skip(0).limit(3)
For the next three, we can do this:
>> db.users.find().skip(3).limit(3)
This skips over the first three user records, and gives us the next three. If there is only one more user in your database, don't worry; MongoDB is smart enough to only return data that is present, and won't crash.
This can be generalised like so, and would be roughly equivalent to what you would do in a web application. Assuming we have variables called PAGE_SIZE which is set to 3, and an arbitrary PAGE_NUMBER:
>> db.users.find().skip(PAGE_SIZE * (PAGE_NUMBER - 1)).limit(PAGE_SIZE)
I cannot speak directly as to how to employ this method in Ruby on Rails, but I suspect the Ruby MongoDB library exposes these methods.

How to quickly find a sharepoint document library by id?

Given the SPList.ID and a site collection (or an SPWeb with subwebs), how do I quickly find the document library with the given ID?
I can recursively enumerate through all webs and perform a web.Lists[guid] on each one of them, but there might be thousands of subwebs in my case, and I'm looking for a realtime solution.
If there is no way to do this quickly, any other suggestions on how to uniquely identify a document library? I could store the full path (url), but the identification will be publicly visible and I don't feel very comfortable giving away our exact SharePoint document structure like that. Should I resort to maintaining a manual ID <-> library mapping in a separate list?
I vote for the manual ID -> URL pair matching in a top-level, well-known list that's visible only to the elevated privileges account.
Since you are storing the ListID somewhere, you may also store the WebId. Lists are opened by the context SPWeb always, so if you go to:
http://toplevel/_layouts/ListGeneralSettings.aspx?ID={GUID1} // OK
http://toplevel/sub1/_layouts/ListGeneralSettings.aspx?ID={GUID1} // Wont Work (same Guid)
Having the WebId and ListId you can simply:
using(SPWeb subweb = (new SPSite("http://url")).OpenWeb(new Guid("{000...}")))
{
SPList list = subweb.Lists.GetList(new Guid("{111...}"), true);
// list logic
}
MS does not support this :)...
But take a look at this for giggles: http://weblogs.sqlteam.com/jhermiz/archive/2007/08/15/60288.aspx
If you have MOSS Search available, then it might help, depending on the lag you have between these lists getting created and needing to search for them. You could probably map list id as a managed property and do a quick search for list objects with the id in question.
For lots of classes of problems it seems like search is the fastest way to rip through huge sets of data. In fact if this approach worked for you, you really wouldn't even need to know the site collection up front. Don't have access to any of my MOSS environments at the moment, so can't verify this will work though.

Resources