How to select distinct row in CouchDB? - view

I have the view:
function (doc) {
var obj;
obj = {
one:doc.document.someParameter1,
two:doc.document.someParameter2
};
emit(doc.document.id, obj);}
On request it returns several something like
{"total_rows":511,"offset":381,"rows":[
{"id":"CDOC_2.16.840.1.113883.3.59.3:0947___QCPR___80717","key":"7012979","value":{"one":"one","two":"two"}},
{"id":"CDOC_2.16.840.1.113883.3.59.3:0947___QCPR___80921","key":"7012979","value":{"one":"one","two":"two"}}
]}
Is there a way instead of several results get just one?
Of course, I could do filtering on the application side, but this could be very expensive since I have to transfer all unnecessary results.

Related

Angular Meteor objects not acting as expected

I am working with Angular Meteor and am having an issue with my objects/arrays. I have this code:
angular.module("learn").controller("CurriculumDetailController", ['$scope', '$stateParams', '$meteor',
function($scope, $stateParams, $meteor){
$scope.curriculum = $meteor.object(CurriculumList, $stateParams.curriculumId);
$scope.resources = _.map($scope.curriculum.resources, function(obj) {
return ResourceList.findOne({_id:obj._id})
});
console.log($scope.resources)
}]);
I am attempting to iterate over 'resources', which is a nested array in the curriculum object, look up each value in the 'ResourceList' collection, and return the new array in the scope.
Problem is, sometimes it works, sometimes it doesnt. When I load up the page and access it through a UI-router link. I get the array as expected. But if the page is refreshed, $scope.resources is an empty array.
My thought is there is something going on with asynchronous calls but have not been able for find a solution. I still have the autopublish package installed. Any help would be appreciated.
What you're going to do is return a cursor containing all the information you want, then you can work with $meteor.object on the client side if you like. Normally, publishComposite would look something like this: (I don't know what your curriculum.resources looks like)
Use this method if the curriculum.resources has only ONE id:
// this takes the place of the publish method
Meteor.publishComposite('curriculum', function(id) {
return {
find: function() {
// Here you are getting the CurriculumList based on the id, or whatever you want
return CurriculumList.find({_id: id});
},
children: [
{
find: function(curr) {
// (curr) will be each of the CurriculumList's found from the parent query
// Normally you would do something like this:
return ResourceList.find(_id: curr.resources[0]._id);
}
}
]
}
})
This method if you have multiple resources:
However, since it looks like your curriculum is going to have a resources list with one or many objects with id's then we need to build the query before returning anything. Try something like:
// well use a function so we can send in an _id
Meteor.publishComposite('curriculum', function(id){
// we'll build our query before returning it.
var query = {
find: function() {
return CurriculumList.find({_id: id});
}
};
// now we'll fetch the curriculum so we can access the resources list
var curr = CurriculumList.find({_id: id}).fetch();
// this will pluck the ids from the resources and place them into an array
var rList = _.pluck(curr.resources, '_id');
// here we'll iterate over the resource ids and place a "find" object into the query.children array.
query.children = [];
_.each(rList, function(id) {
var childObj = {
find: function() {
return ResourceList.find({_id: id});
}
};
query.children.push(childObj)
})
return query;
});
So what should happen here (I didn't test) is with one publish function you will be getting the Curriculum you want, plus all of it's resourceslist children.
Now you will have access to these on the client side.
$scope.curriculum = $meteor.object(CurriculumList, $stateParams.curriculumId);
// collection if more than one, object if only one.
$scope.resources = $meteor.collection(ResoursesList, false);
This was thrown together somewhat quickly so I apologize if it doesn't work straight off, any trouble I'll help you fix.

Scalable Contains method for LINQ against a SQL backend

I'm looking for an elegant way to execute a Contains() statement in a scalable way. Please allow me to give some background before I come to the actual question.
The IN statement
In Entity Framework and LINQ to SQL the Contains statement is translated as a SQL IN statement. For instance, from this statement:
var ids = Enumerable.Range(1,10);
var courses = Courses.Where(c => ids.Contains(c.CourseID)).ToList();
Entity Framework will generate
SELECT
[Extent1].[CourseID] AS [CourseID],
[Extent1].[Title] AS [Title],
[Extent1].[Credits] AS [Credits],
[Extent1].[DepartmentID] AS [DepartmentID]
FROM [dbo].[Course] AS [Extent1]
WHERE [Extent1].[CourseID] IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
Unfortunately, the In statement is not scalable. As per MSDN:
Including an extremely large number of values (many thousands) in an IN clause can consume resources and return errors 8623 or 8632
which has to do with running out of resources or exceeding expression limits.
But before these errors occur, the IN statement becomes increasingly slow with growing numbers of items. I can't find documentation about its growth rate, but it performs well up to a few thousands of items, but beyond that it gets dramatically slow. (Based on SQL Server experiences).
Scalable
We can't always avoid this statement. A JOIN with the source data in stead would generally perform much better, but that's only possible when the source data is in the same context. Here I'm dealing with data coming from a client in a disconnected scenario. So I have been looking for a scalable solution. A satisfactory approach turned out to be cutting the operation into chunks:
var courses = ids.ToChunks(1000)
.Select(chunk => Courses.Where(c => chunk.Contains(c.CourseID)))
.SelectMany(x => x).ToList();
(where ToChunks is this little extension method).
This executes the query in chunks of 1000 that all perform well enough. With e.g. 5000 items, 5 queries will run that together are likely to be faster than one query with 5000 items.
But not DRY
But of course I don't want to scatter this construct all over my code. I am looking for an extension method by which any IQueryable<T> can be transformed into a chunky executing statement. Ideally something like this:
var courses = Courses.Where(c => ids.Contains(c.CourseID))
.AsChunky(1000)
.ToList();
But maybe this
var courses = Courses.ChunkyContains(c => c.CourseID, ids, 1000)
.ToList();
I've given the latter solution a first shot:
public static IEnumerable<TEntity> ChunkyContains<TEntity, TContains>(
this IQueryable<TEntity> query,
Expression<Func<TEntity,TContains>> match,
IEnumerable<TContains> containList,
int chunkSize = 500)
{
return containList.ToChunks(chunkSize)
.Select (chunk => query.Where(x => chunk.Contains(match)))
.SelectMany(x => x);
}
Obviously, the part x => chunk.Contains(match) doesn't compile. But I don't know how to manipulate the match expression into a Contains expression.
Maybe someone can help me make this solution work. And of course I'm open to other approaches to make this statement scalable.
I’ve solved this problem with a little different approach a view month ago. Maybe it’s a good solution for you too.
I didn’t want my solution to change the query itself. So a ids.ChunkContains(p.Id) or a special WhereContains method was unfeasible. Also should the solution be able to combine a Contains with another filter as well as using the same collection multiple times.
db.TestEntities.Where(p => (ids.Contains(p.Id) || ids.Contains(p.ParentId)) && p.Name.StartsWith("Test"))
So I tried to encapsulate the logic in a special ToList method that could rewrite the Expression for a specified collection to be queried in chunks.
var ids = Enumerable.Range(1, 11);
var result = db.TestEntities.Where(p => Ids.Contains(p.Id) && p.Name.StartsWith ("Test"))
.ToChunkedList(ids,4);
To rewrite the expression tree I discovered all Contains Method calls from local collections in the query with a view helping classes.
private class ContainsExpression
{
public ContainsExpression(MethodCallExpression methodCall)
{
this.MethodCall = methodCall;
}
public MethodCallExpression MethodCall { get; private set; }
public object GetValue()
{
var parent = MethodCall.Object ?? MethodCall.Arguments.FirstOrDefault();
return Expression.Lambda<Func<object>>(parent).Compile()();
}
public bool IsLocalList()
{
Expression parent = MethodCall.Object ?? MethodCall.Arguments.FirstOrDefault();
while (parent != null) {
if (parent is ConstantExpression)
return true;
var member = parent as MemberExpression;
if (member != null) {
parent = member.Expression;
} else {
parent = null;
}
}
return false;
}
}
private class FindExpressionVisitor<T> : ExpressionVisitor where T : Expression
{
public List<T> FoundItems { get; private set; }
public FindExpressionVisitor()
{
this.FoundItems = new List<T>();
}
public override Expression Visit(Expression node)
{
var found = node as T;
if (found != null) {
this.FoundItems.Add(found);
}
return base.Visit(node);
}
}
public static List<T> ToChunkedList<T, TValue>(this IQueryable<T> query, IEnumerable<TValue> list, int chunkSize)
{
var finder = new FindExpressionVisitor<MethodCallExpression>();
finder.Visit(query.Expression);
var methodCalls = finder.FoundItems.Where(p => p.Method.Name == "Contains").Select(p => new ContainsExpression(p)).Where(p => p.IsLocalList()).ToList();
var localLists = methodCalls.Where(p => p.GetValue() == list).ToList();
If the local collection passed in the ToChunkedList method was found in the query expression, I replace the Contains call to the original list with a new call to a temporary list containing the ids for one batch.
if (localLists.Any()) {
var result = new List<T>();
var valueList = new List<TValue>();
var containsMethod = typeof(Enumerable).GetMethods(BindingFlags.Static | BindingFlags.Public)
.Single(p => p.Name == "Contains" && p.GetParameters().Count() == 2)
.MakeGenericMethod(typeof(TValue));
var queryExpression = query.Expression;
foreach (var item in localLists) {
var parameter = new List<Expression>();
parameter.Add(Expression.Constant(valueList));
if (item.MethodCall.Object == null) {
parameter.AddRange(item.MethodCall.Arguments.Skip(1));
} else {
parameter.AddRange(item.MethodCall.Arguments);
}
var call = Expression.Call(containsMethod, parameter.ToArray());
var replacer = new ExpressionReplacer(item.MethodCall,call);
queryExpression = replacer.Visit(queryExpression);
}
var chunkQuery = query.Provider.CreateQuery<T>(queryExpression);
for (int i = 0; i < Math.Ceiling((decimal)list.Count() / chunkSize); i++) {
valueList.Clear();
valueList.AddRange(list.Skip(i * chunkSize).Take(chunkSize));
result.AddRange(chunkQuery.ToList());
}
return result;
}
// if the collection was not found return query.ToList()
return query.ToList();
Expression Replacer:
private class ExpressionReplacer : ExpressionVisitor {
private Expression find, replace;
public ExpressionReplacer(Expression find, Expression replace)
{
this.find = find;
this.replace = replace;
}
public override Expression Visit(Expression node)
{
if (node == this.find)
return this.replace;
return base.Visit(node);
}
}
Please allow me to provide an alternative to the Chunky approach.
The technique involving Contains in your predicate works well for:
A constant list of values (no volatile).
A small list of values.
Contains will do great if your local data has those two characteristics because these small set of values will be hardcoded in the final SQL query.
The problem begins when your list of values has entropy (non-constant). As of this writing, Entity Framework (Classic and Core) do not try to parameterize these values in any way, this forces SQL Server to generate a query plan every time it sees a new combination of values in your query. This operation is expensive and gets aggravated by the overall complexity of your query (e.g. many tables, a lot of values in the list, etc.).
The Chunky approach still suffers from this SQL Server query plan cache pollution problem, because it does not parametrizes the query, it just moves the cost of creating a big execution plan into smaller ones that are more easy to compute (and discard) by SQL Server, furthermore, every chunk adds an additional round-trip to the database, which increases the time needed to resolve the query.
An Efficient Solution for EF Core
🎉 NEW! QueryableValues EF6 Edition has arrived!
For EF Core keep reading below.
Wouldn't it be nice to have a way of composing local data in your query in a way that's SQL Server friendly? Enter QueryableValues.
I designed this library with these two main goals:
It MUST solve the SQL Server's query plan cache pollution problem ✅
It MUST be fast! ⚡
It has a flexible API that allows you to compose local data provided by an IEnumerable<T> and you get back an IQueryable<T>; just use it as if it were another entity of your DbContext (really), e.g.:
// Sample values.
IEnumerable<int> values = Enumerable.Range(1, 1000);
// Using a Join (query syntax).
var query1 =
from e in dbContext.MyEntities
join v in dbContext.AsQueryableValues(values) on e.Id equals v
select new
{
e.Id,
e.Name
};
// Using Contains (method syntax)
var query2 = dbContext.MyEntities
.Where(e => dbContext.AsQueryableValues(values).Contains(e.Id))
.Select(e => new
{
e.Id,
e.Name
});
You can also compose complex types!
It goes without saying that the provided IEnumerable<T> is only enumerated at the time that your query is materialized (not before), preserving the same behavior of EF Core in this regard.
How Does It Works?
Internally QueryableValues creates a parameterized query and provides your values in a serialized format that is natively understood by SQL Server. This allows your query to be resolved with a single round-trip to the database and avoids creating a new query plan on subsequent executions due to the parameterized nature of it.
Useful Links
Nuget Package
GitHub Repository
Benchmarks
SQL Server Cache Pollution Problem
QueryableValues is distributed under the MIT license
Linqkit to the rescue! Might be a better way that does it directly, but this seems to work fine and makes it pretty clear what's being done. The addition being AsExpandable(), which lets you use the Invoke extension.
using LinqKit;
public static IEnumerable<TEntity> ChunkyContains<TEntity, TContains>(
this IQueryable<TEntity> query,
Expression<Func<TEntity,TContains>> match,
IEnumerable<TContains> containList,
int chunkSize = 500)
{
return containList
.ToChunks(chunkSize)
.Select (chunk => query.AsExpandable()
.Where(x => chunk.Contains(match.Invoke(x))))
.SelectMany(x => x);
}
You might also want to do this:
containsList.Distinct()
.ToChunks(chunkSize)
...or something similar so you don't get duplicate results if something this occurs:
query.ChunkyContains(x => x.Id, new List<int> { 1, 1 }, 1);
Another way would be to build the predicate this way (of course, some parts should be improved, just giving the idea).
public static Expression<Func<TEntity, bool>> ContainsPredicate<TEntity, TContains>(this IEnumerable<TContains> chunk, Expression<Func<TEntity, TContains>> match)
{
return Expression.Lambda<Func<TEntity, bool>>(Expression.Call(
typeof (Enumerable),
"Contains",
new[]
{
typeof (TContains)
},
Expression.Constant(chunk, typeof(IEnumerable<TContains>)), match.Body),
match.Parameters);
}
which you could call in your ChunkContains method
return containList.ToChunks(chunkSize)
.Select(chunk => query.Where(ContainsPredicate(chunk, match)))
.SelectMany(x => x);
Using a stored procedure with a table valued parameter could also work well. You in effect write a joint In the stored procedure between your table / view and the table valued parameter.
https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql/table-valued-parameters

Is it possible to retrieve data of all classes in Parse using single REST URL request?

just a scenario :
I have 4 classes created in Parse cloud database for a particular Application - ClassA, ClassB, ClassC, ClassD.
I can retrieve data related to ClassA using REST URL like - https://api.parse.com/1/classes/ClassA
Is it possible to retrieve data of all 4 classes using single REST URL ?
No, it's not possible to do this. You can query from a single class at a time, and a maximum of 1,000 objects.
A cloud function can make multiple queries and merge the results, meaning that a single REST call (to call the function) could return results from multiple classes (but a maximum of 1,000 objects per query). Something like this:
Parse.Cloud.define("GetSomeData", function(request, response) {
var query1 = new Parse.Query("ClassA");
var query2 = new Parse.Query("ClassB");
query1.limit(1000);
query2.limit(1000);
var output = {};
query1.find().then(function(results) {
output['ClassA'] = results;
return query2.find();
}).then(function(results) {
output['ClassB'] = results;
response.success(output);
}, function(error) {
response.error(error);
});
});

Ordering backbone views together with collection

I have a collection which contains several items that should be accessible in a list.
So every element in the collection gets it own view element which is then added to the DOM into one container.
My question is:
How do I apply the sort order I achieved in the collection with a comparator function to the DOM?
The first rendering is easy: you iterate through the collection and create all views which are then appended to the container element in the correct order.
But what if models get changed and are re-ordered by the collection? What if elements are added? I don't want to re-render ALL elements but rather update/move only the necessary DOM nodes.
model add
The path where elements are added is rather simple, as you get the index in the options when a model gets added to a collection. This index is the sorted index, based on that if you have a straightforward view, it should be easy to insert your view at a certain index.
sort attribute change
This one is a bit tricky, and I don't have an answer handy (and I've struggled with this at times as well) because the collection doesn't automatically reshuffle its order after you change an attribute the model got sorted on when you initially added it.
from the backbone docs:
Collections with comparator functions will not automatically re-sort
if you later change model attributes, so you may wish to call sort
after changing model attributes that would affect the order.
so if you call sort on a collection it will trigger a reset event which you can hook into to trigger a redraw of the whole list.
It's highly ineffective when dealing with lists that are fairly long and can seriously reduce user experience or even induce hangs
So the few things you get walking away from this is knowing you can:
always find the index of a model after sorting by calling collection.indexOf(model)
get the index of a model from an add event (3rd argument)
Edit:
After thinking about if for a bit I came up with something like this:
var Model = Backbone.Model.extend({
initialize: function () {
this.bind('change:name', this.onChangeName, this);
},
onChangeName: function ()
{
var index, newIndex;
index = this.collection.indexOf(this);
this.collection.sort({silent: true});
newIndex = this.collection.indexOf(this);
if (index !== newIndex)
{
this.trigger('reindex', newIndex);
// or
// this.collection.trigger('reindex', this, newIndex);
}
}
});
and then in your view you could listen to
var View = Backbone.View.extend({
initialize: function () {
this.model.bind('reindex', this.onReindex, this);
},
onReindex: function (newIndex)
{
// execute some code that puts the view in the right place ilke
$("ul li").eq(newIndex).after(this.$el);
}
});
Thanks Vincent for an awesome solution. There's however a problem with the moving of the element, depending on which direction the reindexed element is moving. If it's moving down, the index of the new location doesn't match the index of what's in the DOM. This fixes it:
var Model = Backbone.Model.extend({
initialize: function () {
this.bind('change:name', this.onChangeName, this);
},
onChangeName: function () {
var fromIndex, toIndex;
fromIndex = this.collection.indexOf(this);
this.collection.sort({silent: true});
toIndex = this.collection.indexOf(this);
if (fromIndex !== toIndex)
{
this.trigger('reindex', fromIndex, toIndex);
// or
// this.collection.trigger('reindex', this, fromIndex, toIndex);
}
}
});
And the example listening part:
var View = Backbone.View.extend({
initialize: function () {
this.model.bind('reindex', this.onReindex, this);
},
onReindex: function (fromIndex, toIndex) {
var $movingEl, $replacingEl;
$movingEl = this.$el;
$replacingEl = $("ul li").eq(newIndex);
if (fromIndex < toIndex) {
$replacingEl.after($movingEl);
} else {
$replacingEl.before($movingEl);
}
}
});

CouchDB "Join" two documents

I have two documents that looks a bit like so:
Doc
{
_id: AAA,
creator_id: ...,
data: ...
}
DataKey
{
_id: ...,
credits_left: 500,
times_used: 0,
data_id: AAA
}
What I want to do is create a view which would allow me to pass the DataKey id (key=DataKey _id) and get both the information of the DataKey and the Doc.
My attempt:
I first tried embedding the DataKey inside the Doc and used a map function like so:
function (doc)
{
if (doc.type == "Doc")
{
var ids = [];
for (var i in doc.keys)
ids.push(doc.keys[i]._id);
emit(ids, doc);
}
}
But i ran into two problems:
There can be multiple DataKey's per
Doc so using startkey=[idhere...]
and endkey=[idhere..., {}] didn't
work (only worked if the key happend
to be the first one in the array).
All the data keys need to be unique, and I would prefer not making a seperate document like {_id = datakey} to reserve the key.
Does anyone have ideas how I can accomplish this? Let me know if anything is unclear.
-----EDIT-----
I forgot to mention that in my application I do not know what the Doc ID is, so I need to be able to search on the DataKey's ID.
I think what you want is
function (doc)
{
if (doc.type == "Doc")
{
emit([doc._id, 0], doc);
}
if(doc.type == "DataKey")
{
emit([doc.data_id, 1], doc);
}
}
Now, query the view with key=["AAA"] and you will see a list of all docs. The first one will be the real "Doc" document. All the rest will be "DataKey" documents which reference the first doc.
This is a common technique, called CouchDB view collation.

Resources