using NEST with Elastic Search for collections - elasticsearch

I'm trying to get my hands dirty with Elastic Search via the NEST .Net api and running into a couple of problems. I suspect I've misunderstood something, or am modelling my docs incorrectly but would appreciate some help.
I have a document with collections in it. A similar trite example below :
public class Company
{
public DateTime RegisteredOn {get;set;}
public string Name {get;set;}
[ElasticProperty(Type = FieldType.nested)]
public List<Employee> Employees {get;set;}
}
public class Employee
{
public string FirstName {get;set;}
public string LastName {get;set;}
[ElasticProperty(Type = FieldType.nested)]
public List<SalesFigure> SalesFigures {get;set}
}
public class SaleFigure
{
public int AverageMonthlySaleValue {get;set;}
public int AverageVolumeSold {get;set;}
}
I've created an index with some data in at each level of the hierarchy and before indexing have called client.MapFromAttributes<Company>();
The following works, but I'd like to understand how I'd find all companies with employees with a firstName of Bob, and or find all companies with employees who have a an average AverageMonthlySaleValue > $1100
client.Search<Company>(query => query.Index("companies").Type("company")
.From(0)
.Size(100)
.Filter(x => x.Term(n => n.Name, "Microsoft")));
Nested queries/filters have been suggested as has suggestions that I ought to flatten my document which I can do, but I'm trying to create a model which better represents the real domain so am in a quandary.
Equally, I know that I'll also have to use facets at some point so want to structure everything correctly to support that.
Thanks
Tim

So it turns out there wasn't much wrong with the structure of my document. The example is trite and the real property I was querying on a collection was a string, not an int, so case sensitivity kicked in.
I had to change the query to use a lower case string value for comparison which worked. Something like the following worked.
client.Search<Company>(query => query.Index("companies")
.Type("company")
.From(0)
.Size(100)
.Filter(x => x.Term("company.employees.firstName", "microsoft")));
I've still to work out how to use a lamda in place of "company.employees.firstName" but it works for now.

Related

Entity Splitting For One-To-Many table relationships

Following this article (What are best practices for multi-language database design?), I have all my database tables splitted in two: the first table contains only language-neutral data (primary key, etc.) and the second table contains one record per language, containing the localized data plus the ISO code of the language. The relationship between the two tables is one to many.
Here a screenshot of the datamodel: https://dl.dropboxusercontent.com/u/17099565/datamodel.jpg
Because the website has 8 languages, for each record in table "CourseCategory" I have 8 record in table "CourseCategoryContents". The same happens with "Course" and "CourseContent"
Then I use Entity Splitting in order to have only one entity for the Course Category and one entity for the Course:
public class CourseCategoryConfiguration : EntityTypeConfiguration<WebCourseCategory>
{
public CourseCategoryConfiguration()
{
Map(m =>
{
m.Properties(i => new { i.Id, i.Order, i.Online });
m.ToTable("CourseCategories");
});
Map(m =>
{
m.Properties(i => new { i.LanguageCode, i.Name, i.Permalink, i.Text, i.MetaTitle, i.MetaDescription, i.MetaKeywords });
m.ToTable("CourseCategoryContents");
});
}
}
public class CourseConfiguration : EntityTypeConfiguration<WebCourse>
{
public CourseConfiguration()
{
Map(m =>
{
m.Properties(i => new { i.Id, i.CategoryId, i.Order, i.Label, i.ThumbnailUrl, i.HeaderImageUrl });
m.ToTable("Courses");
});
Map(m =>
{
m.Properties(i => new { i.LanguageCode, i.Name, i.Permalink, i.Text, i.MetaTitle, i.MetaDescription, i.MetaKeywords, i.Online });
m.ToTable("CourseContents");
});
}
}
Then to retrive the courses in a desired language including their category I do this:
using (WebContext dbContext = new WebContext())
{
// all courses of all categories in the desired language
return dbContext.Courses
.Include(course => course.Category)
.Where(course => course.LanguageCode == lan
&& course.Category.LanguageCode == lan)
.ToList();
}
}
Entity splitting works fine with one-to-one relationships, but here I have one-to-many relationships.
The website has contents (CourseCategories and Courses) in 3 languages ("en", "de", "fr").
EF correctly returns all the Courses with their Category in the right language (eg. in english), but returns each record 3 times. This is because I have the CourseCategory in 3 languages too.
The only one working solution I came up is avoiding using ".Include(Category)", getting all the courses in the desired language in first, then, in a foreach cycle, for each Course retriving its Category in language. I don't like this lazy loading approach, I would like to retrive all the desired data in one shot.
Thanks!
The best solution is to map tables to the model as it then in your model Course class will have a navigation property ICollection<CourseCategoryContent>.
In this case you just project this model to DTO or ViewModel "according to your application design"
e.g.
Your model will look like this
public class Course
{
public int Id {get; set;}
public int Order {get; set;}
public ICollection<CourseCategoryContent> CourseCategoryContents {get; set;}
}
public class CourseCategoryContent
{
public string LanguageId {get; set;}
public string Name {get; set;}
}
Then just create new DTO or ViewModel like :
public class CourseDTO
{
public int Id {get; set;}
public int Order {get; set;}
public string Name {get; set;}
}
Finally do the projection
public IQueryable<CourseDTO> GetCourseDTOQuery ()
{
return dbContext.Courses.Select(x=>new CourseDTO{
Id = x.Id,
Order = x.Order,
Name = x.CourseCategoryContents.FirstOrDefault(lang => lang.LanguageId == lang).Name,
});
}
And note that the return type is IQueryable so you could do any filter, Order or grouping operation on it before hitting the database.
hope this helped
No fix-all answer i'm afraid, every way has a compromise.
I've used both the database approach (10+ language dependent tables) and the resource file approach in fairly large projects, if the data is static and doesn't change (i.e you don't charge a different price or whatever) I would definately consider abstracting language away from your database model and using Resource keys then loading your data from files.
The reason or this is the problem you are experiencing right now where you can't filter includes (this may have changed in EF6 perhaps? I know it's on the list of things to do). You might be able to get away with reading it into memory and filtering them though like you're doing but this meant it wasn't very performant for us and I had to write Stored Procedures that I just passed the iso language and executed in EF.
From a maintenance point of view it was easier as well, for the DB project I had to write an admin console so people could log on and edit values for different languages etc. Using resource files I just copy-pasted the values into excel and emailed them to the people we use to translate.
It depends on the complexity of your project and what you prefer, i'd still consider both approaches in future.
TLDR: options that i've found are:
1) filter in memory
2) lazy load with filter
3) write stored procedure to EF and map that result
4) use resources instead
Hope this helps
EDIT: After looking at diagram it looks like you may need to search against the language dependant values? In that case resources probably won't work. If you're just letting them navigate off a menu then you're good to go.

OData $orderby clause on collection property

I have the following classes :
public class Parent
{
public string ParentProp { get; set; }
public IEnumerable<Child> ManyChildren { get; set; }
}
public class Child
{
public string ChildName { get; set; }
public int Value { get; set; }
}
Say I have an OData operation defined which returns IEnumberable<Parent>. Can I write an $orderby clause which performs the following operation ('parents' is an IEnumerable<Parent>) :
parents.OrderBy(x => x.ManyChildren.Single(y => y.ChildName == "Child1").Value);
I know I can write custom actions (http://msdn.microsoft.com/en-us/library/hh859851(v=vs.103).aspx) to do this ordering for me, but I'd rather use an $orderby clause.
(The only SO question which asked something similar is a little dated - How can I order objects according to some attribute of the child in OData?)
As I tried is possible with nesting $orderby in $expand so will be:
odata/User?&$select=Active,Description,Name,UserId&$expand=Company($select=Active,Name,CreatedBy,CompanyId;$orderby=Active asc)
And what you get is somthing like:
ORDER BY [Project2].[UserId] ASC, [Project2].[C19] ASC
will order a company collection for each user separately.
I think in version OData Client for .NET 6.7.0 is supported, in release notes is writhing:
In query options
$id, $select, $expand(including nested query options)....
I see in version 6.1 that values for nested options exist and is in:
DataQueryOptions->SelectExpand->SelectExpandClasue->SelectedItems->ExpandNavigationItem->OrderByOption
but is not working.
I tried and with System.Web.OData 5.6 and all releated dependencies but seams is not working.
My conclusion:
Seams that is everiting prepared like DataQueryOptions exist nested orderby but is not working.
Like I find out standard seams is going in that direction.
https://issues.oasis-open.org/browse/ODATA-32
It depends on your OData service implementation. Which kind of service are you using? WCFDS, WebAPI, or the service you implement yourself?
Url parser do can parse the URL such as root/People?$orderby=Company/Name. The translator is implemented by service.
And I agree with the answer in related question: "it's not possible to do this with a navigation property that has a cardinality of many". Since it's has a cardinality of many, service cannot know which one should be used to sorting.

How to use a Dictionary or Hashtable for LINQ query performance underneath an OData service

I am very new to OData (only started on it yesterday) so please excuse me if this question is too dumb :-)
I have built a test project as a Proof of Concept for migrating our current web services to OData. For this test project, I am using Reflection Providers to expose POCO classes via OData. These POCO classes come from in-memory cache. Below is the code so far:
public class DataSource
{
public IQueryable<Category> CategoryList
{
get
{
List<Category> categoryList = GetCategoryListFromCache();
return categoryList.AsQueryable();
}
}
// below method is only required to allow navigation
// from Category to Product via OData urls
// eg: OData.svc/CategoryList(1)/ProductList(2) and so on
public IQueryable<Category> ProductList
{
get
{
return null;
}
}
}
[DataServiceKeyAttribute("CategoryId")]
public class Category
{
public int CategoryId { get; set; }
public string CategoryName { get; set; }
public List<Product> ProductList { get; set; }
}
[DataServiceKeyAttribute("ProductId")]
public class Product
{
public int ProductId { get; set; }
public string ProductName { get; set; }
}
To the best of my knowledge, OData is going to use LINQ behind the scenes to query these in-memory objects, ie: List in this case if somebody navigates to OData.svc/CategoryList(1)/ProductList(2) and so on.
Here is the problem though: In the real world scenario, I am looking at over 18 million records inside the cache representing over 24 different entities.
The current production web services make very good use of .NET Dictionary and Hashtable collections to ensure very fast look ups and to avoid a lot of looping. So to get to a Product having ProductID 2 under Category having CategoryID 1, the current web services just do 2 look ups, ie: first one to locate the Category and the second one to locate the Product inside the Category. Something like a btree.
I wanted to know how could I follow a similar architecture with OData where I could tell OData and LINQ to use Dictionary or Hashtables for locating records rather than looping over a Generic List?
Is it possible using Reflection Providers or I am left with no other choice but to write my custom provider for OData?
Thanks in advance.
You will need to process expression trees, so you will need at least partial IQueryable implementation over the underlying LINQ to Objects. For this you don't need a full blown custom provider though, just return you IQueryable from the propties on the context class.
In that IQueryable you would have to recognize filters on the "key" properties (.Where(p => p.ProductID = 2)) and translate that into a dictionary/hashtable lookup. Then you can use LINQ to objects to process the rest of the query.
But if the client issues a query with filter which doesn't touch the key property, it will end up doing a full scan. Although, your custom IQueryable could detect that and fail such query if you choose so.

Dynamic Linq Search Expression on Navigation Properties

We are building dynamic search expressions using the Dynamic Linq library. We have run into an issue with how to construct a lamba expression using the dynamic linq library for navigation properties that have a one to many relationship.
We have the following that we are using with a contains statement-
Person.Names.Select(FamilyName).FirstOrDefault()
It works but there are two problems.
It of course only selects the FirstOrDefault() name. We want it to use all the names for each person.
If there are no names for a person the Select throws an exception.
It is not that difficult with a regular query because we can do two from statements, but the lambda expression is more challenging.
Any recommendations would be appreciated.
EDIT-
Additional code information...a non dynamic linq expression would look something like this.
var results = persons.Where(p => p.Names.Select(n => n.FamilyName).FirstOrDefault().Contains("Smith")).ToList();
and the class looks like the following-
public class Person
{
public bool IsActive { get; set;}
public virtual ICollection<Name> Names {get; set;}
}
public class Name
{
public string GivenName { get; set; }
public string FamilyName { get; set; }
public virtual Person Person { get; set;}
}
We hashed it out and made it, but it was quite challenging. Below are the various methods on how we progressed to the final result. Now we just have to rethink how our SearchExpression class is built...but that is another story.
1. Equivalent Query Syntax
var results = from person in persons
from name in person.names
where name.FamilyName.Contains("Smith")
select person;
2. Equivalent Lambda Syntax
var results = persons.SelectMany(person => person.Names)
.Where(name => name.FamilyName.Contains("Smith"))
.Select(personName => personName.Person);
3. Equivalent Lambda Syntax with Dynamic Linq
var results = persons.AsQueryable().SelectMany("Names")
.Where("FamilyName.Contains(#0)", "Smith")
.Select("Person");
Notes - You will have to add a Contains method to the Dynamic Linq library.
EDIT - Alternatively use just a select...much more simple...but it require the Contains method addition as noted above.
var results = persons.AsQueryable().Where("Names.Select(FamilyName)
.Contains(#0", "Smith)
We originally tried this, but ran into the dreaded 'No applicable aggregate method Contains exists.' error. I a round about way we resolved the problem when trying to get the SelectMany working...therefore just went back to the Select method.

LINQ to Objects question

I am writing a method that is passed a List<AssetMovements> where AssetMovements looks something like
public class AssetMovements
{
public string Description { get; set; }
public List<DateRange> Movements { get; set; }
}
I want to be able to flatten out these objects into a list of all Movements regardless of Description and am trying to figure out the LINQ query I need to do this. I thought that
from l in list select l.Movements
would do it and return IEnumerable<DateRange> but instead it returns IEnumerable<List<DateRange>> and I'm not really sure how to correct this. Any suggestions?
This one's been asked before. You want the SelectMany() method, which flattens out a list of lists. So:
var movements = list.SelectMany(l => l.Movements);

Resources