MongoDB C# low performance issue - performance

I'm testing MongoDB 1.6.5 speed and C# in win64 machine. I use Yahoo.geoplanet as source to load states, county, towns but i'm not very performant. I have currently more 5 sec to load the US states from these source passing a List to a web page in localhost.
Use only id as index. Can someone suggest way to perform. Thanks
class BsonPlaces
{
[BsonId]
public String Id { get; set; }
public String Iso { get; set; }
public String Name { get; set; }
public String Language { get; set; }
public String Place_Type { get; set; }
public String Parent_Id { get; set; }
}
public List<BsonPlaces> Get_States(string UseCountry)
{
using (var helper = BsonHelper.Create())
{
var query = Query.EQ("Place_Type", "State");
if (!String.IsNullOrEmpty(UseCountry))
query = Query.And(query, Query.EQ("Iso", UseCountry));
var cursor = helper.GeoPlanet.PlacesRepository.Db.Places
.FindAs<BsonPlaces>(query);
if (!String.IsNullOrEmpty(UseCountry))
cursor.SetSortOrder(SortBy.Ascending("Name"));
return cursor.ToList();
}
}

I suppose problem not in mongodb, loading can be slow for two reasons:
You trying to load big count of 'BsonPlaces'(20000 for example or even more).
Some another code on page take much time.
For speed improvements you can:
1.Set limit to items that will be returned by query:
cursor.SetLimit(100);
2.Create indexes for 'Name', 'Iso', 'Place_Type':
helper.GeoPlanet.PlacesRepository.Db.Places.EnsureIndex("Name");

c# driver probably have big performance problem. A simple query for 100k times on shell takes 3 seconds, same query (written in c# linq of official c# driver 1.5) takes 30 seconds. Profiler tells each query from c# client takes less than 1 ms. So I assume c# driver is doing a lot of unnecessary stuffs that makes the query so slow.
Under mongodb 2.0.7, OS: windows 7, Ram: 16G.

I downloaded the GeoPlanet Data from Yahoo and found that the geoplanet_places_7.6.0.tsv file has 5,653,969 lines of data.
That means that in the absence of an index you are doing a "full table scan" of over 5 million entries to retrieve the 50 US states.
When querying for states within a country, the following index would probably be the most helpful:
...EnsureIndex("Iso", "Place_Type");
Not sure why you would want to search for all "States" without specifying a country, but you would need another index for that.
Sometimes when a sort is involved it can be advantageous for the index to match the sort order. For example, since you are sorting on "Name" the index could be:
...EnsureIndex("Iso", "Place_Type", "Name");
However, with only 50 states the sort will probably be very fast anyway.
I doubt any of your slow performance is attributable to the C# driver, but if after adding the indexes you still have performance problems let us know.

Related

ElasticSearch / NEST 6 - Serialization of enums as strings in terms query

I've been trying to update to ES6 and NEST 6 and running into issues with NEST serializing of search requests - specifically serializing Terms queries where the underlying C# type is an enum.
I've got a Status enum mapped in my index as a Keyword, and correctly being stored in its string representation by using NEST.JsonNetSerializer and setting the contract json converter as per Elasticsearch / NEST 6 - storing enums as string
The issue comes when trying to search based on this Status enum. When I try to use a Terms query to specify multiple values, these values are being serialized as integers in the request and causing the search to find no results due to the type mismatch.
Interestingly the enum is serialized correctly as a string in a Term query, so I'm theorizing that the StringEnumConverter is being ignored in a scenario where it's having to serialize a collection of enums rather than a single enum.
Lets show it a little more clearly in code. Here's the enum and the (simplified) model used to define the index:
public enum CampaignStatus
{
Active = 0,
Sold = 1,
Withdrawn = 2
}
public class SalesCampaignSearchModel
{
[Keyword]
public Guid Id { get; set; }
[Keyword(DocValues = true)]
public CampaignStatus CampaignStatus { get; set; }
}
Here's a snippet of constructing the settings for the ElasticClient:
var pool = new SingleNodeConnectionPool(new Uri(nodeUri));
var connectionSettings = new ConnectionSettings(pool, (builtin, serializerSettings) =>
new JsonNetSerializer(builtin,
serializerSettings,
contractJsonConverters: new JsonConverter[]{new StringEnumConverter()}
)
)
.EnableHttpCompression();
Here's the Term query that correctly returns results:
var singleTermFilterQuery = new SearchDescriptor<SalesCampaignSearchModel>()
.Query(x => x.Term(y => y.Field(z => z.CampaignStatus).Value(CampaignStatus.Active)));
Generating the request:
{
"query": {
"term": {
"campaignStatus": {
"value": "Active"
}
}
}
}
Here's the Terms query that does not return results:
var termsFilterQuery = new SearchDescriptor<SalesCampaignSearchModel>()
.Query(x => x.Terms(y => y.Field(z => z.CampaignStatus).Terms(CampaignStatus.Active, CampaignStatus.Sold)));
Generating the request:
{
"query": {
"terms": {
"campaignStatus": [
0,
1
]
}
}
}
So far I've had a pretty good poke around at the options being presented by the JsonNetSerializer, tried a bunch of the available attributes (NEST.StringEnumAttribute, [JsonConverter(typeof(StringEnumConverter))] rather than using the global one on the client, having an explicit filter object with ItemConverterType set on the collection of CampaignStatuses, etc.) and the only thing that has had any success was a very brute-force .ToString() every time I need to query on an enum.
These are toy examples from a reasonably extensive codebase that I'm trying to migrate across to NEST 6, so what I'm wanting is to be able to specify global configuration somewhere rather than multiple developer teams needing to be mindful of this kind of eccentricity.
So yeah... I've been looking at this for a couple of days now. Good chances there's something silly I've missed. Otherwise I'm wondering if I need to be providing some JsonConverter with a contract that would match to an arbitrary collection of enums, and whether NEST and their tweaked Json.NET serializer should just be doing that kind of recursive resolution out of the box already.
Any help would be greatly appreciated, as I'm going a bit crazy with this one.

Telerik OpenAccess - Search With Non-Persistent Property

I'm using Telerik OpenAccess and SQL Server on a project and I need to be able to search by what someone's age will be on a certain date. The problem that I am running into is that the person's date of birth is stored in one table and the date to compare to is in another table, which prevents me from using a computed column. They are, however, joined together so that I can calculate the age by creating my own non-persistent property in the partial class like so:
public partial class Student
{
[Telerik.OpenAccess.Transient]
private int? _ageUponArrival;
public virtual int? AgeUponArrival
{
get
{
try
{
var dob = DateTime.Parse(this.StudentProfiles.First().Person.YearOfBirth);
var programStart = (DateTime)(this.StudentPrograms.First().ProgramStart);
this._ageUponArrival = programStart.Year - dob.Year;
if (dob > programStart.AddYears(-(int)(this._ageUponArrival)))
{
(this._ageUponArrival)--;
}
}
catch (Exception e)
{
this._ageUponArrival = null;
}
return _ageUponArrival;
}
set { }
}
}
Please ignore how bad the tables are set up, it's something that I inherited and can't change at this point. The problem with this approach is that the property is not available to search on with Linq. I know that I could create a view that would do this for me, but I would much rather not have to maintain a view just for this. Is there any way at all to create a calculated property through Telerik that would be calculated on the db server in such a way as to be searchable?
It appears that this is not possible at this point. http://www.telerik.com/community/forums/orm/linq-questions/dynamic-query-with-extended-field.aspx

Fluent nHibernate query on Oracle is extremely slow

I'm new to nHibernate and am having some really slow results from a simple select query. Maybe I'm missing something obvious. The situation as follow:
I am using fluent nHibernate.
I am querying an oracle database (10g), I am trying to return a person object.
It's taking around 16 seconds per record!
Here's my fluent nHibernate code:
public class Person
{
public virtual string PersonId { get; set; }
public virtual string FirstName { get; set; }
public virtual string LastName { get; set; }
}
public class PersonMap : ClassMap<Person>
{
public PersonMap()
{
Schema("MyTestDB");
Table("Person");
Id(i => i.PersonId);
Map(i => i.FirstName);
Map(i => i.LastName);
}
}
Here is the code that is suppose to retrieve the actual data:
var sessionFactory = Fluently.Configure().Database(OracleClientConfiguration.Oracle10.ConnectionString(#"User Id=tester;Password=tester99!;Data Source=MyTestDB;").ShowSql()).Mappings(m => m.FluentMappings.AddFromAssembly(Assembly.GetExecutingAssembly())).BuildSessionFactory();
using (var session = sessionFactory.OpenSession())
{
using (session.BeginTransaction())
{
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
var person = session.QueryOver<Person>()
.Where(p => p.PersonId == "1").SingleOrDefault();
stopWatch.Stop();
var ts = stopWatch.Elapsed;
var time = string.Format("{0:00}:{1:00}:{2:00}.{3:00}", ts.Hours, ts.Minutes, ts.Seconds, ts.Milliseconds/10);
Console.WriteLine("Retrieved object: Person, Id: {0}, First Name: {1}, Last Name: {2} in [{3}]", person.PersonId, person.FirstName, person.LastName, time);
}
}
The PersonId column is indexed and is the primary key.
My attempts to figure this out so far has been to run the same sql generated by nHibernate with ADO.Net. The query ran extremely fast (the stopwatch gets a elapsed time of 0).
Using plsql developer to run the same query on the database gave the same fast results. This suggests to me think it is not the query nor the database.
How can I debug this further? Will nHibernate profiler help with this (I don't have this available to me at the moment)?
Any ideas guys?
Firstly, you should try capturing a few more points in time throughout your programs execution. You've assumed that it's the NHibernate component, but without more data-points that's going to be hard to prove, especially when your initial test comes back with 0.
Secondly, the big cost in your NHibernate scenario is the call to BuildSessionFactory(). NHibernate is optimized to have cheap session construction, so it expects you to create this factory once and re-use it throughout the lifetime of your program. If you put trace points around this event, you may find your "expense".
If I had to take a wild guess here, the issue is not with nhibernate querying the database I think it is the initial cost you are paying to build the sessionFactory and/or session and that is why you are seeing the unusual latency.
Why dont you use Jetbrains dotTrace and see where is the actual performance hit, if its in running the query or something else. Just run a sampling query and you will be able to get the timings with the exact number of calls to each function.
P.S: I have no association with jetbrains just a happy customer recommending the product.
The problem ended up being that I didn't specify the datatype for the primary key column, which turned out to be varchar (non-unicode). Turns out you need to specify the datatype for non-unicode columns, as fluent assumes string maps to uni code.
This is how to set the custom type in fluent notation:
Map(x=>x.PersonId).CustomType("AnsiString");

Guid values in Oracle with fluentnhibernate

I've only been using fluent nhibernate a few days and its been going fine until trying to deal with guid values and Oracle. I have read a good few posts on the subject but none that help me solve the problem I am seeing.
I am using Oracle 10g express edition.
I have a simple test table in oracle
CREATE TABLE test (Field RAW(16));
I have a simple class and interface for mapping to the table
public class Test : ITest
{
public virtual Guid Field { get; set; }
}
public interface ITest
{
Guid Field { get; set; }
}
Class map is simple
public class TestMap : ClassMap<Test>
{
public TestMap()
{
Id(x => x.Field);
}
}
I start trying to insert a simple easily recognised guid value
00112233445566778899AABBCCDDEEFF
Heres the code
var test = new Test {Field = new Guid("00112233445566778899AABBCCDDEEFF")};
// test.Field == 00112233445566778899AABBCCDDEEFF here.
session.Save(test);
// after save guid is changed, test.Field == 09a3f4eefebc4cdb8c239f5300edfd82
// this value is different for each run so I pressume nhibernate is assigning
// a value internally.
transaction.Commit();
IQuery query = session.CreateQuery("from Test");
// or
// IQuery query = session.CreateSQLQuery("select * from Test").AddEntity(typeof(Test));
var t in query.List<Test>().Single();
// t.Field == 8ef8a3b10e704e4dae5d9f5300e77098
// this value never changes between runs.
The value actually stored in the database differs each time also, for the run above it was
EEF4A309BCFEDB4C8C239F5300EDFD82
Truly confused....
Any help much appreciated.
EDIT: I always delete data from the table before each test run. Also using ADO directly works no problem.
EDIT: OK, my first problem was that even though I thought I was dropping the data from the table via SQL command line for oracle when I viewed the table via oracle UI it still had data and the first guid was as I should have expected 8ef8a3b10e704e4dae5d9f5300e77098.
Fnhibernate still appears to be altering the guid value on save. it alters it to the value it stores in the database but I'm still not sure why it is doing this or how\if I can control it.
If you intend on assigning the id yourself you will need to use a different id generator than the default which is Guid.comb. You should be using assigned instead. So your mapping would look something like this:
Id(x => x.Field).GeneratedBy.Assigned();
You can read more about id generators in the nhibernate documentation here:
http://www.nhforge.org/doc/nh/en/index.html#mapping-declaration-id-generator

linq (to nhibernate) where clause on dynamic property in sql

My entity has a property which is derived from other properties
public class MyClass
{
public DateTime Deadline { get; set; }
public Severity Severity
{
return (DateTime.Now - Deadline < new TimeSpan(5, 0, 0, 0)) ? Severity.High : Severity.Low;
}
}
is there a way I can modify the following
return repository.Query().Where(myClass => myClass.Severity == High);
so that the where clause evaluates in sql rather than in code?
I tried to do something like this, but to no avail
public class MyClass
{
public DateTime Deadline { get; set; }
public Func<MyClass, bool> SeverityFunc = (DateTime.Now - Deadline < new TimeSpan(5, 0, 0, 0)) ? Severity.High : Severity.Low;
public Severity Severity
{
return SeverityFunc.Invoke(this);
}
}
return repository.Query().Where(myClass => myClass.SeverityFunc(myclass) == High);
I'm guessing because the func cant be evaluated to SQL. Is there some other way to do this without ending up with duplicate calculations for severity
Any help appreciated, ta
Edit: This is a simplified version of what im trying to do, i'm looking for answers that cover the theory of this rather than a specific fix (though still welcome). Im interested in whats possible, and best practices to achieve this sort of thing.
Andrew
I have used something similar on a mapper. Make sure to wrap the Func on on Expr, like:
public Expr<Func<MyClass, bool>> SeverityFunc ...
By wrapping it with expr linq2sql will be able to look at the full expression and translate appropiately. I haven't used it as part of a class instance like the one you have, so I am not sure how that would affect it.
Regarding on where to put it, I had to move on the last time I worked on a similar scenario, in my case it ended up in the mapper, but mostly because it was more a mapping concern from an awfully database schema than domain logic. I didn't even had the property dynamically calculated on the domain entity for that matter (a different scenario for sure).
One option is the LINQ Dynamic Query Library, See http://weblogs.asp.net/scottgu/archive/2008/01/07/dynamic-linq-part-1-using-the-linq-dynamic-query-library.aspx
Another option is to use the PredicateBuilder from http://www.albahari.com/nutshell/predicatebuilder.aspx
Hope this answers your question,
Roelof.

Resources