How does LLBLGen Pro Stack up Against Nhibernate Performance Wise

How does LLBLGen Pro Stack up Against Nhibernate Performance Wise - performance

I have search the internet high and low looking for any performance information for LLBLGen Pro. None found. Just wanted to know how does LLBLGen Pro perform compared the Nhibernate. Thanks

Your question is essentially impossible to answer without context. The questions I would ask in response would start with:
What kind of application? Data-centric? Business logic-centric?
How much data are we talking about?
What kind of data operations are we talking about? Mostly reads? Mostly writes?
As a general matter, LLBLGen performs very well. We have used it on 10+ projects (including a few enterprise-scale projects) where I work, and the few issues we've seen with performance were always the result of misunderstand what the code was doing (there is a learning curve) or a poorly implemented physical model (e.g. missing indexes).
The two frameworks approach the problem of data access very differently. LLBLGen's operations generally translate into SQL that is fairly easy to understand if you have a strong data background. NHibernate uses sessions and cache to keep data in memory when possible to improve performance (disclaimer: I am not an NHibernate expert). LLBLGen does not support this sort of concept; instead it works in a disconnected state and stores change tracking information directly on its entity objects.
Bottom line, the approaches the frameworks take are very different, and it's hard to predict which will perform better without knowing more about what your system does. In the right hands, either framework can be used to design a system where data access performance is not a major performance bottleneck.

Initially we tested LLBLGen # ORMBattle.NET, it was ~ 2 times faster than NH on materialization; LINQ query compilation time was pretty good (~ 4000 queries/sec.), but CUD operations were noticeably slower than in NH (there is no CUD batching in LLBLGen).
Both frameworks must be relatively slow in case when you deal with large amount of objects in the same session:
NH is relatively slow because of its materialization pipeline. I'm not fully sure why, but e.g. to implement dirty checking, NH must store a clone of any materialized objects somewhere. At least two times more RAM ~= at least 2 times slower.
LLBLGen uses relatively "fat" entities - it seems they store fields in dictionaries. Obviously, this isn't good from the point of performance, since RAM consumption is one of essential factors affecting on it.
See this FAQ question and Test Suite Summary for a bit deeper explanation.
So in short, LLBLGen Pro must be faster than NH on reads, but slower on writes.

Related

Tracing ORM performance

This isn't a question of "which is the fastest ORM", nor is it a question on "how to write good code with ORMs". This is the other side: the code's been written, it's gone live, several thousand users are hitting the application, but there's a perceived overall performance problem. A SQL Profiler trace can only be ran for a short amount of time: 5 mins gives several hundred thousand results.
The question is simply this: having used SQL Profiler to narrow down a number of slow queries (duration greater than a given amount of time), what techniques and solutions exist for tracing these SQL queries back into the problematic component? A releated question is that if a specific area is slow, how can we identify the SQL that this area is executing so it can be suitably filtered in SQL Profiler?
The background to this is that we have a rather large application with a fairly complex table structure, and is currently based around data-access via stored procedures. If a SQL performance problem arises, it's usually case of pulling out SQL profiler, find out if there's anything slow (filter by duration) or if a the area being complained about is slow (filter by stored procedure), and tune the stored procedures (or the schema - through indexing).
Now there's a push to move our code over from a mostly-sproc solution to a mostly-ORM solution, however the big push against the move is how performance problems, if they arise, can be traced back to problematic code. I've read around and it seems that more often than not, it may involve third-party tools (ORM tracing utilities like NHProf or .NET tracing utils like dottrace) that we'd need to install on the server. Now whether additional tools can be installed on a live environment is another question, so if things like this can be performed without additional tools, then that may be a bonus.
I'm mostly interested in solutions with SQL Server 2008, but it's probably generic enough for any RDBMS. As far as the ORM tech, on this I have no specific focus as nothing's currently in use, so be interested to hear how techniques differ (or are common) twixt nHibernate, fluent-nhibernate and Entity Framework. Other ORMs are welcome though if they offer something else :-)
I've read through How to find and fix performance problems (...), and I think the issue is simply the section on there that says "isolate". A problem that is easily reproducible only on a live system is going to be difficult to isolate. The figures I quoted in para 2 are figures the types of volumes that we can get from a profile as well...
If you have real-world experience of ORM tracing on live, so much the better :-)
Update, 2016-10-21: Just for completeness, we eventually solved this for NHibernate by writing code, and overriding NHibernate methods. Full details in this other SO question I asked: NHibernate and Interceptors - measuring SQL round trip times. I expect this will be a similar approach for many different ORMs.

There exists profilers for ORM tools, like UberProf. It finds out which SQL statements that are generated by the ORM can be problematic.
Like the select n+1 problem, for instance. These kind of tools might give you an indication of which ORM query statements result in poor SQL code, and perhaps even how you could improve them.

We had a Java/Hibernate app with issues, so we used SET CONTEXT_INFO with a different value. If we saw, say, 0x14 on the same SPID just before a WTF query, we could narrow it to module x.
Not being a Java guy, I don't know exactly what they did, and of course it may not apply to .net. IIRC you have to be careful about when connections are opened/closed
We could also control the client load at this time so we didn't have too much superfluous traffic.
YMMV of course, but it may be useful
I just found these which could be useful too
Temporary tables, sessions and logging in SQL Server?
Why is my CONTEXT_INFO() empty?

How to approach performance issues?

We are developing a client-server desktop application(winforms with sql server 2008, using LINQ-SQL).We are now finding many issues related to performance.These relate to querying too much data with LINQ , bad database design,not much caching etc.What do you suggest,we should do - how to go about solving these performance issues? One thing,I am doing is doing sql profiling,and trying to fix some queries.As far caching is concerned,we have static lists.But,how to keep them updated,we don't have any server side implementation.So,these lists can be stale,if someone changes data.
regards

Performance analysis without tools is fruitless, with the wrong tools frustrating. SQL Profiler is the wrong tool to rely on for what you are looking at. I think it is at best giving you a hint of what is wrong.
You need to use a code profiler to determine why/when these queries are being executed. You should be able to find one by Googling it and run it a x day trial.
The key questions are:
Are queries being run multiple times when there is no reason to at all? Is the data already in memory (even if not stored statically). This happens a lot where data is already retrieved but because of some action on the code it loads it again. Class properties are a big culprit here.
Should certain data be stored statically across the application? How volatile is that data? Can you afford to show stale data?
The only way to decide on #2 is to have hard data to examine the cost of a particular transaction. For example, if I know it takes me 1983 ms to create a new invoice, what will it be after I start caching data. After the cache is that savings significant. But recognize you can't answer that question until you know it takes 1983 ms to create an invoice.
When I profile an application transaction I focus on the big contributor and try to determine why it is so big. I look for individual methods that are slow and for any code that is executed frequently. It is often the latter, the death of a thousand cuts, that gets you.
And I wanted to add this, it is also very important to know when to stop working on a performance issue.

I found Jeff Atwood's articles on this quite interesting:
Compiled Or Bust
All Abstractions are field Abstractions

For updating, you can create a Table. I called it ListVersions.
Just store list id, name and version.
When you do some changes to a list, just increment its version. In your application, you'll just need to compare version and update only if it has changed. Update lists that have version incremented, not all.
I've described it in my answer to this question
What is the preferred method of refreshing a combo box when the data changes?
Good Luck!

A general recipe for performance issues:
Measure (wall clock time, CPU time, memory consumption etc.)
Design & implement an algorithm that you think could be faster than current code.
Measure again to assess the impact of your fix.
Many times the biggest bottle necks aren't exactly where you though they were. So, base your actions on measured data.
Try to keep the number of SQL queries small. You're more likely to get performance improvements by lowering the amount of queries than restrucrturing the SQL syntax of an individual query.
I recommed adding some server side logic instead of directly firing the SQL queries from the client. You could implement caching shared but all clients on the server side.

Are there any resources for language independent performance tips?

I work with many people that program video games for a living. I have a quite a bit of knowledge in C++ and I know a number of general performance strategies to utilize in day to day programming. Like using prefix ++/-- over post fix.
My problem is that often times people come to me to give them tips on general optimizations they can do on a regular basis when programming, but often times these people program in all sorts of languages. Some use C++, C#, Java, ActionScript, etc.
I am wondering if there are any general performance tips that can be utilized on a day by day programming basis? For example, I would suggest prefix ++/-- over postfix for people programming in another language, but I am just not sure if that is true.
My guess is that it is language specific and the best way to go about general optimizations is to make sure you are not using majorly bloated algorithms, but maybe someone has some advice.

Without going into language specifics, or even knowing whether this is embedded, web, CAD, game, or iPhone programming, there isn't much that can be said. All we know is that there's multiple languages involved, and for some unknown reason performance is always slower than desirable.
First, check your algorithms. A slow algorithm can cause horrible performance. Read up on algorithms and their complexity.
Second, note if there are any really slow operations, such as hitting a database or transmitting information or moving a robot arm. See if the program is doing more of those than it should.
Third, profile. If there's a section of code that's taking 5% of the time, no optimization will make your program more than 5% faster. If a section of code is taking a lot of the time, it's worth looking at.
Fourth, get somebody who knows what they're doing to make any specific optimizations. Test them when they're done to make sure they actually speed up performance. When performance was an issue, I've improved it with some counterintuitive measures, like rolling up loops.

I don't think you can generalize optimization as such. To optimize execution time, you need to dig deep into the language and understand how things work in detail. Just guessing or making assumptions on experiences with other languages won't work! For example, writing x = x << 1 instead of x = x*2 might be a big benefit in C++. In JavaScript it will slow you down.
With all the differences between all the languages it's hard to find generic optimization tips. Maybe for some languages which are similar (f.ex. C# and Java). But if you add both JavaScript and Python to that list I'm pretty sure not many common optimization techniques will be left over.
Also keep in mind that premature optimization is often considered bad practice. Developer-hours are much more expensive than buying additional hardware.
However, there is one thing which comes to mind. Over the past decade or so, Object Relational Mappers have become quite popular. And hence, they emerge(d) in pretty much all popular languages. But you have to be careful with those. It's easy to load tons of data into memory that you will never use in your code if not properly configured. Keep that in mind. Lazy loading might be of some help here. But your mileage will vary.
Optimization depends on so many things that answering such a generic question would make this post explode into a full-fledged paper. In my opinion, optimization should be regarded on a project-by-project basis. Not only Language-by-Language basis.

I think you need to split this into two separate questions:
1) Are there language-agnostic ways to find performance problems? YES. Profile, but avoid the myths around that subject.
2) Are there language-agnostic ways to fix performance problems? IT DEPENDS.
A general language-agnostic principle is: do (1) before you do (2).
In other words, Ready-Aim-Fire, not Ready-Fire-Aim.
Here's an example of performance tuning, in C, but it could be any language.

A few things I have learned since asking this:
I/O operations are usually the most expensive to performance. This holds especially true when you are doing disk or network I/O (which is usually the most expensive because if you have to wait for a response from the other host you have to wait for all processing and I/O operations the remote host does). Only do these operations when absolutely necessary and possibly consider using a cache when possible.
Database operations can be very expensive because of network/disk I/O and the translation time to and from SQL. Using in-memory DB or cache can help reduce I/O issues and some (not all) NoSQL databases can reduce SQL translation time.
Only log important information. Using logging libraries like log4j can help because you can put logging to your hearts desire in your application but you set each message to a certain log level. Whichever log level you set the application to it will only log messages at that level or higher. This way if you need to troubleshoot functionality you only have to change a quick config and restart you application to give you additional messages. Then when you are done just turn you application back to the default level so that you do not log too often.
Only include functionality that is needed. Additional functionality may be nice to have but can increase processing time, provide additional locations for the application to fail, and costs your team development time that could be spent on more important tasks.
Use and configure your memory manager correctly. Garbage collection routines can kill performance if they are not configured correctly. If every minute you application freezes for a second or two for garbage collection your customer probably will not be happy.
Profile only after you have discovered a performance issue. Profilers will make the applications performance look worse than it is because you have your application and the profiler running on the same host, consuming the same hardware resources.
Do not prematurely do performance tuning. There are general practices you can take that should be better on performance in each language, but starting performance tuning in the middle of application development can cost you a lot on development because there is still functionality to be added.
This is not necessarily going to help performance but keep class dependency to a minimal. When you get into performance tuning there is good chance you will have to rewrite whole portions of code, which if there is a lot of dependencies on the section you are performance tuning the greater chance you will break the code. It can often be a domino affect because after fixing the performance issue than you have to fix all the dependencies, and possibly dependencies of the original dependencies. A performance tuning exercise estimate for a few hours can quickly turn into months with an application that has a lot of dependencies.
If performance is a concern do not use interpreted languages (scripting languages).
Only use the hardware you need. Having a system with a 64 core processor may seem cool but if you only have two or three threads running in your application than you are getting little benefit from having 64 cores. In fact, in rare instances having overly excessive hardware can sometimes hurt performance because the chips have to be wired to handle all the hardware which can cause your application to spend more time switching between cores or processors than actually being processed.
Any timing metrics you report make as granular as possible. Currently, you may only need to be worried about the number of milliseconds a process takes but in the future as you make your application faster and faster you may need more granular timings. If version A uses milliseconds and version B uses microseconds, how can you compare performance if version B is taking about the same number of milliseconds. Version B may be better but you just can't tell because version A did not use granular enough metrics.

Performance gains using straight ado.net vs an ORM?

would i get any performance gains if i replace my data access part of the application from nhiberate to straight ado.net ?
i know that NHibernate uses ado.net at its core !

Short answer:
It depends on what kind of operations you perform. You probably would get a performance improvement if you write good SQL, but in some cases you might get worse performance since you lose the NHibernate caching etc.
Long answer:
As you mention, NHibernate sits on top of ADO.NET and provides a layer of abstraction. This makes your life easier in many ways, but as all layers of abstraction it has a certain performance cost.
The main case where you probably would see a performance benefit is when you are operating on many objects at once, such as updating many entities or fetching a large amount of entities. This is because of the work that the NHibernate session does to keep track of which objects are modified etc. My experience is that the performance of NHibernate degrades significantly as the amount of entities in the session grows.
NHibernate has a lot of ways to improve performance and if you really know it well, you can get it to perform quite close to ADO.NET. However, if you are not that familiar with it, you can easilly shoot yourself in the foot, performance-wise. (Select N+1 problem, etc.)
There are some situations where you could actually get worse performance when switching form NHibernate to straight ADO.NET. This is because of the fact that the NHibernate abstraction layer introduces some features that can improve performance, such as caching. NHibernate also includes functionality for optimizing the generated SQL for the current database management system. For example, if you are using SQL Server it might generate slightly different SQL than if you are using Oracle.
It is worth mentioning that it does not have to be an all or nothing kind of situation. You could use NHibernate for the 90% of your database access for which it works great, and then use straight SQL for the 10% where you do complex queries, batch inserts/updates etc.

LINQ2SQL performance vs. custom DAL vs. NHibernate

Given a straightforward user-driven, high traffic web application (no fancy reporting/BI):
If my utmost goal is performance (not ease of maintainability, ease of queryability, etc) I would surmise that in most cases, a roll-yourown DAL would be the best choice.
However, if i were to choose Linq2SQL or NHibernate, roughly what kind of performance hit would we be talking about? 10%? 20%? 200%? Of the two, which would be faster?
Does anyone have any real world numbers that could shed some light on this? (and yes, I know Stackoverflow runs on Linq2SQL..)

If you know your stuff (esp. in SQL and ADO.NET), then yes - most likely, you'll be able to create a highly tweaked, highly optimized custom DAL for your particular interest and be faster overall than a general-purpose ORM like Linq-to-SQL or NHibernate.
As to how much - that's really really hard to say without knowing your concrete table structure, data and usage patterns. I remember Rico Mariani did some Linq-to-SQL vs. raw SQL comparisons, and his end result was that Linq-to-SQL achieve over 90% of the performance of a highly skilled SQL programmer.
See: http://blogs.msdn.com/ricom/archive/2007/07/05/dlinq-linq-to-sql-performance-part-4.aspx
Not too shabby in my book, especially if you factor in the productivity gains you get - but that's the big trade-off always: productivity vs. raw performance.
Here's another blog post on Entity Framework and Linq-to-SQL compared to DataReader and DataTable performance.
I don't have any such numbers for NHibernate, unfortunately.

In two high traffic web apps refactoring a ORM call to use a stored procedure from ado.net only got us about 1-2% change in CPU and time.
Going from an ORM to a custom DAL is an exercise in micro optimization.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio