High performance product catalog in asp.net? - linq

I am planning a high performance e-commerce project in asp.net and need help in selecting the optimal data retrieval model for the product catalog.
Some details,
products in 10-20 categories
1000-5000 products in every category
products listed with name, price, brand and image, 15-40 on every page
products needs to be listed without table-tags
product info in 2-4 tables that will be joined together (product images not stored in db)
webserver and sql database on different hardware
ms sql 2005 on shared db-server (pretty bad performance to start with...)
enable users to search products combining different criteria such as pricerange, brand, with/without image.
My questions are,
what technique shall I use to retrieve the products?
what technique shall I use to present the products?
what cache strategy do you recomend?
how do I solve filtering, sortering, pageing in the most efficient way?
do you have any recomendations for more reading on this subject?
Thanks in advance!

Let the SQL server retrive the data.
With fairly good indexing the SQL server should be able to cope.
in SQL 2005 you can do paging in results, that way you have less data to shuffle back and forth.

I think you will end up with a lot of text searching. Give a try for either lucene or Solr ( http server on top of lucene). CNET developed solr for their product catalog search.

Have you thought about looking at an existing shopping cart platform that allows you to purchase the source code?
I've used www.aspdotnetstorefront.com
They have lots of examples of major e-commerce stores running on this platform. I built www.ElegantAppliance.com on this platform. Several thousand products, over 100 categories/sub-categories.

Make sure your database design is normalised as much as possible - use lookup tables where necessary to make sure you are not repeating data unnecessarily.
Store your images on the server filesystem and store a relative (not full) path reference to them in the database.
Use stored procedures for as much as possible, and always retrieve the least amount of data as you can from the server to help with memory and network traffic efficiencies.
Don't bother with caching, your database should be fast enough to produce results immediately, and if not, make it faster.

Related

Can Umbraco handle millions of nodes

I have evaluated umbraco and believe it is a good cms for my project. But there is something really bugging me. When I open the backoffice all the child nodes are displayed for a certain node. What if a child node has millions of nodes. Will all these nodes load in the backoffice? This is the only problem I have with umbraco as my next project will require loading tens of thousand of nodes per day.
This is not a question for SO, you should really ask it on the http://our.umbraco.org site.
However ... You need to look at your architecture - how you plan to implement your project and also how users are going to use the backoffice too. Loading purchase orders, sales orders etc as nodes in the CMS does not make a great deal of sense. After all, Umbraco is a CMS not an ecommerce solution. And navigating 1000s of nodes would be horrible for a backoffice user.
Generally, you should restrict nodes to being related specifically to content. If it isn't content then it shouldn't be a node. There are exceptions, e.g. you could create content categories and date folders as nodes. These are not strictly content but impact how content is displayed.
Products displayed on a site is an interesting one because you could argue that a product is content. But then it depends on how many products you are listing. If you have a catalogue of 10000 product SKUs, yes Umbraco would probably handle it but is this the best use of Umbraco?
The alternative to creating data in nodes is to have separate db tables that hold the relevant data (in this case orders) and then have a custom section within the CMS that provides access to listing/detail/edit screens. This approach is probably much more appropriate when dealing with large volumes of data as you are not putting the load on Umbraco - a custom section essentially by-passes Umbraco and allows you to access data directly from the database in whichever implementation you like (MVC/Web forms).
Finally, I should point out that there are already several ecommerce packages available that will do this for you. See teaCommerce and uCommerce
Updated at 03/2015:
I've just completed an Umbraco ecommerce project using uWebshop (an open-source option). It does create products as nodes and I thought I should probably update this answer. In this circumstance the shop had a very small catalogue (< 50 SKUs) and so having products as nodes didn't pose an massive issue. I could see however that managing a much large catalogue (e.g. 500+) in this manner would become extremely unwieldy.

Could disabled products slow down the site?

I have more and more disabled products and I am wondering if they could slow down my site.
All operations will be much faster on a smaller database.
Each product has tens of records in the EAV tables. 10-20k inactive products will have a major impact on anything that is not cached. Reindexing will take longer.
You should take into consideration the following when you delete products:
Reorder will not work (as the same product ordered is expected to exist)
Check if you have some custom reports/extensions that full the data from the product tables
Orders/Invoices created already will display all the informations
if you know how to optimize the whole thing, and your site was working fine before, deactivating products is not much impact on the speed of the store as a whole.

More than 200 stores with multi store and same products?

we have 250k products and we want to create more than 200 different site(multi site) all stores will have same products but different domain and different design. Magento multi site function is good for us but you need to match every product and every site so that means 50m(min) record for tables like "catalog_product_website"(and some other tables) and also flat catalog and indexing will be a real problem.
if we use store and store view than userbase will be same and also flat catalog will be problem.
So my question, is there any way to make all these stores work like single store? Or is there any way to make it work with nice performance?
The short answer is no.
While Magento is built for multi-site applications, I don't think it's suited to running 200+ stores with 250,000 products.
At that size, I'd say contact Magento regarding Magento Enterprise to see what they have to say. Oh, and be prepared to pay the $15,000/Server/Year price.

How to improve Search performance on e-commerce site?

I have an e-commerce website built upon ASP.net MVC3. It has appx. 200k products. We are currently searching in product table for providing search on site.
The issue is it is deadly slow, and of course, by analyzing performance on profiler we found that it is the sql search which is the main issue.
What are other alternatives that could be used to speed up the search? Do we need to build a separate cache for search or anything else is needs to be done?
If you look at the other large e-commerce sites like ebay, amazon or flipkart, they are very fast. How do they do it?
They usually build a full text index of what is searchable, using for example Lucene.NET or Solr (uses Java, but an instance of it can be called through .NET using SolrNet).
The index is built from your SQL database and any searches you do would need to make use of that index by sending queries to it like you would on a SQL database, but with a different syntax of course. The index needs to be updated/recreated periodically to stay up to date with the SQL database.
Such a text index is built for querying large amounts of string data and can easily handle hundreds of thousands of products in a product search function on your website. Aside from speed, there are other benefits that would be very hard to do without a text index, such as spelling corrections and fuzzy searches.

How to access data in Dynamics CRM?

What is the best way in terms of speed of the platform and maintainability to access data (read only) on Dynamics CRM 4? I've done all three, but interested in the opinions of the crowd.
Via the API
Via the webservices directly
Via DB calls to the views
...and why?
My thoughts normally center around DB calls to the views but I know there are purists out there.
Given both requirements I'd say you want to call the views. Properly crafted SQL queries will fly.
Going through the API is required if you plan to modify data, but it isnt the fastest approach around because it doesnt allow deep loading of entities. For instance if you want to look at customers and their orders you'll have to load both up individually and then join them manually. Where as a SQL query will already have the data joined.
Nevermind that the TDS stream is a lot more effecient that the SOAP messages being used by the API & webservices.
UPDATE
I should point out in regard to the views and CRM database in general: CRM does not optimize the indexes on the tables or views for custom entities (how could it?). So if you have a truckload entity that you lookup by destination all the time you'll need to add an index for that property. Depending upon your application it could make a huge difference in performance.
I'll add to jake's comment by saying that querying against the tables directly instead of the views (*base & *extensionbase) will be even faster.
In order of speed it'd be:
direct table query
view query
filterd view query
api call
Direct table updates:
I disagree with Jake that all updates must go through the API. The correct statement is that going through the API is the only supported way to do updates. There are in fact several instances where directly modifying the tables is the most reasonable option:
One time imports of large volumes of data while the system is not in operation.
Modification of specific fields across large volumes of data.
I agree that this sort of direct modification should only be a last resort when the performance of the API is unacceptable. However, if you want to modify a boolean field on thousands of records, doing a direct SQL update to the table is a great option.
Relative Speed
I agree with XVargas as far as relative speed.
Unfiltered Views vs Tables: I have not found the performance advantage to be worth the hassle of manually joining the base and extension tables.
Unfiltered views vs Filtered views: I recently was working with a complicated query which took about 15 minutes to run using the filtered views. After switching to the unfiltered views this query ran in about 10 seconds. Looking at the respective query plans, the raw query had 8 operations while the query against the filtered views had over 80 operations.
Unfiltered Views vs API: I have never compared querying through the API against querying views, but I have compared the cost of writing data through the API vs inserting directly through SQL. Importing millions of records through the API can take several days, while the same operation using insert statements might take several minutes. I assume the difference isn't as great during reads but it is probably still large.

Resources