How do I update an expensive in-memory cache across a SharePoint farm? - caching

We have 3 front-end servers each running multiple web applications. Each web application has an in memory cache.
Recreating the cache is very expensive (>1 min). Therefore we repopulate it using a web service call to each web application on each front-end server every 5 minutes.
The main problem with this setup is maintaining the target list for updating and the cost of creating the cache several times every few minutes.
We are considering using AppFabric or something similar but I am unsure how time consuming it is to get up and running. Also we really need the easiest solution.
How would you update an expensive in memory cache across multiple front-end servers?

The problem with memory caching is that it's unique to the server. I'm going with the idea that this is why you want to use AppFabric. I'm also assuming that you're re-creating the cache every few minutes to keep the in memory caches in sync across all servers. With all this work, I can well appreciate that caching is expensive for you.
It sounds like you're doing a lot of work that probably isn't necessary. This article has some detail about the caching mechanisms available within SharePoint. You may be interested in the output cache discussed near the top of the article. You may also want to read the linked TechNet article and the linked article called "Custom Caching Overview".

The only SharePoint way to do that is to use Service Application infrastructure. The only problem is that it requires some time to understand how it works. Also it's too complicated to do it from scratch. You might consider downloading one of existing applications and rename classes/GUIDs to match your naming conventions. I used this one: http://www.parago.de/2011/09/paragoservices-a-sharepoint-2010-service-application-sample/. In this case you can have single cache per N front-end servers.

Related

How to speed up the TYPO3 Backend?

Given: Each call to a BE module takes several seconds even with a SSD drive. (A well configured setup runs below 1 second for general BE tasks.)
What are likely bottlenecks?
How to check for them?
What options to speed up?
On purpose I don't give a special configuration, but ask for a general checklist, so that the answer is suitable for many people as first entry point.
General tips on performance tuning for TYPO3 can be found here: https://wiki.typo3.org/Performance_tuning
However, in my experience most general performance problems are due to one of a few reasons:
Bad/no caching. Usually this is a problem with one or more extensions (partly) disabling cache. Try disabling all third party extensions and enabling them one by one to see which causes the site to slow down the most. $GLOBALS['TSFE']->set_no_cache() will disable all cache, so you could search for that. USER_INT and COA_INT in TypoScript also disable cache for anything that's configured inside there.
A lot of data. Check the database for any tables containing a lot of data. How many constitutes "a lot", depends on a lot of factors, but generally anything below a million records shouldn't be too much of a problem unless for example you do queries with things like LIKE '%...%' on fields containing a lot of data.
Not enough resources on the server. To fix this, add more memory and/or CPU cores to the server. Or if it's a shared server, reduce the number of sites running on it.
Heavy traffic. No matter how many resources a server has, it will always have a limit to the number of requests it can process in a given time. If this is your problem you will have to look into load balancing and caching servers. If you don't (normally) have a lot of visitors, high traffic can still be caused by robots crawling your site too quickly. These are usually easy to block on IP address in your firewall or webserver configuration.
A slow backend on a server without any other traffic (you're the only one who can access it) rules out 1 (can only cause a slow backend if users are accessing the frontend and causing a high server load) and 4 (no other traffic).
one further aspect you could inspect: in the user record a lot of things are stored, for example the settings you used in the log module.
one setting which could consume a lot of memory (and time to serialize and deserialize) is the state of the pagetree (which pages are expanded/ which are not).
Cleaning the user settings could make the backend faster for this user.
If you have a large page tree and the user has to navigate through many pages the effect will stall. another draw back: you loose all settings as there still is no selective cleaning.
Cannot comment here but need to say: The TSFE-Object does absolutely nothing in the TYPO3 Backend. The Backend is always uncached. The TYPO3-Backend is a standalone module to edit and maintenance the frontend output. There are tons of Google search results that will ignore this fact.
Possible performance bottlenecks are poor written extensions that do rendering or data processing. Hooks to core functions are usually no big deal but rendering of many elements for edit forms (especially in TYPO3s Fluid Template Engine) can cause performance problems.
The Extbase-DBAL-Layer can also cause massive performance problems. The reason is the database model does not know indexes. It' simple but stupid. A SQL-Join on a big table of 2000 records+ will delay the output perceptibly, depending on the data model.
Also TYPO3 Backend does not really depend on the Typoscript-Configuration but in effect to control some output or loaded by extensions, the full parsing of the *.ts files is needed. And this parser is very slow.
If you want to speed things up you need to know what goes wrong. The only way to debug this behaviour is to inspect the runtime with a PHP profiling tool like xdebug because the TYPO3 Framework is very complex. It's using some kind of Doctrine Framework and will load tons of files, by every request. Thus a good configured OpCache is a must.
Most reason the whole thing is slow is because it is poor written. You can confirm that fact by inspecting the runtime.
In addition to what already has been said, put the runtime environment onto your checklist:
Memory:
If heavy IDE and other tools are open at the same time, available memory can become an issue. To check the memory profile, you may start a tool that monitors the memory usage of the machine.
If virtualization is used, check the memory assigned to the box. Try if assigning more memory improves behaviour.
If required and possible spend more memory to your machine. This should not be a bugfix to poorly written code. Bad code can blow up any size of memory.
File access:
TYPO3 reads and writes thousands of files. If you work with a contemporary SSD, this is surprisingly fast. I did measure this. Loading all class files of TYPO3 takes just a fraction of a second.
However this may look different if you do not work with a standard setup. Many factors may slow you down:
USB-Sticks as storage.
Memory cards as storage.
All kind of external storage may be limited due to slow drivers.
Virtualization can become an issue. Again it's a question of drivers.
In doubt test and store your files and DB on a different drive to compere the behaviour.
Routing
The database itself may be fast. A bad routing of your request may still slow you down. Think of firewalls, proxies etc. even on your local machine and specially if virtualisation is used.
Database connection:
I fast database connection is crucial. If the database access is slow TYPO3 can't be fast.
Especially due to Extbase TYPO3 often queries much more data than really required and more often than really required, because a lot of relations are resolved in the PHP layer instead of the DB layer itself. Loading data structures like the root line may cause a lot of ping-pong between the PHP and the DB layer.
I can't give advice, how to measure your DB-connection. You have to as your admin for that. What you always can do is to test and compare with another DB from a completely different environment.
The speed of the database may depend on the type of the database itself. Typically you use MySQL/Maria-DB which should be fast. It also depends on the factors mentioned above, memory, file access and routing.
Strategy:
Even without being and admin and knowing all performance tools, you can always exchange parts of your system and check if matters improve. By this approach you can localise the culprit without being an expert. Once having spotted the culprit, Google may help you to get more information.
When it comes to a clean and performant setup of routing or virtualisation it's still the best idea to ask an experienced admin.
Summary
This is all in addition to what others have already pointed to.
What would be really helpful would be a BE-Plugin, that analyses and measures the environment. May there are some out there I don't know.

What are the size limits for Laravel's file-based caching?

I am a new developer and am trying to implement Laravel's (5.1) caching facility to improve the speed of my app. I started out caching a large DB table that my app constantly references - but it got too large so I have backed away from that and am now 'forever' caching smaller chunks of data - for example, for each page only the portions of that large DB table that are relevant.
I have watched 'Caching Essentials' on Laracasts, done some Googling and had a search in this forum (and Laracasts') but I still have a couple of questions:
I am not totally clear on how the cache size limits work when you are using Laravel's file-based system - is there an overall in-app size limit for the cache or is one limited size-wise only per key and by your server size?
What are the signs you should switch from file-based caching to something like Memcached or Redis - and what are the benefits of using one of those services? Is it the fact that your caching is handled on a different server (thereby lightening the load on your own)? Do you switch over to one of these services when your local, file-based cache gets too big for your server?
My app utilizes several tables that have 3,000-4,000 rows - the data in these tables is constantly referenced and will remain static unless I decide to add new options. I am basically looking for the best way to speed up queries to the data in these tables.
Thanks!
I don't think Laravel imposes any limitations on its file i/o at all - the limitations will be with how much what PHP can read / write to a file at once, or hold in its memory / process at any one time.
It does serialise the data that you cache, and unserialise it when you reload it, so your PHP environment would have to be able to process the entire cache file (which is equivalent to the top level cache key) at once. So, if you are getting cacheduser.firstname, it would have to load the whole cacheduser key from the file, unserialise it, then get the firstname key from that.
I would take the PHP memory limit (classic, i know!) as a first point to investigate if you want to keep down this road.
Caching services like Redis or memcached are bespoke, optimised caching solutions. They take some of the logic and responsibility out of your PHP environment.
They can, for example, retrieve sub-keys from items without having to process the whole thing, so can retrieve part of some cached data in a memory efficient way. So, when you request cacheduser.firstname from redis, it just returns you the firstname attribute.
They have other advantages regarding tagging / clearing out subsets of caches (see [the cache tags Laravel docs] (https://laravel.com/docs/5.4/cache#cache-tags))
Another thing to think about is scaling. If your site is large enough, and is load-balanced across multiple servers, the filesystem caching may be different across those servers, as each server can only check their local filesystem for the cache files. A caching service can be on a different server (many hosts will have a separate redis / memcached services available), so isn't victim to this issue.
Also - as I understand it (and this might be the most important thing), the file cache driver in Laravel is mainly for local development and testing. Although it can work fine for simple applications with basic caching needs, it's not intended for large scalable production environments.
Personally, I develop locally and test with file caching, as i'm only dealing with small amounts of data then, and use redis to cache on production environments.
It doesn't necessarily need to be on a separate server to get the benefits. If you are never going to scale to multiple application servers, then using a caching service on the same server will already be a large improvement to caching large documents.

Performance difference between Azure Redis cache and In-role cache for outputcaching

We are moving an asp.net site to Azure Web Role and Azure Sql Database. The site is using output cache and normal Cache[xxx] (i.e. HttpRuntime.Cache). These are now stored in the classic way in the web role instance memory.
The low hanging fruit is to first start using a distributed cache for output caching. I can use in-role cache, either as co-located or with a dedicated cache role, or Redis cache. Both have outputcache providers ready made.
Are there any performance differences between the two (thee with co-located/dedicated) cache methods?
One thing to consider is that will getting the page from Redis for every pageload on every server be faster or slower than composing the page from scratch one every server every 120 seconds but inbetween just getting it from local memory?
Which will scale better when we want to start caching our own data (i.e. pocos) in a distributed cache instead of HttpRuntime.Cache?
-Mathias
Answering to your each question individually:
Are there any performance differences between the two (thee with
co-located/dedicated) cache methods?
Definately co-located caching solution is faster than dedicated cache server, as in co-located/inproc solution request will be handled locally within the process where as dedicated cache solution will involve getting data over the network. However since data will be in-memory on cache server, getting will still be faster than getting from DB.
One thing to consider is that will getting the page from Redis for
every pageload on every server be faster or slower than composing the
page from scratch one every server every 120 seconds but inbetween
just getting it from local memory?
It will depend on number of objects on page i.e. time taken to compose the page from scratch. Though getting from cache will involve network trip time but its mostly in fractions of a millisecond.
Which will scale better when we want to start caching our own data
(i.e. pocos) in a distributed cache instead of HttpRuntime.Cache?
Since HttpRuntime.Cache is in-process caching, it is limited to single process's memory therefore it is not scalable. A distributed cache on the other hand is a scalable solution where you can always add more servers to increase cache space and throughput. Also out-proc nature of distributed cache solution makes it possible to access data cached by on application process to be used by any other process.
You can also look into NCache for Azure as a distributed caching solution. NCache is a native .Net distributed caching solution.
Following blog posts by Iqbal Khan will help you better understand the need of distributed cache for ASP.Net applications:
Improve ASP.NET Performance in Microsoft Azure with Output Cache
How to use a Distributed Cache for ASP.NET Output Cache
I hope this helps :-)
-Sameer

What steps do you take to increase performance of a Sharepoint site?

Sharepoint isn't the speediest of server applications, and I've read about a few tips to speed it up. What steps do you think are necessary to increase performance so it can be used to host a high traffic site?
At the end of the day SharePoint is just a complicated web site with all the standard components.
In order to optimize performance you need to analyze each component and determine which one is a problem, and then adjust it accordingly.
We're in the process of implementing a 1000 concurrent user sharepoint website, which may or may not be large, however some steps we are taking are:
Implementing a detailed caching strategy, to cache webpart content intelligently.
Use load balanced servers to ensure all our hardware is utilised rather then lying idle.
We've undertaken capacity planning given the existing solution, so we have a good idea which component is the bottleneck for us. (The SQL Server), so we will ensure the server can cope with expected load and future growth of the site.
We're also using hardware load balancers which will ensure our network and the related servers operate as expected, and again this is something to investigate before you implement a sharepoint website.
We're also ensuring our webparts don't generate unnecessary html, and don't return unnecesary data, as this will slow down loading times.
Something which I definately think is a good idea is to have a goal to work towards, as you can spend a huge amount of money and time optimizing SharePoint, which may prove unnecessary.
My additional best bets are:
use x64 to allow more RAM on your server
Make the best use of your application pool recycling http://blogs.msdn.com/joelo/archive/2007/10/29/sharepoint-app-pool-settings.aspx
Make sure all custom code properly disposes SPWeb and SPSite objects using this http://blogs.msdn.com/rogerla/archive/2008/02/12/sharepoint-2007-and-wss-3-0-dispose-patterns-by-example.aspx
utilize MS Capacity Planning Tool http://technet.microsoft.com/en-us/library/bb961988.aspx
Plan your site collection and database sizes. Keeping your databases and site collections under control will be key
GOVERNANCE GOVERNANCE GOVERNANCE - Plan for site size limits and expiration strategy. Old data should be deleted or archived for better performance. http://technet.microsoft.com/en-us/office/sharepointserver/bb507202.aspx
I cannot emphasize enough that proper early planning is essential for a successful SharePoint implementation.
In addition to caching and hardware, try to make sure that your masterpages and page layouts are not ghosted in the database (requiring a database call to retrieve).
Do this by ensuring the files get released to the 12 hive in your solution.
Don't forget careful selection of the built-in cache settings (choose the right one for your situation).
Use the BLOBCache.
Use IIS Compression/caching (the defaults are not enough BTW).
Ensure your SQL box can keep up, especially during indexing/crawling. Splitting the Application roles (indexing vs search query and dedicated WFE for indexing/crawling) helps.
BTW if you're running VMWare VMs for your WFEs, Windows NLB breaks (though not consistently), so use hardware NLBs or DNS round-robin, etc.
If you don't need > 2gig RAM for the IIS Application Pool on a WFE, don't bother with 64bit on the WFE.
Just my 2c.

Caching with multiple server

I'm building an application with multiple server involved. (4 servers where each one has a database and a webserver. 1 master database and 3 slaves + one load balancer)
There is several approach to enable caching. Right now it's fairly simple and not efficient at all.
All the caching is done on an NFS partition share between all servers. NFS is the bottleneck in the architecture.
I have several ideas implement
caching. It can be done on a server
level (local file system) but the
problem is to invalidate a cache
file when the content has been
update on all server : It can be
done by having a small cache
lifetime (not efficient because the
cache will be refresh sooner that it
should be most of the time)
It can also be done by a messaging
sytem (XMPP for example) where each
server communicate with each other.
The server responsible for the
invalidation of the cache send a
request to all the other to let them
know that the cache has been
invalidated. Latency is probably
bigger (take more time for everybody
to know that the cache has been
invalidated) but my application
doesn't require atomic cache
invalidation.
Third approach is to use a cloud
system to store the cache (like
CouchDB) but I have no idea of the
performance for this one. Is it
faster than using a SQL database?
I planned to use Zend Framework but I don't think it's really relevant (except that some package probably exists in other Framework to deal with XMPP, CouchDB)
Requirements: Persistent cache (if a server restart, the cache shouldn't be lost to avoid bringing down the server while re-creating the cache)
http://www.danga.com/memcached/
Memcached covers most of the requirements you lay out - message-based read, commit and invalidation. High availability and high speed, but very little atomic reliability (sacrificed for performance).
(Also, memcached powers things like YouTube, Wikipedia, Facebook, so I think it can be fairly well-established that organizations with the time, money and talent to seriously evaluate many distributed caching options settle with memcached!)
Edit (in response to comment)
The idea of a cache is for it to be relatively transitory compared to your backing store. If you need to persist the cache data long-term, I recommend looking at either (a) denormalizing your data tier to get more performance, or (b) adding a middle-tier database server that stores high-volume data in straight key-value-pair tables, or something closely approximating that.
In defence of memcached as a cache store, if you want high peformance with low impact of a server reboot, why not just have 4 memcached servers? Or 8? Each 'reboot' would have correspondingly less effect on the database server.
I think I found a relatively good solution.
I use Zend_Cache to store locally each cache file.
I've created a small daemon based on nanoserver which manage cache files locally too.
When one server create/modify/delete a cache file locally, it send the same action to all server through the daemon which do the same action.
That mean I have local caching files and remote actions at the same time.
Probably not perfect, but should work for now.
CouchDB was too slow and NFS is not reliable enough.

Resources