How does caching work in openCPU? - opencpu

This question is directed towards Jeroen and is a follow-up to this answer: https://stackoverflow.com/a/12482918/177984
Jeroen wrote "the server does caching" .. "so if enough memory is available it will automatically be available from memory."
How can I confirm if an object is cached 'in-memory' or not? From what I can tell (by performance) all of my objects are being read from disk. I'd like to have things read from memory to speed up data load times. Is there a way to view what's in the in-memory cache? Is there a way to force caching objects in-memory?
Thanks for your help.

The OpenCPU project is rapidly evolving. Things have changed in OpenCPU 1.0. Have a look at the website for the latest information: http://www.opencpu.org.
The answer that you cited is outdated. Currently indeed all the caching is done on disk. In a previous version, OpenCPU used Varnish to do caching, which is completely in-memory. However this turned out to make things more complicated (especially https), and performance was a bit disappointing (especially in comparison with fast disks these days). So now we're back at nginx which caches on disk, but is much more mature and configurable as a web server, and has other performance benefits.

Related

How to speed up the TYPO3 Backend?

Given: Each call to a BE module takes several seconds even with a SSD drive. (A well configured setup runs below 1 second for general BE tasks.)
What are likely bottlenecks?
How to check for them?
What options to speed up?
On purpose I don't give a special configuration, but ask for a general checklist, so that the answer is suitable for many people as first entry point.
General tips on performance tuning for TYPO3 can be found here: https://wiki.typo3.org/Performance_tuning
However, in my experience most general performance problems are due to one of a few reasons:
Bad/no caching. Usually this is a problem with one or more extensions (partly) disabling cache. Try disabling all third party extensions and enabling them one by one to see which causes the site to slow down the most. $GLOBALS['TSFE']->set_no_cache() will disable all cache, so you could search for that. USER_INT and COA_INT in TypoScript also disable cache for anything that's configured inside there.
A lot of data. Check the database for any tables containing a lot of data. How many constitutes "a lot", depends on a lot of factors, but generally anything below a million records shouldn't be too much of a problem unless for example you do queries with things like LIKE '%...%' on fields containing a lot of data.
Not enough resources on the server. To fix this, add more memory and/or CPU cores to the server. Or if it's a shared server, reduce the number of sites running on it.
Heavy traffic. No matter how many resources a server has, it will always have a limit to the number of requests it can process in a given time. If this is your problem you will have to look into load balancing and caching servers. If you don't (normally) have a lot of visitors, high traffic can still be caused by robots crawling your site too quickly. These are usually easy to block on IP address in your firewall or webserver configuration.
A slow backend on a server without any other traffic (you're the only one who can access it) rules out 1 (can only cause a slow backend if users are accessing the frontend and causing a high server load) and 4 (no other traffic).
one further aspect you could inspect: in the user record a lot of things are stored, for example the settings you used in the log module.
one setting which could consume a lot of memory (and time to serialize and deserialize) is the state of the pagetree (which pages are expanded/ which are not).
Cleaning the user settings could make the backend faster for this user.
If you have a large page tree and the user has to navigate through many pages the effect will stall. another draw back: you loose all settings as there still is no selective cleaning.
Cannot comment here but need to say: The TSFE-Object does absolutely nothing in the TYPO3 Backend. The Backend is always uncached. The TYPO3-Backend is a standalone module to edit and maintenance the frontend output. There are tons of Google search results that will ignore this fact.
Possible performance bottlenecks are poor written extensions that do rendering or data processing. Hooks to core functions are usually no big deal but rendering of many elements for edit forms (especially in TYPO3s Fluid Template Engine) can cause performance problems.
The Extbase-DBAL-Layer can also cause massive performance problems. The reason is the database model does not know indexes. It' simple but stupid. A SQL-Join on a big table of 2000 records+ will delay the output perceptibly, depending on the data model.
Also TYPO3 Backend does not really depend on the Typoscript-Configuration but in effect to control some output or loaded by extensions, the full parsing of the *.ts files is needed. And this parser is very slow.
If you want to speed things up you need to know what goes wrong. The only way to debug this behaviour is to inspect the runtime with a PHP profiling tool like xdebug because the TYPO3 Framework is very complex. It's using some kind of Doctrine Framework and will load tons of files, by every request. Thus a good configured OpCache is a must.
Most reason the whole thing is slow is because it is poor written. You can confirm that fact by inspecting the runtime.
In addition to what already has been said, put the runtime environment onto your checklist:
Memory:
If heavy IDE and other tools are open at the same time, available memory can become an issue. To check the memory profile, you may start a tool that monitors the memory usage of the machine.
If virtualization is used, check the memory assigned to the box. Try if assigning more memory improves behaviour.
If required and possible spend more memory to your machine. This should not be a bugfix to poorly written code. Bad code can blow up any size of memory.
File access:
TYPO3 reads and writes thousands of files. If you work with a contemporary SSD, this is surprisingly fast. I did measure this. Loading all class files of TYPO3 takes just a fraction of a second.
However this may look different if you do not work with a standard setup. Many factors may slow you down:
USB-Sticks as storage.
Memory cards as storage.
All kind of external storage may be limited due to slow drivers.
Virtualization can become an issue. Again it's a question of drivers.
In doubt test and store your files and DB on a different drive to compere the behaviour.
Routing
The database itself may be fast. A bad routing of your request may still slow you down. Think of firewalls, proxies etc. even on your local machine and specially if virtualisation is used.
Database connection:
I fast database connection is crucial. If the database access is slow TYPO3 can't be fast.
Especially due to Extbase TYPO3 often queries much more data than really required and more often than really required, because a lot of relations are resolved in the PHP layer instead of the DB layer itself. Loading data structures like the root line may cause a lot of ping-pong between the PHP and the DB layer.
I can't give advice, how to measure your DB-connection. You have to as your admin for that. What you always can do is to test and compare with another DB from a completely different environment.
The speed of the database may depend on the type of the database itself. Typically you use MySQL/Maria-DB which should be fast. It also depends on the factors mentioned above, memory, file access and routing.
Strategy:
Even without being and admin and knowing all performance tools, you can always exchange parts of your system and check if matters improve. By this approach you can localise the culprit without being an expert. Once having spotted the culprit, Google may help you to get more information.
When it comes to a clean and performant setup of routing or virtualisation it's still the best idea to ask an experienced admin.
Summary
This is all in addition to what others have already pointed to.
What would be really helpful would be a BE-Plugin, that analyses and measures the environment. May there are some out there I don't know.

Is memcache(d) necessary when using Cloudflare/Incapsula

If you need caching in your website to make database use lower, do you have to do it using memcache or memcached (in PHP, for example) or can you achieve this by using professional services like CloudFlare, Incapsula or others like that do some caching for you?
Services like Cloudflare cache your HTML and/or assets like images and CSS files in a CDN, so that your entire server is hit less often. This is great for semi-static sites but may not be the best fit for highly dynamic sites.
Local caches like memcached just store any data in a way that's fast to access. You can use that to cache database queries and lower your database activity, but you can also use it to store pre-computed data that would be expensive to re-create all the time or whatever else you may want to store non-permanently in a fast-to-access way.
Both solutions solve different problems. You may use both together, or either, or neither. It really depends on where exactly your bottleneck is and which solution fits your problem better.
I'm the CEO of CloudFlare and I'd say: more (intelligent) caching is almost always a good thing. While we can significantly decrease the load coming to your web server, to get the best performance it's still extremely important to optimize your web application and it's interaction with your database. To that end, memcache and other fast caching layers can play an important role and I'd never discourage them.
PS - we work great with dynamic sites. 95%+ of our sites are highly dynamic web applications.

Is SQLite suitable for use as a read only cache on a web server?

I am currently building a high traffic GIS system which uses python on the web front end. The system is 99% read only. In the interest of performance, I am considering using an externally generated cache of pre-generated read-optimised GIS information and storing in an SQLite database on each individual web server. In short it's going to be used as a distributed read-only cache which doesn't have to hop over the network. The back end OLTP store will be postgreSQL but that will handle less than 1% of the requests.
I have considered using Redis but the dataset is quite large and therefore it will push up the administrative cost and memory cost on the virtual machines this is being hosted on. Memcache is not suitable as it cannot do range queries.
Am I going to hit read-concurrency problems with SQLite doing this?
Is this a sensible approach?
Ok after much research and performance testing, SQLite is suitable for this. It has good request concurrency on static data. SQLite only becomes an issue if you are doing writes as well as heavy reads.
More information here:
http://www.sqlite.org/lockingv3.html
if usage case is just a cache why don't you use something like
http://memcached.org/.
You can find memcached bindings for python in pypi repository.
Another options is that you use materialized views in postgres, this way you will keep things simple and have everything in one place.
http://tech.jonathangardner.net/wiki/PostgreSQL/Materialized_Views

Difference between Memcache, APC, XCache and other alternatives I've not heard of

At work, we've recently started designing an application to me "large scale" (we're engineering for the potential to serve up many millions of hits a day). One of the senior devs and the sysadmin have set up memcache on the server.
As I understand it, Memcache will hold query results and certain tables in memory for X amount of time and keep everything hunky dory.
A drawback of memcache it seems is that I just can't for the life of me manage to set it up on my local dev environment. I've followed a few different instructionals on how to compile it for yourself. Most, if not all of the steps seem to work properly but get this error on PHPLoad:
[11-Sep-2010 16:02:30] PHP Warning: PHP Startup: Unable to load dynamic library '/Applications/MAMP/bin/php5.3/lib/php/extensions/no-debug-non-zts-20090626/memcached.so' - dlopen(/Applications/MAMP/bin/php5.3/lib/php/extensions/no-debug-non-zts-20090626/memcached.so, 9): image not found in Unknown on line 0
Not the primary question but incedentally, if you've been able to compile Memcache for MAMP 1.9 on Snow Leopard, please let me know the trick.
My primary question is about what the differences are between the various web caching technologies. I've seen mention of Memcache, APC and Xcache (here: Cache results of a mysql query manually to a txt file) but don't know the pros, cons and differences between each.
To my mind, Memcache has the advantage of being the one that the project's lead dev and our sysadmin chose. It has the disadvantage of being utter foobar to try and set up and compile on a Mac. :-^)
Anyone who I'd love to hear from anyone who can enumerate the pros and cons of each (or even one of) the other cachine technologies. Where are they best used, how are they best used. And so on.
It's all useful information I think.
Thanks so much for lending your time to expanding my knowledge.
- Alex.
First, a list of opcode cachers for php.
Second Memcache/MemcacheD is not an Opcode Cacher. It is a distributed memory caching system. It does not improve the speed/performance of your PHP code. It can be used to store data only.
APC, EAccelerator, XCache and the others are non distributed, meaning you can only store data on the local web-server. However all of these are opcode cachers and can improve the performance of your PHP app. Most, excluding EAccelerator (in the current version) can also store data.
I generally choose APC for the opcode cacher (It reportedly will be included into the core of PHP 6). However if I also have more than one web-server for the site I will also make use of MemcacheD.
Edit 1 I agree it is very annoying to setup APC, Memcache on MAMP. There are however tutorials out there dealing with such.
Edit 2 Also with regards to the best Opcode Cacher for your app really depends on which server you are using. Some work better on some systems. It also depends on the size and scale of your app as to how the cachers perform.
Edit 3 Very interesting article here about comparing performance of a few different cachers. (This article appears to be written in 2006 and should not really be used for current reference)
APC is a opcode cache. It will store parsed PHP code so that every time your PHP files do not need to get parsed.
Memcache is a data cache. It will store data as a key value pair.

wanting a good memory + disk caching solution

I'm currently storing generated HTML pages in a memcached in-memory cache. This works great, however I am wanting to increase the storage capacity of the cache beyond available memory. What I would really like is:
memcached semantics (i.e. not reliable, just a cache)
memcached api preferred (but not required)
large in-memory first level cache (MRU)
huge on-disk second level cache (main)
evicted from on-disk cache at maximum storage using LRU or LFU
proven implementation
In searching for a solution I've found the following solutions but they all miss my marks in some way. Does anyone know of either:
other options that I haven't considered
a way to make memcachedb do evictions
Already considered are:
memcachedb
best fit but doesn't do evictions: explicitly "not a cache"
can't see any way to do evictions (either manual or automatic)
tugela cache
abandoned, no support
don't want to recommend it to customers
nmdb
doesn't use memcache api
new and unproven
don't want to recommend it to customers
Tokyo Cabinet/Tokyo Tyrant?
Seems that later versions of memcachedb can be cleaned up manually if desired using the rget command and storing the expiry time in the data record. Of course, this means that I pound both the server and network with requests for the entire data block even though I only want the expiry time. Not the best solution but seemingly the only one currently available.
I worked with EhCache and it works very good. It has in memory cache and disk storage with differents eviction policies. It's a mature library a with good support. There is a memcached api that wraps EhCache, specially developed for GAE support.
Regards,
Jonathan.

Resources