How to cache Zend Lucene search results in Code Igniter? - caching

I'm not sure if this is the best way to go about this, but my aim is to have pagination of my lucene search results.
I thought it would make sense to run the search, store all the results in the cache, and then have a page function on my results controller that could return any particular subset of results from the cached results.
Is this a bad approach? I've never used caching of any sort, so don't know where to begin. The CI Caching Driver looked promising, but everything throws a server error. I don't know if I need to install APC, or Memcached, or what to do.
Help!

Lucene is a search engine that is built for scale. You can push it pretty far till the need arises to cache the search results. I would suggest you use the default settings and run it.
If you still feel the need for cache, first look at this Lucene FAQ and then the next level would perhaps be something on the lines of memcache.
Hope it helps!

Zend Search Lucene is indexed on the file system and as the user above has stated, built for scale. Unless you are indexing hundreds of thousands of documents, then caching is not really necessary - especially since all you would effectively be doing is taking data from one file and storing it in another.
On the other hand, if you are only storing, say, product Id in your search index and then selecting the products from the database when you get a result, it's well worth caching. This can easily be achived by using Zend_Cache.
A basic example of Zend Db caching is here:
$frontendOptions = array(
'automatic_serialization' => true
);
$backendOptions = array(
'cache_dir' => YOUR_CACHE_PATH_ON_THE_FILE_SYSTEM,
'file_name_prefix' => 'my_cache_prefix',
);
$cache = Zend_Cache::factory('Core',
'File',
$frontendOptions,
$backendOptions
);
Zend_Db_Table_Abstract::setDefaultMetadataCache($cache);
This should be added to your bootstrap file in an _initDbCache (call it whatever you want) method.
Of course that is a very simple implementation and does not achieve full result caching, more information on Zend Caching with Zend Db can be found here.

Related

Umbraco 8 - Get Children Of Node Using ContentAtXPath() Method

I've been refactoring an existing Umbraco project to use more performant querying when getting back document data as everything was previously being returned using LINQ. I have been using a combination of Umbraco's querying via XPaths and Examine.
I am currently stumped on trying to get child documents using the Umbraco.ContentAtXPath() method. What I would like to do is get child document based on a path I parse to the method. This is what I have currently:
IEnumerable<IPublishedContent> umbracoPages = Umbraco.ContentAtXPath("//* [#isDoc]/descendant::/About [#isDoc]");
Running this returns a "Object reference not set to an instance of an object." error and unable to see exactly where I'm going wrong (new to this form of querying in Umbraco).
Ideally, I'd like to enhance the querying to also carry out sorting using the non-LINQ approach, as demonstrated here.
Up until Umbraco 8, content was cached in an XML file, which made XPath perfect for querying content efficiently. In v8, however, the so called "NuCache" is not file based nor XML based, so the XPath query support is only there for ... well... Old times sake, I guess? Either way it's probably not going to be super efficient and (I'd advise) not something to "aim for". That said I of course don't know what you are changing from (Linq can be a lot of things) :-/
It certainly depends on how big your dataset is.
As Umbraco has moved away from the XML backed cache, you should look into Linq queries against your content models. Make sure you use ModelsBuilder to generate the models.
On a small dataset Linq will be much quicker than examine. On a large dataset Examine/Lucene will be much more steady on performance.
Querying NuCache is pretty fast in Umbraco 8, only beaten by an Examine search.
Assuming you're using Models Builder, and your About page is a child of Home page, you could use:
var homePage = (HomePage) Model.Root();
var aboutPage = homePage?.Children<AboutPage>().FirstOrDefault();
var umbracoPages = aboutPage.Children();
Where HomePage is your home page Document Type Alias and AboutPage is your About page Document Type alias.

Increase performance of large tables from Plone search results

The traditional way of Zope to handle large search results is batched output: The first batchsize items are displayed, and to get the next chunk of data, you click on a "next" link to get the next chunk from the server.
Nowadays there are cool Javascript solutions which allow client-side sorting and filtering of tables, e.g. Datatables. These work fine; but if the table is large, and Zope generates the complete HTML, it sometimes takes a long time before the page loads (seems like the search is reasonably fast, but the TAL engine is the performance bottleneck).
So, how is this tackled best?
Generate the whole table from JSON? (needs Javascript for anything to work)
Use the standard paging, and replace it by a client-side table solution if Javascript is available?
provide the data of pages 2+ by JSON
provide all data by JSON
let the table engine load contents for next page or filtering
Is there some plug-in solution to apply such enhancements to standard views (like folder contents)?
I have a page which contains about 1600 items and takes 60s+ to load, which certainly needs to be improved ...
Any pointers and/or code snippets? Thank you!
You'll have to roll your own customizations. The new folder contents in plone 5 works only with json data; however, it does not return the whole folder resultset at a time--it still does paging server side. I've actually not encountered anyone the use-case of sorting thousands of results client-side. I usually do that server side and bring in the data with ajax.
You'll want to generate json data structures yourself from catalog results(brains). Something like this in a view perhaps:
import json
from Products.CMFCore.utils import getToolByName
catalog = getToolByName(context, 'portal_catalog')
query = {
'path': {
'query': '/'.join(context.getPhysicalPath()),
'depth': 1
}
}
result = []
for brain in catalog(**query):
result.append({
'id': brain.id,
'uid': brain.UID,
'title': brain.Title,
'url': brain.getURL()
})
return json.dumps(result)
Along with JavaScript. There are plenty of libraries to present json data in tables.
Things you can investigate if you're interested:
https://github.com/plone/plone.app.content/blob/master/plone/app/content/browser/vocabulary.py
https://github.com/plone/mockup/tree/master/mockup/patterns/structure

Kohana ORM Caching/Caching design approach

This questions is related to Kohana ORM AND Caching module. I use version 3.2 if it matters. I tried to research trust me, but I really couldn't find some good answer... so here it is:
What are the correct ways to use ORM::cached() and ORM::serialize() and ORM::$reload_on_wakeup?
I've seen many 2-line code examples but never anything really solid on the userguide/api...
What is the difference between enabling Cache module and 'caching' => true in Kohana::init?
Anyone has any recommended approach for the following specific situations? I have a catalogue page that upon profiling, I realized two very expensive actions:
I queried database each time for a currency model for each item, when the currency information can really be reused.
I queried database each time for each item's inventory item, this is an expensive query, which I wish I can cache until inventory level changes.
References that I found but couldn't answer fully my questions:
http://forum.kohanaframework.org/discussion/1782/tip-for-caching-orm-objects/p1
http://forum.kohanaframework.org/discussion/10600/does-kohana-orm-and-cache-work-together/p1
Just found your question, maybe too late, but maybe is useful to others:
cached, will force the Query builder to cache the DB query. It uses the KOhana:cache method (file cache) I am trying to find a workaround for this.
enables caching for the file search as says in the Kohana/Core.php file: Whether to use internal caching for [Kohana::find_file], does not apply to [Kohana::cache]. Set by [Kohana::init]
Enable caching true to speed up the file search, and enable the cache module, I am working on a way to cache the queries of DB using the instance used by the module. That would be better than using the file cache. Maybe I am missing something but stuck there right now.

Doctrine2 Caching of updated Elements

I have a problem with doctrine. I like the caching, but if i update an Entity and flush, shouldn't doctrine2 be able to clear it's cache?
Otherwise the cache is of very little use to me since this project has a lot of interaction and i would literally always have to disable the cache for every query.
The users wouldn't see their interaction if the cache would always show them the old, cached version.
Is there a way arround it?
Are you talking about saving and fetching a new Entity within the same runtime (request)? If so then you need to refresh the entity.
$entity = new Entity();
$em->persist($entity);
$em->flush();
$em->refresh($entity);
If the entity is managed and you make changes, these will be applied to Entity object but only persisted to your database when calling $em->flush().
If your cache is returning an old dataset for a fresh request (despite it being updated successfully in the DB) then it sounds like you've discovered a bug. Which you can file here >> http://www.doctrine-project.org/jira/secure/Dashboard.jspa
Doctrine2 never has those delete methods such as deleteByPrefix, which was in Doctrine1 at some point (3 years ago) and was removed because it caused more trouble.
The page http://docs.doctrine-project.org/projects/doctrine-orm/en/latest/reference/caching.html#deleting is outdated (The next version of the doctrine2 document will see those methods removed). The only thing you can do now is manually managing the cache: find the id and delete it manually after each update.
More advanced doctrine caching is WIP: https://github.com/doctrine/doctrine2/pull/580.
This is according to the documentation on Doctrine2 on how to clear the cache. I'm not even sure this is what you want, but I guess it is something to try.
Doctrine2's cache driver has different levels of deleting cached entries.
You can delete by the direct id, using a regex, by suffix, by prefix and plain deleting all values in the cache
So to delete all you'd do:
$deleted = $cacheDriver->deleteAll();
And to delete by prefix, you'd do:
$deleted = $cacheDriver->deleteByPrefix('users_');
I'm not sure how Doctrine2 names their cache ids though, so you'd have to dig for that.
Information on deleting cache is found here: http://docs.doctrine-project.org/projects/doctrine-orm/en/latest/reference/caching.html#deleting
To get the cache driver, you can do the following. It wasn't described in the docs, so I just traced through the code a little.
I'm assuming you have an entity manager instance in this example:
$config = $em->getConfiguration(); //Get an instance of the configuration
$queryCacheDriver = $config->getQueryCacheImpl(); //Gets Query Cache Driver
$metadataCacheDriver = $config->getMetadataCacheImpl(); //You probably don't need this one unless the schema changed
Alternatively, I guess you could save the cacheDriver instance in some kind of Registry class and retrieve it that way. But depends on your preference. Personally I try not to depend on Registries too much.
Another thing you can do is tell the query you're executing to not use the result cache. Again I don't think this is what you want, but just throwing it out there. Mainly it seems you might as well turn off the query cache altogether. That is unless it's only a few specific queries where you don't want to use the cache.
This example is from the docs: http://docs.doctrine-project.org/projects/doctrine-orm/en/latest/reference/caching.html#result-cache
$query = $em->createQuery('select u from \Entities\User u');
$query->useResultCache(false); //Don't use query cache on this query
$results = $query->getResult();

How to cache a collection in Magento?

I have a collection that takes significant time to load. What I would like is to cache it (APC, Memcache). It is not possible to cache the entire object (as it cannot be unserialized and it is over 1 MB). I'm thinking that caching the collection data ($col->getData() ) is the way to go, but I found no way to rebuild the object based on this array. Any clues?
Collections already have some caching built in but they need a little prompting so put this in the constructor of a collection:
$cache = Mage::app()->getCacheInstance();
$prefix = "SomeUniqueValue";
$this->initCache($cache, $prefix, array(Mage_Catalog_Model_Product::CACHE_TAG));
Choose tags appropriate to the content of the collection so that it will be flushed automatically. This way builds an ID based on the query being executed, it is most useful when the collection is filtered, ordered or paged - it avoids a version conflict.
Generally this hardly gets used because when you retrieve data you almost always end up displaying it, probably as HTML, so it makes sense to cache the output instead. Block caching is widely used and better documented.
I really don't know, but I searched for all the files that have the word "cache" in them with file names of "Collection.php" and got a few results. The most promising example to look at might be Mage_Sales_Model_Entity_Quote_Item_Collection (_getProductCollection() method). Looks like Varien_Data_Collection (which is a parent class of any magento collection) has a few cache-related methods: initCache() and _getCacheInstance().
Can't say I have used them before but might be useful someday.
Good luck.
You can get more information here: Can I use Magento's Caching layer as a Key/Value Store?
I'll be posting more info there as I find it.

Resources