how to customize plone 4 collection to sort by multiple fields - sorting

I'm building a Plone 4.1 based site and am trying to find the best way to either sort a collection by multiple sort criteria, or at least customize a collection portlet to do so for the font page of the site. I believe the portlet uses the collection sort settings unless you choose random. Here is the section of code from the standard results in the portlet:
def _standard_results(self):
results = []
collection = self.collection()
if collection is not None:
limit = self.data.limit
if limit and limit > 0:
# pass on batching hints to the catalog
results = collection.queryCatalog(batch=True, b_size=limit)
results = results._sequence
else:
results = collection.queryCatalog()
if limit and limit > 0:
results = results[:limit]
return results
For example, I would like to be able to sort by Expiration Date if present, if not then use the Creation Date for example. Or sort by tags and Creation Date. Any feedback on the best approach to this would be appreciated.

as ross said you'll need AdvancedQuery to sort on multiple criteria.
if you just need this for the frontpage i'd suggest do create a custom portelt based on the collectionportlet.
where the collectionportlet calls collection.queryCatalog() you'll want to add some additional logic for your sorting:
>>> uids = [brain.UID for brain in collection.queryCatalog()]
>>> query = AdvancedQuery.In('UID', uids)
>>> results = catalog.evalAdvancedQuery(query, (('sortable_title', 'asc'), ('date', 'desc')
then you can use results instead of the results in your code sample above

As in this answer, multiple sort is only available through AdvancedQuery an there's no integration of AdvancedQuery into collections that I'm aware of. So basically this isn't possible unless you integrate AdvancedQuery into collections yourself, which would be a non-trivial task.
A hackish workaround might be to use plone.indexer to write an indexer that returns the right sort value according to your logic, create a new FieldIndex in the catalog (profiles/default/catalog.xml), register that new index as valid for sort criterion in profiles/default/portal_atct.xml, then use that as your sort index.

Related

Django haystack narrow with OR operator between fields

I do a search. I narrow by field A. I narrow by field B. I get results that include burlap AND sack. What I want is to get results that include burlap OR sack.
sqs = sqs.narrow(fieldA='burlap')
sqs = sqs.narrow(fieldB='sack')
You can do some level of OR narrowing with the following:
sqs = sqs.narrow(fieldA=('burlap' or 'tweed' or 'plastic'))
sqs = sqs.narrow(fieldB='sack')
But you still end up with results with burlap AND sack. An alternative to this method is the following, but it is not ideal since it seems to be slow on large data sets:
sqs = sqs.filter_or(fieldA='burlap')
sqs = sqs.filter_or(fieldB='sack')
Where is Daniel Lindsay when you need him?
YMMV -- the docs (http://django-haystack.readthedocs.org/en/latest/searchqueryset_api.html#narrow) point out that this method is not portable between backends and that the syntax depends on the backend. The example in that section even has a lucene looking "SearchQuerySet().narrow('title:smoothie')" example.
In the source it looks like haystack pretty trustingly passes whatever you have as your narrow argument to the back end. You didn't say what backend you are using, but maybe something like this would get you the fq you want in solr:
sqs = sqs.narrow('fieldA:burlap OR fieldB:sack')
Filter_or is a different animal than narrow, at least with solr. Filter_or will add that clause to the main query, resulting in a different set of results, different scoring, etc. Narrow will create a filter query. This is instead used to filter your original results (shocking, right?) and it can be cached, which can help performance if you're going to be using that filter a lot.
D'oh, I typed all that stuff and still don't know where Daniel Lindsay is.

'Maximum number of expressions in a list is 1000' error with Grails and Oracle

I'm using Grails with an Oracle database. Most of the data in my application is part of a hierarchy that goes something like this (each item containing the following one):
Direction
Group
Building site
Contract
Inspection
Non-conformity
Data visible to a user is filtered according to his accesses which can be at the Direction, Group or Building Site level depending on user role.
We easily accomplished this by creating a listWithSecurity method for the BuildingSite domain class which we use instead of list across most of the system. We created another listWithSecurity method for Contract. It basically does a Contract.findAllByContractIn(BuildingSite.listWithSecurity). And so on with the other classes. This has the advantage of keeping all the actual access logic in BuildingSite.listWithsecurity.
The problem came when we started getting real data in the system. We quickly hit the "ora-01795 maximum number of expressions in a list is 1000" error. Fair enough, passing a list of over 1000 literals is not the most efficient thing to do so I tried other ways even though it meant I would have to deport the security logic to each controller.
The obvious way seemed to use a criteria such as this (I only put the Direction level access here for simplicity):
def c = NonConformity.createCriteria()
def listToReturn = c.list(max:params.max, offset: params.offset?.toInteger() ?: 0)
{
inspection {
contract {
buildingSite {
group {
'in'("direction",listOfOneOrTwoDirections)
}
}
}
}
}
I was expecting Grails to generate a single query with joins that would avoid the ora-01795 error but it seems to be calling a separate query for each level and passing the result back to Oracle as literal in an 'in' to query the other level. In other words, it does exactly what I was doing so I get the same error.
Actually, it might be optimising a bit. It seems to be solving the problem but only for one level. In the previous example, I wouldn't get an error for 1001 inspections but I would get it for 1001 contracts or building sites.
I also tried to do basically the same thing with findAll and a single HQL where statement to which I passed a single direction to get the nonConformities in one query. Same thing. It solves the first levels but I get the same error for other levels.
I did manage to patch it by splitting my 'in' criteria into many 'in' inside an 'or' so no single list of literals is more than 1000 long but that's profoundly ugly code. A single findAllBy[…]In becomes over 10 lines of code. And in the long run, it will probably cause performance problems since we're stuck doing queries with a very large amount of parameters.
Has anyone encountered and solved this problem in a more elegant and efficient way?
This won't win any efficiency awards but I thought I'd post it as an option if you just plainly need to query a list of more than 1000 items none of the more efficient options are available/appropriate. (This stackoverflow question is at the top of Google search results for "grails oracle 1000")
In a grails criteria you can make use of Groovy's collate() method to break up your list...
Instead of this:
def result = MyDomain.createCriteria().list {
'in'('id', idList)
}
...which throws this exception:
could not execute query
org.hibernate.exception.SQLGrammarException: could not execute query
at grails.orm.HibernateCriteriaBuilder.invokeMethod(HibernateCriteriaBuilder.java:1616)
at TempIntegrationSpec.oracle 1000 expression max in a list(TempIntegrationSpec.groovy:21)
Caused by: java.sql.SQLSyntaxErrorException: ORA-01795: maximum number of expressions in a list is 1000
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:440)
You'll end up with something like this:
def result = MyDomain.createCriteria().list {
or { idList.collate(1000).each { 'in'('id', it) } }
}
It's unfortunate that Hibernate or Grails doesn't do this for you behind the scenes when you try to do an inList of > 1000 items and you're using an Oracle dialect.
I agree with the many discussions on this topic of refactoring your design to not end up with 1000+ item lists but regardless, the above code will do the job.
Along the same lines as Juergen's comment, I've approached a similar problem by creating a DB view that flattens out user/role access rules at their most granular level (Building Site in your case?) At a minimum, this view might contain just two columns: a Building Site ID and a user/group name. So, in the case where a user has Direction-level access, he/she would have many rows in the security view - one row for each child Building Site of the Direction(s) that the user is permitted to access.
Then, it would be a matter of creating a read-only GORM class that maps to your security view, joining this to your other domain classes, and filtering using the view's user/role field. With any luck, you'll be able to do this entirely in GORM (a few tips here: http://grails.1312388.n4.nabble.com/Grails-Domain-Class-and-Database-View-td3681188.html)
You might, however, need to have some fun with Hibernate: http://grails.org/doc/latest/guide/hibernate.html

How to get the collection based upon the wildchar redis key using redis-rb gem?

The redis objects created using the redis-rb gem.
$redis = Redis.new
$redis.sadd("work:the-first-task", 1)
$redis.sadd("work:another-task", 2)
$redis.sadd("work:yet-another-task", 3)
Is there any method to get the collection that has "work:*" keys?
Actually, if you just want to build a collection on Redis, you only need one key.
The example you provided builds 3 distinct collections, each of them with a single item. This is probably not that you wanted to do. The example could be rewritten as:
$redis = Redis.new
$redis.sadd("work","the-first-task|1")
$redis.sadd("work", "another-task|2")
$redis.sadd("work", "yet-another-task|3")
To retrieve all the items of this collection, use the following code:
x = $redis.smembers("work")
If you need to keep track of the order of the items in your collection, it would be better to use a list instead of a set.
In any case, usage of the KEYS command should be restricted to tooling/debug code only. It is not meant to be used in a real application because of its linear complexity.
If you really need to build several collections, and retrieve items from all these collections, the best way is probably to introduce a new "catalog" collection to keep track of the keys corresponding to these collections.
For instance:
$redis = Redis.new
$redis.sadd("catalog:work", "work:the-first-task" )
$redis.sadd("catalog:work", "work:another-task" )
$redis.sadd("work:the-first-task", 1)
$redis.sadd("work:the-first-task", 2)
$redis.sadd("work:another-task", 3)
$redis.sadd("work:another-task", 4)
To efficiently retrieve all the items:
keys = $redis.smembers("catalog:work")
res = $redis.pipelined do
keys.each do |x|
$redis.smembers(x)
end
end
res.flatten!(1)
The idea is to perform a first query to get the content of catalog:work, and then iterate on the result using pipelining to fetch all the data. I'm not a Ruby user, so there is probably a more idiomatic way to implement it.
Another simpler option can be used if the number of collections you want to retrieve is limited, and if you do not care about the ownership of the items (in which set is stored each item)
keys = $redis.smembers("catalog:work")
res = $redis.sunion(*keys)
Here the SUNION command is used to build a set resulting of the union of all the sets you are interested in. It also filters out the duplicates in the result (this was not done in the previous example).
Well, I could get it by $redis.keys("work:*").

Retrieve list of mongo documents by ids preserving order

Which is the best way to retrieve a list of mongodb documents using mongoid in the order specified in the list.
My current solution is:
docs = Doc.where(:_id.in => ids).sort { |x, y| ids.index(x.id) <=> ids.index(y.id) }
It seems there should be a better solution for this using mongoid query interface. Any ideas?
If the number of ids is small you might get away with this (no need to sort it though):
docs = ids.map { |id| Doc.find(id) }
The drawback is of course that it will still go to the database for every document.
The closest method I could find is Doc.criteria.for_ids(ids) but it will not honor the order of the ids and fetch every document only once. See this question.

Query a Django queryset without creating a new queryset?

Not even sure I'm stating the question correctly. Here's the situation. I have a queryset generated by accessing the foreign key relationship. Using the standard Blog/Entry models in the Django documentation, let's say I have selected a blog and now have a set of entries:
entries = Blog.objects.get(id=1).entry_set.all()
So we have some entries for the blog, possibly zero. I'd like to then say construct a calendar and indicate which days have blog entries. So my thinking is to iterate over list of days in the month or whatever, and check the entries queryset for an entry with that date. Question is, what is the best way to do this? My first thought was to do something like
dayinfo = [] # we will iterate over this in the template
for curday in month:
dayinfo.append({'day':curday, 'entry':entries.filter(day=curday)})
Problem is that the filter call returns a new queryset, and that generates a new sql call for each loop iteration. I just need to pluck the entry object from entries if it exists and stick it into my calendar. So what is the best way to do this? I did get this to work:
dayinfo.append({'day':day, 'entry':[e for e in entries if e.day == curday][0]})
That does not generate new sql calls. But it sure seems ugly.
Resist the urge to put everything on one line - I think the code would be cleaner with something like this:
from collections import defaultdict
calendar = defaultdict(list)
for entry in entries:
calendar[entry.day].append(entry)
The defaultdict part is simple but you might want to initialize it with all of the days in a month if you're just planning to use a for loop in the template. Also note that if you're using Django 1.1 you can use the new annotate() method to simply calculate the post count if you're not actually planning to generate links to individual posts.

Resources