Search in default_collection minus a specific collection - google-search-appliance

In our GSA index of 500K documents half of the documents are coming from an internal bug tracking system.
We have been hearing some power users complain about results from the bug tracking system pushing down other useful results from many other sources.
We discussed about using result biasing to lower the importance of bug tracking documents but I am not very keen on this approach as I believe we should let GSA do its magic and decide on the relevancy of the results.
Instead what I want to provide users as an option is a UI (checkbox for each collection) where they can pick what collections they want to perform the search.
My non-default collections does not include everything that is under the default_collection. So when user checks each and every checkbox they may think that that is everything in the index while it is not.
Because of this I want the checkboxes to behave as exclude rather than include (i,e. check to exclude this collection).
Finally my question: Is there a way to search in the default collection but filter out results that belong to a specific collection (bug tracking collection).
When you want to use multiple collections you do &site=col1|col2|col3..
What I am after is something like &site=default_collection-col1 (that's a minus in between).
Is there a way to do this?
Any alternative approaches to this problem?

Personally, I would rethink the design of your collections and build more modular collections that you can include. That way as you mentioned you can include OR queries in your site include.
http://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/xml_reference/request_format.html#1076953
A less ideal but more specific solution to your problem is going to be do an exclude by URL in your search query, be aware this can appear in results query search box and looks ugly, but this can be fixed using a simple XSLT change.
To exclude results for a specific site (http://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/xml_reference/request_format.html#1076964) I would use this sparingly and opt for better design of the collections.

By far the best way to do this is in your collection config. Just create a new collection that has the same include pattern as your default collection and add the pattern from your bug tracking collection as an exclude pattern.
There's no way to do what you're asking purely using query parameters unless you list out every individual collection using the '|' except the one you want and then you're likely to run in to URL length issues.

Update your frontend to exclude the url patterns mentioned for the bugtracking collection.
check this url on your box
http://yourGSAEnterpriseCcontroller:8000/EnterpriseController/serve_remove.html

Related

Exclude objects from all collections/queries unless explicitly asked for

I'll have some items in a model's database table that I more often that not won't want to include in queries for that model. So, rather than querying to exclude these items everywhere I call for the model, either directly or via a relationship, it would be nice to tell Laravel 'in one place' to exclude these items from all collections. The criteria for excluding will be a column value.
Perhaps somewhere in the model I can put this criteria?
Ideally the solution will also provide a way to easily explicitly re-include those excluded items in collections, at the point of querying.
Laravel's model scopes are almost there, but I need it the over way around. Perhaps scopes will solve the second part of my quest (in the paragraph above this one).
I found the answer: Global Scopes. https://laravel.com/docs/8.x/eloquent#global-scopes
I was previously looking at an older Laravel version's doc, which didn't have Global Scopes.

Is there a way to sort a content query by the value of a field programmatically?

I'm working on a portal based on Orchard CMS. We're using Orchard to manage the "normal" content of the site, as well as to model what's essentially data for a small application embedded in it.
We figured that doing it that way is "recommended" for working in Orchard, and that it would save us duplicating a bunch of effort in features that Orchard already provides, mainly generating a good enough admin UI. This is also why we're using fields wherever possible.
However, for said application, the client wants to be able to display the data in the regular UI in a garden-variety datagrid that can be filtered, sorted, and paged.
I first tried to implement this by cobbling together a page with a bunch of form elements for the filtering, above a projection with filters bound to query string parameters. However, I ran into the following issues with this approach:
Filters for numeric fields crash when the value is missing - as would be pretty common to indicate that the given field shouldn't be considered when filtering. (This I could achieve by changing the implementation in the Orchard source, which would however make upgrading trickier later. I'd prefer to keep anything I haven't written untouched.)
It seems the sort order can only be defined in the administration UI, it doesn't seem to support tokens to allow for the field to sort by to be changed when querying.
So I decided to dump that approach and switched to trying to do this with just MVC controllers that access data using IContentQuery. However, there I found out that:
I have no clue how, if at all, it's possible to sort the query based on field values.
Or, for that matter, how / if I can filter.
I did take a look at the code of Orchard.Projections, however, how it handles sorting is pretty inscrutable to me, and there doesn't seem to be a straightforward way to change the sort order for just one query either.
So, is there any way to achieve what I need here with the rest of the setup (which isn't little) unchanged, or am I in a trap here, and I'll have to move every single property I wish to use for sorting / filtering into a content part and code the admin UI myself? (Or do something ludicrous, like create one query for every sortable property and direction.)
EDIT: Another thought I had was having my custom content part duplicate the fields that are displayed in the datagrids into Hibernate-backed properties accessible to query code, and whenever the content item is updated, copy values from these fields into the properties before saving. However, again, I'm not sure if this is feasible, and how I would be able to modify a content item just before it's saved on update.
Right so I have actually done a similar thing here to you. I ended up going down both approaches, creating some custom filters for projections so I could manage filters on the frontend. It turned out pretty cool but in the end projections lacked the raw querying power I needed (I needed to filter and sort based on joins to aggregated tables which I think I decided I didn't know how I could do that in projections, or if its nature of query building would allow it). I then decided to move all my data into a record so I could query and filter it. This felt like the right way to go about it, since if I was building a UI to filter records it made sense those records should be defined in code. However, I was sorting on users where each site had different registration data associated to users and (I think the following is a terrible affliction many Orchard devs suffer from) I wanted to build a reusable, modular system so I wouldn't have to change anything, ever!
Didn't really work out quite like I hoped, but to eventually answer the question in your title: yes, you can query fields. Orchard projections builds an index that it uses for querying fields. You can access these in HQL, get the ids of the content items, then call getmany to get them all. I did this several years ago, and I cant remember much but I do remember having a distinctly unenjoyable time with it haha. So after you have an nhibernate session you can write your hql
select distinct civr.Id
from Orchard.ContentManagement.Records.ContentItemVersionRecord civr
join civ.ContentItemRecord cir
join ci.FieldIndexPartRecord fipr
join fipr.StringFieldIndexRecord sfir
This just shows you how to join to the field indexes. There are a few, for each different data type. This is the string one I'm joining here. They are all basically the same, with a PropertyName and value field. Hql allows you to add conditions to your join so we can use that to join with the relevant field index records. If you have a part called Group attached directly to your content type then it would be like this:
join fipr.StringFieldIndexRecord sfir
with sfir.PropertyName = 'MyContentType.Group.'
where sfir.Value = 'HR'
If your field is attached to a part, replace MyContentType with the name of your part. Hql is pretty awesome, can learn more here: https://docs.jboss.org/hibernate/orm/3.3/reference/en/html/queryhql.html But I dunno, it gave me a headache haha. At least HQL has documentation though, unlike Orchard's query layer. Also can always fall back to pure SQL when HQL wont do what you want, there is an option to write SQL queries from the NHibernate session.
Your other option is to index your content types with lucene (easy if you are using fields) then filter and search by that. I quite liked using that, although sometimes indexes are corrupted, or need to be rebuilt etc. So I've found it dangerous to rely on it for something that populates pages regularly.
And pretty much whatever you do, one query to filter and sort, then another query to getmany on the contentmanager to get the content items is what you should accept is the way to go. Good luck!
You can use indexing and the Orchard Search API for this. Sebastien demoed something similar to what you're trying to achieve at Orchard Harvest recently: https://www.youtube.com/watch?v=7v5qSR4g7E0

XPages: can i filter a view to show only entries that belong to a group?

i have a view in an xpage with some entries (lets say clients). I have an acl group of persons (clients) that contains some of the clients of the view. Now i want to use the search attribute of the view to show only entries that belong to the group.
I already use search attribute to select users by name e.g:
FIELD Name Contains "Chuck Norris"
Is there any similar query? (maybe using #isMember on the field....?)
UPDATE: i will have the group entries (client names) into a text list in a document too. so can i filter the "name" field of the view based on the values of a text list?
Perhaps using a reader field is a good idea. You're talking about restricting document access to a group of Domino users - that's exactly what reader fields are for.
For example, make your text list field containing client names into a reader field like this:
var item = document1.getFirstItem("myfield");
item.setReaders(true);
document1.save();
myfield needs to contain canonical names (CN=firstname lastname/O=organisation).
Using reader fields, you don't need to do any view filtering at all, it happens automatically. If you have really many documents (say, half a million or so), it could slow down things, otherwise, it's a nice approach.
When you want to restrict displaying documents only in one certain view reader fields are no solution, though. In that case, you need to do the view filtering yourself as you tried.
If you want to filter only for ONE certain client, then using a categorized view is the way to go. You can give the view panel the name of one client as category filter then.
If you want to filter for multiple clients, you need to do it based on fulltext search, just as you already tried. In that case, make sure you're working with Domino 9. Previous Domino versions don't apply the view sorting order to a fulltext search result, which means you have to search it manually using custom javascript or so, which is complicated.
Or, as Frantisek suggested, write a scheduled agent which puts documents in folders depending on their clients - but depending on the number of clients you want to filter the view for this may lead to many folders, which may lead to other problems. Furthermore, you need to make sure to remove folders when they are not needed anymore, and you have a lag until new documents appear in a folder.
So in a nutshell, if you want to do an application wide restriction based on client names, use reader fields.
If you want to restrict for one client name at a time, use categories.
Otherwise, use fulltext search with Domino 9.

How do I create a custom role for entities in Sphinx?

In my project we define threats/risks and countermeasures. I want to keep track and refer to both types of entities in Sphinx, as well as generating a list of both threats/risks and the countermeasures. Let's say I have 30 risks and 50 countermeasures (many-to-many relationship).
I'd be happy just to have a lists of both and the ability to refer to each other by numbers (e.g. "risk #23", "countermeasure #12"). It would be even better if the system could display the relationship automatically.
The content of both is let's say a single paragraph or even shorter, so that's why I dislike to use regular headings. And I cannot refer to items in lists or table rows. So, I'm looking for something like a Figure in Sphinx (numbered, with caption), but then for arbitrary types of entities.
My current approach is to create a custom RST role for this. Is this the right approach? If so, where to start?

What's the easiest way to sort an EF4 EntityCollection<T>?

I'd love to add some sorting to an EntityCollection that is bound to an ItemsControl (in xaml). I'd also like to do it as simply as possible. It appears that this is not possible.
If I wrap the collection in a "sorted" version of the collection property within the Entity I lose collection change notifications. I can't use a CollectionViewSource because the entity collection's BindingListCollectionView does not support sorting for some goddamned reason (note: I've seen the blog post with the "dirty" hack to get around this, so please don't answer with that kthx).
Is there a simple (couple lines of xaml, couple lines of code, whatever) way to achieve this??
The EntityCollection type cannot be directly filtered or sorted. It's a common LINQ-to-Entities problem, see:
Sort child objects while selecting the parent using LINQ-to-Entities
One solution would be to sort the entity collection separately using LINQ when you need the data, and incur the additional performance hit. If you're working with a collection you expect to be small and/or infrequently used, the difference in processing time could be negligible.
If you want the database perform the sorting and make use of any indexes, you can project the main entity along with the child entities. Alex James posts an example in his MSDN blog: http://blogs.msdn.com/b/alexj/archive/2009/02/25/tip-1-sorting-relationships-in-entity-framework.aspx. You're not limited to anonymous types, of course.

Resources