Lucene.NET - sorting by int - sorting

In the latest version of Lucene (or Lucene.NET), what is the proper way to get the search results back in sorted order?
I have a document like this:
var document = new Lucene.Document();
document.AddField("Text", "foobar");
document.AddField("CreationDate", DateTime.Now.Ticks.ToString()); // store the date as an int
indexWriter.AddDocument(document);
Now I want do a search and get my results back in order of most recent.
How can I do a search that orders results by CreationDate? All the documentation I see is for old Lucene versions that use now-deprecated APIs.

After doing some research and poking around with the API, I've finally found some non-deprecated APIs (as of v2.9 and v3.0) that will allow you to order by date:
// Find all docs whose .Text contains "hello", ordered by .CreationDate.
var query = new QueryParser(Lucene.Net.Util.Version.LUCENE_29, "Text", new StandardAnalyzer()).Parse("hello");
var indexDirectory = FSDirectory.Open(new DirectoryInfo("c:\\foo"));
var searcher = new IndexSearcher(indexDirectory, true);
try
{
var sort = new Sort(new SortField("CreationDate", SortField.LONG));
var filter = new QueryWrapperFilter(query);
var results = searcher.Search(query, , 1000, sort);
foreach (var hit in results.scoreDocs)
{
Document document = searcher.Doc(hit.doc);
Console.WriteLine("\tFound match: {0}", document.Get("Text"));
}
}
finally
{
searcher.Close();
}
Note I'm sorting the creation date with the LONG comparison. That's because I store the creation date as DateTime.Now.Ticks, which is a System.Int64, or long in C#.

Related

ElasticSearch NEST recreate index with zero downtime

I am writing backend in C# for a website. I'd like to recreate index with little downtime.
After reading these two posts:
Nest Client c# 7.0 for elastic search removing Aliases
Recreate ElasticSearch Index with Nest 7.x
I come up with this:
var alias_exist = await _client.Indices.ExistsAsync(index_string_alias);
if (alias_exist.Exists)
{
var oldIndices = await _client.GetIndicesPointingToAliasAsync(index_string_alias);
var oldIndexName = oldIndices.First().ToString();
await _client.Indices.BulkAliasAsync(new BulkAliasRequest
{
Actions = new List<IAliasAction>
{
new AliasRemoveAction {Remove = new AliasRemoveOperation {Index = oldIndexName, Alias = index_string_alias}},
new AliasAddAction {Add = new AliasAddOperation {Index = index_string_unique, Alias = index_string_alias}}
}
});
} else
{
var putAliasResponse = await _client.Indices.PutAliasAsync(new PutAliasRequest(index_string_unique, index_string_alias));
}
}
I'd like to remove index_string_alias if exists and assign the alias to the newly created index_string_unique.
Also, I'd like to confirm that I can treat the alias as the index name in my other queries.
I am really new to Elastic Search and wonder how people figure out these things. I searched through the official documentation and found little information about the async functions in NEST. Where should I look for explanations for functions?

How can I full text search response documents and have the parent show in the view/data table

I'm at a bit of an impasse. I am working in Domino with xPages and I am trying to allow full text searching through a view including response documents but including the parent document for any responses that match the query in the view or data table. Currently I'm just using the search term in a view datasource, and then using that datasource in a view control, but any workable solution would be welcome. There may be additional search criteria on the parent document.
Any ideas?
Richard,
you can't directly use the view as data source, so you won't use the view control. You can use the data table or (probably better, since it gives you full layout control) the repeat control.
Run the search against the view in code:
var v = database.getView("yourView")
//var result = database.FTSearch(...)
var result = v.FTSearchSorted(...) // or FTSearch
var datasource = [];
var parent;
for (var doc in result) {
addResult(doc, datasource);
if (doc.isResponseDoc()) {
parent = doc.getParentDocument();
addResult(parent, datasource);
// Careful here - if the parent is part of the resultset on its own
parent.recycle();
}
doc.recycle();
}
try {
result.recycle();
v.recycle();
} catch (e) {
// We suffer silently
}
return datasource;
function addResult(doc, datasource) {
var oneResult = {};
//Adjust that to your needs
oneResult.subject = doc.getItemValueString("Subject");
oneResult.unid = doc.getUniversalId();
datasource.push(oneResult);
}
See the FTSearchSorted documentation. I typed the code off my head, so there might be little syntax snafus, ut you get the idea Don't return documents or Notes objects to the XPage and use recycle() wisely.

Using an list in a query in entity framework

I am trying to find a way to pass in an optional string list to a query. What I am trying to do is filter a list of tags by the relationship between them. For example if c# was selected my program would suggest only tags that appear in documents with a c# tag and then on the selection of the next, say SQL, the tags that are linked to docs for those two tags together would be shown, whittling it down so that the user can get closer and closer to his goal.
At the moment all I have is:
List<Tag> _tags = (from t in Tags
where t.allocateTagDoc.Count > 0
select t).ToList();
This is in a method that would be called repeatedly with the optional args as tags were selected.
I think I have been coming at it arse-backwards. If I make two(or more) queries one for each supplied tag, find the docs where they all appear together and then bring out all the tags that go with them... Or would that be too many hits on the db? Can I do it entirely through an entity context variable and just query the model?
Thanks again for any help!
You can try this.
First collect tag to search in a list of strings .
List<string> tagStrings = new List<string>{"c#", "sql"};
pass this list in your query, check whether it is empty or not, if empty, it will return all the tags, else tags which matches the tagStrings.
var _tags = (from t in Tags
where t.allocateTagDoc.Count > 0
&& (tagStrings.Count ==0 || tagStrings.Contains(t.tagName))
select t).ToList();
You can also try this, Dictionary represents ID of a document with it's tags:
Dictionary<int, string[]> documents =
new Dictionary<int, string[]>();
documents.Add(1, new string[] { "C#", "SQL", "EF" });
documents.Add(2, new string[] { "C#", "Interop" });
documents.Add(3, new string[] { "Javascript", "ASP.NET" });
documents.Add(4, new string[] { });
// returns tags belonging to documents with IDs 1, 2
string[] filterTags = new string[] { "C#" };
var relatedTags = GetRelatedTags(documents, filterTags);
Debug.WriteLine(string.Join(",", relatedTags));
// returns tags belonging to document with ID 1
filterTags = new string[] { "C#", "SQL" };
relatedTags = GetRelatedTags(documents, filterTags);
Debug.WriteLine(string.Join(",", relatedTags));
// returns tags belonging to all documents
// since no filtering tags are specified
filterTags = new string[] { };
relatedTags = GetRelatedTags(documents, filterTags);
Debug.WriteLine(string.Join(",", relatedTags));
public static string[] GetRelatedTags(
Dictionary<int, string[]> documents,
string[] filterTags)
{
var documentsWithFilterTags = documents.Where(o =>
filterTags
.Intersect(o.Value).Count() == filterTags.Length);
string[] relatedTags = new string[0];
foreach (string[] tags in documentsWithFilterTags.Select(o => o.Value))
relatedTags = relatedTags
.Concat(tags)
.Distinct()
.ToArray();
return relatedTags;
}
Thought I would pop back and share my solution which was completely different to what I first had in mind.
First I altered the database a little getting rid of a useless field in the allocateDocumentTag table which enabled me to use the entity framework model much more efficiently by allowing me to leave that table out and access it purely through the relationship between Tag and Document.
When I fill my form the first time I just display all the tags that have a relationship with a document. Using my search filter after that, when a Tag is selected in a checkedListBox the Document id's that are associated with that Tag(s) are returned and are then fed back to fill the used tag listbox.
public static List<Tag> fillUsed(List<int> docIds = null)
{
List<Tag> used = new List<Tag>();
if (docIds == null || docIds.Count() < 1)
{
used = (from t in frmFocus._context.Tags
where t.Documents.Count >= 1
select t).ToList();
}
else
{
used = (from t in frmFocus._context.Tags
where t.Documents.Any(d => docIds.Contains(d.id))
select t).ToList();
}
return used;
}
From there the tags feed into the doc search and vice versa. Hope this can help someone else, if the answer is unclear or you need more code then just leave a comment and I'll try and sort it.

Is there a way to get results of solr grouping using Solr Net

I want to try new solr collapsing/grouping included in solr 3.3, i have tried queries on solr Admin page and that works absolutely right but when I try to query in my c# code using solr net that does not seem to work as expected. Here is how I am setting the param values
options.ExtraParams = new List<KeyValuePair<string, string>>
{
new KeyValuePair<string,string>("group","true"),
new KeyValuePair<string,string>("group.field","AuthorID"),
};
Yes, you can use Grouping (formerly known as Field Collapsing) with SolrNet, it was introduced in the SolrNet 0.4.0 alpha1 release. Here are the release notes on the author's blog about this support being added in. So you will need to grab that version (or later) from Google Code(binaries) or GitHub(source). Also here is an example of using grouping from the unit tests in the source - Grouping Tests
public void FieldGrouping()
{
var solr = ServiceLocator.Current.GetInstance<ISolrBasicOperations<Product>>();
var results = solr.Query(SolrQuery.All, new QueryOptions
{
Grouping = new GroupingParameters()
{
Fields = new [] { "manu_exact" },
Format = GroupingFormat.Grouped,
Limit = 1,
}
});
Console.WriteLine("Group.Count {0}", results.Grouping.Count);
Assert.AreEqual(1, results.Grouping.Count);
Assert.AreEqual(true, results.Grouping.ContainsKey("manu_exact"));
Assert.GreaterThanOrEqualTo(results.Grouping["manu_exact"].Groups.Count,1);
}

Custom search in Dynamics CRM 4.0

I have a two related questions.
First:
I'm looking to do a full text search against a custom entity in Dynamics CRM 4.0. Has anyone done this before or know how to do it?
I know that I can build QueryExpressions with the web service and sdk but can I do a full text search with boolean type syntax using this method? As far as I can tell that won't do the trick.
Second:
Does anyone else feel limited with the searching abilities provided with Dynamics CRM 4.0? I know there are some 3rd pary search products out there but I haven't found one I like yet. Any suggestions would be appreciated.
Searching and filtering via the CRM SDK does take some time to get used to. In order to simulate full text search, you need to use nested FilterExpressions as your QueryExpression.Criteria. SDK page for nested filters The hardest part is figuring out how to build the parent child relationships. There's so much boolean logic going on that it's easy to get lost.
I had a requirement to build a "search engine" for one of our custom entities. Using this method for a complex search string ("one AND two OR three") with multiple searchable attributes was ugly. If you're interested though, I can dig it up. While it's not really supported, if you can access the database directly, I would suggest using SQL's full text search capabilities.
--
ok, here you go. I don't think you'll be able to copy paste this and fulfill your needs. my customer was only doing two to three key word searches and they were happy with the results from this. You can see what a pain it is to just do this in a simple search scenario. I basically puked out code until it was 'working'.
private FilterExpression BuildFilterV2(string[] words, string[] seachAttributes)
{
FilterExpression filter = new FilterExpression();
List<FilterExpression> allchildfilters = new List<FilterExpression>();
List<string> andbucket = new List<string>();
List<string> orBucket = new List<string>();
// clean up commas, quotes, etc
words = ScrubWords(words);
int index = 0;
while (index < words.Length)
{
// if current word is 'and' then add the next wrod to the ad bucket
if (words[index].ToLower() == "and")
{
andbucket.Add(words[index + 1]);
index += 2;
}
else
{
if (andbucket.Count > 0)
{
List<FilterExpression> filters = new List<FilterExpression>();
foreach (string s in andbucket)
{
filters.Add(BuildSingleWordFilter(s, seachAttributes));
}
// send existing and bucket to condition builder
FilterExpression childFilter = new FilterExpression();
childFilter.FilterOperator = LogicalOperator.And;
childFilter.Filters = filters.ToArray();
// add to child filter list
allchildfilters.Add(childFilter);
//new 'and' bucket
andbucket = new List<string>();
}
if (index + 1 < words.Length && words[index + 1].ToLower() == "and")
{
andbucket.Add(words[index]);
if (index + 2 <= words.Length)
{
andbucket.Add(words[index + 2]);
}
index += 3;
}
else
{
orBucket.Add(words[index]);
index++;
}
}
}
if (andbucket.Count > 0)
{
List<FilterExpression> filters = new List<FilterExpression>();
foreach (string s in andbucket)
{
filters.Add(BuildSingleWordFilter(s, seachAttributes));
}
// send existing and bucket to condition builder
FilterExpression childFilter = new FilterExpression();
childFilter.FilterOperator = LogicalOperator.And;
childFilter.Filters = filters.ToArray();
// add to child filter list
allchildfilters.Add(childFilter);
//new 'and' bucket
andbucket = new List<string>();
}
if (orBucket.Count > 0)
{
filter.Conditions = BuildConditions(orBucket.ToArray(), seachAttributes);
}
filter.FilterOperator = LogicalOperator.Or;
filter.Filters = allchildfilters.ToArray();
return filter;
}
private FilterExpression BuildSingleWordFilter(string word, string[] seachAttributes)
{
List<ConditionExpression> conditions = new List<ConditionExpression>();
foreach (string attr in seachAttributes)
{
ConditionExpression expr = new ConditionExpression();
expr.AttributeName = attr;
expr.Operator = ConditionOperator.Like;
expr.Values = new string[] { "%" + word + "%" };
conditions.Add(expr);
}
FilterExpression filter = new FilterExpression();
filter.FilterOperator = LogicalOperator.Or;
filter.Conditions = conditions.ToArray();
return filter;
}
private ConditionExpression[] BuildConditions(string[] words, string[] seachAttributes)
{
List<ConditionExpression> conditions = new List<ConditionExpression>();
foreach (string s in words)
{
foreach (string attr in seachAttributes)
{
ConditionExpression expr = new ConditionExpression();
expr.AttributeName = attr;
expr.Operator = ConditionOperator.Like;
expr.Values = new string[] { "%" + s + "%" };
conditions.Add(expr);
}
}
return conditions.ToArray();
}
Hm, that's a pretty interesting scenario...
You could certainly do a 'Like' query, and 'or' together the colums/attribute conditions you want included in the search. This seems to be how CRM does queries from the box above entity lists (and they're plenty fast). It looks like the CRM database has a full-text index, although exactly which columns are used to populate it is a bit foggy to me after a brief peek.
And remember LinqtoCRM for CRM query love (I started the project, sorry about the shameless plug).
Second - I can recommend "Global Search" by Akvelon which provides ability to search in all Custom Entities and attributes and Out of Box entities and attributes. Also they are using FTS for search in the attached documents contents. You can find more details in their official site: http://www.akvelon.com/Products/Dynamics%20CRM%20global%20Search/default.aspx
I would suggest utilizing the Dynamics CRM filtered views provided for you in the database. Then you can utilize all the power of native SQL to do any LIKE's or other logic you need. Plus, the filtered views are security trimmed, so you won't have to worry about users accessing records they do not have permission to.

Resources