Elastic Search using NEST - elasticsearch

How to sort data using multiple filters in elastic search using NEST queries.
I need to apply two fields lets say price and kilometer for cars in results set. I want the results sorted based on these fields by ASC or DESC.
How to get top five results based on conditions or some filters.
Please provide some links if available.

This should answer your questions:
IEnumerable<string> searchableFields = new List<string>() { "price", "kilometer" };
SearchDescriptor<T> descriptor = new SearchDescriptor<T>();
descriptor = descriptor.Size(5).OnFields(searchableFields)
.Sort(s => s.OnField("price").Descending().OnField("kilometer").Ascending());
var result = client.Search<T>(body => descriptor);

Related

Single query to return documents sorted by distance based on one documents Id rather than its geopoint

I have an index in elasticsearch which contains a Id field and a geopoint.
right now in order to get the nearest documents I have to make two queries, one to get the original document by its id and after that use its coordinates to do a geosort. I was wondering if there is anyway to execute this as a single query.
public IEnumerable<RestaurantSearchItem> GetNearbyRestaurants(double latitude, double longitude)
{
var query = _elasticClient.Search<RestaurantSearchItem>(s =>
s.Index(RestaurantSearchItem.IndexName)
.Sort(
ss =>ss.GeoDistance(
g => g
.Field(p => p.Location)
.DistanceType(GeoDistanceType.Plane)
.Unit(DistanceUnit.Meters)
.Order(SortOrder.Ascending)
.Points(new GeoLocation(latitude,longitude)))));
var nearByRestaurants = query.Documents;
foreach (var restaurant in nearByRestaurants)
{
restaurant.Distance = Convert.ToDouble(query.Hits.Single(x => x.Id == restaurant.Id).Sorts.Single());
}
return nearByRestaurants;
}
I don't think it's possible to do this in one query; the latitude and longitude used for sorting can't be looked up from elsewhere in the data, so needs to be supplied in the request.
As of my knowledge, the only Elasticsearch query that accepts id of a document as its parameter is terms query, which fetches list of terms for the query from the given document.
But you want to find relevant documents based on location, not exact terms.
This can be achieved with denormalization of your data. It might look like storing the list of nearby restaurants in a nested field.
In the case of denormalization you will have to pre-compute all nearby restaurants before inserting the document in the index.

Simple query without a specified field searching in whole ElasticSearch index

Say we have an ElasticSearch instance and one index. I now want to search the whole index for documents that contain a specific value. It's relevant to the search for this query over multiple fields, so I don't want to specify every field to search in.
My attempt so far (using NEST) is the following:
var res2 = client.Search<ElasticCompanyModelDTO>(s => s.Index("cvr-permanent").AllTypes().
Query(q => q
.Bool(bo => bo
.Must( sh => sh
.Term(c=>c.Value(query))
)
)
));
However, the query above results in an empty query:
I get the following output, ### ES REQEUST ### {} , after applying the following debug on my connectionstring:
.DisableDirectStreaming()
.OnRequestCompleted(details =>
{
Debug.WriteLine("### ES REQEUST ###");
if (details.RequestBodyInBytes != null) Debug.WriteLine(Encoding.UTF8.GetString(details.RequestBodyInBytes));
})
.PrettyJson();
How do I do this? Why is my query wrong?
Your problem is that you must specify a single field to search as part of a TermQuery. In fact, all ElasticSearch queries require a field or fields to be specified as part of the query. If you want to search every field in your document, you can use the built-in "_all" field (unless you've disabled it in your mapping.)
You should be sure you really want a TermQuery, too, since that will only match exact strings in the text. This type of query is typically used when querying short, unanalyzed string fields (for example, a field containing an enumeration of known values like US state abbreviations.)
If you'd like to query longer full-text fields, consider the MultiMatchQuery (it lets you specify multiple fields, too.)
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html
Try this
var res2 = client.Search<ElasticCompanyModelDTO>(s =>
s.Index("cvr-permanent").AllTypes()
.Query(qry => qry
.Bool(b => b
.Must(m => m
.QueryString(qs => qs
.DefaultField("_all")
.Query(query))))));
The existing answers rely on the presence of _all. In case anyone comes across this question at a later date, it is worth knowing that _all was removed in ElasticSearch 6.0
There's a really good video explaining the reasons behind this and the way the replacements work from ElasticOn starting at around 07:30 in.
In short, the _all query can be replaced by a simple_query_string and it will work with same way. The form for the _search API would be;
GET <index>/_search
{
"query": {
"simple_query_string" : {
"query": "<queryTerm>"
}
}
}
The NEST pages on Elastic's documentation for this query are here;

Group By Elasticsearch

I have document A, B, C in the same document type. All 3 has a property is_type = 'Normal', is_type = 'Normal', is_type = 'AbNormal'. I want to get search Response in one single query and then just use Search Response API to get the list of Documents which were having type as normal and abnormal. I know aggregation will not help in getting the document as it's just aggregation. Any help would be appreciated.

How to get the total documents count, containing a specific field, using aggregations?

I am moving from ElasticSearch 1.7 to 2.0. Previously while calculating Term Facets I got the Total Count as well. This will tell in how many documents that field exists. This is how I was doing previously.
TermsFacet termsFacet = (TermsFacet) facet;
termsFacet.getTotalCount();
It worked with Multivalue field as well.
Now in current version for Term Aggregation we don't have anything as Total Count. I am getting DocCount inside Aggregation bucket. But that will not work for muti-valued fields.
Terms termsAggr = (Terms) aggr;
for (Terms.Bucket bucket : termsAggr.getBuckets()) {
String bucketKey = bucket.getKey();
totalCount += bucket.getDocCount();
}
Is there any way I can get Total count of the field from term aggregation.
I don't want to fire exists Filter query. I want result in single query.
I would use the exists query:
https://www.elastic.co/guide/en/elasticsearch/reference/2.x/query-dsl-exists-query.html
For instance to find the documents that contain the field user you can use:
{
"exists" : { "field" : "user" }
}
There is of course also a java API:
https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/java-term-level-queries.html#java-query-dsl-exists-query
QueryBuilder qb = existsQuery("name");

How to use wildcards with ngrams in ElasticSearch

Is it possible to combine wildcard matches and ngrams in ElasticSearch? I'm already using ngrams of length 3-11.
As a very small example, I have records C1239123 and C1230123. The user wants to return both of these. This is the only info they know: C123?12
The above case won't work on my full match analyzer because the query is missing the 3 on the end. I was under the impression wildcard matches would work out of the box, but if I perform a search similar to the above I get gibberish.
Query:
.Search<ElasticSearchProject>(a => a
.Size(100)
.Query(q => q
.SimpleQueryString(query => query
.OnFieldsWithBoost(b => b
.Add(f => f.Summary, 2.1)
.Add(f => f.Summary.Suffix("ngram"), 2.0)
.Query(searchQuery))));
Analyzer:
var projectPartialMatch = new CustomAnalyzer
{
Filter = new List<string> { "lowercase", "asciifolding" },
Tokenizer = "ngramtokenizer"
};
Tokenizer:
.Tokenizers(t=>t
.Add("ngramtokenizer", new NGramTokenizer
{
TokenChars = new[] {"letter","digit","punctuation"},
MaxGram = 11,
MinGram = 3
}))
EDIT:
The main purpose is to allow the user to tell the search engine exactly where the unknown characters are. This preserves the match order. I do not ngram the query, only the indexed fields.
EDIT 2 with more test results:
I had simplified my prior example a bit too much. The gibberish was being caused by punctuation filters. With a proper example there's no gibberish, but results aren't returned in a relevant order. Seeing below, I'm unsure why the first 2 results match at all. Ngram is not applied to the query.
Searching for c.a123?.7?0 gives results in this order:
C.A1234.560
C.A1234.800
C.A1234.700 <--Shouldn't this be first?
C.A1234.950
To anyone looking for a resolution to this, wildcards are used on ngrammed tokens by default. My problem was due to my queries having punctuation in them and using a standard analyzer on my query (which breaks on punctuation).
Duc.Duong's suggestion to use the Inquisitor plugin helped show exactly how data would be analyzed.

Resources