Single query to return documents sorted by distance based on one documents Id rather than its geopoint - elasticsearch

I have an index in elasticsearch which contains a Id field and a geopoint.
right now in order to get the nearest documents I have to make two queries, one to get the original document by its id and after that use its coordinates to do a geosort. I was wondering if there is anyway to execute this as a single query.
public IEnumerable<RestaurantSearchItem> GetNearbyRestaurants(double latitude, double longitude)
{
var query = _elasticClient.Search<RestaurantSearchItem>(s =>
s.Index(RestaurantSearchItem.IndexName)
.Sort(
ss =>ss.GeoDistance(
g => g
.Field(p => p.Location)
.DistanceType(GeoDistanceType.Plane)
.Unit(DistanceUnit.Meters)
.Order(SortOrder.Ascending)
.Points(new GeoLocation(latitude,longitude)))));
var nearByRestaurants = query.Documents;
foreach (var restaurant in nearByRestaurants)
{
restaurant.Distance = Convert.ToDouble(query.Hits.Single(x => x.Id == restaurant.Id).Sorts.Single());
}
return nearByRestaurants;
}

I don't think it's possible to do this in one query; the latitude and longitude used for sorting can't be looked up from elsewhere in the data, so needs to be supplied in the request.

As of my knowledge, the only Elasticsearch query that accepts id of a document as its parameter is terms query, which fetches list of terms for the query from the given document.
But you want to find relevant documents based on location, not exact terms.
This can be achieved with denormalization of your data. It might look like storing the list of nearby restaurants in a nested field.
In the case of denormalization you will have to pre-compute all nearby restaurants before inserting the document in the index.

Related

Simple query without a specified field searching in whole ElasticSearch index

Say we have an ElasticSearch instance and one index. I now want to search the whole index for documents that contain a specific value. It's relevant to the search for this query over multiple fields, so I don't want to specify every field to search in.
My attempt so far (using NEST) is the following:
var res2 = client.Search<ElasticCompanyModelDTO>(s => s.Index("cvr-permanent").AllTypes().
Query(q => q
.Bool(bo => bo
.Must( sh => sh
.Term(c=>c.Value(query))
)
)
));
However, the query above results in an empty query:
I get the following output, ### ES REQEUST ### {} , after applying the following debug on my connectionstring:
.DisableDirectStreaming()
.OnRequestCompleted(details =>
{
Debug.WriteLine("### ES REQEUST ###");
if (details.RequestBodyInBytes != null) Debug.WriteLine(Encoding.UTF8.GetString(details.RequestBodyInBytes));
})
.PrettyJson();
How do I do this? Why is my query wrong?
Your problem is that you must specify a single field to search as part of a TermQuery. In fact, all ElasticSearch queries require a field or fields to be specified as part of the query. If you want to search every field in your document, you can use the built-in "_all" field (unless you've disabled it in your mapping.)
You should be sure you really want a TermQuery, too, since that will only match exact strings in the text. This type of query is typically used when querying short, unanalyzed string fields (for example, a field containing an enumeration of known values like US state abbreviations.)
If you'd like to query longer full-text fields, consider the MultiMatchQuery (it lets you specify multiple fields, too.)
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html
Try this
var res2 = client.Search<ElasticCompanyModelDTO>(s =>
s.Index("cvr-permanent").AllTypes()
.Query(qry => qry
.Bool(b => b
.Must(m => m
.QueryString(qs => qs
.DefaultField("_all")
.Query(query))))));
The existing answers rely on the presence of _all. In case anyone comes across this question at a later date, it is worth knowing that _all was removed in ElasticSearch 6.0
There's a really good video explaining the reasons behind this and the way the replacements work from ElasticOn starting at around 07:30 in.
In short, the _all query can be replaced by a simple_query_string and it will work with same way. The form for the _search API would be;
GET <index>/_search
{
"query": {
"simple_query_string" : {
"query": "<queryTerm>"
}
}
}
The NEST pages on Elastic's documentation for this query are here;

Elasticsearch projections onto new type

Is it possible to get a projection as a query result in elasticsearch?
For example:
I have 3 types in my index:
User { Id, Name, Groups[], Location { Lat, Lon } }
Group { Id, Name, Topics[] }
Message { Id, UserId, GroupId, Content}
And I want to get the number of messages and users in a group in a given area, so my input would be:
{ Lat, Lon, Distance, GroupId }
and the output would be:
Group { Id, Name, Topics, NumberOfUsers, NumberOfMessages }
where the actual output of the query is a combination of data returned by the query and aggregations within that data.
Is this possible?
There are no JOINs in Elasticsearch (except for parent-child, but those shouldn't be used for heavy joining either). With your current data model you'll only be able to to application-side JOINs and depending on your actual data that might be a lot of roundtrips. I don't think this will work out too well.
PS: Generally, please provide some simple test documents with usable data. If I have to put together a test data set to try out your problem, your chances that anybody will actually try it will get rather slim.

How can I find the true score from Elasticsearch query string with a wildcard?

My ElasticSearch 2.x NEST query string search contains a wildcard:
Using NEST in C#:
var results = _client.Search<IEntity>(s => s
.Index(Indices.AllIndices)
.AllTypes()
.Query(qs => qs
.QueryString(qsq => qsq.Query("Micro*")))
.From(pageNumber)
.Size(pageSize));
Comes up with something like this:
$ curl -XGET 'http://localhost:9200/_all/_search?q=Micro*'
This code was derived from the ElasticSearch page on using Co-variants. The results are co-variant; they are of mixed type coming from multiple indices. The problem I am having is that all of the hits come back with a score of 1.
This is regardless of type or boosting. Can I boost by type or, alternatively, is there a way to reveal or "explain" the search result so I can order by score?
Multi term queries like wildcard query are given a constant score equal to the boosting by default. You can change this behaviour using .Rewrite().
var results = client.Search<IEntity>(s => s
.Index(Indices.AllIndices)
.AllTypes()
.Query(qs => qs
.QueryString(qsq => qsq
.Query("Micro*")
.Rewrite(RewriteMultiTerm.ScoringBoolean)
)
)
.From(pageNumber)
.Size(pageSize)
);
With RewriteMultiTerm.ScoringBoolean, the rewrite method first translates each term into a should clause in a bool query and keeps the scores as computed by the query.
Note that this can be CPU intensive and there is a default limit of 1024 bool query clauses that can be easily hit for a large document corpus; running your query on the complete StackOverflow data set (questions, answers and users) for example, hits the clause limit for questions. You may want to analyze some text with an analyzer that uses an edgengram token filter.
Wildcard searches will always return a score of 1.
You can boost by a particular type. See this:
How to boost index type in elasticsearch?

How to get the total documents count, containing a specific field, using aggregations?

I am moving from ElasticSearch 1.7 to 2.0. Previously while calculating Term Facets I got the Total Count as well. This will tell in how many documents that field exists. This is how I was doing previously.
TermsFacet termsFacet = (TermsFacet) facet;
termsFacet.getTotalCount();
It worked with Multivalue field as well.
Now in current version for Term Aggregation we don't have anything as Total Count. I am getting DocCount inside Aggregation bucket. But that will not work for muti-valued fields.
Terms termsAggr = (Terms) aggr;
for (Terms.Bucket bucket : termsAggr.getBuckets()) {
String bucketKey = bucket.getKey();
totalCount += bucket.getDocCount();
}
Is there any way I can get Total count of the field from term aggregation.
I don't want to fire exists Filter query. I want result in single query.
I would use the exists query:
https://www.elastic.co/guide/en/elasticsearch/reference/2.x/query-dsl-exists-query.html
For instance to find the documents that contain the field user you can use:
{
"exists" : { "field" : "user" }
}
There is of course also a java API:
https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/java-term-level-queries.html#java-query-dsl-exists-query
QueryBuilder qb = existsQuery("name");

Elastic Search using NEST

How to sort data using multiple filters in elastic search using NEST queries.
I need to apply two fields lets say price and kilometer for cars in results set. I want the results sorted based on these fields by ASC or DESC.
How to get top five results based on conditions or some filters.
Please provide some links if available.
This should answer your questions:
IEnumerable<string> searchableFields = new List<string>() { "price", "kilometer" };
SearchDescriptor<T> descriptor = new SearchDescriptor<T>();
descriptor = descriptor.Size(5).OnFields(searchableFields)
.Sort(s => s.OnField("price").Descending().OnField("kilometer").Ascending());
var result = client.Search<T>(body => descriptor);

Resources