How can I sort String with has_parent in ElasticSearch - elasticsearch

I created parent-child in ElasticSearch.
I use has_parent for query data, but I want to sort this data follow some parent field which is a String.
I tried to find solutions, I found how to sort in ElasticSearch documents
"query": {
"has_parent" : {
"parent_type" : "blog",
"score" : true,
"query" : {
"function_score" : {
"script_score": {
"script": "_score * doc['view_count'].value"
}
}
}
}
}
https://www.elastic.co/guide/en/elasticsearch/reference/5.5/query-dsl-has-parent-query.html#_sorting_2
It use "score" * doc['view_count'] for searching which is an integer, but I want to do some calculations with String for searching it (with Groovy), but I don't know what should I do.
If you have another idea, please tell me.
thanks for helping me.

Related

How can we do a key insensitive cardinality aggregation?

We can use cardinality to get a distinct count on a field, however the cardinality is case sensitive... meaning that if we have emails like user#x.com, User#x.com and USER#x.com these will count as 3 emails, however I need this to count as a single email count.
This is the aggregation I am using:
"aggs" : {
"emails" : {
"cardinality" : {
"field" : "emails.keyword"
}
}
}
I would need something like:
"aggs" : {
"emails" : {
"cardinality" : {
"field" : "emails.keyword",
"casesensitive": false ????
}
}
}
How can we do to make a cardinality aggregation to be key insensitive?
Although I would go with Val's suggestion, here is the query I thought may be useful if you do not have the control of the mapping where I made use of a custom script in Cardinality Aggregation
Aggregation Query:
POST <your_index_name>/_search
{
"size":0,
"aggs":{
"email_count":{
"cardinality":{
"script":{
"source":"doc['email.keyword'].toString().toLowerCase()"
}
}
}
}
}
Note that you would find more details on Scripting in the aforementioned link.
Hope this helps!

fuzzy searching with query_string Elasticsearch

i have a record saved in Elasticsearch which contains a string exactly equals to Clash of clans
now i want to search this string with Elasticsearch and i using this
{
"query_string" : {
"query" : "clash"
}
}
its working perfectly but now if i write
"query" : "class"
it dont give me back any record so i realize i should use Fuzzy searching so i come to know that i can use fuzziness parameter with query_string so i did
{
"query_string" : {
"query" : "clas"
"fuzziness":1
}
}
but still elasticsearch is not returning anything!
kindly help and i cant use Fuzzy query i just can use query_string.
Thanks
You need to use the ~ operator to have fuzzy searching in query_string:
{
"query": {
"query_string": {
"query": "class~"
}
}
}

How to use lucene SpanQuery in ElasticSearch

For my project, I thought of using Span Near Queries of ElasticSearch, with the constraint that is, certain tokens may have to searched with Fuzziness. I was able to generate a set of SpanQuery (org.apache.lucene.search.spans.SpanQuery) objects some with fuzzy enabled, some without. I couldn't figure out how to use these set of SpanQueries in ElasticSearch spanNearQuery.
Can someone help me out with right pointers to samples or docs. And is there any way to construct ES SpanNearQueryBuilder with some clauses fuzzy enabled ?
You can wrap an fuzzy query into a span query with Span Multi Term Query:
{
"span_near" : {
"clauses" : [
{ "span_term" : { "field" : "value1" } },
{ "span_multi" :
"match" : {
"prefix" : { "user" : { "field" : "value2" } }
}
}
],
...
}
}

Filter facet returns count of all documents and not range

I'm using Elasticsearch and Nest to create a query for documents within a specific time range as well as doing some filter facets. The query looks like this:
{
"facets": {
"notfound": {
"query": {
"term": {
"statusCode": {
"value": 404
}
}
}
}
},
"filter": {
"bool": {
"must": [
{
"range": {
"time": {
"from": "2014-04-05T05:25:37",
"to": "2014-04-07T05:25:37"
}
}
}
]
}
}
}
In the specific case, the total hits of the search is 21 documents, which fits the documents within that time range in Elasticsearch. But the "notfound" facet returns 38, which fits the total number of ErrorDocuments with a StatusCode value of 404.
As I understand the documentation, facets collects data from withing the search. In this case, the "notfound" facet should never be able to return a count higher that 21.
What am I doing wrong here?
There's a distinct difference between filter/query/filtered_query/facet filter which is good to know.
Top level filter
{
filter: {}
}
This acts as a post-filter, meaning it will filter the results after the query phase has ended. Since facets are part of the query phase filters do not influence the documents that are facetted over. Filters do not alter score and are therefor very cacheable.
Top level query
{
query: {}
}
Queries influence the score of a document and are therefor less cacheable than filters. Queries run in the query phase and thus also influence the documents that are facetted over.
Filtered query
{
query: {
filtered: {
filter: {}
query: {}
}
}
}
This allows you to run filters in the query phase taking advantage of their better cacheability and have them influence the documents that are facetted over.
Facet filter
"facets" : {
"<FACET NAME>" : {
"<FACET TYPE>" : {
...
},
"facet_filter" : {
"term" : { "user" : "kimchy"}
}
}
}
this allows you to apply a filter to the documents that the facet is run over. Remember that the it'll be a combination of the queryphase/facetfilter unless you also specify global:true on the facet as well.
Query Facet/Filter Facet
{
"facets" : {
"wow_facet" : {
"query" : {
"term" : { "tag" : "wow" }
}
}
}
}
Which is the one that #thomasardal is using in this case which is perfectly fine, it's a facet type which returns a single value: the query hit count.
The fact that your Query Facet returns 38 and not 21 is because you use a filter for your time range.
You can fix this by either doing the filter in a filtered_query in the query phase or apply a facet filter(not a filter_facet) to your query_facet although because filters are cached better you better use facet filter inside you filter facet.
Confusingly Filter Facets are specified using .FacetFilter() on the search object. I will change this in 1.0 to avoid future confusion.
Sadly: .FacetFilter() and .FacetQuery() in NEST do not allow you to specify a facet filter like you can with other facets:
var results = typedClient.Search<object>(s => s
.FacetTerm(ft=>ft
.OnField("myfield")
.FacetFilter(f=>f.Term("filter_facet_on_this_field", "value"))
)
);
You issue here is that you are performing a Filter Facet and not a normal facet on your query (which will follow the restrictions applied via the query filter). In the JSON, the issue is because of the "query" between the facet name "notfound" and the "terms" entry. This is telling Elasticsearch to run this as a separate query and facet on the results of this separate query and not your main query with the date range filter. So your JSON should look like the following:
{
"facets": {
"notfound": {
"term": {
"statusCode": {
"value": 404
}
}
}
},
"filter": {
"bool": {
"must": [
{
"range": {
"time": {
"from": "2014-04-05T05:25:37",
"to": "2014-04-07T05:25:37"
}
}
}
]
}
}
}
Since I see you have this tagged with NEST as well, in your call using NEST, you are probably using FacetFilter on your search request, switch this to just Facet to get the desired result.

Full-text schema in ElasticSearch

I'm (extremely) new to ElasticSearch so forgive my potentially ridiculous question. I currently use MySQL to perform full-text searches, and want to move this to ElasticSearch. Currently my table has a fulltext index spanning three columns:
title,description,tags
In ES, each document would therefore have title, description and tags fields, allowing me to do a fulltext search for a general phrase, or filter on a given tag.
I also want to add further searchable fields such as username (so I can retrieve posts by a given user). So, how do I specify that a fulltext search should match title OR description OR tags but not username?
From the OR filter example, I'd assume I'd have to use something like this:
{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"or" : [
{
"term" : { "title" : "foobar" }
},
{
"term" : { "description" : "foobar" }
},
{
"term" : { "tags" : "foobar" }
}
]
}
}
}
Coming at this new, it doesn't seem like this is very efficient. Is there a better way of doing this, or do I need to move the username field to a separate index?
This is fine.
I general I would suggest getting familiar with ElasticSearch mapping types and options.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping.html

Resources