elasticsearch regroup aggs by prefix - elasticsearch

Hi i make a search facets with elasticsearch so i use aggs to get this facets but i would like to regroup all term start with text until the first '/'
Example some value are indexing like 'levelone/leveltwo'
but i would like to regroup all same levelone value
i try that but it does not work
'aggs' = array( 'tags => array ('terms' => array (
'field' => $filter,
'size' => 0,
'include' => ".*/.*",
)
)
)
Is it possible ?

Related

Query one field with multiple values in elasticsearch nest

I have a combination of two queries with Elasticsearch and nest, the first one is a full-text search for a specific term and the second one is to filter or query another field which is file-path but it should be for many files paths and the path could be part or full path, I can query one file-path but I couldn't manage to do it for many file paths, any suggestion?
Search<SearchResults>(s => s
.Query(q => q
.Match(m => m.Field(f => f.Description).Query("Search_term"))
&& q
.Prefix(t => t.Field(f => f.FilePath).Value("file_Path"))
)
);
For searching for more than one path you can use bool Query in elasticsearch and then use Should Occur to search like logical OR, so you code should look like this:
Search<SearchResults>(s => s
.Query(q => q.
Bool(b => b
.Should(
bs => bs.Wildcard(p => p.FilePath, "*file_Pathfile_Path*"),
bs => bs.Wildcard(p => p.FilePath, "*file_Pathfile_Path*"),
....
))
&& q.Match(m => m.Field(f => f.description).Query("Search_term")
)));
Also you should use WildCard Query to get result for paths that could be part or full path. For more information check ES offical documentation about WildQuery and Bool Query below:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html
https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/bool-queries.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-wildcard-query.html

Elasticsearch aggregations: strategy for accessing additional fields

I am using the following code for aggregations. It only return the Id and I would also like to return the Name. We did a project a few years ago and we indexed a field called idName (separated by |) but it was so messy solution. I am wondering if there are any better ways to do this with the recent version of Elastic?
.Aggregations(aggs => aggs
.Nested("nested_cat", nested => nested
.Path(p => p.Categories)
.Aggregations(a1 => a1
.Terms("terms_cat_id", terms1 => terms1
.Field(f1 => f1.Categories.First().Id)
)
)
)
)
I read there are 2 other options. One is to do a sub-aggregation, however the following doesn't seem to work:
.Aggregations(aggs => aggs
.Nested("nested_cat", nested => nested
.Path(p => p.Categories)
.Aggregations(a1 => a1
.Terms("terms_cat_id", terms1 => terms1
.Field(f1 => f1.Categories.First().Id)
.Aggregations(a2 => a2
.Terms("terms_cat_name", terms2 => terms2
.Field(f2 => f2.Categories.First().Name)
)
)
)
)
)
)
I also read that I can use Scripts, but I haven't gotten that to work either.
I was able to get the sub-aggregation to work by mapping category.name as "Keyword" instead of "Text"

ElasticSearch _count and _search apply same query

i'm trying get count of total results for paginator, but the problem is that _filter and _count have different arguments, for example:
_search:
'from' => $offset,
'size' => $limit,
'query' => [
'match_phrase' => $q,
]
_count:
'index' => $index,
'type' => $type,
'q' => $q,
I need apply match_phrase also inside _count, because when will keep there just q then count/number will don't be correct...but does _count accept match_phrase?
It's _count right way or must use different way? I was searching long hours, but found just this Elastic Search _search vs. _count syntax but it's saying nothing for me..

ElasticSearch NEST custom word joiner Analyzer not returning the correct result

I created an autocomplete filter with ElasticSearch using NEST API. I cant seem to get the word joiner to work.
So basically if I search for something like Transhex i also want to be able to return Trans Hex
My Index looks as follow...I think the WordDelimiter filter might be wrong.
Also, I followed the following article Link. They use the low-level API so it is possible that I am doing it completely wrong using the NEST API
var response = this.Client.CreateIndex(
"company-index",
index => index.Mappings(
ms => ms.Map<CompanyDocument>(m => m.Properties(p => p
.Text(t => t.Name(n => n.CompanyName).Analyzer("auto-complete")
.Fields(ff => ff.Keyword(k => k.Name("keyword")))))))
.Settings(f => f.Analysis(
analysis => analysis
.Analyzers(analyzers => analyzers
.Custom("auto-complete", a => a.Tokenizer("standard").Filters("lowercase", "word-joiner-filter", "auto-complete-filter")))
.TokenFilters(tokenFilter => tokenFilter
.WordDelimiter("word-joiner-filter", t => t.CatenateAll())
.EdgeNGram("auto-complete-filter", t => t.MinGram(3).MaxGram(30))))));
UPDATE
So I updated the Analyzer to look as follows, noticed that I updated the Analyzer from standard to keyword.
var response = this.Client.CreateIndex(
this.indexName,
index => index.Mappings(
ms => ms.Map<CompanyDocument>(m => m.Properties(p => p
.Text(t => t.Name(n => n.CompanyName).Analyzer("auto-complete")
.Fields(ff => ff.Keyword(k => k.Name("keyword")))))))
.Settings(f => f.Analysis(
analysis => analysis
.Analyzers(analyzers => analyzers
.Custom("auto-complete", a => a.Tokenizer("keyword").Filters("lowercase", "word-joiner-filter", "auto-complete-filter")))
.TokenFilters(tokenFilter => tokenFilter
.WordDelimiter("word-joiner-filter", t => t.CatenateAll())
.EdgeNGram("auto-complete-filter", t => t.MinGram(1).MaxGram(20))))));
The Results will look as follows
Search Keyword : perfect pools
Results
perfect pools -> This is the correct one at the top
EXCLUSIVE POOLS
Perfect Painters
Search Keyword : perfectpools Or PerfectPools
Results
Perfect Hideaways (Pty) Ltd -> this is the wrong one i would like to display perfect pools
PERFORMANTA APAC PTY LTD
Perfect Laser Technologies (PTY) LTD
Use Keyword tokenizer. The standard tokenizer will split the word in 2 tokens, then apply the filters on them.
UPDATE:
I used a search like this one and seems ok.
var searchResult = EsClient.Search<CompanyDocument>(q => q
.Index("test_index")
.Type("companydocument")
.TrackScores(true)
.Query(qq =>
{
QueryContainer queryContainer = null;
queryContainer = qq.QueryString(qs => qs.Fields(fs => fs.Field(f => f.CompanyName)).Query("perfectpools").DefaultOperator(Operator.And).Analyzer("auto-complete"));
return queryContainer;
})
.Sort(sort => sort.Descending(SortSpecialField.Score))
.Take(10)
);

Is it possible to query aggregations on NEST for multiple term fields (.NET)?

Below is the NEST query and aggregations:
Func<QueryContainerDescriptor<ConferenceWrapper>, QueryContainer> query =
q =>
q.Term(p => p.type, "conference")
// && q.Term(p => p.conference.isWaitingAreaCall, true)
&& q.Range(d => d.Field("conference.lengthSeconds").GreaterThanOrEquals(minSeconds))
&& q.DateRange(qd => qd.Field("conference.firstCallerStart").GreaterThanOrEquals(from.ToUniversalTime().ToString("yyyy-MM-ddTHH:mm:ss.fffZ")))
&& q.DateRange(qd => qd.Field("conference.firstCallerStart").LessThanOrEquals(to.ToUniversalTime().ToString("yyyy-MM-ddTHH:mm:ss.fffZ")));
Func<AggregationContainerDescriptor<ConferenceWrapper>, IAggregationContainer> waitingArea =
a => a
.Terms("both", t => t
.Field(p => p.conference.orgNetworkId) // seems ignore this field
.Field(p => p.conference.isWaitingAreaCall)
// .Field(new Field( p => p.conference.orgNetworkId + "-ggg-" + p.conference.networkId)
.Size(300)
.Aggregations(a2 => a2.Sum("sum-length", d2 => d2.Field("conference.lengthSeconds"))));
I have called .Field(p => p.conference.orgNetworkId) followed by .Field(p => p.conference.isWaitingAreaCall) But it seems the NEST client tries to ignore the first field expression.
Is is possible to have multiple fields to be the terms group by?
Elasticsearch doesn't support a terms aggregation on multiple fields directly; the calls to .Field(...) within NEST are assignative rather than additive, so the last call will overwrite any previously set values.
In order to aggregate on multiple fields, you can either
Create a composite field at index time that incorporates the values that you wish to aggregate on
or
Use a Script to generate the terms on which to aggregate at query time, by combining the two field values.
The performance of the first option will be better than the second.

Resources