Creating a custom analyzer in ElasticSearch Nest client

Creating a custom analyzer in ElasticSearch Nest client - elasticsearch

Im very very new to elasticsearch using the nest client, I am creating an index with a custom analyzer, however when testing using analyze it does not seem to use the custom analyzer. Mainly no edgengram tokens appear. Is there anything I am missing that would make my custom analyser the default for the index? When I check my mappings using elastichq they show my custom analyzer.
ConnectionSettings settings = new ConnectionSettings(new Uri("http://localhost:9200"), defaultIndex: "forum-app");
IndexSettings indsettings = new IndexSettings();
var an = new CustomAnalyzer();
an.CharFilter = new List<string>();
an.CharFilter.Add("html_strip");
an.Tokenizer = "edgeNGram";
an.Filter = new List<string>();
an.Filter.Add("standard");
an.Filter.Add("lowercase");
an.Filter.Add("stop");
indsettings.Analysis.Tokenizers.Add("edgeNGram", new Nest.EdgeNGramTokenizer
{
MaxGram = 15,
MinGram = 3
});
indsettings.Analysis.Analyzers.Add("forumanalyzer", an);
ElasticClient client = new ElasticClient(settings);
client.CreateIndex("forum-app", c => c
.NumberOfReplicas(0)
.NumberOfShards(1)
.AddMapping<Forum>(e => e.MapFromAttributes())
.Analysis(analysis => analysis
.Analyzers(a => a
.Add("forumanalyzer", an)
)));
//To index I just do this
client.Index(aForum);

You've added your custom analyzer to your index, but now you need to apply it your fields. You can do this on a field mapping level:
client.CreateIndex("forum-app", c => c
.NumberOfReplicas(0)
.NumberOfShards(1)
.AddMapping<Forum>(e => e
.MapFromAttributes()
.Properties(p => p
.String(s => s.Name(f => f.SomeProperty).Analyzer("formanalyzer")))
)
.Analysis(analysis => analysis
.Analyzers(a => a
.Add("forumanalyzer", an)
)
)
);
Or you can apply it to all fields by default by setting it as the default analyzer of your index:
client.CreateIndex("forum-app", c => c
.NumberOfReplicas(0)
.NumberOfShards(1)
.AddMapping<Forum>(e => e.MapFromAttributes())
.Analysis(analysis => analysis
.Analyzers(a => a
.Add("default", an)
)
)
);
More info here in regards to analyzer defaults.

Add a custom analyzer:
var indexSettings = new IndexSettings
{
NumberOfReplicas = 0, // If this is set to 1 or more, then the index becomes yellow.
NumberOfShards = 5
};
indexSettings.Analysis = new Analysis();
indexSettings.Analysis.Analyzers = new Analyzers();
indexSettings.Analysis.TokenFilters = new TokenFilters();
var customAnalyzer = new CustomAnalyzer
{
//CharFilter = new List<string> { "mapping " },
Tokenizer = "standard",
Filter = new List<string> { "lowercase", "asciifolding" }
};
indexSettings.Analysis.Analyzers.Add("customAnalyzerLowercaseSynonymAsciifolding", customAnalyzer);
And then when creating the index, you specify the analyzer:
var indexConfig = new IndexState
{
Settings = indexSettings
};
var createIndexResponse = elasticClient.CreateIndex(indexName, c => c
.InitializeUsing(indexConfig)
.Mappings(m => m
.Map<ElasticsearchModel>(mm => mm
.Properties(
p => p
.Text(t => t.Name(elasticsearchModel => elasticsearchModel.StringTest).Analyzer("customAnalyzerLowercaseSynonymAsciifolding"))
)
)
)
);
elasticClient.Refresh(indexName);
And then you query it with something like:
var response = elasticClient.Search<ElasticsearchModel>(s => s
.Index(indexName)
.Query(q => q
.SimpleQueryString(qs => qs
.Fields(fs => fs
.Field(f => f.StringTest, 4.00)
)
.Query(query)
)
)
);
var results = new List<ElasticsearchModel>();
results = response.Hits.Select(hit => hit.Source).ToList();

Related

ElasticSearch NEST ObjectInitializer syntax to fluent syntax translation not working

Given this ObjectInitializer NEST query
var mustClauses = new List<QueryContainer>
{
new QueryStringQuery
{
Query = queryFilter.Query,
Lenient = true
},
new MatchQuery
{
Field = new Field("status"),
Query = queryFilter.Status,
Lenient = true,
Operator = Operator.And
},
new DateRangeQuery
{
Field = new Field("timeSent"),
LessThanOrEqualTo = now,
GreaterThanOrEqualTo = GetDateTimeFor(queryFilter.TimeCriteria, now)
}
};
return client.SearchAsync<Ingestion.Entities.ElasticSearch.MessageData>(sd => sd.Query(q => q.Bool(b => b.Must(mustClauses.ToArray())))
.Sort(x => x.Descending(b => b.TimeSent))
.From(from)
.Size(pageSize));
which works, and outputs the following query to my Visual Studio Output window:
{"from":0,"query":{"bool":{"must":[{"match":{"status":{"lenient":true,"operator":"and","query":"Success"}}},{"range":{"timeSent":{"gte":"2021-12-14T03:39:26.5126419Z","lte":"2021-12-21T03:39:26.5126419Z"}}}]}},"size":20,"sort":[{"timeSent":{"order":"desc"}}]}
I am trying to convert it to fluent query syntax like this:
return client.SearchAsync<Ingestion.Entities.ElasticSearch.MessageData>(sd => sd.Query(q => q.Bool(b => b.Must(
mu => mu
.QueryString(qs => qs
.Query(queryFilter.Query)
.Lenient(true)),
mu =>
{
if (string.IsNullOrEmpty(queryFilter.Status))
return null;
return mu
.Match(ma => ma.Field(f => f.Status == queryFilter.Status)
.Lenient(true)
.Operator(Operator.And));
},
mu =>
{
if (queryFilter.TimeCriteria == TimeCriteria.All)
return null;
return mu.DateRange(dr => dr
.Field(f => f.TimeSent)
.LessThanOrEquals(now)
.GreaterThanOrEquals(GetDateTimeFor(queryFilter.TimeCriteria, now)));
})))
.Sort(x => x.Descending(b => b.TimeSent))
.From(from)
.Size(pageSize));
and it's not working. When I run this query, the Match query on that Status field does not appear in the NEST output.
Any guidance/help would be appreciated.

The status lambda expression should be
mu =>
{
if (string.IsNullOrEmpty(queryFilter.Status))
return null;
return mu
.Match(ma => ma.Field(f => f.Status)
.Query(queryFilter.Status)
.Lenient(true)
.Operator(Operator.And));
},

Composite Aggregation with After functionality

I am looking for a code snippet of After functionality usage with NEST lib.
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-composite-aggregation.html#_after.
Thanks in advance for the code snippet

You can pass the CompositeKey from a previous composite aggregation as the .After() parameter for a new composite aggregation. For example
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var settings = new ConnectionSettings(pool);
var client = new ElasticClient(settings);
var searchResponse = client.Search<object>(s => s
.From(0)
.AllIndices()
.AllTypes()
.Aggregations(a => a
.Composite("composite_agg", c => c
.Sources(so => so
.DateHistogram("date", dh => dh
.Field("timestamp")
.Interval("1d")
)
.Terms("product", t => t
.Field("product")
)
)
)
)
);
var compositeAgg = searchResponse.Aggregations.Composite("composite_agg");
searchResponse = client.Search<object>(s => s
.From(0)
.AllIndices()
.AllTypes()
.Aggregations(a => a
.Composite("composite_agg", c => c
.Sources(so => so
.DateHistogram("date", dh => dh
.Field("timestamp")
.Interval("1d")
)
.Terms("product", t => t
.Field("product")
)
)
.After(compositeAgg.AfterKey) // <-- pass the after key from previous agg response
)
)
);
Assuming you're using Elasticsearch 6.x (which you must be to be using Composite Aggregation), please update NEST client to latest (6.6.0 at this time), as it contains a bug fix for a CompositeKey with null values.

Does Elasticsearch Nest support Update By Query

I want to use the UpdateByQuery method on the high level client but can't find any documentation for Nest. They have great documentation if I wanted to make a CURL request but nothing for NEST. https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update-by-query.html
If anyone has and example of them using it or can share documentation they have found that would be awesome!

Update By Query API is supported in NEST. Here's an example adapted from the integration tests. NEST Documentation for Index and Update APIs is planned :)
private static void Main()
{
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var settings = new ConnectionSettings(pool)
.DefaultMappingFor<Test>(m => m
.IndexName("tests")
.TypeName("test")
);
var client = new ElasticClient(settings);
var index = IndexName.From<Test>();
if (client.IndexExists(index).Exists)
client.DeleteIndex(index);
client.CreateIndex(index, c => c
.Mappings(m => m
.Map<Test>(map => map
.Properties(props => props
.Text(s => s.Name(p => p.Text))
.Keyword(s => s.Name(p => p.Flag))
)
)
)
);
client.Bulk(b => b
.IndexMany(new[] {
new Test { Text = "words words", Flag = "bar" },
new Test { Text = "words words", Flag = "foo" }
})
.Refresh(Refresh.WaitFor)
);
client.Count<Test>(s => s
.Query(q => q
.Match(m => m
.Field(p => p.Flag)
.Query("foo")
)
)
);
client.UpdateByQuery<Test>(u => u
.Query(q => q
.Term(f => f.Flag, "bar")
)
.Script("ctx._source.flag = 'foo'")
.Conflicts(Conflicts.Proceed)
.Refresh(true)
);
client.Count<Test>(s => s
.Query(q => q
.Match(m => m
.Field(p => p.Flag)
.Query("foo")
)
)
);
}
public class Test
{
public string Text { get; set; }
public string Flag { get; set; }
}
Observe that the count from the first Count API call is 1, and on the second Count API call after the Update By Query API call, it's 2.

NEST aggs collection is read only

I'm using the NEST client with the following syntax:
_server.Search<Document>(s => s.Index(_config.SearchIndex)
.Query(q => q.MatchAll(p => p)).Aggregations(
a => a
.Terms("type", st => st
.Field(p => p.Type)
)));
However I keep getting the following exception
A first chance exception of type 'System.NotSupportedException' occurred in mscorlib.dll
Additional information: Collection is read-only.
It only seems to occur when I'm using aggregations, the field of Type has the following mapping:
[Keyword(Name = "Type")]
public string Type { get; set; }

I would double check the versions of NEST and Elasticsearch.Net that you are using. I just tried the following example with Elasticsearch 5.1.2 and NEST 5.0.1 and don't see the issue
void Main()
{
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var defaultIndex = "default-index";
var connectionSettings = new ConnectionSettings(pool)
.DefaultIndex(defaultIndex);
var client = new ElasticClient(connectionSettings);
if (client.IndexExists(defaultIndex).Exists)
client.DeleteIndex(defaultIndex);
client.CreateIndex(defaultIndex, c => c
.Mappings(m => m
.Map<Document>(mm => mm.AutoMap())
)
);
var documents = Enumerable.Range(1, 100).Select(i =>
new Document
{
Type = $"Type {i % 5}"
}
);
client.IndexMany(documents);
client.Refresh(defaultIndex);
var searchResponse = client.Search<Document>(s => s
.Index(defaultIndex)
.Query(q => q.MatchAll(p => p))
.Aggregations(a => a
.Terms("type", st => st
.Field(p => p.Type)
)
)
);
foreach (var bucket in searchResponse.Aggs.Terms("type").Buckets)
Console.WriteLine($"key: {bucket.Key}, count {bucket.DocCount}");
}
public class Document
{
[Keyword(Name = "Type")]
public string Type { get; set; }
}
this outputs
key: Type 0, count 20
key: Type 1, count 20
key: Type 2, count 20
key: Type 3, count 20
key: Type 4, count 20

Nest multisearch query writing as object initializer Syntax

I have a multiSearch query as below. Basically I query product and category types. I would like to make this query optional without writing same code again.
Basically in some cases I want to query only product type, that means that it will not multisearch but a search query. How can I split this query into 2 search queries. Something like below I think.
return Client.MultiSearch(ms => ms
.Search<Product>("products", s => s
.Index(IndexName)
.Explain(explain)
.Query(q => q
.Bool(b => b
.Should(
sh => sh.MultiMatch(qs => qs
.Fields(d => d
.Field(Name + ".raw", NameBoost + 0.5)
.Field(Name, NameBoost)
.Type(TextQueryType.BestFields)
.Query(key))
))).From(startfrom).Size(size))
.Search<Category>("categories", s => s
.Index(IndexName)
.Explain(explain)
.Query(q => q.
Bool(b => b.
Should(sh => sh.
MultiMatch(m => m
.Fields(d => d
.Field(f => f.Name, NameBoost)
.Field(p => p.Name.Suffix("raw"), NameBoost + 0.5)).Type(TextQueryType.BestFields)
.Query(key)
)
))).From(startfrom).Size(size))
);
something like this below. I guess that it is called object initializer Syntax according to this article
Client.MultiSearch (SearchProductQuery && SearchCategoryQuery)
is it possible?

This fluent API multi search
client.MultiSearch(ms => ms
.Search<Product>("products", s => s
.Index(IndexName)
.Explain(explain)
.Query(q => q
.Bool(b => b
.Should(sh => sh
.MultiMatch(qs => qs
.Fields(d => d
.Field(Name + ".raw", NameBoost + 0.5)
.Field(Name, NameBoost)
)
.Type(TextQueryType.BestFields)
.Query(key)
)
)
)
)
.From(startfrom)
.Size(size)
)
.Search<Category>("categories", s => s
.Index(IndexName)
.Explain(explain)
.Query(q => q
.Bool(b => b
.Should(sh => sh
.MultiMatch(m => m
.Fields(d => d
.Field(f => f.Name, NameBoost)
.Field(p => p.Name.Suffix("raw"), NameBoost + 0.5)
)
.Type(TextQueryType.BestFields)
.Query(key)
)
)
)
)
.From(startfrom)
.Size(size)
)
);
would be this OIS API multi search
var multiSearch = new MultiSearchRequest
{
Operations = new Dictionary<string, ISearchRequest>
{
{ "products", new SearchRequest<Product>(IndexName)
{
Explain = true,
Query = new BoolQuery
{
Should = new QueryContainer[] {
new MultiMatchQuery
{
Fields =
((Fields)Field.Create(Name + ".raw", NameBoost + 0.5))
.And(Name, NameBoost),
Type = TextQueryType.BestFields,
Query = key
}
}
},
From = startfrom,
Size = size
}
},
{ "categories", new SearchRequest<Category>(IndexName)
{
Explain = true,
Query = new BoolQuery
{
Should = new QueryContainer[] {
new MultiMatchQuery
{
Fields =
((Fields)Infer.Field<Category>(f => f.Name, NameBoost))
.And<Category>(f => f.Name.Suffix("raw"), NameBoost + 0.5),
Type = TextQueryType.BestFields,
Query = key
}
}
},
From = startfrom,
Size = size
}
},
}
};
client.MultiSearch(multiSearch);
Take a look at the multi search integration tests for another example. I'll look at getting this added to the documentation.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Creating a custom analyzer in ElasticSearch Nest client - elasticsearch

Related

ElasticSearch NEST ObjectInitializer syntax to fluent syntax translation not working

Composite Aggregation with After functionality

Does Elasticsearch Nest support Update By Query

NEST aggs collection is read only

Nest multisearch query writing as object initializer Syntax

Categories

Resources