Elasticsearch bad request on mapping put method - elasticsearch

I m trying to create an index in Elastic Search and to add analyzer and maaping to handle special characters like # to search on Email field. Here is my code
Analyzer.txt
{
"analysis": {
"analyzer": {
"email_search": {
"type": "custom",
"tokenizer": "uax_url_email",
"filter": [ "lowercase", "stop" ]
}
}
}
Mapping.txt
{"users":
{
"properties": {
"email": {
"type": "string",
"analyzer": "email_search"
}
}
}
CreatedIndex
string _docuementType ="users";
string _indexName="usermanager";
public bool Index(T entity)
{
SetupAnalyzers();
SetupMappings();
var indexResponse = _elasticClient.Index(entity, i => i.Index(_indexName)
.Type(_docuementType)
.Id(entity.Id));
return indexResponse.IsValid;
}
SetupAnalyzer
protected void SetupAnalyzers()
{
if (!_elasticClient.IndexExists(_indexName).Exists)
{
_elasticClient.CreateIndex(_indexName);
}
using (var client = new System.Net.WebClient())
{
string analyzers = File.ReadAllText("analyzers.txt");
client.UploadData("http://localhost:9200/usermanager/_settings", "PUT", Encoding.ASCII.GetBytes(analyzers));
}
}
SetupMappings
protected void SetupMappings()
{
using (var client = new System.Net.WebClient())
{
var mappings = File.ReadAllText("mappings.txt");
client.UploadData("http://localhost:9200/usermanager/users/_mapping", "PUT", Encoding.ASCII.GetBytes(mappings));
}
}
But getting error on SetupMappings method
The remote server returned an error: (400) Bad Request.
Elastic version is 1.7.5
Nest version is 5.5

The problem is
var indexResponse = _elasticClient.Index(entity, i => i.Index(_indexName)
.Type(_docuementType)
.Id(entity.Id));
You're indexing documents before setting up the mapping and analyzers; in this case, Elasticsearch will automatically create the index, infer a mapping from the first document it sees and string properties will be mapped as analyzed string fields using the standard analyzer.
To fix this, you should create the index, analyzers and mappings before indexing any documents. You can also disable auto index creation if you need to, in elasticsearch.yml config. It depends on your use case as to whether this is might be a good idea; a search use case with known indices and explicit mappings is a case where you might consider disabling it.
So, the procedural flow would be something like
void Main()
{
var indexName = "usermanager";
var typeName = "users";
var settings = new ConnectionSettings(new Uri("http://localhost:9200"))
.MapDefaultTypeNames(d => d
// map the default index to use for the User type
.Add(typeof(User), typeName)
)
.MapDefaultTypeIndices(d => d
// map the default type to use for the User type
.Add(typeof(User), indexName)
);
var client = new ElasticClient(settings);
if (!client.IndexExists(indexName).Exists)
{
client.CreateIndex(indexName, c => c
.Analysis(a => a
.Analyzers(aa => aa
.Add("email_search", new CustomAnalyzer
{
Tokenizer = "uax_url_email",
Filter = new [] { "lowercase", "stop" }
})
)
)
.AddMapping<User>(m => m
.Properties(p => p
.String(s => s
.Name(n => n.Email)
.Analyzer("email_search")
)
)
)
);
}
client.Index(new User { Email = "me#example.com" });
}
public class User
{
public string Email { get; set; }
}
which will send the following requests (assuming index doesn't exist)
HEAD http://localhost:9200/usermanager
POST http://localhost:9200/usermanager
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"email_search": {
"tokenizer": "uax_url_email",
"filter": [
"lowercase",
"stop"
],
"type": "custom"
}
}
}
}
},
"mappings": {
"users": {
"properties": {
"email": {
"analyzer": "email_search",
"type": "string"
}
}
}
}
}
POST http://localhost:9200/usermanager/users
{
"email": "me#example.com"
}
NOTE: You will need to delete the index and recreate, in order to change the mapping. It's a good idea to use aliases to interact with indices, so that you can iterate on mappings whilst developing.

Related

Elasticsearch search templates - How to construct the search terms in NEST

Currently I have a search template that I am trying to pass in a couple of parameters,
How can I construct my search terms using NEST to get the following result.
Template
PUT _scripts/company-index-template
{
"script": {
"lang": "mustache",
"source": "{\"query\": {\"bool\": {\"filter\":{{#toJson}}clauses{{/toJson}},\"must\": [{\"query_string\": {\"fields\": [\"companyTradingName^2\",\"companyName\",\"companyContactPerson\"],\"query\": \"{{query}}\"}}]}}}",
"params": {
"query": "",
"clauses": []
}
}
}
DSL query looks as follow
GET company-index/_search/template
{
"id": "company-index-template",
"params": {
"query": "sky*",
"clauses": [
{
"terms": {
"companyGroupId": [
1595
]
}
},
{
"terms": {
"companyId": [
158,
836,
1525,
2298,
2367,
3176,
3280
]
}
}
]
}
}
I would like to construct the above query in NEST but cant seem to find a good way to generate the clauses value.
This is what I have so far...
var responses = this.client.SearchTemplate<Company>(descriptor =>
descriptor
.Index(SearchConstants.CompanyIndex)
.Id("company-index-template")
.Params(objects => objects
.Add("query", queryBuilder.Query)
.Add("clauses", "*How do I contruct this JSON*");
UPDATE:
This is how I ended up doing it. I just created a dictionary with all my terms in it.
I do think there might be a beter why of doing it, but I cant find it.
new List<Dictionary<string, object>>
{
new() {{"terms", new Dictionary<string, object> {{"companyGroupId", companyGroupId}}}},
new() {{"terms", new Dictionary<string, object> {{"companyId", availableCompanies}}}}
}
And then I had to Serialize when I passed it to the Params method.
var response = this.client.SearchTemplate<Company>(descriptor =>
descriptor.Index(SearchConstants.CompanyIndex)
.Id("company-index-template")
.Params(objects => objects
.Add("query", "*" + query + "*")
.Add("clauses", JsonConvert.SerializeObject(filterClauses))));

Elastic Search NEST client raw + custom query

I'm using NEST client for querying ES, but now I have a specific situation - I'm trying to proxy query to ES, but with specific query applied by default:
public IEnumerable<TDocument> Search<TDocument>(string indexName, string query, string sort, int page, int pageSize) where TDocument : class
{
var search = new SearchRequest(indexName)
{
From = page,
Size = pageSize,
Query = new RawQuery(query),
};
var response = this.client.Search<TDocument>(search);
return response.Documents;
}
Code above is just proxying query to ES, but what if I need to apply specific filter that should be always applied along with passed query?
So for example I'd want Active field to be true by default. How can I merge this raw query with some specific and always applied filter (without merging strings to formulate merged ES API call if possible).
Assuming that query is well formed JSON that corresponds to the query DSL, you could deserialize it into an instance of QueryContainer and combine it with other queries. For example
var client = new ElasticClient();
string query = #"{
""multi_match"": {
""query"": ""hello world"",
""fields"": [
""description^2.2"",
""myOtherField^0.3""
]
}
}";
QueryContainer queryContainer = null;
using (var stream = client.ConnectionSettings.MemoryStreamFactory.Create(Encoding.UTF8.GetBytes(query)))
{
queryContainer = client.RequestResponseSerializer.Deserialize<QueryContainer>(stream);
}
queryContainer = queryContainer && +new TermQuery
{
Field = "another_field",
Value = "term"
};
var searchResponse = client.Search<TDocument>(s => s.Query(q => queryContainer));
which will translate to the following query (assuming default index is _all)
POST http://localhost:9200/_all/_search?pretty=true&typed_keys=true
{
"query": {
"bool": {
"filter": [{
"term": {
"another_field": {
"value": "term"
}
}
}],
"must": [{
"multi_match": {
"fields": ["description^2.2", "myOtherField^0.3"],
"query": "hello world"
}
}]
}
}
}

Translate ElasticSearch query to Nest c#

I need some help in creating an AggregationDictionary from the following elasticsearch query
GET organisations/_search
{
"size": 0,
"aggs": {
"by_country": {
"nested": {
"path": "country"
},
"aggs": {
"by_country2": {
"filter": {
"bool": {
"must": [
{
"term": {
"country.isDisplayed": "true"
}
}
]
}
},
"aggs": {
"by_country3": {
"terms": {
"field": "country.displayName.keyword",
"size": 9999
}
}
}
}
}
}
}
}
I managed to write this horrible piece of code which I am pretty sure it is wrong, I am totally new to this.
AggregationDictionary aggs = new AggregationDictionary()
{
{
"countries_step1",
new NestedAggregation("countries_step1")
{
Path = "country",
Aggregations = new AggregationDictionary()
{
{
"countries_step2",
new FilterAggregation("countries_step2")
{
Filter = new BoolQuery
{
Must = new QueryContainer[] {
new NestedQuery
{
Query = new TermQuery
{
Field = "country.isDisplayed",
Value = true
}
}
}
},
Aggregations = new AggregationDictionary
{
{
"countries_step3",
new TermsAggregation("countries_step3")
{
Field = "country.displayName.keyword",
Size = 9999
}
}
}
}
}
}
}
}
};
Can someone tell me if I am in the correct direction? I am using Nest 6.6.0. Is there any tool that helps with these translations?
What you have so far is pretty solid, but when you try to execute this aggregation with the following call
var searchAsync = await client.SearchAsync<Document>(s => s.Size(0).Aggregations(aggs));
you will get this error
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "query malformed, empty clause found at [14:22]"
}
],
"type" : "illegal_argument_exception",
"reason" : "query malformed, empty clause found at [14:22]"
},
"status" : 400
}
Checking request which was sent to elasticsearch give us the answer why it happened
{
"aggs": {
"countries_step1": {
"aggs": {
"countries_step2": {
"aggs": {
"countries_step3": {
"terms": {
"field": "country.displayName.keyword",
"size": 9999
}
}
},
"filter": {}
}
},
"nested": {
"path": "country"
}
}
},
"size": 0
}
filter clause is empty, this is because you tried to used nested query but you didn't pass path parameter. We don't need nested query here (as shown in your example query), we can simplify the whole query to
var aggs = new AggregationDictionary()
{
{
"countries_step1",
new NestedAggregation("countries_step1")
{
Path = "country",
Aggregations = new AggregationDictionary()
{
{
"countries_step2",
new FilterAggregation("countries_step2")
{
Filter = new BoolQuery
{
Must = new QueryContainer[]
{
new TermQuery
{
Field = "country.isDisplayed",
Value = true
}
}
},
Aggregations = new AggregationDictionary
{
{
"countries_step3",
new TermsAggregation("countries_step3")
{
Field = "country.displayName.keyword",
Size = 9999
}
}
}
}
}
}
}
}
};
Now we have a valid request sent to elasticsearch.
There are a couple of things we can improve here:
1. Remove unnecessary bool query
Filter = new BoolQuery
{
Must = new QueryContainer[]
{
new TermQuery
{
Field = "country.isDisplayed",
Value = true
}
}
},
to
Filter =
new TermQuery
{
Field = "country.isDisplayed",
Value = true
},
2. Replace string field names
Usually, when doing calls from .Net there is some kind of POCO type which is helping us with writing strongly-typed requests to elasticsearch which helps us managing clean code and refactoring. With this, we can change field definition from
"country.displayName.keyword"
to
Infer.Field<Document>(f => f.Country.FirstOrDefault().DisplayName.Suffix("keyword"))
my types definition
public class Document
{
public int Id { get; set; }
[Nested]
public List<Country> Country { get; set; }
}
public class Country
{
public bool IsDisplayed { get; set; }
public string DisplayName { get; set; }
}
3. Consider using a fluent syntax
With NEST you can write queries in two ways: using object initializer syntax (which you did) or with help of fluent syntax. Have a look. Trying to write above query with the fluent syntax you will get something like
var searchResponse = await client.SearchAsync<Document>(s => s
.Size(0)
.Aggregations(a => a.Nested("by_country", n => n
.Path(p => p.Country)
.Aggregations(aa => aa
.Filter("by_country2", f => f
.Filter(q => q
.Term(t => t
.Field(field => field.Country.FirstOrDefault().IsDisplayed)
.Value(true)))
.Aggregations(aaa => aaa
.Terms("by_country3", t => t
.Field(field => field.Country.FirstOrDefault().DisplayName.Suffix("keyword"))
.Size(9999)
)))))));
which I find a little bit easier to follow and write, maybe it will be better for you as well.
As a final note, have a look into docs and check how you can debug your queries.
Hope that helps.

Is an nGram fuzzy search possible?

I'm trying to get an nGram filter to work with a fuzzy search, but it won't. Specifically, I'm trying to get "rugh" to match on "rough".
I don't know whether it's just not possible, or it is possible but I've defined the mapping wrong, or the mapping is fine but my search isn't defined correctly.
Mapping:
{
settings = new
{
index = new
{
number_of_shards = 1,
number_of_replicas = 1,
analysis = new
{
filter = new
{
edge_ngram_filter = new
{
type = "nGram",
min_gram = 3,
max_gram = 8
}
}, // filter
analyzer = new
{
analyzer_ngram = new
{
type = "custom",
tokenizer = "standard",
filter = new string[]
{
"lowercase",
"edge_ngram_filter"
}
}
} // analyzer
} // analysis
} // index
}, // settings
mappings = new
{
j_cv = new
{
properties = new
{
Text = new
{
type = "text",
include_in_all = false,
analyzer = "analyzer_ngram",
search_analyzer = "standard"
}
}
} // j_cv
} // mappings
}
Document:
{
Id = Guid.NewGuid(),
Name = "Jimmy Riddle",
Keyword = new List<string>(new string[] { "Hunting", "High", "Hotel", "California" }),
Text = "Rough Justice was a program on BBC some years ago. It was quite interesting. Will this match?"
}
Search:
{
query = new
{
query_string = new
{
fields = new string[] { "Text" },
fuzziness = "3",
query = "rugh"
}
}
}
Incidentally, "ugh" does match which is what you'd expect.
Thanks for any help you can give,
Adam.
The same analyzer should usually be applied at index and search time, so search_analyzer=standard is wrong, it should be working if you remove it.
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-analyzer.html
Edit:
You forgot the fuzzy operator "~" in your query, if you add it to "rugh" it will work!

Elasticsearch NEST - Filtering on multilevel nested types

I have a document model like this:
"customer": {
"properties": {
"id": { "type": "integer" },
"name": { "type": "string" },
"orders": {
"type": "nested",
"properties": {
"id": { "type": "integer" },
"orderDate" : { "type": "date", "format" : "YYYY-MM-dd" },
"orderLines": {
"type": "nested",
"properties": {
"seqno": { "type": "integer" },
"quantity": { "type": "integer" },
"articleId": { "type": "integer" }
}
}
}
}
}
}
A customer can have 0, 1 or multiple orders and an order can have 0, 1 or multiple orderLines
(this is a model I created for this question, as I think this is data everyone can understand, so if you spot any mistakes, please let me know, but don't let them distract you from my actual question)
I want to create a query with NEST which selects a (or all) customers with a specific value for customer.id, but only if they have at least one orderLine with a specific articleId.
I've looked at Need concrete documentation / examples of building complex index using NEST ElasticSearch library and Matching a complete complex nested collection item instead of separate members with Elastic Search, but was unable to create the query. Based upon the second question, I got to the point where I wrote
var results = client.Search<customer>(s => s
.From(0)
.Size(10)
.Types(typeof(customer))
.Query(q =>
q.Term(c => c.id, 12345)
&& q.Nested(n => n
.Path(c => c.order)
.Query(q2 => q2.Nested(n2 => n2
.Path(o => o.???))))
)
);
I expected the second Path to use order (orders is List) as generic type, but it is customer.
What is the code for the correct query?
In addition: is there more elaborate documentation of the search/query/filtering methods of NEST than the documentation on http://nest.azurewebsites.net/? In the first referenced question, both the links to the complex query tutorial (in question) and the unit test examples (accepted answer) do not work (timeout and 404 respectively).
Assuming we are modelling the customer to something on these lines
class customer
{
public int id { get; set; }
public string name { get; set;}
public class Orders {
public int id { get; set;}
public string orderData { get; set;}
public class OrderLines
{
public int seqno { get; set; }
public int quantity { get; set; }
public int articleId { get; set; }
}
[ElasticProperty(Type = FieldType.Nested)]
public List<OrderLines> orderLines { get; set; }
}
[ElasticProperty(Type = FieldType.Nested)]
public List<Orders> orders { get; set; }
};
The query in the above case would be :
var response = client.Search<customer>(
s => s.Index(<index_name_here>).Type("customer")
.Query(q => q.Term(p=>p.id, 1)
&&
q.Nested(n =>
n.Path("orders")
.Query(q2=> q2.Nested(
n2 => n2.Path("orders.orderLines")
.Query(q3 =>
q3.Term(c=>c.orders.First().orderLines.First().articleId, <article_id_here>)))))
));
As far as documentation the best I have come across is the same as the one you posted in the question and the resources linked there.

Resources