ElasticSearch NEST simple Terms query requires .keyword - elasticsearch

I am trying to retrieve a single document with a secific name (exactly that name) using NEST 7.5.1 (.NET Core 3.1):
var queryByTerm = client.Search<SomeDto>(s =>s.Query(q => q.Term(p => p.NameField, "example name")));
But it does not return any documents (the call succeeds).
The actual query being sent (as seen in DebugInformation with .EnableDebugMode on client's ConnectionSettings):
{"query":{"term":{"nameField":{"value":"example name"}}}}
But it only works (in Kibana) when I add .keyword fo the nameField:
{"query":{"term":{"nameField.keyword":{"value":"example name"}}}}
Do I somehow have to force NEST to use nameField.keyword instead of nameField?

You can do this with .Suffix() extension method. Docs.
var queryByTerm = client.Search<SomeDto>(s =>s.Query(q => q.Term(p => p.NameField.Suffix("keyword"), "example name")));
Hope that helps.

Related

Why is elasticsearch's Nest lowlevel Search method ignoring type and index name defined in SearchDescriptor<>() object

NEST/Elasticsearch.Net version:5.6.5
Elasticsearch version:5.4.3
We are trying to fetch result from our index using the LowLevelClient. We are using the below SearchAsync API
var searchDescriptor = new SearchDescriptor<MyType>()
.Type("mytype")
.Index("myindex")
.Query(....)
.Aggregation(ag => ag.Terms(... Aggregation(ag1 => ag1.Min(...TopHits(...)))));
var memoryStream = new MemoryStream();
_client.Serializer.Serialize(searchDescriptor, memoryStream);
var response = await _client.LowLevel.SearchAsync<byte[]>(memoryStream.ToArray()).ConfigureAwait(false);
//_client - instance of Nest.ElasticClient
//Next Step - Deserialize the response
This is giving me results from other indices also(a combination of results from the various indices) and my deserialization is breaking. The client is ignoring type and index name and calling POST /_search API instead of POST /myindex/mytype/_search on the elastic search
Note:
We need to call a lower-level client because we are using a custom deserializer for performance concern
What is the issue here?
Found a workaround
The SearchAsync<>() method have overloaded method _client.LowLevel.SearchAsync<T>(string indexName, string typeName, T t)
Passing the index name and type name will narrow to search to that particular index.
But the question still remains, why it is not taking the index and type name from SearchDescriptor object.

Elasticsearch's NEST API does not return query results while the same query is successful when submitting by POSTMAN

The following code snippet is a MoreLikeThis query built using NEST API:
private class Temp
{
public string Content { get; set; }
public string TextToSearch { get; set; }
}
var temp = new Temp
{
TextToSearch = "empire",
};
var response = await model.ElasticClient.SearchAsync<Temp>(s => s
.Query(q => q
.MoreLikeThis(qd => qd
.Like(l => l.Text(temp.TextToSearch))
.MinTermFrequency(1)
.MinDocumentFrequency(1)
.Fields(fd => fd.Fields(r => r.Content)))));
After executing this code snippet response.Documents did not return any records. But when the following JSON payload is POSTed by POSTMAN, the results are received successfully:
{"query":{"more_like_this":{"fields":["content"],"like":["advanced technology"],"min_doc_freq":1,"min_term_freq":1}}}
This payload is generated by the C# code snippet above when enabling audit trail. While the credentials are passed in both cases properly why the NEST API version 6.5.0 does not receive documents from the elastic search instance?
Is there a bug in the library or we're missing a point?
Besides the TextToSearch being "empire" in the C# example and "advanced technology" in the JSON query DSL example, I strongly suspect that the issue here is that of the index and type being targeted in the NEST case.
When no index and type are provided in the API call:
For index,
Will look to see if there is a default index to use for Temp type configured with DefaultMappingFor<T> on ConnectionSettings
If no default index for Temp, will use the DefaultIndex configured on ConnectionSettings
If no default index is configured on ConnectionSettings, the API call will not be made and NEST will throw an exception to indicate that it does not have enough information to make the API call.
For type,
Will look to see if there is a default type name to use for Temp type configured with DefaultMappingFor<T> on ConnectionSettings
Will look to see if a type name convention is configured using DefaultTypeNameInferrer on ConnectionSettings. If none is configured, or the delegate it is configured with returns null or "" for a given type, then will continue
Will look to see if a default type name is specified with DefaultTypeName on ConnectionSettings. If none is specified, a type name will be inferred for a POCO type by lowercasing the type name. For Temp, this will be temp.
So, assuming you have a default index configured and no convention for type names, the request URI for your NEST example will be
<configured uri>/<default index>/temp/_search
which probably does not match what you are using in Postman.
Check out the documentation to see more details about Index name inference and Type name inference.

Reindexing using NEST V5.4 - ElasticSearch

I'm quite new to ElasticSearch. I'm trying to reindex a index in order to rename it. I'm using NEST API v5.4.
I saw this example:
var reindex =
elasticClient.Reindex<Customer>(r =>
r.FromIndex("customers-v1")
.ToIndex("customers-v2")
.Query(q => q.MatchAll())
.Scroll("10s")
.CreateIndex(i =>
i.AddMapping<Customer>(m =>
m.Properties(p =>
p.String(n => n.Name(name => name.Zipcode).Index(FieldIndexOption.not_analyzed))))));
Source: http://thomasardal.com/elasticsearch-migrations-with-c-and-nest/
However, I can't reproduce this using NEST 5.4. I think that is to version 2.4.
I check the breaking changes of ElasticSearch and try reindexing using this:
Source: https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/nest-breaking-changes.html
public method Nest.ReindexDescriptor..ctor Declaration changed (Breaking)
2.x: public .ctor(IndexName from, IndexName to) 5.x: public .ctor()
var reindex = new client.Reindex(oldIndexName, newIndexName);
But this did not work too.
I also search for documentation but i didn't find any code on c#, just JSON
Source: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html)
Can someone give me a example how to reindex using NEST 5.4 on C#?
Thanks in advance! :slight_smile:
After search for 2 long days I found out the solution to reindex a index. In order to solve future problems, I'll provide my solution.
Nest Version - 5.4
var reindex = client.Reindex<object>(r => r
.BackPressureFactor(10)
// ScrollAll - Scroll all the documents of the index and store it for 1minute
.ScrollAll("1m", 2, s => s
.Search(ss => ss
.Index(oldIndexName)
.AllTypes())
// there needs to be some degree of parallelism for this to work
.MaxDegreeOfParallelism(4))
.CreateIndex(c => c
// New index here
.Index(newIndexName)
.Settings(
// settings goes here)
.Mappings(
// mappings goes here))
.BulkAll(b => b
// New index here!
.Index(newIndexName)
.Size(100)
.MaxDegreeOfParallelism(2)
.RefreshOnCompleted()));
the ReIndex method returns a cold IObservable on which you have to call .Subscribe() to kick off everything.
So, you need to add it to your code:
var o = new ReindexObserver(
onError: (e) => { //do something },
onCompleted: () => { //do something });
reindex.Subscribe(o);
Useful links to check this are:
Documentation
Issue 2660 on GitHub
Issue 2771 on GitHub

Nest - Reindexing

Elasticsearch released their new Reindex API in Elasticsearch 2.3.0, does the current version of NEST (2.1.1) make use of this api yet? If not, are there plans to do so?
I am aware that the current version has a reindex method, but it forces you to create the new index. For my use case, the index already exists.
Any feedback/insights will be greately appricated. Thnx!
This kind of question is best asked on the github issues for NEST since the committers on the project will be able to best answer :)
A commit went in on 6 April to map the new Reindex API available in Elasticsearch 2.3.0, along with other features like the Task Management API and Update By Query. This made its way into NEST 2.3.0
NEST 2.x already contains a helper for doing reindexing that uses scan/scroll under the covers and returns an IObservable<IReindexResponse<T>> that can be used to observe progress
public class Document {}
var observable = client.Reindex<Document>("from-index", "to-index", r => r
// settings to use when creating to-index
.CreateIndex(c => c
.Settings(s => s
.NumberOfShards(5)
.NumberOfReplicas(2)
)
)
// query to optionally limit documents re-indexed from from-index to to-index
.Query(q => q.MatchAll())
// the number of documents to reindex in each request.
// NOTE: The number of documents in each request will actually be
// NUMBER * NUMBER OF SHARDS IN from-index
// since reindex uses scan/scroll
.Size(100)
);
ExceptionDispatchInfo e = null;
var waitHandle = new ManualResetEvent(false);
var observer = new ReindexObserver<Document>(
onNext: reindexResponse =>
{
// do something with notification. Maybe log total progress
},
onError: exception =>
{
e = ExceptionDispatchInfo.Capture(exception);
waitHandle.Set();
},
completed: () =>
{
// Maybe log completion, refresh the index, etc..
waitHandle.Set();
}
);
observable.Subscribe(observer);
// wait for the handle to be signalled
waitHandle.Wait();
// throw the exception if one was captured
e?.Throw();
Take a look at the ReIndex API tests for some ideas.
The new Reindex API is named client.ReIndexOnServer() in the client to differentiate it from the existing observable implementation.

How to present NEST query results?

I want to return NEST query results as console output.
My query is:
private static void PerformTermQuery(string query)
{
var result =
client.Search<Post>(s => s
.Query(p => p.Term(q => q.PostText, query)));
}
What I am getting is object, with 2 Documents. How to "unpack" it to show documents as json (full or partial) to the console?
Assuming you are using version 1.3.1 of NEST, you can:
get raw JSON response using result.RequestInformation.ResponseRaw.Utf8String()
parse JSON to get _source
include/exclude _source properties using SearchSourceDescriptor on SearchDescriptor
var result =
client.Search<Post>(s => s
.Query(p => p.Term(q => q.PostText, query)).Source(...));
For NEST / Elasticsearch 5.x, result.RequestInformation is no longer available. Instead, you can access the raw request and response data by first disabling direct streaming on the request:
var results = elasticClient.Search<MyObject>(s => s
.Index("myindex")
.Query(q => q
...
)
.RequestConfiguration(rc => rc
.DisableDirectStreaming()
)
);
After you've disabled direct streaming, you can access results.ApiCall.ResponseBodyInBytes (if you look at this property without disabling direct streaming, it will be null)
string rawResponse = Encoding.UTF8.GetString(results.ApiCall.ResponseBodyInBytes);
This probably has a performance impact so I would avoid using it on production. You can also disable direct streaming at the connection / client level, if you need to use it across all your queries. Take a look at the documentation for more information.

Resources