Change ElasticSearch track_total_hits in NEST - elasticsearch

I was running thought examples of ElasticSearch, and read this link that says that there is a default set at 10,000, which also can be changed on the search calls, like on this example
GET twitter/_search
{
"track_total_hits": 100,
"query": {
"match" : {
"message" : "Elasticsearch"
}
}
}
The problem is, I'm trying to do the same on NEST, but I don't manage to replicate it. The only thing similar that I found, only accept a Boolean value and not a number. It is possible to change the total through NEST?
Here is the code that I tried:
var results = elasticClient.Search<MyClass>(s => s
.Query(q => q.QueryString(q2 => q2.Query(readLine)
.Fields(f => f.Field(p => p.MyField)))).TrackTotalHits(true));

As stated by #russcam here at the moment you can do it via casting ISearchRequest to IRequest<SearchRequestParameters>:
var client = new ElasticClient();
var searchResponse = client.Search<Document>(s =>
{
IRequest<SearchRequestParameters> request = s;
request.RequestParameters.SetQueryString("track_total_hits", 1000);
return s;
});
It will apply it as querystring parameter

Related

Adding FunctionScore/FieldValueFactor to a MultiMatch query

We've got a pretty basic query we're using to allow users to provide a query text, and then it boosts matches on different fields. Now we want to add another boost based on votes, but not sure where to nest the FunctionScore in.
Our original query is:
var results = await _ElasticClient.SearchAsync<dynamic>(s => s
.Query(q => q
.MultiMatch(mm => mm
.Fields(f => f
.Field("name^5")
.Field("hobbies^2")
)
.Query(queryText)
)
)
);
If I try to nest in FunctionScore around the MultiMatch, it basically ignores the query/fields, and just returns everything in the index:
var results = await _ElasticClient.SearchAsync<dynamic>(s => s
.Query(q => q
.FunctionScore(fs => fs
.Query(q2 => q2
.MultiMatch(mm => mm
.Fields(f => f
.Field("name^5")
.Field("hobbies^2")
)
.Query(queryText)
)
)
)
)
);
My expectation is that since I'm not providing a FunctionScore or any Functions, this should basically do the exact same thing as above. Then, just adding in FunctionScore will provide boosts on the results based on the functions I give it (in my case, boosting based on the votes field just FieldValueFactor).
The documentation around this is a little fuzzy, particularly with certain combinations, like MultiMatch, FunctionScore, and query text. I did find this answer, but it doesn't cover when including query text.
I'm pretty sure it boils down to my still foggy understanding of how Elastic queries work, but I'm just not finding much to cover the (what I would think is a pretty common) scenario of:
A user entering a query
Boosting matches of that query with certain fields
Boosting all results based on the value of a numeric field
Your function_score query is correct, but the reason that you are not seeing the results that you expect is because of a feature in NEST called conditionless queries. In the case of a function_score query, it is considered conditionless when there are no functions, omitting the query from the serialized form sent in the request.
The easiest way to see this is with a small example
private static void Main()
{
var defaultIndex = "my-index";
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var settings = new ConnectionSettings(pool, new InMemoryConnection())
.DefaultIndex(defaultIndex)
.DisableDirectStreaming()
.PrettyJson()
.OnRequestCompleted(callDetails =>
{
if (callDetails.RequestBodyInBytes != null)
{
Console.WriteLine(
$"{callDetails.HttpMethod} {callDetails.Uri} \n" +
$"{Encoding.UTF8.GetString(callDetails.RequestBodyInBytes)}");
}
else
{
Console.WriteLine($"{callDetails.HttpMethod} {callDetails.Uri}");
}
Console.WriteLine();
if (callDetails.ResponseBodyInBytes != null)
{
Console.WriteLine($"Status: {callDetails.HttpStatusCode}\n" +
$"{Encoding.UTF8.GetString(callDetails.ResponseBodyInBytes)}\n" +
$"{new string('-', 30)}\n");
}
else
{
Console.WriteLine($"Status: {callDetails.HttpStatusCode}\n" +
$"{new string('-', 30)}\n");
}
});
var client = new ElasticClient(settings);
var queryText = "query text";
var results = client.Search<dynamic>(s => s
.Query(q => q
.FunctionScore(fs => fs
.Query(q2 => q2
.MultiMatch(mm => mm
.Fields(f => f
.Field("name^5")
.Field("hobbies^2")
)
.Query(queryText)
)
)
)
)
);
}
which emits the following request
POST http://localhost:9200/my-index/object/_search?pretty=true&typed_keys=true
{}
You can disable the conditionless feature by marking a query as Verbatim
var results = client.Search<dynamic>(s => s
.Query(q => q
.FunctionScore(fs => fs
.Verbatim() // <-- send the query *exactly as is*
.Query(q2 => q2
.MultiMatch(mm => mm
.Fields(f => f
.Field("name^5")
.Field("hobbies^2")
)
.Query(queryText)
)
)
)
)
);
This now sends the query
POST http://localhost:9200/my-index/object/_search?pretty=true&typed_keys=true
{
"query": {
"function_score": {
"query": {
"multi_match": {
"query": "query text",
"fields": [
"name^5",
"hobbies^2"
]
}
}
}
}
}

Nest DeleteByQuery without the Object name

I want to send a Nest delete request to elasticsearch without specifying the object which I don't have. I've seen solutions like:
var response = elasticClient.DeleteByQuery<MyClass>(q => q
.Match(m => m.OnField(f => f.Guid).Equals(someObject.Guid))
);
From: DeleteByQuery using NEST and ElasticSearch
As I'm just reading plain text from a queue I don't have access to the MyClass object to use with the delete request. Basically I just want to delete all documents in an index (whose name I know) where a variable matches for example ordId = 1234. Something like:
var response = client.DeleteByQuery<string>( q => q
.Index(indexName)
.AllTypes()
.Routing(route)
.Query(rq => rq
.Term("orgId", "1234"))
);
I see that the nest IElasticClient interface does have a DeleteByQuery method that doesn't require the mapping object but just not sure how to implement it.
You can just specify object as the document type T for DeleteByQuery<T> - just be sure to explicitly provide the index name and type name to target in this case. T is used to provide strongly type access within the body of the request only. For example,
var client = new ElasticClient();
var deleteByQueryResponse = client.DeleteByQuery<object>(d => d
.Index("index-name")
.Type("type-name")
.Query(q => q
.Term("orgId", "1234")
)
);
Will generate the following query
POST http://localhost:9200/index-name/type-name/_delete_by_query
{
"query": {
"term": {
"orgId": {
"value": "1234"
}
}
}
}
Replace _delete_by_query with _search in the URI first, to ensure you're targeting the expected documents :)

NEST Search not found any result

Just find out about nest. I already insert some number of document in Elastic Search. Right now I want to search the data based on my type, subcriberId. I did run through curl and it works just fine. But when I tried using nest, no result found.
My curl which work:
http://localhost:9200/20160902/_search?q=subscribeId:aca0ca1a-c96a-4534-ab0e-f844b81499b7
My NEST code:
var local = new Uri("http://localhost:9200");
var settings = new ConnectionSettings(local);
var elastic = new ElasticClient(settings);
var response = elastic.Search<IntegrationLog>(s => s
.Index(DateTime.Now.ToString("yyyyMMdd"))
.Type("integrationlog")
.Query(q => q
.Term(p => p.SubscribeId, new Guid("aca0ca1a-c96a-4534-ab0e-f844b81499b7"))
)
);
Can someone point what I did wrong?
A key difference between your curl request and your NEST query is that the former is using a query_string query and the latter, a term query. A query_string query input undergoes analysis at query time whilst a term query input does not so depending on how subscribeId is analyzed (or not), you may see different results. Additionally, your curl request is searching across all document types within the index 20160902.
To perform the exact same query in NEST as your curl request would be
void Main()
{
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var connectionSettings = new ConnectionSettings(pool)
// set up NEST with the convention to use the type name
// "integrationlog" for the IntegrationLog
// POCO type
.InferMappingFor<IntegrationLog>(m => m
.TypeName("integrationlog")
);
var client = new ElasticClient(connectionSettings);
var searchResponse = client.Search<IntegrationLog>(s => s
.Index("20160902")
// search across all types. Note that documents found
// will be deserialized into instances of the
// IntegrationLog type
.AllTypes()
.Query(q => q
// use query_string query
.QueryString(qs => qs
.Fields(f => f
.Field(ff => ff.SubscribeId)
)
.Query("aca0ca1a-c96a-4534-ab0e-f844b81499b7")
)
)
);
}
public class IntegrationLog
{
public Guid SubscribeId { get; set; }
}
This yields
POST http://localhost:9200/20160902/_search
{
"query": {
"query_string": {
"query": "aca0ca1a-c96a-4534-ab0e-f844b81499b7",
"fields": [
"subscribeId"
]
}
}
}
this specifies the query_string query in the body of the request which is analogous to using the q query string parameter to specify the query.

Elasticsearch NEST 2 How to correctly map and use nested classes and bulk index

I have three main questions I need help answering.
How do you correctly map and store a nested map?
How do you search a nested part of a document?
How do you bulk index?
I'm using Nest version 2 and have been looking over the new documentation which can be found Here. The documentation has been useful in creating certain parts of the code but unfortunately doesn't explain how they fit together.
Here is the class I'm trying to map.
[ElasticsearchType(Name = "elasticsearchproduct", IdProperty = "ID")]
public class esProduct
{
public int ID { get; set; }
[Nested]
public List<PriceList> PriceList { get; set; }
}
[ElasticsearchType(Name = "PriceList")]
public class PriceList
{
public int ID { get; set; }
public decimal Price { get; set; }
}
and my mapping code
var node = new Uri(HOST);
var settings = new ConnectionSettings(node).DefaultIndex("my-application");
var client = new ElasticClient(settings);
var map = new CreateIndexDescriptor("my-application")
.Mappings(ms => ms
.Map<esProduct>(m => m
.AutoMap()
.Properties(ps => ps
.Nested<PriceList>(n => n
.Name(c => c.PriceList)
.AutoMap()
)
)
)
);
var response = client.Index(map);
This is the response I get:
Valid NEST response built from a succesful low level call on POST: /my-application/createindexdescriptor
So that seems to work. next index.
foreach (DataRow dr in ProductTest.Tables[0].Rows)
{
int id = Convert.ToInt32(dr["ID"].ToString());
List<PriceList> PriceList = new List<PriceList>();
DataRow[] resultPrice = ProductPriceTest.Tables[0].Select("ID = " + id);
foreach (DataRow drPrice in resultPrice)
{
PriceList.Add(new PriceList
{
ID = Convert.ToInt32(drPrice["ID"].ToString()),
Price = Convert.ToDecimal(drPrice["Price"].ToString())
}
esProduct product = new esProduct
{
ProductDetailID = id,
PriceList = PriceList
};
var updateResponse = client.Update<esProduct>(DocumentPath<esProduct>.Id(id), descriptor => descriptor
.Doc(product)
.RetryOnConflict(3)
.Refresh()
);
var index = client.Index(product);
}
}
Again this seems to work but when I come to search it does seem to work as expected.
var searchResults = client.Search<esProduct>(s => s
.From(0)
.Size(10)
.Query(q => q
.Nested(n => n
.Path(p => p.PriceList)
.Query(qq => qq
.Term(t => t.PriceList.First().Price, 100)
)
)
));
It does return results but I was expecting
.Term(t => t.PriceList.First().Price, 100)
to look move like
.Term(t => t.Price, 100)
and know that is was searching the nested PriceList class, is this not the case?
In the new version 2 documentation I can't find the bulk index section. I tried using this code
var descriptor = new BulkDescriptor();
***Inside foreach loop***
descriptor.Index<esProduct>(op => op
.Document(product)
.Id(id)
);
***Outside foreach loop***
var result = client.Bulk(descriptor);
which does return a success response but when I search I get no results.
Any help would be appreciated.
UPDATE
After a bit more investigation on #Russ advise I think the error must be with my bulk indexing of a class with a nested object.
When I use
var index = client.Index(product);
to index each product I can use
var searchResults = client.Search<esProduct>(s => s
.From(0)
.Size(10)
.Query(q => q
.Nested(n => n
.Path(p => p.PriceList)
.Query(qq => qq
.Term(t => t.PriceList.First().Price, 100)
)
)
)
);
to search and return results, but when I bulk index this no long works but
var searchResults = client.Search<esProduct>(s => s
.From(0)
.Size(10)
.Query(q => q
.Term(t => t.PriceList.First().Price, 100)
)
);
will work, code b doesn't work on the individual index method. Does anyone know why this has happened?
UPDATE 2
From #Russ suggested I have taken a look at the mapping.
the code I'm using to index is
var map = new CreateIndexDescriptor(defaultIndex)
.Mappings(ms => ms
.Map<esProduct>(m => m
.AutoMap()
.Properties(ps => ps
.Nested<PriceList>(n => n
.Name(c => c.PriceList)
.AutoMap()
)
)
)
);
var response = client.Index(map);
Which is posting
http://HOST/fresh-application2/createindexdescriptor {"mappings":{"elasticsearchproduct":{"properties":{"ID":{"type":"integer"},"priceList":{"type":"nested","properties":{"ID":{"type":"integer"},"Price":{"type":"double"}}}}}}}
and on the call to http://HOST/fresh-application2/_all/_mapping?pretty I'm getting
{
"fresh-application2" : {
"mappings" : {
"createindexdescriptor" : {
"properties" : {
"mappings" : {
"properties" : {
"elasticsearchproduct" : {
"properties" : {
"properties" : {
"properties" : {
"priceList" : {
"properties" : {
"properties" : {
"properties" : {
"ID" : {
"properties" : {
"type" : {
"type" : "string"
}
}
},
"Price" : {
"properties" : {
"type" : {
"type" : "string"
}
}
}
}
},
"type" : {
"type" : "string"
}
}
},
"ID" : {
"properties" : {
"type" : {
"type" : "string"
}
}
}
}
}
}
}
}
}
}
}
}
}
}
fresh-application2 returned mapping doesn't mention nested type at all, which I'm guessing is the issue.
The mapping my working nested query looks more like this
{
"my-application2" : {
"mappings" : {
"elasticsearchproduct" : {
"properties" : {
"priceList" : {
"type" : "nested",
"properties" : {
"ID" : {
"type" : "integer"
},
"Price" : {
"type" : "double"
}
}
},
"ID" : {
"type" : "integer"
},
}
}
}
}
}
This has the nested type returned. I think the one which isn't returning nested as a type is when I started using .AutoMap() , am I using it correctly?
UPDATE
I have fixed my mapping problem. I have changed my mapping code to
var responseMap = client.Map<esProduct>(ms => ms
.AutoMap()
.Properties(ps => ps
.Nested<PriceList>(n => n
.Name(c => c.PriceList)
.AutoMap()
)
)
);
Whilst you're developing, I would recommend logging out requests and responses to Elasticsearch so you can see what is being sent when using NEST; this'll make it easier to relate to the main Elasticsearch documentation and also ensure that the body of the requests and responses match your expectations (for example, useful for mappings, queries, etc).
The mappings that you have look fine, although you can forgo the attributes since you are using fluent mapping; there's no harm in having them there but they are largely superfluous (the type name for the esProduct is the only part that will apply) in this case because .Properties() will override inferred or attribute based mapping that is applied from calling .AutoMap().
In your indexing part, you update the esProduct and then immediately after that, index the same document again; I'm not sure what the intention is here but the update call looks superfluous to me; the index call will overwrite the document with the given id in the index straight after the update (and will be visible in search results after the refresh interval). The .RetryOnConflict(3) on the update will use optimistic concurrency control to perform the update (which is effectively a get then index operation on the document inside of the cluster, that will try 3 times if the version of the document changes in between the get and index). If you're replacing the whole document with an update i.e. not a partial update then the retry on conflict is not really necessary (and as per previous note, the update call in your example looks unnecssary altogether since the index call is going to overwrite the document with the given id in the index).
The nested query looks correct; You specify the path to the nested type and then the query to a field on the nested type will also include the path.I'll update the NEST nested query usage documentation to better demonstrate.
The bulk call looks fine; you may want to send documents in batches e.g. bulk index 500 documents at a time, if you need to index a lot of documents. How many to send in one bulk call is going to depend on a number of factors including the document size, how it is analyzed, performance of the cluster, so will need to experiment to get a good bulk size call for your circumstances.
I'd check to make sure that you are hitting the right index, that the index contains the count of documents that you expect and find a document that you know has a PriceList.Price of 100 and see what is indexed for it. It might be quicker to do this using Sense while you're getting up an running.

example of how to use synonyms in nest

i haven't found a solid example on how to create and use synonyms using Nest for Elasticsearch. if anyone has one it would be helpful.
my attempt looks like this, but i don't know how to apply it to a field.
var syn = new SynonymTokenFilter
{
Synonyms = new [] { "pink, p!nk => pink", "lil, little", "ke$ha, kesha => ke$ha" },
IgnoreCase = true,
Tokenizer = "standard"
};
client.CreateIndex("myindex", i =>
{
i
.Analysis(a => a.Analyzers(an => an
.Add("fullTermCaseInsensitive", fullTermCaseInsensitive)
)
.TokenFilters(x => x
.Add("synonym", syn)
)
)
...
it's very simple :)
you will need to define first the Synonym filter the you can use it in your custom Analyzer...where you can add also other type of filters.
Small example :
.Analysis(descriptor => descriptor
.Analyzers(bases => bases
.Add("folded_word", new CustomAnalyzer()
{
Filter = new List<string> { "icu_folding", "trim", "synonym" },
Tokenizer = "standard"
}
)
)
.TokenFilters(i => i
.Add("synonym", new SynonymTokenFilter()
{
SynonymsPath="analysis/synonym.txt",
Format = "Solr"
}
)
)
Then you can use the custom analyzer in the mapping part
Assuming your fullTermCaseInsensitive analyzer is custom, you need to add your synonym filter to it:
var fullTermCaseInsensitive = new CustomAnalyzer()
{
.
.
.
Filter = new string[] { "syn" }
};
And upon creating your index, you can add a mapping and apply the fullTermCaseInsensitive analyzer to your field(s):
client.CreateIndex("myindex", c => c
.Analysis(a => a
.Analyzers(an => an.Add("fullTermCaseInsensitive", fullTermCaseInsensitive))
.TokenFilters(tf => tf.Add("syn", syn)))
.AddMapping<MyType>(m => m
.Properties(p => p
.String(s => s.Name(t => t.MyField).Analyzer("fullTermCaseInsensitive")))));

Resources