NEST (ElasticSearch) search response drops aggregates - elasticsearch

Here is a query that works in ElasticSearch.
"query":{
"match_all":{
}
},
"size":20,
"aggs":{
"CompanyName.raw":{
"terms":{
"field":"CompanyName.raw",
"size":20,
"order":{
"_count":"desc"
}
}
}
}
}
The response from ElasticSearch has a property aggregations['CompanyName.raw']['buckets'] which is an array.
I use this code to exeute the same query via NEST
string responseJson = null;
ISearchResponse<ProductPurchasing> r = Client.Search<ProductPurchasing>(rq);
using (MemoryStream ms = new MemoryStream())
{
Client.RequestResponseSerializer.Serialize<ISearchResponse<ProductPurchasing>>(r, ms);
ms.Position = 0;
using (StreamReader sr = new StreamReader(ms))
{
responseJson = sr.ReadToEnd();
}
}
However, in the resulting responseJson this array is always empty.
Where hs it gone?
How can I get it back?
Or is it that NEST doesn't support aggregates?

NEST does support aggregation, you can have a look into docs on how to handle aggregation response with NEST help.
Here you can find a short example of writing and retrieving data from simple terms aggregation:
class Program
{
public class Document
{
public int Id { get; set; }
public string Name { get; set; }
public string Brand { get; set; }
public string Category { get; set; }
public override string ToString() => $"Id: {Id} Name: {Name} Brand: {Brand} Category: {Category}";
}
static async Task Main(string[] args)
{
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var connectionSettings = new ConnectionSettings(pool);
connectionSettings.DefaultIndex("documents");
var client = new ElasticClient(connectionSettings);
var deleteIndexResponse = await client.Indices.DeleteAsync("documents");
var createIndexResponse = await client.Indices.CreateAsync("documents", d => d
.Map(m => m.AutoMap<Document>()));
var indexDocument = await client
.IndexDocumentAsync(new Document {Id = 1, Brand = "Tommy", Category = "men"});
var indexDocument2 = await client
.IndexDocumentAsync(new Document {Id = 2, Brand = "Diesel", Category = "men"});
var indexDocument3 = await client
.IndexDocumentAsync(new Document {Id = 3, Brand = "Boss", Category = "men"});
var refreshAsync = client.Indices.RefreshAsync();
var searchResponse = await client.SearchAsync<Document>(s => s
.Query(q => q.MatchAll())
.Aggregations(a => a
.Terms("brand", t => t
.Field(f => f.Brand.Suffix("keyword")))));
var brands = searchResponse.Aggregations.Terms("brand");
foreach (var bucket in brands.Buckets)
{
Console.WriteLine(bucket.Key);
}
}
}
Prints:
Boss
Diesel
Tommy
Hope that helps.

Related

How to stream many documents from Elasticsearch index (Scroll, Sliced Scroll)

I am looking for a way to stream all (~ 10^6+) documents via .NET Nest Client.
I want to boost performance by using parallel async requests. (e.g ActionBlock, Task.WhenAll())
old fashioned without boosting:
var objects = new List<object>();
var searchResponse = await elasticClient.SearchAsync<object>(
new SearchRequest<object>("myIndex")
{
Size = 7000,
Query = new BoolQuery
{
//...
},
// why here and in scroll itself?
Scroll = "2s",
Sort = new List<ISort>
{
//..
}
});
while (searchResponse.Documents.Any())
{
objects.AddRange(searchResponse.Documents);
searchResponse = await elasticClient.ScrollAsync<object>("2s", searchResponse.ScrollId).ConfigureAwait(false);
}
return objects;
then a try using parallel sliced scroll
var result = new ConcurrentBag<object>();
var tasks = Enumerable.Range(0, 4).Select(
id => new SearchRequest<object>("myIndex")
{
// hast to be lower than 1024?
Size = 1000,
Query = new BoolQuery
{
//...
},
// why here and in scroll itself?
Scroll = "2s",
Sort = new List<ISort>
{
//..
}
}).Select(
async searchRequest =>
{
var searchResponse = await elasticClient.SearchAsync<object>(searchRequest).ConfigureAwait(false);
while (searchResponse.Documents.Any())
{
searchResponse.Documents.Each(result.Add);
searchResponse = await elasticClient.ScrollAsync<object>("2s", searchResponse.ScrollId).ConfigureAwait(false);
}
// good idea right?
//await elasticClient.ClearScrollAsync(x => x.ScrollId(searchResponse.ScrollId)).ConfigureAwait(false);
});
await Task.WhenAll(tasks).PreserveAllExceptions().ConfigureAwait(false);
return result.ToList();
But this only gives me a fraction of the actual available documents.
More over slices scroll is limited to 1024 documents per slice.
I was not able to increase this value to 7000:
{
"myIndex_template": {
"settings": {
"index": {
"number_of_shards": "1",
"number_of_replicas": "0",
"max_slices_per_scroll": "10000"
}
}
}
}

Elasticsearch - NEST - Upsert

I have the following two classes
Entry
public class Entry
{
public Guid Id { get; set; }
public IEnumerable<Data> Data { get; set; }
}
EntryData
public class EntryData
{
public string Type { get; set; }
public object Data { get; set; }
}
I have a bunch of different applications that produces messages to a queue that I then consume in a separate application to store that data in elasticsearch.
Im using a CorrelationId for all the messages and I want to use this ID as the ID in elasticsearch.
So given the following data:
var id = Guid.Parse("1befd5b62b944b4aa600c85632159e11");
var entries = new List<Entry>
{
new Entry
{
Id = id,
Data = new List<EntryData>
{
new EntryData
{
Data = DateTime.UtcNow,
Type = "APPLICATION1_Received"
},
new EntryData
{
Data = DateTime.UtcNow,
Type = "APPLICATION1_Validated"
},
new EntryData
{
Data = DateTime.UtcNow,
Type = "APPLICATION1_Published"
},
}
},
new Entry
{
Id = id,
Data = new List<EntryData>
{
new EntryData
{
Data = DateTime.UtcNow,
Type = "APPLICATION2_Received"
},
new EntryData
{
Data = DateTime.UtcNow,
Type = "APPLICATION2_Validated"
},
new EntryData
{
Data = DateTime.UtcNow,
Type = "APPLICATION2_Published"
},
}
},
new Entry
{
Id = id,
Data = new List<EntryData>
{
new EntryData
{
Data = DateTime.UtcNow,
Type = "APPLICATION3_Received"
},
new EntryData
{
Data = DateTime.UtcNow,
Type = "APPLICATION3_Validated"
},
new EntryData
{
Data = DateTime.UtcNow,
Type = "APPLICATION3_Published"
},
}
},
};
I want this to be saved as one entry in elasticsearch where ID == 1befd5b6-2b94-4b4a-a600-c85632159e11 and a data array that contains 9 elements.
Im struggling a bit with getting this to work, when trying the following:
var result = await _elasticClient.BulkAsync(x => x.Index("journal").UpdateMany(entries, (descriptor, entry) => {
descriptor.Doc(entry);
return descriptor.Upsert(entry);
}), cancellationToken);
But this is just overwriting whatever already exists in the data array and the count is 3 instead of 9 (only the Application3 entries are saved).
So, is it possible to do what I want to do? I never worked with elasticsearch before so it feels like maybe Im missing something simple here... :)
Managed to solve it like this:
var result = await _elasticClient.BulkAsync(x => x.Index("journal").UpdateMany(entries, (descriptor, entry) => {
var script = new InlineScript("ctx._source.data.addAll(params.data)")
{
Params = new Dictionary<string, object> {{"data", entry.Data}}
};
descriptor.Script(b => script);
return descriptor.Upsert(entry);
}), cancellationToken);

Low level Elastic search not working properly

I am trying to do Elastic search with sort option. My query is like this:
var client = new ElasticClient(settings);
var query = new
{
query = new
{
term = new { title = "7-0 v Spurs" }
},
Sort = new List<ISort>
{
new SortField { Field = "releaseFrom", Order = SortOrder.Descending }
}
};
and my search is like this:
var stream = new MemoryStream();
client.Serializer.Serialize(query, stream);
var jsonQuery = System.Text.Encoding.UTF8.GetString(stream.ToArray());
var qRequest = new SearchRequest(jsonQuery);
var searchResponse = client.LowLevel.Search<SearchResponse<dynamic>>(IndexingService.IndexName, "article_en", qRequest);
I am getting the result, but it returns records which does not match the title and also it does not sort.
This is the query which is generated:
{ "query": { "term": { "title": "7-0 v Spurs" } }, "sort": [ { "releaseFrom": { "order": "desc" } } ] }
Anybody, with suggestion if I miss something here.
Found the solution.
Used ElasticLowLevelClient instead of ElasticClient.
Code is like this:
var lowlevelClient = new ElasticLowLevelClient(settings);
var stream = new MemoryStream();
lowlevelClient.Serializer.Serialize(query, stream);
var jsonQuery = System.Text.Encoding.UTF8.GetString(stream.ToArray());
var searchResponse = lowlevelClient.Search< SearchResponse<dynamic>>(IndexingService.IndexName, "article_en", jsonQuery);
one change in query also
match = new { title = "7-0 v Spurs" }

How to get attributes from OData response via AJAX?

I'm working on the MVC application which using OData & Web API via ajax. I'm trying to do paging from server side by using OData filter attributes. Here is my code of Controller.
[RoutePrefix("OData/Products")]
public class ProductsController : ODataController
{
private List<Product> products = new List<Product>
{
new Product() { Id = 1, Name = "Thermo King MP-3000", Price = 300, Category = "Thermo King" },
new Product() { Id = 2, Name = "Thermo King MP-4000", Price = 500, Category = "Thermo King" },
new Product() { Id = 3, Name = "Daikin Decos III c", Price = 200, Category = "Daikin" },
new Product() { Id = 4, Name = "Daikin Decos III d", Price = 400, Category = "Daikin" },
new Product() { Id = 5, Name = "Starcool RCC5", Price = 600, Category = "Starcool" },
new Product() { Id = 6, Name = "Starcool SCC5", Price = 700, Category = "Starcool" }
};
[EnableQuery(PageSize=2)]
public IQueryable<Product> Get()
{
return products.AsQueryable<Product>();
}
//[EnableQuery]
//public SingleResult<Product> Get([FromODataUri] int id)
//{
// var result = products.Where(x => x.Id.Equals(id)).AsQueryable();
// return SingleResult.Create<Product>(result);
//}
[EnableQuery]
public Product Get([FromODataUri] int id)
{
return products.First(x => x.Id.Equals(id));
}
}
And here is my code of javascript:
<script type="text/javascript">
$(document).ready(function () {
var apiUrl = "http://localhost:56963/OData/Products";
$.getJSON(apiUrl,
function (data) {
$("#div_content").html(window.JSON.stringify(data));
}
);
//$.get(apiUrl,
//function (data) {
// alert(data[0]);
// $("#div_content").html(data);
//});
});
</script>
The response from OData is JSON result like:
{"#odata.context":"http://localhost:56963/OData/$metadata#Products","value":[{"Id":1,"Name":"Thermo King MP-3000","Price":300,"Category":"Thermo King"},{"Id":2,"Name":"Thermo King MP-4000","Price":500,"Category":"Thermo King"}],"#odata.nextLink":"http://localhost:56963/OData/Products?$skip=2"}
I was trying to get "#odata.nextLink" but failed, there is no way to get "odata.nextLink" by "data.#odata.nextLink" from javascript.
Any one can help me to get through this?
After parse the string to json, data['#odata.nextLink'] can work:
var data = '{"#odata.context":"http://localhost:56963/OData/$metadata#Products","value":[{"Id":1,"Name":"Thermo King MP-3000","Price":300,"Category":"Thermo King"},{"Id":2,"Name":"Thermo King MP-4000","Price":500,"Category":"Thermo King"}],"#odata.nextLink":"http://localhost:56963/OData/Products?$skip=2"}';
data = JSON.parse(data);
alert(data['#odata.nextLink']);

Not able to get TermVector results properly in SolrNet

I'm not able to get TermVector results properly thru SolrNet. I tried with the following code.
QueryOptions options = new QueryOptions()
{
OrderBy = new[] { new SortOrder("markupId", Order.ASC) },
TermVector = new TermVectorParameters
{
Fields = new[] { "text" },
Options = TermVectorParameterOptions.All
}
};
var results = SolrMarkupCore.Query(query, options);
foreach (var docVectorResult in results.TermVectorResults)
{
foreach (var vectorResult in docVectorResult.TermVector)
System.Diagnostics.Debug.Print(vectorResult.ToString());
}
In the above code, results.TermVectorResults in the outer foreach gives the proper count whereas docVectorResult.TermVector in the inner foreach is empty.
I've copied the generated solr query of the above code and issued against solr admin and I'm properly getting the termVectors values. The actual query I issued is below
http://localhost:8983/solr/select/?sort=markupId+asc&tv.tf=true&start=0&q=markupId:%2823%29&tv.offsets=true&tv=true&tv.positions=true&tv.fl=text&version=2.2&rows=50
First you should check HTTP query to sure termvector feature is set property.
If it's not OK, change your indexing based on:
The Term Vector Component
If it is OK,You can use "ExtraParams" by changing the handler to termvector handler. Try this:
public SolrQueryExecuter<Product> instance { get; private set; }
public ICollection<TermVectorDocumentResult> resultDoc(string q)
{
string SERVER="http://localhost:7080/solr/core";//change this
var container = ServiceLocator.Current as SolrNet.Utils.Container;
instance = new SolrQueryExecuter<Product>(
container.GetInstance<ISolrAbstractResponseParser<Product>>(),
new SolrConnection(SERVER),
container.GetInstance<ISolrQuerySerializer>(),
container.GetInstance<ISolrFacetQuerySerializer>(),
container.GetInstance<ISolrMoreLikeThisHandlerQueryResultsParser<Product>>());
instance.DefaultHandler = "/tvrh";
SolrQueryResults<Product> results =
instance.Execute(new SolrQuery(q),
new QueryOptions
{
Fields = new[] { "*" },
Start = 0,
Rows = 10,
ExtraParams = new Dictionary<string, string> {
{ "tv.tf", "false" },
{ "tv.df", "false" },
{ "tv.positions", "true" },
{ "tv", "true" },
{ "tv.offsets", "false" },
{ "tv.payloads", "true" },
{ "tv.fl", "message" },// change the field name here
}
}
);
return results.TermVectorResults;
}

Resources