Filtering global aggregation in Elastica - elasticsearch

I have elastic query built with ruflin/Elastica, with global aggregation. Is it possible to somehow add some filters to it, separate from my main query.
It looks like so:
$query = new Query($boolQuery);
$categoryAggregation = new Terms('category_ids');
$categoryAggregation->setField('category_ids');
$categoryAggregation->setSize(0);
$manufacturerAggregation = new Terms('manufacturer_ids');
$manufacturerAggregation->setField('manufacturer_id');
$manufacturerAggregation->setSize(0);
$globalAggregation = new GlobalAggregation('global');
$globalAggregation->addAggregation($categoryAggregation);
$globalAggregation->addAggregation($manufacturerAggregation);
$query->addAggregation($globalAggregation);
I would like to add some custom filters to manufacturer_ids and category_ids aggregations. At the moment they are aggregated from all documents. Is there any way to do it via Elastica API, so that it applies some filtering to it?

I found it myself through trial and error, it goes as following:
$categoryAggregation = new Terms('category_ids');
$categoryAggregation->setField('category_ids');
$categoryAggregation->setSize(0);
$filter = new Filter('category_ids', $merchantIdQuery);
$filter->addAggregation($categoryAggregation);
$globalAggregation = new GlobalAggregation('global');
$globalAggregation->addAggregation($filter);

Related

How can I create Bulk CRUD Operations request in Elasticsearch version 8 using JAVA APIs?

We wanted to create IndexRequest, DeleteRequest, UpdateRequest and BulkRequest in Elasticsearch version 8 using JAVA APIs. But I don't see any java documentation in elasticsearch v8 official website. Previously in elasticsearch version 7, we used below code in order to perform operations.
IndexRequest indexRequest = Requests.indexRequest(index).id(key).source(source);
BulkRequest bulkRequest = Requests.bulkRequest();
bulkRequest.add(indexRequest);
Also following Elasticsearch Java API Client [8.1] , but no luck.
Problem arises when we try to do Requests.indexRequest(), this Request class is not available in version 8.
So, Is it possible to create similar request in ES version 8 also?
Update 1:-
My point here is that I need to keep a list of request operation which maybe arbitrary ( maybe 1st five are inserts, next 2 are update and next 2 are delete requests and at the end 1 insert operation ). And that list needed to be flushed via Bulk maintaining the type of request received. I am using BulkRequest.Builder bulkRequestBuilder = new BulkRequest.Builder();
But my issue is with bulk update. I am unable to find any update API for bulkrequest for elasticsearch version 8.
For Insert:-
bulkRequestBuilder.operations(op -> op.index(idx -> idx.index(index).id(key).document(source)));
For Delete:-
bulkRequestBuilder.operations(op -> op.delete(d -> d.index(index).id(key)));
And flushing the bulk operation:-
BulkResponse bulkResponse = client.bulk(bulkRequestBuilder.build());
I am looking for something similar to above mentioned insert and delete operation.
Like, bulkRequestBuilder.operations(op->op.update(u->u.index(index).id(key)....))
You can use the refer the following code which I used to bulk index.
String PRODUCT_INDEX="product"
final BulkResponse bulkResponse = esUtil.getESClient().bulk(builder -> {
for (Product product : products) {
builder.index(PRODUCT_INDEX)
.operations(ob -> {
ob.index(ib -> ib.document(product).pipeline("score").id(product.getId())));
return ob;
});
}
return builder;
});
You can use Fluent DSL like below as mentioned here:
List<Product> products = fetchProducts();
BulkRequest.Builder br = new BulkRequest.Builder();
for (Product product : products) {
br.operations(op -> op
.index(idx -> idx
.index("products")
.id(product.getSku())
.document(product)
)
);
}
BulkResponse result = esClient.bulk(br.build());
You can use Classic Builder like below (Not Recommndate):
IndexRequest.Builder<Product> indexReqBuilder = new IndexRequest.Builder<>();
indexReqBuilder.index("product");
indexReqBuilder.id("id");
indexReqBuilder.document(product);
List<BulkOperation> list = new ArrayList<BulkOperation>();
list.add(indexReqBuilder);
BulkRequest.Builder br = new BulkRequest.Builder();
br.index("");
br.operations(list);
client.bulk(br.build());
Update Request:
As mentioned here in document, UpdateRequest class support TDocument and TPartialDocument as parameter. when you want to index document as parial document (means only update) then you can use TPartialDocument and when you want to index document as upsert then you can use TDocument. you can pass Void class for other parameter.
You can check this discussion as well which give you some understanding.
client.update(b -> b
.index("")
.id("")
.doc(product),
Product.class
);
client.update(new UpdateRequest.Builder<Void, Product>()
.index("product")
.id("")
.doc(product)
.build(),
Void.class
);
Bulk Update request:
BulkRequest.Builder br = new BulkRequest.Builder();
for (Product product : products) {
ObjectMapper mapper = new ObjectMapper();
String json = mapper.writeValueAsString(product);
br.operations(op -> op
.update(idx -> idx.index("products").id("").withJson(new StringReader(json))));
}

Aggregations in Jest client without JSON query

While exploring aggregation in elasticsearch I found out that aggregation functionality can be implemented via JSON query in HTTP based JEST client but not in TCP based Java client.
I am using Jest Client and implemented aggregation through query String that works fine. But I feel it gets quiet cumbersome as the filters increase.
I want to know if there is a way to implement aggregations other than using JSON query in JEST client (Something like aggregation builders in TCP client) and how do we implement it?
This was my solution to implement aggregations, don't forget to add the query first.
org.elasticsearch.search.builder.SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
TermsAggregationBuilder termsAggregationBuilder =
AggregationBuilders.terms(fieldName).field("keyword");
termsAggregationBuilder.minDocCount(minDocCount);
termsAggregationBuilder.size(size);
searchRequestBuilder.aggregation(termsAggregationBuilder);
Search search = new io.searchbox.core.Search.Builder(searchSourceBuilder.toString())
.addIndex(indexNameBuilder.getIndexName(searchContext.getCompanyId()))
.build();
SearchResult result = jestConnectionManager.getClient().execute(search);
Here's what I landed on:
SearchSourceBuilder searchBuilder = SearchSourceBuilder
.searchSource()
.size(0)
.query(QueryBuilders.termQuery("field", "my value"))
.aggregation(
AggregationBuilders
.sum("number_field_sum")
.field("number_field")
);
Search search = new Search.Builder(searchBuilder.toString())
.addIndex("my-index")
.build();
SearchResult result = jestClient.execute(search);
This is largely the same as what you came up with, but I put the query into the SearchSourceBuilder and I simplified the example a bit.

NEST: How can I do different operations and mapping types in one bulk request?

I have a list of "event" objects. Each event has its operation (delete, update, index, etc), its mapping type (document, folder, etc.), and the actual content to be indexed into Elasticsearch, if any. I don't know what any of these operations will be in advance. How can I use NEST to dynamically choose the bulk operation and mapping type for each of these events?
Bulk method on ElasticClient should fit your requirements.
You can pass various bulk operations into theBulkRequest, this is a simple usage:
var bulkRequest = new BulkRequest();
bulkRequest.Operations = new List<IBulkOperation>
{
new BulkCreateDescriptor<Document>().Id(1).Document(new Document{}),
new BulkDeleteDescriptor<Document>().Id(2)
};
var bulkResponse = client.Bulk(bulkRequest);
Hope it helps.

Get raw query from NEST client

Is it possible to get the raw search query from the NEST client?
var result = client.Search<SomeType>(s => s
.AllIndices()
.Type("SomeIndex")
.Query(query => query
.Bool(boolQuery => BooleanQuery(searchRequest, mustMatchQueries)))
);
I'd really like to debug why I am getting certain results.
The methods to do this seem to change with each major version, hence the confusing number of answers. If you want this to work in NEST 6.x, AND you want to see the deserialized request BEFORE it's actually sent, it's fairly easy:
var json = elasticClient.RequestResponseSerializer.SerializeToString(request);
If you're debugging in Visual Studio, it's handy to put a breakpoint right after this line, and when you hit it, hover over the json variable above and hit the magnifying glass thingy. You'll get a nice formatted view of the JSON.
You can get raw query json from RequestInformation:
var rawQuery = Encoding.UTF8.GetString(result.RequestInformation.Request);
Or enable trace on your ConnectionSettings object, so NEST will print every request to trace output
var connectionSettings = new ConnectionSettings(new Uri(elasticsearchUrl));
connectionSettings.EnableTrace(true);
var client = new ElasticClient(connectionSettings);
NEST 7.x
Enable debug mode when creating settings for a client:
var settings = new ConnectionSettings(connectionPool)
.DefaultIndex("index_name")
.EnableDebugMode()
var client = new ElasticClient(settings);
then your response.DebugInformation will contain information about request sent to elasticsearch and response from elasticsearch. Docs.
For NEST / Elasticsearch.NET v6.0.2, use the ApiCall property of the IResponse object. You can write a handy extension method like this:
public static string ToJson(this IResponse response)
{
return Encoding.UTF8.GetString(response.ApiCall.RequestBodyInBytes);
}
Or, if you want to log all requests made to Elastic, you can intercept responses with the connection object:
var node = new Uri("https://localhost:9200");
var pool = new SingleNodeConnectionPool(node);
var connectionSettings = new ConnectionSettings(pool, new HttpConnection());
connectionSettings.OnRequestCompleted(call =>
{
Debug.Write(Encoding.UTF8.GetString(call.RequestBodyInBytes));
});
In ElasticSearch 5.x, the RequestInformation.Request property does not exist in ISearchResponse<T>, but similar to the answer provided here you can generate the raw query JSON using the Elastic Client Serializer and a SearchDescriptor. For example, for the given NEST search query:
var results = elasticClient.Search<User>(s => s
.Index("user")
.Query(q => q
.Exists(e => e
.Field("location")
)
)
);
You can get the raw query JSON as follows:
SearchDescriptor<User> debugQuery = new SearchDescriptor<User>()
.Index("user")
.Query(q => q
.Exists(e => e
.Field("location")
)
)
;
using (MemoryStream mStream = new MemoryStream())
{
elasticClient.Serializer.Serialize(debugQuery, mStream);
string rawQueryText = Encoding.ASCII.GetString(mStream.ToArray());
}
Before making Request, from Nest Query - For Nest 5.3.0 :
var stream = new System.IO.MemoryStream();
elasticClient.Serializer.Serialize(query, stream );
var jsonQuery = System.Text.Encoding.UTF8.GetString(stream.ToArray());
Edit: It's changed from from Nest 6.x, and you can do below:
var json = elasticClient.RequestResponseSerializer.SerializeToString(request);
on nest version 6 use
connextionString.DisableDirectStreaming();
then on response.DebugInformation you can see all information.
Use result.ConnectionStatus.Request.
When using NEST 7 and you don't want to enable debug mode.
public static string GetQuery<T>(this IElasticClient client, SearchDescriptor<T> searchDescriptor) where T : class
{
using (System.IO.MemoryStream ms = new System.IO.MemoryStream())
{
client.RequestResponseSerializer.Serialize(searchDescriptor, ms);
return Encoding.UTF8.GetString(ms.ToArray());
}
}
While it's possible to get raw request/response through code, I find it much easier to analyze it with fiddler.
The reason is that I can easily analyze raw request, response, headers, Full URL, execution time - all together without any hassle of code changes.
Here's some reference links in case someone unfamiliar with fiddler wants to check details:
#1 https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/logging-with-fiddler.html
#2 NEST 1.0: See request on Fiddler
#3 https://newbedev.com/how-to-get-nest-to-work-with-proxy-like-fiddler
How about using Fiddler ?! :)

How to execute RemoveAliasMapping in ElasticSearch using JEST

I am trying to remove an alias mapping for an index in ES using jest.
Here is what I have tried :
// create Jest Client.
JestClient client = factory.getObject();
// create RemoveAliasMapping Object.
RemoveAliasMapping removeAliasMapping = new RemoveAliasMapping.Builder("oldIndex", "alias").build();
After creating the removeAliasMapping object, I couldn't find a way to execute it.
If I use the api : client.execute(removeAliasMapping), it says : The method execute(Action<T>) in the type JestClient is not applicable for the arguments (RemoveAliasMapping)
Also, I couldn't find any other api exposed to execute AliasMapping.
Can anyone help me out with this here? If possible, please put an example too.
Try this:
ModifyAliases modifyAliases = new ModifyAliases.Builder(new RemoveAliasMapping.Builder("oldIndex", "alias").build()).build();
JestResult result = client.execute(modifyAliases);

Resources