Parsing query string with custom aggregation, using Search API - spring-boot

I've installed a plugin to elastic (https://github.com/opendatasoft/elasticsearch-aggregation-geoclustering) and I have an endpoint (high level rest client) that takes an ES query body string and builds it with SearchSourceBuilder.
Using kibana, the plugin works fine:
{
"size": 0,
"query": {
"match_all": {}
},
"aggs": {
"clusters": {
"geo_point_clustering": {
"field": "geo.point",
"zoom": 5
}
}
}
}
And I get the expected result.
However, the SearchSourceBuilder have trouble parsing the "geo_point_clustering" aggregation
Exception:
threw exception [Request processing failed; nested exception is ParsingException[Unknown aggregation type [geo_point_clustering]]; nested: NamedObjectNotFoundException[[31:30] unknown field [geo_point_clustering]];] with root cause
The exception comes from where the request body is converted into a SearchSourceBuilder
val searchSourceBuilder = SearchSourceBuilder().timeout(TimeValue(5, TimeUnit.SECONDS))
val searchModule = SearchModule(Settings.EMPTY, false, emptyList())
val xContentRegistry = NamedXContentRegistry(searchModule.namedXContents)
searchSourceBuilder.parseXContent(
XContentFactory.xContent(XContentType.JSON)
.createParser(xContentRegistry,
DeprecationHandler.THROW_UNSUPPORTED_OPERATION, body)
)
So my question is (i think) how can I add the custom aggregation type to this parser?
Thanks!

Related

Multimatch query failure

Hi Team I am using elasticsearch after a long time and facing some difficulties with multi_match queries.
Essentially my query need to have an exact match on 2 fields and should do a text search on 4 fields.
Here is the exact query that I have prepared which is giving expected results.
GET /maintenance_logs/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"vinNumber.keyword": "DH34ASD7SDFF84742"
}
},
{
"term": {
"organizationId": 1
}
}
],
"minimum_should_match": 1,
"should": [
{
"multi_match": {
"query": "Cabin pressure",
"fields": [
"dtcCode",
"subSystem",
"maintenanceActivity",
"description"
]
}
}
]
}
}
}
However when I am trying to convert this into Elasticsearch java, I am unable to get data back.
Here is the error:
org.elasticsearch.ElasticsearchStatusException: Elasticsearch exception [type=search_phase_execution_exception, reason=all shards failed]
Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [http://localhost:9200], URI [/maintenance_logs/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true], status line [HTTP/1.1 400 Bad Request]
{"error":{"root_cause":[{"type":"query_shard_exception","reason":"failed to create query: For input string: \"DH34ASD7SDFF84742\"","index_uuid":"VnZg-NkFQmS-nSHbYemZkQ","index":"maintenance_logs"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"maintenance_logs","node":"_XMtzvY5TW-02IijVrR2Ww","reason":{"type":"query_shard_exception","reason":"failed to create query: For input string: \"DH34ASD7SDFF84742\"","index_uuid":"VnZg-NkFQmS-nSHbYemZkQ","index":"maintenance_logs","caused_by":{"type":"number_format_exception","reason":"For input string: \"DH34ASD7SDFF84742\""}}}]},"status":400}
at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:326) ~[elasticsearch-rest-client-7.12.1.jar:7.12.1]
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:296) ~[elasticsearch-rest-client-7.12.1.jar:7.12.1]
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:270) ~[elasticsearch-rest-client-7.12.1.jar:7.12.1]
at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1654) ~[elasticsearch-rest-high-level-client-7.12.1.jar:7.12.1]
Here is the Request:
public static SearchRequest buildSearchRequest(final String indexName,
final ElasticSearchDTO dto,
final MaintenanceLogsSearchDto maintenanceLogsSearchDto,
Pageable pageable) {
try {
final int page = pageable.getPageNumber();
final int size = pageable.getPageSize();
final int from = page <= 0 ? 0 : pageable.getPageSize();
SearchRequest searchRequest = new SearchRequest(indexName);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
final QueryBuilder vinQuery = QueryBuilders.termsQuery("vinNumber.keyword", maintenanceLogsSearchDto.getVinNumber());
final QueryBuilder orgIdQuery = QueryBuilders.termsQuery("organizationId", maintenanceLogsSearchDto.getVinNumber());
boolQueryBuilder.must(vinQuery);
boolQueryBuilder.must(orgIdQuery);
boolQueryBuilder.minimumShouldMatch(1);
MultiMatchQueryBuilder multiMatchQueryBuilder = new MultiMatchQueryBuilder(dto.getSearchTerm(), "dtcCode", "subSystem", "maintenanceActivity", "description");
multiMatchQueryBuilder.operator(Operator.AND);
boolQueryBuilder.should(multiMatchQueryBuilder);
searchSourceBuilder.query(boolQueryBuilder);
searchSourceBuilder
.from(from)
.size(size)
.sort(SortBuilders.fieldSort("statsDate")
.order(SortOrder.DESC));
searchRequest.source(searchSourceBuilder);
return searchRequest;
} catch (final Exception e) {
e.printStackTrace();
return null;
}
}
I would also like to improve my search functionality with another api where I would like to run something like this :
"select * from maintenance_logs where subSystem in ["cabin pressure", "Engine, ABS"] or dtcCoe in ["p100", "p200", "p300"]"
Looks like issue is with your vinQuery as it uses this DH34ASD7SDFF84742 and it seems you are trying to assign the number to your vinNumber.keyword using your Java DTO maintenanceLogsSearchDto.getVinNumber(), and once you query it in Elasticsearch, Elasticsearch is not able to convert it to number as it can't be converted to number.
Below log line in your logs give hint about the issue.
reason":"failed to create query: For input string:
"DH34ASD7SDFF84742"","index_uuid":"VnZg-NkFQmS-nSHbYemZkQ","index":"maintenance_logs","caused_by":{"type":"number_format_exception","reason":"For
input string: "DH34ASD7SDFF84742""}}}]},"status":400}

Elasticsearch Jest client add condition to json query

I am using Elasticsearch 6.3 with Jest client 6.3 (Java API)
Search search = new Search.Builder(jsonQueryString)
.addIndex("SOME_INDEX")
.build();
SearchResult result = jestClient.execute(search);
And this is my sample JSON query
{
"query": {
"bool" : {
"filter": {
"match" :{
"someField" : "some value"
}
}
}
}
}
The JSON query string is accepted as a POST request body and then passed to the Jest client. Before I can execute the json query on the Jest client, I need to add conditions to the query for e.g.
{
"query": {
"bool" : {
"filter": {
"match" :{
"someField" : "some value"
}
}
},
"must": {
"match" :{
"systemField" : "pre-defined value"
}
}
}
}
}
Is there an API that allows to parse the JSON query and add conditions to it before it can be executed on Jest client? The JSON query can be any query supported by Query DSL and not necessarily contain bool condition. I need to add a pre-defined condition to the query. I appreciate any help on this. Thanks very much.
There is no out of the box Elasticsearch or Jest API to achieve the above, the workaround I implemented is using Jackson ObjectMapper
// convert the search request body into object node
ObjectNode searchRequestNode = objectMapper.readValue(queryString, ObjectNode.class);
// extract the query
String query = searchRequestNode.get("query").toString();
// wrap the original query and add conditions
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(QueryBuilders.wrapperQuery(query));
boolQueryBuilder.filter(QueryBuilders.termsQuery("fieldA", listOfValues));
boolQueryBuilder.filter(QueryBuilders.termQuery("fieldB", value));
// convert querybuilder to json query string
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(queryBuilder);
String queryWithFilters = searchSourceBuilder.toString();
// convert json string to object node
ObjectNode queryNode = objectMapper.readValue(queryWithFilters, ObjectNode.class);
// replace original query with the new query containing added conditions
searchRequestNode.set("query", queryNode.get("query"));
String finalSearchRequestWithOwnFilters = searchRequestNode.toString();

ElasticSearch NEST Executing raw Query DSL

I'm trying to create the simplest proxy possible in an API to execute searches on ElasticSearch nodes. The only reason for the proxy to be there is to "hide" the credentials and abstract ES from the API endpoint.
Using Nest.ElasticClient, is there a way to execute a raw string query?
Example query that is valid in vanilla ES:
{
"query": {
"fuzzy": { "title": "potato" }
}
}
In my API, I tried Deserializing the raw string into a SearchRequest, but it fails. I'm assuming it cannot deserialize the field:
var req = m_ElasticClient.Serializer.Deserialize<SearchRequest>(p_RequestBody);
var res = m_ElasticClient.Search<T>(req);
return m_ElasticClient.Serializer.SerializeToString(res);
System.InvalidCastException: Invalid cast from 'System.String' to 'Newtonsoft.Json.Linq.JObject'.
Is there a way to just forward the raw string query to ES and return the string response? I tried using the LowLevel.Search method without luck.
NEST does not support deserializing the short form of "field_name" : "your_value" of the Elasticsearch Query DSL, but it does support the long form "field_name" : { "value" : "your_value" }, so the following works
var client = new ElasticClient();
var json = #"{
""query"": {
""fuzzy"": {
""title"": {
""value"": ""potato""
}
}
}
}";
SearchRequest searchRequest;
using (var stream = new MemoryStream(Encoding.UTF8.GetBytes(json)))
{
searchRequest = client.Serializer.Deserialize<SearchRequest>(stream);
}
As Rob has answered, NEST also supports supplying a raw json string as a query
Yes, you can do this with NEST, check out the following
var searchResponse = client.Search<object>(s => s
.Type("type").Query(q => q.Raw(#"{""match_all"":{}}")));
Hope that helps.

Obtaining string query (JSON) from SearchQuery object

For debugging purposes, I need to know what query spring-data-elasticsearch is sending to the ElasticSearch cluster. I have tried to call the toString method on the SearchQuery object, and doesn't return what I need.
What I am doing in Java (using spring-data-elasticsearch) is:
private FilterBuilder getFilterBuilder(String id) {
return orFilter(
termFilter("yaddayaddayadda.id", id),
termFilter("blahblahblah.id", id)
);
}
SearchQuery sq = NativeSearchQueryBuilder()
.withQuery(new MatchAllQuery())
.withFilter(fb)
.build();
And I expect to return something like this plain query executed in ES cluster REST API is returning:
{
"query": {
"filtered": {
"filter": {
"or": [
{
"term": {
"yaddayaddayadda.id": "9"
}
},
{
"term": {
"blahblahblah.id": "9"
}
}
]
}
}
}
}
Thanks in advance!
One way to achieve this is to log the queries on the ES/server-side into the slowlog file. Open your elasticsearch.yml config file and towards the bottom uncomment/edit the two lines below:
...
index.search.slowlog.threshold.query.info: 1ms
...
index.search.slowlog.threshold.fetch.info: 1ms
...
The advantage of this solution is that whatever client technology you're using to query your ES server (Spring Data, Ruby, Browser, Javascript, etc), you'll be able to dump and debug your queries in a single location.
SearchQuery Interface has a method getQuery() and getFilter() to get the information you need.
System.out.println(searchQuery.getQuery());
System.out.println(searchQuery.getFilter());
Hope this helps.
When using SearchRequest or SearchSourceBuilder, calling .toString() method on their instance will get you actual JSON query:
SearchRequest searchRequest = new SearchRequest("index");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
// building the query
// ...
searchSourceBuilder.query(query);
searchRequest.source(searchSourceBuilder);
System.out.println(searchSourceBuilder.toString()); // prints json query
System.out.println(searchRequest.toString()); // prints json query + other information

Different set of results for "significant terms" in Elasticsearch using REST Api or Transportclient

We use the new significant terms plugin in elasticsearch. Using the transport client I get less results compared to that when I use the REST API. I don't understand why. Using the node client is unfortunately not possible, since my service using ES is not in the same network. Why are the results different?
Here is the REST call:
POST /searchresults_sharded/article/_search
{
"query": {
"match": {
"titlebody": {
"query": "japanische hundenamen",
"operator": "and"
}
}
},
"aggregations": {
"searchresults": {
"significant_terms": {
"field": "titlebody",
"size": 100
}
}
}
}
and here the scala request building code:
val builder = reqBuilder.searchReqBuilder
builder.setIndices(indexCoords.indexName)
builder.setTypes(indexCoords.typeName)
builder.setQuery(QueryBuilders.matchQuery(indexCoords.field, keywords.mkString(" ")).operator(MatchQueryBuilder.Operator.AND))
val sigTermAggKey: String = "significant-term"
val sigTermBuilder = new SignificantTermsBuilder(sigTermAggKey)
sigTermBuilder.field(indexCoords.field)
sigTermBuilder.size(size)
builder.addAggregation(sigTermBuilder)
I used toString on the Builder and found out: Reason was the different size parameter of the two requests. Both request sizes (20 and 100) were bigger than the returned signif.term-aggregation bucket size (7 compared to 1) but it seems that the query size param has an impact on the returned size even if it's far below the query size parameter

Resources