Initialize an Elasticsearch SearchResponse object before running a query - elasticsearch

I am running various Elastic Search queries from my Java code. In order to consolidate my code, I would like to initialize a SearchResponse object before my conditional loops that each run an ElasticSearch query with different settings. This way, I can execute a single line of code once for getting the total hits from the query. You'll get what I mean from the code
#GET
#Path("/search")
public SearchResultsAndFacets search() {
SearchResultsAndFacets srf = new SearchResultsAndFacets();
RestHighLevelClient client = createHighLevelRestClient();
// Build the base query that applies to all searches
SearchSourceBuilder querySourceBuilder = buildQueryWrapper(colNames, sro.q, sro.f,
facetsToUpdate, sro.u, sro.lc);
SearchResponse searchresponse; // This line does not work. How can I initialize this object here (outside of the following conditional loops)?
// Searches executed from the table view to populate a table of documents
if (searchType.equals("table")) {
List<SortParameters> sortParametersList = sortAdapter(sro.s);
searchResponse = runTableQuery(client, querySourceBuilder, sortParameters, offset, limit);
}
// Searches involving geo_point data to populate a leaflet map
if (searchType.equals("contacts")) {
RestHighLevelClient client = createHighLevelRestClient();
ElasticSearchMapService esms = new ElasticSearchMapService();
searchResponse = esms.runContactsMapQuery(querySourceBuilder, client, <some geographic coordinate parameters necessary for this search>);
MapSearchResponse mapSearchResponse = esms.getLocationsFromSearchResponse(searchResponse);
srf.mapSearchResponse = mapSearchResponse;
}
// I would like to include these next few lines here at the end of the conditional loops.
// Currently they must be inside each if clause.
srf.totalHits = searchResponse.getHits().getTotalHits().value;
srf.elapsed = searchResponse.getTook().getMillis();
srf.facetsData = getUpdatedFacetData(facetsToUpdate,
searchResponse, sro.f);
return srf;
}
Elastic's high level REST client for JAVA does not allow initializing a SearchResponse object like this. It is also not possible to do so with
SearchResponse searchResponse = new SearchResponse();
And there is a null pointer error if we do...
SearchResponse searchResponse = new SearchResponse(null);
How can I rewrite this code so that I can fetech totalHits, elapsed and facetsData outside of the conditional loops?

Related

Mock Elastic Search response in.Net

I have Elastic Search Nest library code and need to mock the response i am getting from elastic search index.
var obj = service.Search<TestDocument>(new student().Query());
var Name= obj.Aggs.Terms("Name");
For Testing :
I am creating the Nest object after doing quick watch but facing issue -Aggregations - is a internal protected property and i am not able to set this value.
new Nest.KeyedBucket<object>
{
Key="XYZ school",
KeyAsString=null,
Aggregations=new Dictionary<string, IAggregationContainer>{}
}
Please suggest solution or any other approach i can use to mock elastic search nest object .
If you really want to stub the response from the client, you could do something like the following with Moq
var client = new Mock<IElasticClient>();
var searchResponse = new Mock<ISearchResponse<object>>();
var aggregations = new AggregateDictionary(new Dictionary<string, IAggregate> {
["Name"] = new BucketAggregate
{
Items = new List<KeyedBucket<object>>
{
new Nest.KeyedBucket<object>(new Dictionary<string, IAggregate>())
{
Key = "XYZ school",
KeyAsString = null,
DocCount = 5
}
}.AsReadOnly()
}
});
searchResponse.Setup(s => s.Aggregations).Returns(aggregations);
client.Setup(c => c.Search<object>(It.IsAny<Func<SearchDescriptor<object>, ISearchRequest>>()))
.Returns(searchResponse.Object);
var response = client.Object.Search<object>(s => s);
var terms = response.Aggregations.Terms("Name");
Another way would be to use the InMemoryConnection and return known JSON in response to a request..
For testing purposes however, it may be better to have an instance of Elasticsearch running, and perform integration tests against it. Take a look at Elastic.Xunit which provides an easy way to spin up an Elasticsearch cluster for testing purposes. This is used by the client in integration tests.
You can get Elastic.Xunit from the Appveyor feed.

QueryContainerDescriptor vs QueryContainer vs QueryBase

Can anyone explain what is the difference between QueryContainerDescriptor, QueryContainer & QueryBase?
How can I assign a query (or QueryBase) to QueryContainer?
In the code below, I can assign the same TermQuery to QueryBase and QueryContainer objects:
QueryBase bq = new TermQuery
{
Field = Field<POCO>(p => p.Title),
Value = "my_title"
};
QueryContainer tq = new TermQuery
{
Field = Field<POCO>(p => p.Title),
Value = "my_title"
};
Also I am not sure if there is any difference between, creating a TermQuery using QueryContainerDescriptor and the above method?
QueryContainer qcd = new QueryContainerDescriptor<POCO>().
Term(r => r.Field(f => f.Title).Value("my_title"));
QueryBase is the base type for all concrete query implementations
QueryContainer is a container for a query. It is used in places where a query is expected.
QueryContainerDescriptor<T> is a type for building a QueryContainer using a builder / fluent interface pattern.
NEST supports both an Object Initializer syntax where requests can be composed through instantiating types and composing an object graph by assigning types to properties, and also a Fluent API syntax, where requests can be composed using Lambda expressions and a fluent interface pattern. All *Descriptor types within NEST are builders for the Fluent API syntax. Use whichever syntax you prefer, or mix and match as you see fit :)
You might be thinking, why do we need QueryContainer, why not just use QueryBase? Well, within the JSON representation, a query JSON object is keyed against the name of the query as a property of an outer containing JSON object i.e.
{
"query": { // <-- start of outer containing JSON object
"term": { // <-- start of JSON query object
"field": {
"value": "value"
}
}
}
}
Relating back to C# types, QueryBase will be serialized to the query JSON object and QueryContainer will be the outer containing JSON object. To make it easier to compose queries, there are implicit conversions from QueryBase to QueryContainer, so often you just need to instantiate a derived QueryBase implementation and assign it to a property of type QueryContainer
var client = new ElasticClient();
var termQuery = new TermQuery
{
Field = "field",
Value = "value"
};
var searchRequest = new SearchRequest<MyDocument>
{
Query = termQuery // <-- Query property is of type QueryContainer
};
var searchResponse = client.Search<MyDocument>(searchRequest);
With QueryContainerDescriptor<T>, you often don't need to instantiate an instance outside of the client call, as an instance will be instantiated within the call. Here's the same request with the Fluent API
client.Search<MyDocument>(s => s
.Query(q => q
.Term("field", "value")
)
);

Convert Elastic Search Results to POJO

I have a project using the spring-data-elasticsearch library. I've got my system returning results, but I was wondering how to get my results in the form of my domain POJO class.
I'm not seeing too much documentation on how to accomplish this, but I don't know what the right question I should be Googling for.
Currently, my code looks like this, and in my tests, it retrieves the right results, but not as a POJO.
QueryBuilder matchQuery = QueryBuilders.queryStringQuery(searchTerm).defaultOperator(QueryStringQueryBuilder.Operator.AND);
Client client = elasticsearchTemplate.getClient();
SearchRequestBuilder request = client
.prepareSearch("mediaitem")
.setSearchType(SearchType.QUERY_THEN_FETCH)
.setQuery(matchQuery)
.setFrom(0)
.setSize(100)
.addFields("title", "description", "department");
System.out.println("SEARCH QUERY: " + request.toString());
SearchResponse response = request.execute().actionGet();
SearchHits searchHits = response.getHits();
SearchHit[] hits = searchHits.getHits();
Any help is greatly appreciated.
One option is to use jackson-databind to map JSON from the search hits to POJOs.
For example:
ObjectMapper objectMapper = new ObjectMapper();
SearchHit[] hits = searchHits.getHits();
Arrays.stream(hits).forEach(hit -> {
String source = hit.getSourceAsString();
MediaItem mediaItem = objectMapper.readValue(source, MediaItem.class);
// Use media item...
});

elasticsearch NEST : get TopHits result directly without using bucket.TopHits()

With nest I am doing a Terms aggregation .
I am also doing an inner TopHits aggregation .
My result give me all results infos in the response object except TopHits values which i can read thanks to TopHits() method.
I would like to have tophits values directly in result without using NEST TopHits() methode for reading into aggs. I would like to have all datas in info as we have in elastic search classic requests.
This is what i am actually doing :
My aggregation request:
var response = Client.Search<myclass>(s => s
.Type("type")
.Aggregations(a => a
.Terms("code_bucket", t => t
.Field("field_of_aggregation")
.Size(30)
.Order(TermsOrder.CountAscending)
.Aggregations(a2 => a2
.TopHits("code_bucket_top_hits", th => th.Size(20))
)
)));
I receive a result object in wich i can access all infos except TopHits.
if we examine results we can see TopHits values are stored in private field "_hits":
If I stringify result object , i can see the total number of TopHits, but I can't see the field _hits so i can see the documents:
JavaScriptSerializer js = new JavaScriptSerializer();
string json = js.Serialize(response);
json does not contains topHits result:
I can access to values but i need to use nest method TopHits():
var firstBucket= response.Aggs.Terms("code_bucket");
foreach (var bucket in firstBucket.Buckets)
{
var hits = bucket.TopHits("code_bucket_top_hits");
foreach (var hit in hits.Documents<myclass>())
{
var prop1= hit.prop1;
var prop2= hit.prop2;
}
}
}
But it would be really usefule if i could have all infos in one , like we have when we do elasticsearch request without nest
Do you know if there is a way?
NEST is a higher level abstraction over Elasticsearch that models each request and response with strong types, providing fluent and object initializer syntaxes to build requests, and methods to access portions of the response, without having to handle JSON serialization yourself.
Sometimes however, you might want to manage this yourself, which is what it sounds like you'd like to do. In these cases, Elasticsearch.Net can be used, which is a low level client for Elasticsearch and is unopinionated in how you model your requests and responses.
You can use the client in Elasticsearch.Net instead of NEST, however, the good news is NEST uses Elasticsearch.Net under the covers and also exposes the low level client through the .LowLevel property on IElasticClient. Why would you want to use the lowlevel client on NEST as opposed to just using Elasticsearch.Net directly? A major reason to do so is that you can take advantage of strong types for requests and responses when you need to and leverage NEST's usage of Json.NET for serialization, but bypass this and make calls with the low level client when you want/need to.
Here's an example
var client = new ElasticClient();
var searchRequest = new SearchRequest<Question>
{
Size = 0,
Aggregations = new TermsAggregation("top_tags")
{
Field = "tags",
Size = 30,
Order = new[] { TermsOrder.CountAscending },
Aggregations = new TopHitsAggregation("top_tag_hits")
{
Size = 20
}
}
};
var searchResponse = client.LowLevel.Search<JObject>("posts", "question", searchRequest);
// this will be of type JObject. Do something with it
searchResponse.Body
Here, I can use NEST's object initializer syntax to construct a request, but use the low level client to deserialize the response to a Json.NET JObject. You can deserialize to a T of your choosing by changing it in client.LowLevel.Search<T>(). You could for example use
var searchResponse = client.LowLevel.Search<string>("posts", "question", searchRequest);
to return a string, or
var searchResponse = client.LowLevel.Search<Stream>("posts", "question", searchRequest);
to return a stream, etc.

Bulk Update on ElasticSearch using NEST

I am trying to replacing the documents on ES using NEST. I am seeing the following options are available.
Option #1:
var documents = new List<dynamic>();
`var blkOperations = documents.Select(doc => new BulkIndexOperation<T>`(doc)).Cast<IBulkOperation>().ToList();
var blkRequest = new BulkRequest()
{
Refresh = true,
Index = indexName,
Type = typeName,
Consistency = Consistency.One,
Operations = blkOperations
};
var response1 = _client.Raw.BulkAsync<T>(blkRequest);
Option #2:
var descriptor = new BulkDescriptor();
foreach (var eachDoc in document)
{
var doc = eachDoc;
descriptor.Index<T>(i => i
.Index(indexName)
.Type(typeName)
.Document(doc));
}
var response = await _client.Raw.BulkAsync<T>(descriptor);
So can anyone tell me which one is better or any other option to do bulk updates or deletes using NEST?
You are passing the bulk request to the ElasticsearchClient i.e. ElasticClient.Raw, when you should be passing it to ElasticClient.BulkAsync() or ElasticClient.Bulk() which can accept a bulk request type.
Using BulkRequest or BulkDescriptor are two different approaches that are offered by NEST for writing queries; the former uses an Object Initializer Syntax for building up a request object while the latter is used within the Fluent API to build a request using lambda expressions.
In your example, BulkDescriptor is used outside of the context of the fluent API, but both BulkRequest and BulkDescriptor implement IBulkRequest so can be passed to ElasticClient.Bulk(IBulkRequest).
As for which to use, in this case it doesn't matter so whichever you prefer.

Resources