I have got a simple method that performs simple terms aggregation using elastic search8.0
I am able to do it using RestHighLevelClient but with ElasticsearchClient I am getting empty buckets.
can someone please help me to resolve
public void aggregate(ElasticsearchClient client) throws ElasticsearchException, IOException {
String field = "loglevel";
Map<String, Long> buckets = new HashMap<String, Long>();
SearchResponse<SspDevLog> response = client.search(fn -> fn
.aggregations("loglevel", a -> a.terms(v-> v.field(field))), SspDevLog.class);
Map<String, Aggregate> aggrs = response.aggregations();
for(Map.Entry<String, Aggregate> entry : aggrs.entrySet()) {
Aggregate aggregate = entry.getValue();
StringTermsAggregate sterms = aggregate.sterms();
Buckets<StringTermsBucket> sbuckets = sterms.buckets();
List<StringTermsBucket> bucArr = sbuckets.array();
for(StringTermsBucket bucObj : bucArr) {
buckets.put(bucObj.key(), bucObj.docCount());
}
}
System.out.println(buckets);
}
Related
I have configured a job as follow, which is to read from db and write into files but by partitioning data on basis of sequence.
//Job Config
#Bean
public Job job(JobBuilderFactory jobBuilderFactory) throws Exception {
Flow masterFlow1 = (Flow) new FlowBuilder<Object>("masterFlow1").start(masterStep()).build();
return (jobBuilderFactory.get("Partition-Job")
.incrementer(new RunIdIncrementer())
.start(masterFlow1)
.build()).build();
}
#Bean
public Step masterStep() throws Exception
{
return stepBuilderFactory.get(MASTERPPREPAREDATA)
//.listener(customSEL)
.partitioner(STEPPREPAREDATA,new DBPartitioner())
.step(prepareDataForS1())
.gridSize(gridSize)
.taskExecutor(new SimpleAsyncTaskExecutor("Thread"))
.build();
}
#Bean
public Step prepareDataForS1() throws Exception
{
return stepBuilderFactory.get(STEPPREPAREDATA)
//.listener(customSEL)
.<InputData,InputData>chunk(chunkSize)
.reader(JDBCItemReader(0,0))
.writer(writer(null))
.build();
}
#Bean(destroyMethod="")
#StepScope
public JdbcCursorItemReader<InputData> JDBCItemReader(#Value("#{stepExecutionContext[startingIndex]}") int startingIndex,
#Value("#{stepExecutionContext[endingIndex]}") int endingIndex)
{
JdbcCursorItemReader<InputData> ir = new JdbcCursorItemReader<>();
ir.setDataSource(batchDataSource);
ir.setMaxItemCount(DBPartitioner.partitionSize);
ir.setSaveState(false);
ir.setRowMapper(new InputDataRowMapper());
ir.setSql("SELECT * FROM FIF_INPUT fi WHERE fi.SEQ > ? AND fi.SEQ < ?");
ir.setPreparedStatementSetter(new PreparedStatementSetter() {
#Override
public void setValues(PreparedStatement ps) throws SQLException {
ps.setInt(1, startingIndex);
ps.setInt(2, endingIndex);
}
});
return ir;
}
#Bean
#StepScope
public FlatFileItemWriter<InputData> writer(#Value("#{stepExecutionContext[index]}") String index)
{
System.out.println("writer initialized!!!!!!!!!!!!!"+index);
//Create writer instance
FlatFileItemWriter<InputData> writer = new FlatFileItemWriter<>();
//Set output file location
writer.setResource(new FileSystemResource(batchDirectory+relativeInputDirectory+index+inputFileForS1));
//All job repetitions should "append" to same output file
writer.setAppendAllowed(false);
//Name field values sequence based on object properties
writer.setLineAggregator(customLineAggregator);
return writer;
}
Partitioner provided for partitioning db is written separately in other file so as follows
//PartitionDb.java
public class DBPartitioner implements Partitioner{
public static int partitionSize;
private static Log log = LogFactory.getLog(DBPartitioner.class);
#SuppressWarnings("unchecked")
#Override
public Map<String, ExecutionContext> partition(int gridSize) {
log.debug("START: Partition"+"grid size:"+gridSize);
#SuppressWarnings("rawtypes")
Map partitionMap = new HashMap<>();
int startingIndex = -1;
int endSize = partitionSize+1;
for(int i=0; i< gridSize; i++){
ExecutionContext ctxMap = new ExecutionContext();
ctxMap.putInt("startingIndex",startingIndex);
ctxMap.putInt("endingIndex", endSize);
ctxMap.put("index", i);
startingIndex = endSize-1;
endSize += partitionSize;
partitionMap.put("Thread:-"+i, ctxMap);
}
log.debug("END: Created Partitions of size: "+ partitionMap.size());
return partitionMap;
}
}
This one is executing properly but problem is even after partitioning on the basis of sequence i am getting same rows in multiple files which is not right as i am providing different set of data for each partition. Can anyone tell me whats wrong. I am using HikariCP for Db connection pooling and spring batch 4
This one is executing properly but problem is even after partitioning on the basis of sequence i am getting same rows in multiple files which is not right as i am providing different set of data for each partition.
I'm not sure your partitioner is working properly. A quick test shows that it is not providing different sets of data as you are claiming:
DBPartitioner dbPartitioner = new DBPartitioner();
Map<String, ExecutionContext> partition = dbPartitioner.partition(5);
for (String s : partition.keySet()) {
System.out.println(s + " : " + partition.get(s));
}
This prints:
Thread:-0 : {endingIndex=1, index=0, startingIndex=-1}
Thread:-1 : {endingIndex=1, index=1, startingIndex=0}
Thread:-2 : {endingIndex=1, index=2, startingIndex=0}
Thread:-3 : {endingIndex=1, index=3, startingIndex=0}
Thread:-4 : {endingIndex=1, index=4, startingIndex=0}
As you can see, almost all partitions will have the same startingIndex and endingIndex.
I recommend you unit test your partitioner before using it in a partitioned step.
In a spring-boot 2.0 rest controller, I have created the following code which works as desired:
#ResponseBody
#GetMapping("/test3")
Mono<List<String>> test3(){
List<String> l1 = Arrays.asList("one","two","three");
List<String> l2 = Arrays.asList("four","five","six");
return Flux
.concat(Flux.fromIterable(l1),Flux.fromIterable(l2))
.collectList();
}
My problem comes from trying to do the same thing from an external datasource. I have created the following test case:
#ResponseBody
#GetMapping("/test4")
Flux<Object> test4(){
List<String> indecies = Arrays.asList("1","2");
return Flux.concat(
Flux.fromIterable(indecies)
.flatMap(k -> Flux.just(myRepository.getList(k))
.subscribeOn(Schedulers.parallel()),2
)
).collectList();
}
Where myRepository is the following:
#Repository
public class MyRepository {
List<String> l1 = Arrays.asList("one","two","three");
List<String> l2 = Arrays.asList("four","five","six");
Map<String, List<String>> pm = new HashMap<String, List<String>>();
MyRepository(){
pm.put("1", l1);
pm.put("2", l2);
}
List<String> getList(String key){
List<String> list = pm.get(key);
return list;
}
}
My code labeled test4 gives me the code hint error:
Type mismatch: cannot convert from Flux< List < String >> to Publisher < ?
extends Publisher < ? extends Object >>
So a few questions:
I thought that a Flux was a publisher? So why the error?
What am I doing wrong in test 4 so that it will output the same result as in test3?
The expected output is: [["one","two","three","four","five","six"]]
Using M. Deinum's comment, here is what works:
#ResponseBody
#GetMapping("/test6")
Mono<List<String>> test6(){
List<String> indecies = Arrays.asList("1","2");
return Flux.fromIterable(indecies)
.flatMap(k -> Flux.fromIterable(myRepository.getList(k)).subscribeOn(Schedulers.parallel()),2)
.collectList();
}
I have trouble executing pipeline commands in spring data redis. I am using StringRedisTemplate. spring-data-redis 1.6.1, spring boot 1.3.2, and jedis both 2.7.3 and 2.8.0.
The code:
public void saveUserActivityEvents(Event... events) {
List<Object> results = stringRedisTemplate.executePipelined(
new RedisCallback<Object>() {
public Object doInRedis(RedisConnection connection) throws DataAccessException {
StringRedisConnection stringRedisConn = (StringRedisConnection)connection;
for(int i=0; i< events.length; i++) {
Event event = events[i];
String userId = getUserId(event.getUser());
String eventType = event.getEventType();
String itemId = event.getItem();
Integer amount = event.getAmount() == null ? 0 : Integer.parseInt(event.getAmount());
Double timestamp = Double.valueOf(event.getTimestamp());
Map<String, String> valueMap= new HashMap<String, String>();
valueMap.put("itemId", itemId);
valueMap.put("userId", userId);
String userItemEventsKey = StrSubstitutor.replace(Constants.KEY_USER_ITEM_EVENTS, valueMap);
valueMap.put("userId", userId);
String userItemsKey = StrSubstitutor.replace(Constants.KEY_USER_ITEMS, valueMap);
stringRedisConn.zAdd(userItemsKey, timestamp, itemId);
stringRedisConn.hIncrBy(userItemEventsKey, eventType, amount);
long expireInMs = TimeoutUtils.toMillis(getExpiryTimeInDays(event.getUser()), TimeUnit.DAYS);
stringRedisConn.pExpire(userItemEventsKey, expireInMs);
}
return null;
}
});
}
It blows with the exception in subject when executing pExpire.
I've tried with different flavour suggested in reference guide:
with
execute(redisCallback, true, true)
The same result. Any idea?
Thanks
I am in the need of performing faceted search using elastic search repositories developed using spring data.
One of the repositories which I have created are
public interface EmployeeSearchRepository extends ElasticsearchRepository<Employee, Long> {
}
it does provide a method called search with a signature:
FacetedPage<Employee> search(QueryBuilder query, Pageable pageable);
but the getFacets method of the FacetedPage returns null. How can I query to generate the facets?
I have the same problem and it seems that it is not implemented (yet).
If you look at DefaultResultMapper.mapResults() it calls response.getFacets() which is always null.
Note that facets are deprecated in elasticsearch and you should use aggregations instead. So maybe the contributors of the project are refactoring it?
I worked this around by writing my own results mapper class which extends the DefaultResultMapper but also converts the aggregations to FacetResults.
SomethingResultsMapper:
#Override
public <T> FacetedPage<T> mapResults(SearchResponse response, Class<T> clazz, Pageable pageable) {
FacetedPage<T> facetedPage = super.mapResults(response, clazz, pageable);
//Process Aggregations.
if (response.getAggregations() != null) {
for (Aggregation aggregations : response.getAggregations().asList()) {
final Filter filterAggregations = (Filter) aggregations;
for (Aggregation filterAgg : filterAggregations.getAggregations().asList()) {
if (filterAgg instanceof Terms) {
final Terms aggTerm = (Terms) filterAgg;
if (!aggTerm.getBuckets().isEmpty()) {
facetedPage.getFacets().add(processTermAggregation(aggTerm));
}
} else if (filterAgg instanceof Nested) {
final Nested nestedAgg = (Nested) filterAgg;
for (Aggregation aggregation : nestedAgg.getAggregations().asList()) {
final Terms aggTerm = (Terms) aggregation;
if (!aggTerm.getBuckets().isEmpty()) {
facetedPage.getFacets().add(processTermAggregation(aggTerm));
}
}
} else {
throw new IllegalArgumentException("Aggregation type not (yet) supported: " + filterAgg.getClass().getName());
}
}
}
}
return facetedPage;
}
private FacetResult processTermAggregation(final Terms aggTerm) {
long total = 0;
List<Term> terms = new ArrayList<>();
List<Terms.Bucket> buckets = aggTerm.getBuckets();
for (Terms.Bucket bucket : buckets) {
terms.add(new Term(bucket.getKey(), (int) bucket.getDocCount()));
total += bucket.getDocCount();
}
return new FacetTermResult(aggTerm.getName(), FacetConfig.fromAggregationTerm(aggTerm.getName()).getLabel(),
terms, total, aggTerm.getSumOfOtherDocCounts(), aggTerm.getDocCountError());
}
Then i created a custom Spring data repository (see the docs) and defined a custom method where i provide my SomethingResultsMapper:
#Override
public FacetedPage<Something> searchSomething(final SearchQuery searchQuery) {
return elasticsearchTemplate.queryForPage(searchQuery, Something.class, new SomethingResultsMapper());
}
EDIT: I think this one is being fixed by https://jira.spring.io/browse/DATAES-211
Is it possible to add an additional parameter to a Solr query using Spring Data Solr that generates the following request?
"params": {
"indent": "true",
"q": "*.*",
"_": "1430295713114",
"wt": "java",
"AuthenticatedUserName": "user#domain.com"
}
I want to add a parameter needed by Apache Manifoldcf, AuthenticatedUserName and its value, alongside the other ones that are automatically populated by Spring Data Solr (q, wt).
Thank you,
V.
I managed to make it work by looking at the source code of the SolrTemplate class but I was wondering if there is a less intrusive solution.
public Page<Document> searchDocuments(DocumentSearchCriteria criteria, Pageable page) {
String[] words = criteria.getTitle().split(" ");
Criteria conditions = createSearchConditions(words);
SimpleQuery query = new SimpleQuery(conditions);
query.setPageRequest(page);
SolrQuery solrQuery = queryParsers.getForClass(query.getClass()).constructSolrQuery(query);
solrQuery.add(AUTHENTICATED_USER_NAME, criteria.getLoggedUsername());
try {
String queryString = this.queryParsers.getForClass(query.getClass()).getQueryString(query);
solrQuery.set(CommonParams.Q, queryString);
QueryResponse response = solrTemplate.getSolrServer().query(solrQuery);
List<Document> beans = convertQueryResponseToBeans(response, Document.class);
SolrDocumentList results = response.getResults();
return new SolrResultPage<>(beans, query.getPageRequest(), results.getNumFound(), results.getMaxScore());
} catch (SolrServerException e) {
log.error(e.getMessage(), e);
return new SolrResultPage<>(Collections.<Document>emptyList());
}
}
private <T> List<T> convertQueryResponseToBeans(QueryResponse response, Class<T> targetClass) {
return response != null ? convertSolrDocumentListToBeans(response.getResults(), targetClass) : Collections
.<T> emptyList();
}
public <T> List<T> convertSolrDocumentListToBeans(SolrDocumentList documents, Class<T> targetClass) {
if (documents == null) {
return Collections.emptyList();
}
return solrTemplate.getConverter().read(documents, targetClass);
}
private Criteria createSearchConditions(String[] words) {
return new Criteria("title").contains(words)
.or(new Criteria("description").contains(words))
.or(new Criteria("content").contains(words))
.or(new Criteria("resourcename").contains(words));
}