Partial match with Spring Data Elasticsearch - elasticsearch

I was able to search by whole words, for example searching phrase "secret" text "This is a secret word" was found. But If I search for phrase "secre" I get an empty array. I need this for autocomplete function. I'm using Spring Boot release 1.3.1.RELEASE (it uses Elasticsearch version 1.5.2 and I can't upgrade spring). What am I doing wrong? I'l be thankful for a link to a working example also. I know Elasticsearch is too heavy for this purpose but I want to learn how to use it. Many thanks in advance!
SearchConfig.java
#Configuration
#EnableElasticsearchRepositories(basePackages = {"cz.project.search"})
public class SearchConfig {
#Autowired
private Client elasticsearchClient;
#Bean
public ConstructionWorkIndexInitializer constructionWorkIndexInitializer() {
return new ConstructionWorkIndexInitializer();
}
}
ConstructionWorkIndexRepository.java
public interface ConstructionWorkIndexRepository extends ElasticsearchRepository<ConstructionWorkIndex, String> {
Page<ConstructionWorkIndex> findByCodeOrDescription(String code, String description, Pageable pageable);
}
ConstructionWorkIndex.java
#Data
#AllArgsConstructor
#NoArgsConstructor
#Document(indexName = "works", type = "work", shards = 1, replicas = 0, indexStoreType = "memory", refreshInterval = "-1")
#Setting(settingPath = "classpath:construction-works-settings.json")
public class ConstructionWorkIndex {
#Id
#Field(indexAnalyzer = "standard", searchAnalyzer = "standard", type = FieldType.String, store = true)
private String code;
#Field(indexAnalyzer = "standard", searchAnalyzer = "standard", type = FieldType.String, store = true)
private String description;
public ConstructionWorkIndex(ConstructionWorkVO constructionWork) {
requireNonNull(constructionWork, "constructionWork must not be null");
this.code = constructionWork.getCode();
this.description = constructionWork.getDescription();
}
}
construction-works-settings.json
{
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 20
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
}
ConstructionWorkIndexInitializer.java
#Slf4j
public class ConstructionWorkIndexInitializer {
#Autowired
private ConstructionWorkRepository workRepository;
#Autowired
private ConstructionWorkIndexRepository workIndexRepository;
#PostConstruct
#Transactional(readOnly = true)
#Async
public void init() {
initWorks();
}
private void initWorks() {
List<ConstructionWorkVO> works = workRepository.findAll();
works.forEach(work -> {
workIndexRepository.save(new ConstructionWorkIndex(work));
log.debug("Added construction work code '{}'", work.getCode());
});
log.debug("Indexed {} construction works", works.size());
}
}

Related

Spring Boot, query Elasticsearch specific fields from already indexed data created by Elastic Stack

The target is to query specific fields from an index via a spring boot app.
Questions in the end.
The data in elasticsearch are created from Elastic Stack with Beats and Logstash etc. There is some inconsistency, eg some fields may be missing on some hits.
The spring app does not add the data and has no control on the fields and indexes
The query I need, with _source brings
GET index-2022.07.27/_search
{
"from": 0,
"size": 100,
"_source": ["#timestamp","message", "agent.id"],
"query": {
"match_all": {}
}
}
brings the hits as
{
"_index": "index-2022.07.27",
"_id": "C1zzPoIBgxar5OgxR-cs",
"_score": 1,
"_ignored": [
"event.original.keyword"
],
"_source": {
"agent": {
"id": "ddece977-9fbb-4f63-896c-d3cf5708f846"
},
"#timestamp": "2022-07-27T09:18:27.465Z",
"message": """a message"""
}
},
and with fields instead of _source is
{
"_index": "index-2022.07.27",
"_id": "C1zzPoIBgxar5OgxR-cs",
"_score": 1,
"_ignored": [
"event.original.keyword"
],
"fields": {
"#timestamp": [
"2022-07-27T09:18:27.465Z"
],
"agent.id": [
"ddece977-9fbb-4f63-896c-d3cf5708f846"
],
"message": [
"""a message"""
]
}
},
How can I get this query with Spring Boot ?
I lean on StringQuery with the RestHighLevelClient as below but cant get it to work
Query searchQuery = new StringQuery("{\"_source\":[\"#timestamp\",\"message\",\"agent.id\"],\"query\":{\"match_all\":{}}}");
SearchHits<Items> productHits = elasticsearchOperations.search(
searchQuery,
Items.class,
IndexCoordinates.of(CURRENT_INDEX));
What form must Items.class have? What fields?
I just need timestamp, message, agent.id. The later is optional, it may not exist.
How will the mapping work?
versions:
Elastic: 8.3.2
Spring boot: 2.6.6
elastic (mvn): 7.15.2
spring-data-elasticsearch (mvn): 4.3.3
official documentation states that with RestHighLevelClient the versions should be supported
Support for upcoming versions of Elasticsearch is being tracked and
general compatibility should be given assuming the usage of the
high-level REST client.
You can define an entity class for the data you want to read (note I have a nested class for the agent):
#Document(indexName = "index-so", createIndex = false)
public class SO {
#Id
private String id;
#Field(name = "#timestamp", type = FieldType.Date, format = DateFormat.date_time)
private Instant timestamp;
#Field(type = FieldType.Object)
private Agent agent;
#Field(type = FieldType.Text)
private String message;
public String getId() {
return id;
}
public void setId(String id) {
this.id = id;
}
public Instant getTimestamp() {
return timestamp;
}
public void setTimestamp(Instant timestamp) {
this.timestamp = timestamp;
}
public Agent getAgent() {
return agent;
}
public void setAgent(Agent agent) {
this.agent = agent;
}
public String getMessage() {
return message;
}
public void setMessage(String message) {
this.message = message;
}
class Agent {
#Field(name = "id", type = FieldType.Keyword)
private String id;
public String getId() {
return id;
}
public void setId(String id) {
this.id = id;
}
}
}
The query then would be:
var query = new NativeSearchQueryBuilder()
.withQuery(matchAllQuery())
.withSourceFilter(new FetchSourceFilter(
new String[]{"#timestamp", "message", "agent.id"},
new String[]{}))
.build();
var searchHits = operations.search(query, SO.class);

Elasticsearch query on array of composite objects along with date ranges

Hi I have a question on how to create an elastic search query for a nested composite object with date ranges and additional field parameters like so
[{
"name": "A",
"availability": [
{
"partial": true,
"dates": {
"gte": "2020-12-01",
"lte": "2020-12-02"
}
}
]
},
{
"name": "B",
"availability": [
{
"partial": true,
"dates": {
"gte": "2020-12-05",
"lte": "2020-12-06"
}
},
{
"partial": false,
"dates": {
"gte": "2020-12-08",
"lte": "2020-12-11"
}
}
]
}]
This is my entity data
#Document(indexName = "workers")
public class Worker {
#Id
private String id;
#Field(type = FieldType.Text)
private String name;
#Field(type = FieldType.Nested)
private List<Availability> availability;
}
public class Availability {
#Field(type = FieldType.Boolean)
private boolean partial;
#Field(type = FieldType.Date_Range, format = DateFormat.custom, pattern = "uuuu-MM-dd")
private Map<String, LocalDate> dates;
}
This is the search query that I have currently written, but the results come as empty
final BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
queryBuilder.must(QueryBuilders.termQuery("availability.partial", query.isPartial()));
RangeQueryBuilder availability = QueryBuilders.rangeQuery("availability.dates")
.gte(query.getStartDate())
.lte(query.getEndDate());
queryBuilder.must(availability);
Pageable pageable = PageRequest.of(pageNumber, pageSize);
// #formatter:off
return new NativeSearchQueryBuilder()
.withPageable(pageable)
.withQuery(queryBuilder)
.build();
This is my query dto
public class WorkerQuery {
private boolean partial;
private LocalDate startDate;
private LocalDate endDate;
}
// Request data
{
"partial": true,
"startDate": "2020-12-01",
"endDate": "2020-12-02"
}
Great start!! You're just missing a nested query since availability is nested. The Java query needs to be like this:
final BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
queryBuilder.must(QueryBuilders.termQuery("availability.partial", query.isPartial()));
RangeQueryBuilder availability = QueryBuilders.rangeQuery("availability.dates")
.gte(query.getStartDate())
.lte(query.getEndDate())
.relation("within");
queryBuilder.must(availability);
final NestedQueryBuilder nested = QueryBuilders.nestedQuery("availability", queryBuilder);
Pageable pageable = PageRequest.of(pageNumber, pageSize);
// #formatter:off
return new NativeSearchQueryBuilder()
.withPageable(pageable)
.withQuery(nested)
.build();

Spring Boot - Get Data from DB and store it in list and parse it to JSON using jackson

I'm trying to get data from multiple tables and put it in Array List of class, and then convert it to JSON Object.
But when i'm trying to parse it to json using Jackson Object Mapper all the lists are converted as below
Using ObjectMapper().writeValueAsString for deserialization from class objects to json
```{
"College": [
{
"institution": [
{
"instId": "T34",
"Country": "India",
"Code": "T33"
},
{
"instId": "T22",
"Country": "India",
"Code": "T22"
}
],
"Rating": [
{
"star": "4"
"comments": "good"
},
{
"star": "2"
"comments": "ok"
},
}
]
}```
But i want the result as below
{
"College": [
{
"institution": [
{
"instId": "T34",
"Country": "India",
"Code": "T33"
}
],
"Rating": [
{
"star": "4"
"comments": "good"
}
]
},
{
"institution": [
{
"instId": "T22",
"Country": "India",
"Code": "T22"
}
],
"Rating": [
{
"star": "2"
"comments": "ok"
}
]
}
]
}
The above is just an example.
Please help in getting the desired output.
Below are the class files used.
public class AllCollege{
List<College> college = new ArrayList<>();
public List<College> getCollege() {
return college;
}
public void setCollege(List<College> college) {
this.college = college;
}
}
public class College{
private List<Institution> institution = new ArrayList<>();
private List<Rating> rating = new ArrayList<>();
public List<Institution> getInstitution() {
return institution;
}
public void setInstitution(List<Institution> institution) {
this.institution = institution;
}
public List<Rating> getRating() {
return rating;
}
public void setRating(List<Rating> rating) {
this.rating = rating;
}
}
public class Institution {
private String instId;
private String country;
private String code;
public String getInstId() {
return instId;
}
public void setInstId(String instId) {
this.instId = instId;
}
public String getCountry() {
return country;
}
public void setCountry(String country) {
this.country = country;
}
public String getCode() {
return code;
}
public void setCode(String code) {
this.code = code;
}
}
public class Rating {
private String star;
private String comments;
public String getStar() {
return star;
}
public void setStar(String star) {
this.star = star;
}
public String getComments() {
return comments;
}
public void setComments(String comments) {
this.comments = comments;
}
}
Below is where the data from tables is set into ArrayList and then converted to json string.
session = sessionFactory.openSession();
String sql = "from institution";
Query<InstDto> query = session.createQuery(sql);
List<Institution> configdtoList =query.list();
College alc = new College();
alc.setInstitution(configdtoList);
.
.
.
similarly Rating table.
List<College> clist = new new ArrayList<>();
clist.add(alc);
AllCollege ac = new AllCollege();
ac.setCollege(clist);
String responseJson = new ObjectMapper().writeValueAsString(ac)
class structure as below it will help you to parse:
public class Sample {
#JsonProperty("College")
private List<College> college;
}
public class College {
private List<Institution> institution;
#JsonProperty("Rating")
private List<Rating> rating;
}
public class Rating {
private String comments;
private String star;
}
public class Institution {
#JsonProperty("Code")
private String code;
#JsonProperty("Country")
private String country;
private String instId;
}
I have created an HashMap contains the List<AllCollege> as value and then used json parser which worked as expected.

n-gram implementaion in spring-boot ElasticSearch

I am trying to achieve autocomplete in elasticsearch, I am using it inside spring boot, I have tried a lot and tried with many example from internet but not able to make it. below is my code example pls help me on this.
Main Class:-
#SpringBootApplication
#EnableNatsAnnotations
#EnableAutoConfiguration
#EnableConfigurationProperties(ElasticsearchProperties.class)
#EntityScan(basePackages = {
"com.text.model"
})
#ComponentScan(
{
"com.text.elastic",
"com.text.elastic.controller",
"com.text.elastic.service",
"com.text.elastic.service.impl",
"com.text.nats.utils"
}
)
public class ElasticServicesApplication {
public static void main(String[] args) {
SpringApplication.run(ElasticServicesApplication.class, args);
}
}
Bean Class:-
#Setting(settingPath = "elasticsearch-settings.json")
#Document(indexName = "content", type = "content", shards = 1, replicas = 0, createIndex = true, refreshInterval = "-1")
public class Content {
#Id
private String id;
private Locale locale;
// #Field(type = text, index = true, store = true, analyzer = "standard")
#Field(
type = FieldType.String,
index = FieldIndex.analyzed,
searchAnalyzer = "standard",
//indexAnalyzer = "type_ahead",
analyzer = "standard"
/*,
store = true*/
)
private String contentTitle;
Here I want to achieve it in contentTitle.
Mapping
Concise way of using annotation:
#CompletionField()
private Completion suggest;
Or more powerful but tedious way:
{
"content" : {
"properties" : {
"contentTitle" : { "type" : "string" },
"suggest" : { "type" : "completion",
"analyzer" : "simple",
"search_analyzer" : "simple"
}
}
}
}
//Then refer to the mapping by `#Mapping`:
#Setting(settingPath = "elasticsearch-settings.json")
#Document(indexName = "content", type = "content", shards = 1, replicas = 0, createIndex = true, refreshInterval = "-1")
#Mapping(mappingPath = "/mappings/content-mapping.json")
public class Content {...}
Index
We can index as our common entity:
esTemplate.save(new File(...));
Query
The ElasticsearchTemplate has the method for query suggest:
public SuggestResponse suggest(SuggestBuilder.SuggestionBuilder<?> suggestion, String... indices);
Ref
Blog posts about completion
Official document about completion

Adding more information to the HATEOAS response in Spring Boot Data Rest

I have the following REST controller.
#RepositoryRestController
#RequestMapping(value = "/booksCustom")
public class BooksController extends ResourceSupport {
#Autowired
public BooksService booksService;
#Autowired
private PagedResourcesAssembler<Books> booksAssembler;
#RequestMapping("/search")
public HttpEntity<PagedResources<Resource<Books>>> search(#RequestParam(value = "q", required = false) String query, #PageableDefault(page = 0, size = 20) Pageable pageable) {
pageable = new PageRequest(0, 20);
Page<Books> booksResult = BooksService.findBookText(query, pageable);
return new ResponseEntity<PagedResources<Resource<Books>>>(BooksAssembler.toResource(BooksResult), HttpStatus.OK);
}
My Page<Books> BooksResult = BooksService.findBookText(query, pageable); is backed by SolrCrudRepository. When it is run BookResult has several fields in it, the content field and several other fields, one being highlighted. Unfortunately the only thing I get back from the REST response is the data in the content field and the metadata information in the HATEOAS response (e.g. page information, links, etc.). What would be the proper way of adding the highlighted field to the response? I'm assuming I would need to modify the ResponseEntity, but unsure of the proper way.
Edit:
Model:
#SolrDocument(solrCoreName = "Books_Core")
public class Books {
#Field
private String id;
#Field
private String filename;
#Field("full_text")
private String fullText;
//Getters and setters omitted
...
}
When a search and the SolrRepository is called (e.g. BooksService.findBookText(query, pageable);) I get back these objects.
However, in my REST response I only see the "content". I would like to be able to add the "highlighted" object to the REST response. It just appears that HATEOAS is only sending the information in the "content" object (see below for the object).
{
"_embedded" : {
"solrBooks" : [ {
"filename" : "ABookName",
"fullText" : "ABook Text"
} ]
},
"_links" : {
"first" : {
"href" : "http://localhost:8080/booksCustom/search?q=ABook&page=0&size=20"
},
"self" : {
"href" : "http://localhost:8080/booksCustom/search?q=ABook"
},
"next" : {
"href" : "http://localhost:8080/booksCustom/search?q=ABook&page=0&size=20"
},
"last" : {
"href" : "http://localhost:8080/booksCustom/search?q=ABook&page=0&size=20"
}
},
"page" : {
"size" : 1,
"totalElements" : 1,
"totalPages" : 1,
"number" : 0
}
}
Just so you can get a full picture, this is the repository that is backing the BooksService. All the service does is call this SolrCrudRepository method.
public interface SolrBooksRepository extends SolrCrudRepository<Books, String> {
#Highlight(prefix = "<highlight>", postfix = "</highlight>", fragsize = 20, snipplets = 3)
HighlightPage<SolrTestDocuments> findBookText(#Param("fullText") String fullText, Pageable pageable);
}
Ok, here is how I did it:
I wrote mine HighlightPagedResources
public class HighlightPagedResources<R,T> extends PagedResources<R> {
private List<HighlightEntry<T>> phrases;
public HighlightPagedResources(Collection<R> content, PageMetadata metadata, List<HighlightEntry<T>> highlightPhrases, Link... links) {
super(content, metadata, links);
this.phrases = highlightPhrases;
}
#JsonProperty("highlighting")
public List<HighlightEntry<T>> getHighlightedPhrases() {
return phrases;
}
}
and HighlightPagedResourcesAssembler:
public class HighlightPagedResourcesAssembler<T> extends PagedResourcesAssembler<T> {
public HighlightPagedResourcesAssembler(HateoasPageableHandlerMethodArgumentResolver resolver, UriComponents baseUri) {
super(resolver, baseUri);
}
public <R extends ResourceSupport> HighlightPagedResources<R,T> toResource(HighlightPage<T> page, ResourceAssembler<T, R> assembler) {
final PagedResources<R> rs = super.toResource(page, assembler);
final Link[] links = new Link[rs.getLinks().size()];
return new HighlightPagedResources<R, T>(rs.getContent(), rs.getMetadata(), page.getHighlighted(), rs.getLinks().toArray(links));
}
}
I had to add to my spring RepositoryRestMvcConfiguration.java:
#Primary
#Bean
public HighlightPagedResourcesAssembler solrPagedResourcesAssembler() {
return new HighlightPagedResourcesAssembler<Object>(pageableResolver(), null);
}
In cotroller I had to change PagedResourcesAssembler for newly implemented one and also use new HighlightPagedResources in request method:
#Autowired
private HighlightPagedResourcesAssembler<Object> highlightPagedResourcesAssembler;
#RequestMapping(value = "/conversations/search", method = POST)
public HighlightPagedResources<PersistentEntityResource, Object> findAll(
#RequestBody ConversationSearch search,
#SortDefault(sort = FIELD_LATEST_SEGMENT_START_DATE_TIME, direction = DESC) Pageable pageable,
PersistentEntityResourceAssembler assembler) {
HighlightPage page = conversationRepository.findByConversationSearch(search, pageable);
return highlightPagedResourcesAssembler.toResource(page, assembler);
}
RESULT:
{
"_embedded": {
"conversations": [
..our stuff..
]
},
"_links": {
...as you know them...
},
"page": {
"size": 1,
"totalElements": 25,
"totalPages": 25,
"number": 0
},
"highlighting": [
{
"entity": {
"conversationId": "a2127d01-747e-4312-b230-01c63dacac5a",
...
},
"highlights": [
{
"field": {
"name": "textBody"
},
"snipplets": [
"Additional XXX License for YYY Servers DCL-2016-PO0422 \n  \n<em>hi</em> bodgan \n  \nwe urgently need the",
"Additional XXX License for YYY Servers DCL-2016-PO0422\n \n<em>hi</em> bodgan\n \nwe urgently need the permanent"
]
}
]
}
]
}
I was using Page<Books> instead of HighlightPage to create the response page. Page obviously doesn't contain content which was causing the highlighted portion to be truncated. I ended up creating a new page based off of HighlightPage and returning that as my result instead of Page.
#RepositoryRestController
#RequestMapping(value = "/booksCustom")
public class BooksController extends ResourceSupport {
#Autowired
public BooksService booksService;
#Autowired
private PagedResourcesAssembler<Books> booksAssembler;
#RequestMapping("/search")
public HttpEntity<PagedResources<Resource<HighlightPage>>> search(#RequestParam(value = "q", required = false) String query, #PageableDefault(page = 0, size = 20) Pageable pageable) {
HighlightPage solrBookResult = booksService.findBookText(query, pageable);
Page<Books> highlightedPages = new PageImpl(solrBookResult.getHighlighted(), pageable, solrBookResult.getTotalElements());
return new ResponseEntity<PagedResources<Resource<HighlightPage>>>(booksAssembler.toResource(highlightedPages), HttpStatus.OK);
}
Probably a better way of doing this, but I couldn't find anything that would do what I wanted it to do without having a change a ton of code. Hope this helps!

Resources