How to create and use Ingest pipeline with Spring data elastic search - spring

I need to upload pdf file in elastic search for searching content inside the pdf file. I used ingest pipeline curl APIs through postman it works fine but, i am unable integrate and use in my spring boot project to index and search on pdf file. Can anyone suggest me how to create and use ingest pipeline in spring data elastic search.
for index and document indexing we just annotated on entity class but for ingest pipeline how we use it.
#Document(indexName = "blog", type = "article")
public class Article {
#Id
private String id;
private String title;
#Field(type = FieldType.Nested, includeInParent = true)
private List<Author> authors;
// standard getters and setters
}
I need clarity in spring boot perspective how to configuring ingest pipeline and how to use it in entity class to save the file data to search.

This is not supported in Spring Data Elasticsearch.
And please note that Spring Boot and Spring Data Elasticsearch are different things. Spring Boot can autoconfigure Spring Data Elasticsearch, but that's it

Related

inserting a record to elasticsearch using Springboot Restful service

can any please guide on how to insert record to elastic search using spring boot rest service
we are using spring boot,restful service.
I have tried with elastic document but couldnt get much from it
Have used the below approach but index is not getting created in ES
IndexRequest indexRequest = new IndexRequest("INDEX_NAME", "type", "id")
.source(jsonMap)
.opType(DocWriteRequest.OpType.CREATE);
try {
ActionFuture<IndexResponse> response = client.index(indexRequest);

Spring Boot Couchbase request to repo stops abruptly

Software versions:
Spring Boot v2.0.1
Enterprise Edition 5.1.0 build 5552
I have an account repository where I want to find a user by email:
#N1qlPrimaryIndexed
#ViewIndexed(designDoc = "Account")
public interface AccountRepo extends CouchbasePagingAndSortingRepository<AccountEntity, String> {
Optional<AccountEntity> findByEmail(String email);
#Query("#{#n1ql.selectEntity} where #{#n1ql.filter} and email = $1")
Optional<AccountEntity> findByEmailN1QL(String email);
}
Now when I call the findByEmail method of the repo, the request abruptly stops execution with no error thrown or message whatsoever. Both methods, Spring Data Keywords and N1QL Query, result in the same behaviour.
The entity exists and is created on server startup and the only thing I did on couchbase server is creating the bucket, no views or indexes for now.
What can be the source of this behaviour?
How can I debug this?
The problem was that I used a UUID attribute on my Account class and the serialization POJO -> Couchbase Doc worked, but the deserialization Couchbase Doc -> POJO did not.
Moved to String instead of UUID and it works.

Spring Data MongoDB eliminate POJO's

My system is a dynamic telemetry system. We have hundreds of different spiders all sending telemetry back to the SpringBoot server, Everything is dynamic, driven by json files in Mongo, including the UI. We don't build the UI, as opposed to individual teams can configure their own UI for their needs, all by editing json docs.
We have the majority of the UI running and i began the middleware piece. We are using Spring Boot for the first time along with Spring Data Mongo with several MQ listeners for events. The problem is Spring Data. I started reading the docs on it and I realized the docs do not address using it without POJO's. I have this wonderfully dynamic model that changes per user per minute if the telemetry spiders dictate, I couldn't shackle this to a POJO if I tried. Is there a way to use Spring Data with a Map?
It seems from my experiments that the big issue is there is no way to tell the CRUD routines of the repository class what collection to query without a POJO.
Are my suspicions correct in that this won't work and am I better off ditching Spring Data and using the Mongo driver directly?
I don't think you can do it without a pojo when using spring-data. The least you could do is this
public interface NoPojoRepository extends MongoRepository<DummyPojo, String> {
}
and create a dummy pojo with just id and a Map.
#Data
public class DummyPojo {
#Id
private String id;
private Map<String, Object> value;
}
Since this value field is a map, you can store pretty much anything.

Spring Data MongoDB auto create indexes not working

I am using Spring data Mongodb v1.6.2 and Spring 4.2.1. Today I noticed that #Indexed annotation on my entities did not trigger an index creation.
The entity is annotated with org.springframework.data.mongodb.core.mapping.Document and theorg.springframework.data.mongodb.core.mapping.Document is used.
#Document
public class Entity {
#Indexed(unique= true)
private String name;
}
After some investigation it appeared that MongoPersistentEntityIndexCreator did not receive the MappingContextEvent. Spring 4.2 altered the way generics are handled for ApplicationEvents.
Spring Data MongoDB fixed this in the following commit: https://github.com/spring-projects/spring-data-mongodb/commit/2a27eb74044d6480b228a216c1f93b2b0488c59a
The issue tracker can be found here: https://jira.spring.io/browse/DATAMONGO-1224
This was fixed in all version so upgrading to 1.6.3 fixed the issue.

spring-data-elasticsearch - Jackson can't serialize using global configuration

I'm developing a project using ElasticSearch and I'm having some problems with serialization/deserialization with Jackson. My project was created using JHipster, so, I'm using spring to store my entity to the database and to index in ElasticSearch. All entities and other objects can be (de)serialized with Jackson, except when I try to add it to ES.
This is my global configuration for Jackson:
#Configuration
public class JacksonConfiguration {
#Bean
Jackson2ObjectMapperBuilder jackson2ObjectMapperBuilder() {
SimpleModule timeModule = new JavaTimeModule();
timeModule.addSerializer(OffsetDateTime.class, JSR310DateTimeSerializer.INSTANCE);
timeModule.addSerializer(ZonedDateTime.class, JSR310DateTimeSerializer.INSTANCE);
timeModule.addSerializer(LocalDateTime.class, JSR310DateTimeSerializer.INSTANCE);
timeModule.addSerializer(Instant.class, JSR310DateTimeSerializer.INSTANCE);
timeModule.addDeserializer(LocalDate.class, JSR310LocalDateDeserializer.INSTANCE);
SimpleModule geoModule=new GeoModule();
geoModule.addSerializer(Point.class, PointSerializer.INSTANCE);
geoModule.addDeserializer(Point.class, PointDeserializer.INSTANCE);
return new Jackson2ObjectMapperBuilder()
.featuresToDisable(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS)
.findModulesViaServiceLoader(true)
.modulesToInstall(timeModule,geoModule);
}
}
This configuration works fine, except when I try add an entity to ES, for example, PointSerializer is never called. The only way I can see this serializer running (and consequently indexing correctly) for ES is adding #JsonSerialize(using = PointSerializer.class) to the field. Why is it happening and how can I configure it globally?
It seems that Spring Data elasticsearch doesn't utilize the default spring Jackson2ObjectMapperBuilder for this. Per default this configuration is used:
https://github.com/spring-projects/spring-data-elasticsearch/blob/master/src/main/java/org/springframework/data/elasticsearch/core/DefaultEntityMapper.java
... which you can overwrite by providing some custom object mapper as described here:
https://github.com/spring-projects/spring-data-elasticsearch/wiki/Custom-ObjectMapper
Here you can of course directly use your Jackson ObjectMappers. For more details, have a look at this issue at the jhipster github repo:
https://github.com/jhipster/generator-jhipster/issues/2241#issuecomment-151933768

Resources