How to combine neo4j and elasticsearch - maven

I am developing a Question answering application and for that I need to use neo4j and elasticsearch in the same maven project. I am using elasticsearch to make my application more robust.
As we know that neo4j and elasticsearch works on different version of lucene, so whichever version I include in dependency, it gives an error.
Here is what I am doing:
First elasticsearch will index the data and the data and relationships will be stored as graphdatabase using neo4j. Then the user will input as a query, through which the data will be retrieved with the help of indexes. This data will be trigerred in graphdatabasev using trigger score which will be then propagated along the graphdatabase to find relevant results according to the user query.
Is there any way that I can integrate neo4j and elasticsearch in same maven project, or is there any other way through which these two modules can interact seperately.
Thanks

Please check out our integration page:
http://neo4j.com/developer/elastic-search/
Which has some discussion and also an example project to get you started.
http://github.com/neo4j-contrib/neo4j-elasticsearch

Related

What is the best way to maintain queries in Spring boot application?

In My Application, Using the below technologies
Spring boot 2.7.x
Cassandra
spring batch 5. x
java 11
As part of this, I need to extract data from the Cassandra database and need to write out the file
so here I need to use queries to fetch data so
just want to know what is the best way to maintain all queries at one place so any query changes come in the future, I shouldn't build the app rather just need to modify the query.
Using a repository class is necessary. If you are using JPA i recommend using a repository for each Entity class. With JDBC it is possible to create a single repository which contains all the queries. To access the query methodes i would use a service class. In this way your code is structured well and maintainable for future changes.

Spring data elasticsearch - migration documents to new index

I'm developing spring application for search purpose. I use elasticsearch spring data library for creating indices, and managing documents. For querying (searching) I used regular client from elasticsearch - not from spring data.
I noticed that the spring data only creates index if it is missing in the elasticsearch. Whenever new field is added to the the class annotated with #Document, mapping will not be updated. Thus, searching in just-added field cause a bad request.
The application works now on production already. There are multiple instances of this application running. I would like to change the mapping of the index and keep existing data.
The solution I found in the internet and in the documentation is to create new index, copy data (and possibly change them on-the-fly) with reindex functionality and switch aliases to the new one.
I implemented solution with this approach. Migration procedure runs on application startup(if required - decided with env param).
However, this approach seems to me to be cheap and shoddy. Changing documents with painless script is error prone. It is difficult to test migration. I need to manually keep information on which env I am running migration, and have proper index name set. During deployment I need to keep an eye on the proces to check if everything worked correctly. Possibly some manual changes would be required as well. What if reindex procedure fails in the meantime?
There are a lot of questions that are bothering me. I was searching why there isn't library, similar to Flyway. Also, I understand that it is no possible to change mapping of the index, but it is possible to add new field and this is not supported in the the spring data elasticsearch.
Could you guys please give me some advices how do you tackle such situations?
This is no answer as how to generally do these migrations, but some clarification of what Spring Data Elasticsearch can do and what it does.
Spring Data Elasticsearch creates an index with the corresponding mappping if you are using a Spring Data Elasticsearch repository for your entity and if the index does not exist on application startup. It does not update the mapping of an index by itself.
You can nevertheless update an index mapping from the program code, there's IndexOperations.putMapping(java.lang.Class<?>) for that. So if you add a new property to your entity and then on application start call this method with the changed entity class, the index mapping will be updated. This can only add new fields to the mapping, not change existing ones - this is a restriction of Elasticsearch.
If your application is running in multiple instances it is up to you to synchronize them in updateing or in correctly handling errors.
If you add fields make sure to update the mapping before adding data, otherwise the new field type will be autodetected by Elasticsearch and you will have to do a manual reindex process.

What is the best way to use Spring and ElasticSearch?

I have to implement some application by using springframework.
All i have to do is just select from repository (no RDBMS, maybe lucene or elastic search core) and Display some view pages for customers. that is not save or update but read.
What is the best way to select for repositories in spring framework ?
You can use spring-data-elasticsearch which is the Spring Data implementation for ElasticSearch.
In order to get started, you may like to refer to https://www.mkyong.com/spring-boot/spring-boot-spring-data-elasticsearch-example/ which explains the integration with an example. Although it is a bit old but provide you with enough information to get it working.

What options do I have regarding indexing PDFs while running on Elasticsearch 1.x and Spring Data 1.x, especially if I want to upgrade?

We have a new requirement on our Elasticsearch - to index PDFs. We are still running on Elasticsearch version 1.x (and Spring Data 1.3.4).
I look at the documentation for Elasticsearch 5 and they have new ways of supporting PDFs in 5 (and I would like to upgrade).
So given all this the way I see it I have the following options:
Sit tight and wait for Spring Data to support Elasticsearch 5. This is viable if it is not too far away (please let us know, Spring Data and Elasticsearch dev) although given the business urgency on this feature I don't think I have much leeway
Move off Spring data altogether - this is not as crazy as it sounds as given the complexity of my queries I don't use the Spring Data repositories a great deal. I do however use them for inserting data. I would have to provide my own implementations of the current repository interfaces. It would be work but I wouldn't need to wait for any one and would not need to use any outdated plugins etc
Somehow run on Elasticsearch 5 with Spring Data 2.x/3.x. Will this work at all? Chances are it probably won't even startup.
Upgrade my Elasticsearch/Spring Data to 2.x and use the "old" way of indexing PDFs.
Which option is the best way to go?

ElasticSearch - Index a large file using Java API

We have a requirement wherein we have to use ElasticSearch for performing full text search. We have a Spring based application and for integration with ES we can use either Java API of Elastic Search or Spring Data for ElasticSearch.
The input will be of a file type having size around 5MB.
I went through examples for both ES Java API and SpringData, they do have
tutorials available for inserting a JSON document.
But any help with regards to using File as an input to create documents/index is not available.
I am newbie with Elastic Search, any guidance/help on this will be much appreciated.
EDIT:
I could see that there is a Ingest Attachment Processor plugin available in ES (https://www.elastic.co/guide/en/elasticsearch/plugins/master/ingest-attachment.html).
Can anybody point me to a sample CURL request to use this plugin or any Java code to use this plugin
1.You may use Elasticsearch mapper attachments plugin. This plugin uses Apache Tika to ingest almost any well known type of document and make it searchable by Elasticsearch.
https://www.elastic.co/guide/en/elasticsearch/plugins/2.3/mapper-attachments.html
2.You can use Apache Tika to extract useful content from file and use elasticsearch Bulk Indexing api to index to ES
Hope that helps

Resources