Most useful plugins for ElasticSearch - elasticsearch

What would be some of the top most used Elasticsearch plugins?
For example, monitoring data, mapping, or analysis plugin.

OP didn't mention the ES version you are using. I am suggesting below plugins as they are easy to set-up, free and provides the admin interface for elasticsearch cluster.
I would recommend, For ES versions less than 2.x KOPF plugin and for the latest version of ES, use Cerebro, which is from the same author of kopf.
It offers an easy way of performing common tasks on an elasticsearch cluster. Not every single API is covered by this plugin, but it does offer a REST client which allows you to explore the full potential of the ElasticSearch API.

Related

Versions for integration of apache flink, elasticsearch and kafka

I have problems with different versions of Flink, Kafka and Elastic Search. I'm using Flink 1.8.1 version but I don't know what version to use for Kafka. On the other hand, I want to use the version 6 for Elastic Search. Which versions do you think are suitable for Flink, Kafka and Elastic Search?
The following link is a version of Kafka, but in the comments section, it is introduced as a beta
enter link description here
As listed in the table, Kafka 0.11 (and higher) will work fine. The beta is a version of the Flink Connector, not Kafka itself
Plus, Kafka Connect for Elasticsearch, should you choose to use it, works for elasticsearch 6
As #cricket_007 said, it's safe to use the Kafka connector, even though it is labeled beta (which should be removed as this connector has now been battle-tested since over a year in production).
The setup Kafka -> Flink -> ES6 is quite common, so you can and should use recent version on all involved components.

What options do I have regarding indexing PDFs while running on Elasticsearch 1.x and Spring Data 1.x, especially if I want to upgrade?

We have a new requirement on our Elasticsearch - to index PDFs. We are still running on Elasticsearch version 1.x (and Spring Data 1.3.4).
I look at the documentation for Elasticsearch 5 and they have new ways of supporting PDFs in 5 (and I would like to upgrade).
So given all this the way I see it I have the following options:
Sit tight and wait for Spring Data to support Elasticsearch 5. This is viable if it is not too far away (please let us know, Spring Data and Elasticsearch dev) although given the business urgency on this feature I don't think I have much leeway
Move off Spring data altogether - this is not as crazy as it sounds as given the complexity of my queries I don't use the Spring Data repositories a great deal. I do however use them for inserting data. I would have to provide my own implementations of the current repository interfaces. It would be work but I wouldn't need to wait for any one and would not need to use any outdated plugins etc
Somehow run on Elasticsearch 5 with Spring Data 2.x/3.x. Will this work at all? Chances are it probably won't even startup.
Upgrade my Elasticsearch/Spring Data to 2.x and use the "old" way of indexing PDFs.
Which option is the best way to go?

Couchbase plugin for ElasticSearch deprecated?

I was reading https://www.elastic.co/blog/deprecating-rivers which stats that ES rivers (plugin) are getting deprecated. i.e. any plugin directly integrated with ElasticSearch server will no longer work beyond ES 3.x onwards.
Couchbase plugin is one of those kind.
I searched all the documents of couchbase plugin at http://developer.couchbase.com/documentation/server/4.5/connectors/elasticsearch-2.1/elastic-intro.html but could not find if they are using deprecated way or not?
Does anyone know? Should we keep using couchbase plugin or should start planning to write data directly to ES using our application.
We have couchbase data getting replicated to ES using couchbase plugin and XDCR.
I'm the maintainer of the Couchbase ES transport plugin. As Roi mention in his answer, the plugin doesn't use rivers, so it won't be deprecated. It currently supports any version of ES from 1.3 to 2.x, and I'm working on adding support for 5.x. It's taking a bit longer, because ES 5.x broke some configuration sharing features in unexpected ways.
I'd suggest always looking at our github repo for the latest plugin releases:
https://github.com/couchbaselabs/elasticsearch-transport-couchbase
The Couchbase plugin is not using Rivers, there is another River plugin which is not longer valid.
take a look here: https://github.com/couchbaselabs/elasticsearch-transport-couchbase

How to combine neo4j and elasticsearch

I am developing a Question answering application and for that I need to use neo4j and elasticsearch in the same maven project. I am using elasticsearch to make my application more robust.
As we know that neo4j and elasticsearch works on different version of lucene, so whichever version I include in dependency, it gives an error.
Here is what I am doing:
First elasticsearch will index the data and the data and relationships will be stored as graphdatabase using neo4j. Then the user will input as a query, through which the data will be retrieved with the help of indexes. This data will be trigerred in graphdatabasev using trigger score which will be then propagated along the graphdatabase to find relevant results according to the user query.
Is there any way that I can integrate neo4j and elasticsearch in same maven project, or is there any other way through which these two modules can interact seperately.
Thanks
Please check out our integration page:
http://neo4j.com/developer/elastic-search/
Which has some discussion and also an example project to get you started.
http://github.com/neo4j-contrib/neo4j-elasticsearch

Elasticsearch / Storm integration methods

Looking for a simple integration path between Elasticsearch and Apache Storm. Support for this is included in the elasticsearch-hadoop library, but this brings tons of dependencies on the Hadoop stack: from Hive to Cascading, that I simply don't need. Has anyone out there succeeded in this integration without bringing in elasticsearch-hadoop? Thanks.
In my project we're using rabbitmq river for indexing the storm output. It's very efficient and convenient way to write to elasticsearch. You basically put the messages to the queue and the river does the rest. If something gets stucked the data are simply buffered on the queue.
So I would say, use this river approach for writing and elasticsearch Java API for reading, like Kit Menke suggests (or the Jest client, we've found this cool and it offers async API basing on ApacheHttpAsyncClient, though we're not reading from elasticsearch in storm topology but in different services).

Resources