elasticsearch and camel integration

elasticsearch and camel integration - elasticsearch

I am trying to integrate camel with elasticsearch.
In applicationContext.xml added the following
<route id="timer-to-console">
<from uri="timer://foo?fixedRate=true&period=10s"/>
<transform>
<simple>Hello Web Application, how are you?</simple>
</transform>
<to uri="stream:out"/>
<to uri="elasticsearch://local"/>
</route>
Then when I run
mvn jetty:run
I am getting the following
veryCounter=0, firedTime=Mon Apr 21 13:14:43 PDT 2014}
BodyType String
Body Hello Web Application, how are you?
]
Stacktrace
----------------------------------------------------------------------------------------
java.lang.IllegalArgumentException: operation is missing
at org.apache.camel.component.elasticsearch.ElasticsearchProducer.process(ElasticsearchProducer.java:54)
at org.apache.camel.util.AsyncProcessorConverterHelper$ProcessorToAsyncProcessorBridge.process(AsyncProcessorConverterHelper.java:61)
My elasticsearch is running locally, I am using ES 1.1.1.
what do I need to specify for
elasticsearch://clusterName[?options]
Thanks,

From a quick glance at the Apache Camel Elasticsearch Component page they show the following example:
elasticsearch://local?operation=INDEX&indexName=twitter&indexType=tweet
This would INDEX (add) into an index named twitter with a type of tweet. You can use whatever values you want for the indexName and indexType.
Update: Looking at the Elasticsearch Camel Component documentation again... I think in order to use local as the server name in the elasticsearch connection, you would need to be running your local Elasticsearch instance with a clustername of local. By default Elasticsearch configuration (elasticsearch.yml) is setup to run with a clustername of elasticsearch

You have to use the cluster name specified in elasticsearch.yml, i.e. if it says cluster.name: elasticsearch123 then you have to route for example as follows:
from("file:data_json?noop=true&moveFailed=data_json/.error")
.convertBodyTo(byte[].class)
.to("elasticsearch://elasticsearch123?operation=INDEX&indexName=myindexname&indexType=mytypename");
Using local simply doesn't work for me. Unfortunately it doesn't even throw any error which makes it difficult to debug.
Also notice the conversion .convertBodyTo(byte[].class) - that's also essential, otherwise Camel converts the JSON files to the class Properties (it looks for a conversion from GenericFile to Map and it finds a fallback type converter to Properties). Of course, you could also convert to the other types supported by the component.
It's also worth mentioning that the camel-elasticsearch component of version 2.14.1 depends on org.elasticsearch:elasticsearch:1.0.0 which is quite an old version given how fast Elasticsearch moves (the current version is 1.4.4) and how often it likes to break compatibility. That said, the component seems to work with Elasticsearch 1.4.3.
Update: the current master branch on GitHub has Elasticsearch upgraded to version 1.4.2: https://github.com/apache/camel/blob/2470023e25b8814279cbadb3ebc8002bed6d8fc8/parent/pom.xml#L144
The parameter/clusterName local in fact means that it will look for an Elasticsearch instance started in the same JVM, from Elasticsearch's JavaDoc (NodeBuilder.local(boolean)):
A local node is a node that uses a local (JVM level) discovery and
transport. Other (local) nodes started within the same JVM (actually, class-loader) will be
discovered and communicated with. Nodes outside of the JVM will not be discovered.

Related

How to change elasticsearch version in reactive spring data elastic search api?

spring data elasticsearch uses 7.x client version and my production elasticsearch version is 6.4.2. So I changed the version and got the following exception. How to safely change version in spring data es?
Repopsitory: https://github.com/Yungdi/spring-data-reactive-elasticsearch
An attempt was made to call a method that does not exist. The attempt was made from the following location:
org.springframework.data.elasticsearch.core.ReactiveElasticsearchTemplate.<init>(ReactiveElasticsearchTemplate.java:108)
The following method did not exist:
org.elasticsearch.action.support.IndicesOptions.strictExpandOpenAndForbidClosedIgnoreThrottled()Lorg/elasticsearch/action/support/IndicesOptions;
The method's class, org.elasticsearch.action.support.IndicesOptions, is available from the following locations:
jar:file:/Users/we/DevEnv/gradle-6.4.1/caches/modules-2/files-2.1/org.elasticsearch/elasticsearch/6.4.2/29a4003b7e28ae8d8896041e2e16caa7c4272ee3/elasticsearch-6.4.2.jar!/org/elasticsearch/action/support/IndicesOptions.class
The class hierarchy was loaded from the following locations:
org.elasticsearch.action.support.IndicesOptions: file:/Users/we/DevEnv/gradle-6.4.1/caches/modules-2/files-2.1/org.elasticsearch/elasticsearch/6.4.2/29a4003b7e28ae8d8896041e2e16caa7c4272ee3/elasticsearch-6.4.2.jar
Action:
Correct the classpath of your application so that it contains a single, compatible version of org.elasticsearch.action.support.IndicesOptions

You can’t use a Elasticsearch 6 cluster with Spring Data Elasticsearch 4 which uses Elasticsearch 7 libraries. The Elasticsearch REST API that is used had breaking changes between version 6 and 7.
You can try to use Spring Data Elasticsearch 3.2.x which targets 6.8; I currently don’t know if there were breaking changes between Elasticsearch 6.4 and 6.8, you’ll have to try it.

spring boot + hibernate search + elastic search embedded fails to start

I'm struggeling to setup hibernate seach using the elastic search backend in a spring boot setup.
What I have is spring boot and the following dependencies.
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
<version>1.4.0.M3</version>
</dependency>
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-search-backend-elasticsearch</artifactId>
<version>5.6.0.Alpha3</version>
</dependency>
What happens is, that hibernate search initializes before elastic search has finished starting.
Using the following property exposes the rest interface as well
spring:
data:
elasticsearch:
properties:
http:
enabled: true
Causing an exception
Caused by: org.apache.http.conn.HttpHostConnectException: Connect to localhost:9200 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused: connect
Now, how do I define a dependency here?
I tried using a custom BeanFactoryPostProcessor to inject a dependency on elastic search, but that seems to be ignored in the auto configuration scenario.
Is there any way to introduce a wait until elastic search is up?
The setup works, when I set the hibernate index_management_strategyto NONE, but then the index is not configured and all custom analyzer annotations are ignored, defaulting to the default mappings in elastic search, which can not be configured in the auto configuration scenario.
Ideally elastic search should be hosted external to the jvm, but it's convenient in testing scenarios.

I'm understanding this is an issue you're hitting during integration tests.
You could have a look at how we start ES during the integration tests within Hibernate Search itself, using a Maven plugin which makes sure the server is started before the tests:
- https://github.com/hibernate/hibernate-search/blob/5.6.0.Beta1/elasticsearch/pom.xml#L341-L368
N.B. this uses a custom ES configuration, tuned to start quickly even though it's only a single node cluster:
- https://raw.githubusercontent.com/hibernate/hibernate-search/5.6.0.Beta1/elasticsearch/elasticsearchconfiguration/elasticsearch.yml
Hibernate Search uses the Jest client to connect to ES, so it will require you to enable the HTTP connector of ES: let's not confuse this with a NodeClient, which is a different operating mode.
If your question isn't related to automated testing but rather production clusters, then I'd suggest using a Service Orchestrator like Kubernetes.

Thanks to some help from the spring boot team, I was able to solve the issue - solution here.
The problem is that there's no dependency between the EntityManagerFactory bean and the Elasticsearch Client bean so there's no guarantee that Elasticsearch will start before Hibernate. As it happens, Hibernate starts first and then fails to connect to Elasticsearch.
This can be fixed by setting up a dependency between the two beans. An easy way to do that is with a subclass of EntityManagerFactoryDependsOnPostProcessor:
#Configuration
static class ElasticsearchJpaDependencyConfiguration extends EntityManagerFactoryDependsOnPostProcessor {
public ElasticsearchJpaDependencyConfiguration() {
super("elasticsearchClient");
}
}
Now all that is needed is to set the number of replicas to 0 to fix the health status of the cluster in the single node deployment. This can be done by specifying an additional property in the application.properties file
spring.data.elasticsearch.properties.index.number_of_replicas= 0

I checked sprint-data docs and looks like you misunderstood this piece (and actually it's confusing, guys don't understand the tech underneath?)
By default the instance will attempt to connect to a local in-memory server (a NodeClient in Elasticsearch terms), but you can switch to a remote server (i.e. a TransportClient) by setting spring.data.elasticsearch.cluster-nodes to a comma-separated ‘host:port’ list.
NodeClient is not "local server", it's special type of ES client. This local client can connect to ES cluster nodes containing data, and as I said in the comment, you don't have any ES data nodes running.
Read this for better understanding https://www.elastic.co/guide/en/elasticsearch/guide/current/_transport_client_versus_node_client.html

running tomcat as coherence node

I have a question for someone who is familiar with tomcat and coherence.
I am using tomcat 8 and coherence 12.2.1 now and here I have, maybe not a problem, but interesting case.
I am trying to start web application on tomcat as coherence node. I already know that there is ExtendTcpCacheService and now I am using it to make additional node which can communicate with coherence cluster.
But my question is: Is there a way to make tomcat start node which IS NOT Extend? I mean, I need tomcat to start coherence node but like grizzly rest server (automatically connecting to existing cluster), not like I have it now - it needs all IP addresses and configuration to connect to existing coherence node.
Thank you for any advice!

I am assuming that the other nodes in the cluster have the ExtendTcpCacheService enabled and you just want to disable only this service when running in tomcat. This is easy to do and you can continue to use one cache config file for all cluster nodes but you will need to make a slight change to your coherence cache configuration file. Go to the <proxy-scheme> section pertaining to your ExtendTcpCacheService service and change the <autostart> tag with a system-property attribute as shown below:
<proxy-scheme>
<scheme-name>some-name</scheme-name>
<service-name>ExtendTcpCacheService</service-name>
....
<autostart system-property="ExtendTcpCacheService.enabled">true</autostart>
</proxy-scheme>
In the JVM start-up parameters for Tomcat you will need to pass -DExtendTcpCacheService.enabled=false to turn off starting the service. In the other JVMs you will not need to do anything since this property is on by default.
You can use this feature to modify almost any xml tag in the coherence config using system parameters. More details on this feature is detailed in the coherence docs

Logstash Web Interface is it still available?

Is the web interface for Logstash still available or this has been replaced with Kibana?
I am using logstash-1.5.3.

Nope, since version 1.5 onwards, the Logstash Web UI has been completely removed and replaced with Kibana.

Nutch 2.2.1 and Elasticsearch 0.90.11 NoSuchFieldError: STOP_WORDS_SET

I am trying to integrate Apache Nutch 2.2.1 with Elasticsearch 0.90.11.
I have followed all tutorials available (although there are not so many) and even changed bin/crawl.sh to use elasticsearch to index instead of solr.
It seems that all works when I run the script until elasticsearch is trying to index the crawled data.
I checked hadoop.log inside logs folder under nutch and found the following errors:
Error injecting constructor, java.lang.NoSuchFieldError: STOP_WORDS_SET
Error injecting constructor, java.lang.NoClassDefFoundError: Could not initialize class org.apache.lucene.analysis.en.EnglishAnalyzer$DefaultSetHolder
If you managed to get it working I would very much appreciate the help.
Thanks,
Andrei.

Having never used Apache Nutch, but briefly reading about it, I would suspect that your inclusion of Elasticsearch is causing a classpath collision with a different version of Lucene that is also on the classpath. Based on its Maven POM, which does not specify Lucene directly, then I would suggest only including the Lucene bundled with Elasticsearch, which should be Apache Lucene 4.6.1 for your version.
Duplicated code (differing versions of the same jar) tend to be the cause of NoClassDefFoundError when you are certain that you have the necessary code. Based on the fact that you switched from Solr to Elasticsearch, then it would make sense that you left whatever jars from Solr on your classpath, which would cause the collision at hand. The current release of Solr is 4.7.0, which is the same as Lucene and that would collide with 4.6.1.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

elasticsearch and camel integration - elasticsearch

Related

How to change elasticsearch version in reactive spring data elastic search api?

spring boot + hibernate search + elastic search embedded fails to start

running tomcat as coherence node

Logstash Web Interface is it still available?

Nutch 2.2.1 and Elasticsearch 0.90.11 NoSuchFieldError: STOP_WORDS_SET

Categories

Resources