We are going to deploy an Elasticsearch in a VM and configure our Logstash output to point to it. We don't plan for a multiple node cluster or cloud for hosting Elasticsearch. But we are checking for any possibility to fallback to our system-locally run Elasticsearch service, in case of any connection failure to the VM hosted Elasticsearch.
Is it possible to configure in Logstash in any way to have such fallback, in case connection to Elasticsearch is not available.
We use 5.6.5 version of Logstash and Elasticsearch. Thanks!
Related
I want to connect the confluent cloud to elastic search (local environment). Is it possible to connect the local elasticsearch to confluent cloud kafka?.
Thanks,
Bala
Yes; a local instance of Kafka Connect with the ElasticSearch sink connector can be installed and consume from Confluent Cloud (or any Kafka cluster)
A connector running in the cloud would be unlikely to connect/write to your local instance without port-forwarding from your router, thus why you should consume from a remote cluster and write to Elastic locally.
We have an application that uses spring data Elasticsearch (version 3.1.20.RELEASE) with transport client to connect to elastic search version 6.1.2.
Now we have to use the same application to connect to elastic search version 6.0.1 which is AWS-managed ES. The problem is, it doesn't seem to expose transport ports (9300) for clients.
If we move to a higher version of spring data Elasticsearch, it doesn't seem to support the elastic search cluster 6.0.x, and the current and lower versions of spring data Elasticsearch don't seem to support the REST clients.
We cannot upgrade our ES cluster version. So, either we have to find a way to connect to AWS with the transport client, or we have to make our application compatible with rest client. How can we solve this?
AWS does not expose the transport port / protocol. So you must use the REST protocol. And for Spring Data Elasticsearch that means version 3.2 at least. But this needs Elasticsearch 6.8. So the only way to use Spring Data Elasticsearch is to upgrade your ES cluster.
The other solution is to implement access the the cluster with Elasticsearch's RestClient and RestHighlevelClient and not using Spring data Elasticsearch
I have a topic with 7 million records (3 partitions) and deploy an Elasticsearch sink with 1 task using mostly the default configurations. The sink starts by creating the index in Elasticsearch and then starts writing at a rate of 10,000 msgs/second. If I make any changes to the connector's tasks
pause the connector, restart the task, start the connector
leave connector running but restart the task
The throughput drops to 400 msgs/second and never recovers to the original 10,000/sec.
If I stop the connector, delete the index from Elasticsearch and resume the connector it goes back to sinking 10k messages/sec.
I've tried changing the connector configs away from the defaults with no results.
connection.timeout.ms=1000
batch.size=2000
max.retries=5
max.in.flight.requests=5
retry.backoff.ms=100
max.buffered.records=20000
flush.timeout.ms=10000
read.timeout.ms=3000
My connector config
connector.class=io.confluent.connect.elasticsearch.ElasticsearchSinkConnector
type.name=logdata
errors.log.include.messages=true
tasks.max=1
topics=d8.qa.id.log.sso.transformed.0
key.ignore=true
schema.ignore=true
value.converter.schemas.enable=false
elastic.security.protocol=PLAINTEXT
name=elasticsearch-sink-d8.qa.id.log.transformed
connection.url=http://172.30.2.23:9200,http://172.30.0.158:9200,http://172.30.1.63:9200
client.id=elasticsearch-sink-d8.qa.id.log.transformed
Environment Details
Elasticsearch 6.8 (10 data nodes, 3 master)
Elasticsearch connector (version 2.2.1)
Kafka Connect (2 workers with 16GB memory, version 2.2.1)
Kafka Broker (3 brokers with 32GB memory, version 2.2.1)
NOTES:
Same behaviour with ES 7.2 and Elasticsearch connector version 2.3.1
This is the only connector on deployed to the connect cluster
This is a known issue for the Confluent Platform 5.3.x and below caused by the index not being cached if the index isn't created by JestElasticsearchClient. The fixes PR-340 and PR-309 have been merged and will be deployed with Confluent Platform 5.4.
There are a lot of examples of the SMACK stack, but in my infrastructure I would like to use ElasticSearch and Confluent Kafka Connect and Kafka Streams.
There is a great tutorial on deploying a CloudFormation-based SMACK stack environment and another in creating an IoT pipeline with SMACK as well.
Since I am working on a Lambda architecture, I am first starting with my batch data using ElasticSearch (not Cassandra) and would like to know if there are CloudFormation templates that use Kafka Connect, ElasticSearch. Eventually we want to use Kafka Streams with InfluxDB?
DC/OS has AWS CloudFormation templates and install instructions. Once you have DC/OS installed you can install ElasticSearch and Kafka from the Mesosphere Universe as DC/OS packages.
I am using the camel elasticsearch component : http://camel.apache.org/elasticsearch.html
My assumption, based on the docs, is that the elasticsearch server must be on the same network as the running camel route in order to work. Is this correct?
To clarify, the only connection property available is 'clustername'. I assume this is discovered by searching the network via multicast for the cluster.
My code needs to connect to a remote service. Is this just not possible?
I am fairly new to elasticsearch in general.
I had a similar problem with the autodiscovery of elasticsearch. I had a camel route that tried to index some exchanges, but the cluster was located in another subnet and thus not discoverd.
With the java api of ES it is possible to connect to a remote cluster with a TransportClient specifying an IP adress. I don't have acces to the code at the moment but the Java API in the ES documentation provides clean example code. You could make such a connection from within a bean in the route for example.
I also submitted a patch to Camel to add an ip parameter to the route, which should then connect to the remote cluster with such a TransportClient. The documentation states that should be available with Camel 2.12
Hope this helps.