how to get jobId in spring Job as jobId in QuarzJob - spring

In QuarzJob, we can get the jobId .
But in Spring Job, I try to get jobId. but I am not able to find any similar field as jobId.
Need help.

Related

Migrating data from RDBMS to ElasticSearch using Apache NIfi

We are trying to migrate data from RDBMS to elastic search using Apache Nifi. We have created pipelines in Nifi and are able to transfer data but are facing some issues and wanted to check if someone already got over them.
Please provide inputs on the below items.
1.How to avoid auto-generating _id in elastic search. We want this to be set from a DB column. We tried providing the column name in the "Identifier Record Path" attribute in the PutElasticSearchHTTPRecord processor but were getting an error that the attribute name is not valid. Can you please let us know the acceptable format.
How to load nested objects into the index using NIfi? We are looking to maintain one to many relationships in the index using nested objects but were unable to find a configuration to do so. Do we have any processors to do this in Nifi? Please let us know.
Thanks in Advance!
It needs to be a RecordPath statement like /myidfield
You need to manually create nested fields in Elasticsearch. This is not a NiFi thing, but how Elasticsearch works. If you were to post a document with cURL, you would run into the same issues.

Elasticsearch check if an index is receiving read traffic

We are using elasticsearch 6.0.
We have an index in our cluster and would like to know if there are any clients who are querying it.
Is there a way to know if an index is receiving read( get, search, aggregation etc) on an index?
If you don't have monitoring enabled on your cluster please have a look on the index stats api. You'll find a lot of metrics worth to monitor.
You can see which thread has completed how many write tasks completed with this command:
GET /_cat/thread_pool/write?v&h=id,node_name,active,rejected,completed
or for getting get task:
GET /_cat/thread_pool/get?v&h=id,node_name,active,rejected,completed

No index mapper found for field [...] returning default posting formats

I'm creating a custom mapper plugin under elasticsearch version 2.4.0 but I have this warning in the log every time I insert new data
no index mapper found for field [....] returning default posting formats
I search for this but all I can find is about upgrades issues, nothing like this. Were any ideas wrong? Thanks.

Is an upsert possible with Kafka Connect to ElasticSearch

I'm receiving events which end up in Kafka. From these events I fetch the id using a Kafka Streams application and posting it back to Kafka as a pair of (id, 1) in another topic. Then I would like to see if the id exists already in ElasticSearch, and if so update its counter, otherwise create a new record in ElasticSearch with the id from Kafka and counter set to 1, i.e. an upsert of the record (id, 1) to ES.
I was hoping to use Kafka Connect to ElasticSearch for this, but it seems to be not that straightforward if possible at all. I can see that adding records to ES works, but merging with existing records seems is something I haven't found out about yet. Is this possible already, and if so, how, and if not, is it planned to be possible in a nearby release?
I forked the datamountaineer ES sink connector to allow Upsert. With it you can specify a PK and run an update with docAsUpsert into ES. You can grab the project and compile the Jar from my github fork.

elasticsearch-hadoop for Spark. Send documents from a RDD in different index (by day)

I work on a complex workflow using Spark (parsing, cleaning, Machine Learning ...).
At the end of the workflow I want to send aggregated results to elasticsearch so my portal could query data.
There will be two types of processing: streaming and the possibility to relaunch workflow on all available data.
Right now I use elasticsearch-hadoop and particularly the spark part to send document to elasticsearch with the saveJsonToEs(myindex/mytype) method.
The target is to have an index by day using the proper template that we built.
AFAIK you could not add consideration of a feature in a document to send it to the proper index in elasticsearch-hadoop.
What is the proper way to implement this feature?
Have a special step using spark and bulk so that each executor send documents to the proper index considering the feature of each line?
Is there something that I missed in elasticsearch-hadoop?
I tried to send JSON to _bulk using saveJsonToEs("_bulk") but the pattern has to follow index/type
Thanks to Costin Leau, I have the solution.
Simply use dynamic indexing with something like saveJsonToEs("my-index-{date}/my-type"). "date" have to be a feature in the document that has to be send.
Discussion on elasticsearch google group: https://groups.google.com/forum/#!topic/elasticsearch/5-LwjQxVlhk
Documentation: http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/spark.html#spark-write-dyn
You can use : ("custom-index-{date}/customtype") to create dynamic index. This could be any field in given rdd.
If you want format the date : ("custom-index-{date:{YYYY.mm.dd}}/customtype")
[Answered to question ask by Amit_Hora in the comment, as I don't have enough privilege to comment, I am adding this here]

Resources