I am streaming real-time data with kinesis data streams into kinesis data analytics and I want to send the transformed data into dynamoDB. Currently it can be done by sending the transformed data into another stream which will trigger lambdas to write into dynamoDB.
But I was wondering if there is a way to directly call lambda from kinesis data analytics?
Officially Apache Flink does not provide any sink connector for Dynamo DB.
Either you have to build your own sink connector or use some non-official ones. You can give it a try to the following one -
Flink connector Dynamodb
Related
I have a use case where I source data from JDBC and sink to S3. I wanted to be able to kick off downstream process when sink for a specific data has landed successfully. How to do this?
Is it possible to configure Kafka Streams to use RockesDB-Cloud rather than the default RockesDB database as storage engine? If so, is there any configuration recipe? I would like to persist data on S3 buckets instead of local filesystem.
Hi have a scenario where I need to capture realtime row-inserts in Oracle table using Spark-Streaming, structured streaming.
One option is to write the Oracle rows to kafka topic, and make kafka topic as source for Spark. But my business Client does not want to post anything in Kafka.
So is at all possible to capture the realtime data insert in Oracle table using Spark Streaming?... i.e. make Oracle as source for Spark Streaming?
I might be interested into using Kinesis Analytics to transform some streams in real time to output them into an ES cluster.
However on the AWS documentation I do not see if I will be able to manually set a custom Document ID when Amazon Kinesis Data Firehose pushes it to ES.
Could someone confirm it is possible?
I don't think it is possible with firehose. Firehose auto generates ID for each message in Firehose. You may stream the data to Kinesis data streams and using lambda to set the document id.
I am saving data to BigQuery through a Google Dataflow Streaming job.
I want to insert this data into elastic search for rapid access.
Is it a good practice to call logstach from dataflow through http?
The Apache Beam Java SDK has a connector to read from/write to Elastic search. This should optimize the IO to be consistent with the Beam model.