Embedded debezium doesnt capture changes - spring-boot

Im running an embedded Debezium (1.2.0) in a Spring application, but it only captures changes when starting up
My setup looks like this:
final Properties props = new Properties();
props.setProperty("name", "engine");
props.setProperty("connector.class", "io.debezium.connector.sqlserver.SqlServerConnector");
props.setProperty("offset.storage", "org.apache.kafka.connect.storage.FileOffsetBackingStore");
props.setProperty("offset.storage.file.filename", "/tmp/offsets.dat");
props.setProperty("offset.flush.interval.ms", "60000");
/* begin connector properties */
props.setProperty("database.hostname", "xxxx");
props.setProperty("database.port", "xxxx");
props.setProperty("database.user", "xxxx");
props.setProperty("database.password", "xxxx");
props.setProperty("database.server.id", "xxxx");
props.setProperty("database.server.name", "xxxx");
props.setProperty("database.dbname", "xxxx");
props.setProperty("database.history", "io.debezium.relational.history.FileDatabaseHistory");
props.setProperty("database.history.file.filename", "~logs/dbhistory.dat");
props.setProperty("snapshot.lock.timeout.ms", "-1");
try (DebeziumEngine<ChangeEvent<String, String>> engine = DebeziumEngine.create(Json.class)
.using(props)
.notifying(this::handleEvent)
.build()) {
// Run the engine asynchronously ...
ExecutorService executor = Executors.newSingleThreadExecutor();
executor.execute(engine);
// Do something else or wait for a signal or an event
} catch (IOException | InterruptedException e) {
logger.error("Unable to start debezium " + e);
}
private void handleEvent(ChangeEvent<String, String> changeEvent) {
logger.info(changeEvent.toString());
}
When i boot the application it captures the latest changes, but ends with
INFO i.d.p.ChangeEventSourceCoordinator - Finished streaming
INFO i.d.p.m.StreamingChangeEventSourceMetrics - Connected metrics set to 'false'
Then no subsequently changes are captured until next application restart
No errors are thrown

you must not leave the try block as this would close the engine and stop the streaming. So in place of the comment // Do something else or wait for a signal or an event must be somewaiting logic or you should not place engine into try with resources for automated sopping.

Related

Need to execute code before Ignite cache ExpiryPolicy

I am using Ignite cache with ModifiedExpiryPolicy and need to execute a line of code before event execution. Any help?
IgniteCache<String, Object> expiresCache = cache.withExpiryPolicy(new ModifiedExpiryPolicy(new Duration(Time.MINUTES, timeInMins)));
public class ClassName {
public IgnitePredicate<CacheEvent> functionName() {
return new IgnitePredicate<CacheEvent>() {
#Override
public boolean apply(CacheEvent evt) {
//code to be executed after event.
return true;
}
};
}
}
I think you need to use events to listen for expiry events.
Ignite ignite = Ignition.ignite();
// Local listener that listenes to local events.
IgnitePredicate<CacheEvent> locLsnr = evt -> {
System.out.println("Received expiry event [evt=" + evt.name() + ", key=" + evt.key());
return true; // Continue listening.
};
// Subscribe to specified cache events occuring on local node.
ignite.events().localListen(locLsnr, EventType.EVT_CACHE_OBJECT_EXPIRED);
Note that this is just a local (node) listener, you'll need a remote listener to find expiry events on remote nodes. You'll also need to configure includeEventTypes in your configuration file (events are disabled by default for performance reasons.

New KafkaSpout Issue in Apache Storm

I have a basic topology includes kafka spout and and kafka bolts
When submit my topology ı gets this error in Storm UI
Unable to get offset lags for kafka. Reason: org.apache.kafka.shaded.common.errors.InvalidTopicException: Topic '[enrich-topic]' is invalid
I check enrich-topic is exist, there is no problem about that
TopologyBuilder streamTopologyBuilder = new TopologyBuilder();
KafkaSpoutRetryService kafkaSpoutRetryService = new KafkaSpoutRetryExponentialBackoff(KafkaSpoutRetryExponentialBackoff.TimeInterval.microSeconds(500), KafkaSpoutRetryExponentialBackoff.TimeInterval.milliSeconds(2), Integer.MAX_VALUE, KafkaSpoutRetryExponentialBackoff.TimeInterval.seconds(10));
KafkaSpoutConfig spoutConf = KafkaSpoutConfig.builder(configProvider.getBootstrapServers(), configProvider.getSpoutTopic())
.setGroupId("consumerGroupId")
.setOffsetCommitPeriodMs(10_000)
.setFirstPollOffsetStrategy(UNCOMMITTED_LATEST)
.setMaxUncommittedOffsets(1000000)
.setRetry(kafkaSpoutRetryService)
.build();
KafkaSpout kafkaSpout = new KafkaSpout(spoutConf);
streamTopologyBuilder.setSpout("kafkaSpout", kafkaSpout, 1);
KafkaWriterBolt2 kafkaWriterBolt2 = null;
try {
kafkaWriterBolt2 = new KafkaWriterBolt2(configProvider.getBootstrapServers(), configProvider.getStreamKafkaWriterTopicName());
} catch (IOException e) {
e.printStackTrace();
}
streamTopologyBuilder.setBolt("kafkaWriterBolt2", kafkaWriterBolt2, 1).setNumTasks(1)
.shuffleGrouping("kafkaSpout");
KafkaWriterBolt2 is my class extends from BaseRichBolt

How to know which queue is allocated to which consumer-RocketMQ?

Consumer queues are allocated in client side, broker knows nothing about this.
So how can we monitor which queue is allocated to which consumer client?
Though there is no exiting command, for each message queue per consumer group, You can find out the client using provided admin infrastructure. Here is the snippet achieving this:
private Map<MessageQueue, String> getClientConnection(DefaultMQAdminExt defaultMQAdminExt, String groupName){
Map<MessageQueue, String> results = new HashMap<MessageQueue, String>();
try{
ConsumerConnection consumerConnection = defaultMQAdminExt.examineConsumerConnectionInfo(groupName);
for (Connection connection : consumerConnection.getConnectionSet()){
String clinetId = connection.getClientId();
ConsumerRunningInfo consumerRunningInfo = defaultMQAdminExt.getConsumerRunningInfo(groupName, clinetId, false);
for(MessageQueue messageQueue : consumerRunningInfo.getMqTable().keySet()){
results.put(messageQueue, clinetId + " " + connection.getClientAddr());
}
}
}catch (Exception e){
}
return results;
}
In case you have not used the RocketMQ-Console project, please try and run it: https://github.com/rocketmq/rocketmq-console-ng
In the Consumer tab, Click "consumer detail" button, you will see message queue allocation result visually as below:
Message queues allocation result

How can I know when the amazon mapreduce task is complete?

i am trying to run a mapreduce task on amazon ec2.
i set all the configuration params and then call runFlowJob method of the AmazonElasticMapReduce service.
i wonder is there any way to know whether the job has completed and what was the status.
(i need it to know when i can pick up the mapreduce results from s3 for further processing)
currently the code just keep executing bacause the call to runJobFlow is non-blocking.
public void startMapReduceTask(String accessKey, String secretKey
,String eC2KeyPairName, String endPointURL, String jobName
,int numInstances, String instanceType, String placement
,String logDirName, String bucketName, String pigScriptName) {
log.info("Start running MapReduce");
// config.set
ClientConfiguration config = new ClientConfiguration();
AWSCredentials credentials = new BasicAWSCredentials(accessKey, secretKey);
AmazonElasticMapReduce service = new AmazonElasticMapReduceClient(credentials, config);
service.setEndpoint(endPointURL);
JobFlowInstancesConfig conf = new JobFlowInstancesConfig();
conf.setEc2KeyName(eC2KeyPairName);
conf.setInstanceCount(numInstances);
conf.setKeepJobFlowAliveWhenNoSteps(true);
conf.setMasterInstanceType(instanceType);
conf.setPlacement(new PlacementType(placement));
conf.setSlaveInstanceType(instanceType);
StepFactory stepFactory = new StepFactory();
StepConfig enableDebugging = new StepConfig()
.withName("Enable Debugging")
.withActionOnFailure("TERMINATE_JOB_FLOW")
.withHadoopJarStep(stepFactory.newEnableDebuggingStep());
StepConfig installPig = new StepConfig()
.withName("Install Pig")
.withActionOnFailure("TERMINATE_JOB_FLOW")
.withHadoopJarStep(stepFactory.newInstallPigStep());
StepConfig runPigScript = new StepConfig()
.withName("Run Pig Script")
.withActionOnFailure("TERMINATE_JOB_FLOW")
.withHadoopJarStep(stepFactory.newRunPigScriptStep("s3://" + bucketName + "/" + pigScriptName, ""));
RunJobFlowRequest request = new RunJobFlowRequest(jobName, conf)
.withSteps(enableDebugging, installPig, runPigScript)
.withLogUri("s3n://" + bucketName + "/" + logDirName);
try {
RunJobFlowResult res = service.runJobFlow(request);
log.info("Mapreduce job with id[" + res.getJobFlowId() + "] completed successfully");
} catch (Exception e) {
log.error("Caught Exception: ", e);
}
log.info("End running MapReduce");
}
thanks,
aviad
From the AWS documentation:
Once the job flow completes, the cluster is stopped and the HDFS partition is lost. To prevent loss of data, configure the last step of the job flow to store results in Amazon S3.
It goes on to say:
If the JobFlowInstancesDetail : KeepJobFlowAliveWhenNoSteps parameter is set to TRUE, the job flow will transition to the WAITING state rather than shutting down once the steps have completed.
A maximum of 256 steps are allowed in each job flow.
For long running job flows, we recommended that you periodically store your results.
So it looks like there is no way of knowing when it is done. Instead you need to save your data as part of the job.

Programatically bridge a QueueChannel to a MessageChannel in Spring

I'm attempting to wire a queue to the front of a MessageChannel, and I need to do so programatically so it can be done at run time in response to an osgi:listener being triggered. So far I've got:
public void addService(MessageChannel mc, Map<String,Object> properties)
{
//Create the queue and the QueueChannel
BlockingQueue<Message<?>> q = new LinkedBlockingQueue<Message<?>>();
QueueChannel qc = new QueueChannel(q);
//Create the Bridge and set the output to the input parameter channel
BridgeHandler b = new BridgeHandler();
b.setOutputChannel(mc);
//Presumably, I need something here to poll the QueueChannel
//and drop it onto the bridge. This is where I get lost
}
Looking through the various relevant classes, I came up with:
PollerMetadata pm = new PollerMetadata();
pm.setTrigger(new IntervalTrigger(10));
PollingConsumer pc = new PollingConsumer(qc, b);
but I'm not able to put it all together. What am I missing?
So, the solution that ended up working for me was:
public void addEngineService(MessageChannel mc, Map<String,Object> properties)
{
//Create the queue and the QueueChannel
BlockingQueue<Message<?>> q = new LinkedBlockingQueue<Message<?>>();
QueueChannel qc = new QueueChannel(q);
//Create the Bridge and set the output to the input parameter channel
BridgeHandler b = new BridgeHandler();
b.setOutputChannel(mc);
//Setup a Polling Consumer to poll the queue channel and
//retrieve 1 thing at a time
PollingConsumer pc = new PollingConsumer(qc, b);
pc.setMaxMessagesPerPoll(1);
//Now use an interval trigger to poll every 10 ms and attach it
IntervalTrigger trig = new IntervalTrigger(10, TimeUnit.MILLISECONDS);
trig.setInitialDelay(0);
trig.setFixedRate(true);
pc.setTrigger(trig);
//Now set a task scheduler and start it
pc.setTaskScheduler(taskSched);
pc.setAutoStartup(true);
pc.start();
}
I'm not terribly clear if all the above is explicitly needed, but neither the trigger or the task scheduler alone worked, I did appear to need both. I should also note the taskSched used was the default taskScheduler dependency injected from spring via
<property name="taskSched" ref="taskScheduler"/>

Resources