Time-sensitive Work Node Disaster Recovery and Status Synchronization - go

My project has one main node and one work node, which the work node excutes some job fetched from the main node(most jobs are time-sensitive like send an email after three hours).
We expect the work node to be stateless, it register to main node and fetch some jobs to do, when a job done, it send a finish signal. The main node will check work node's health every once in a while, if work node was died, it set the jobs to unfetched.
The point is we choose HTTP to connect main and worker, so the main node get infomation only when work node send a KEEPALIVE request. It makes some confuse in disaster recovery and status synchronization.
Wanted to know if it's good practice to do that and what would be the best way to do that?
Thanks in advance.

Related

Spring Batch - restart behavior upon worker crash

I've been exploring how Spring Batch works in certain failure cases when remote partitioning is used.
Let's say I have 3 worker nodes and 1 manager node. The manager node creates 30 partitions that the workers can pick up. The messaging layer is Kafka.
The workers are up, waiting for work to arrive on the specific topic. The manager node creates the partitions, puts them into the DB and sends the messages on the Kafka topic which has 3 partitions.
All nodes have started the processing but suddenly one node has crashed. The node that has crashed will have the step execution states set to STARTED/STARTING for the partitions it initially has picked up.
Another node will come to the rescue since the Kafka partitions will get revoked and reassigned, so one of the nodes between the 2 will read the partition the crashed node did.
In this case, nothing will happen of course because the original Kafka offset was committed by the crashed node even though the processing hasn't finished. Let's say when partitions get reassigned, I set the consumer back to the topic's beginning - for the partitions it manages.
Awesome, this way the consumer will start consuming messages from the partition of the crashed node.
And here's the catch. Even though some of the step executions that the crashed node processed with COMPLETED state, the new node that took over will reprocess that particular step execution once more even though it was finished before by the crashed node.
This seems strange to me.
Maybe I'm trying to solve this the wrong way, not sure but I appreciate any suggestions how to make the workers fault-tolerant for crashes.
Thanks!
If a StepExecution is marked as COMPLETED in the job repository, it will not be reprocessed. No data will be run again. A new StepExecution may be created (I don't have the code in front of me right now) but when Spring Batch evaluates what to do based on the previous run, it won't process it again. That's a key feature of how Spring Batch's partitioning works. You can send the workers 100 messages to process each partition, but it will only actually get processed once due to the synchronization in the job repository. If you are seeing other behavior, we would need more information (details from your job repository and configuration specifics).

Execute a method only once at start of an Apache Storm topology

If I have a simple Apache Storm topology with a spout (set to a parallelism of 2) running on two separate nodes. How can I write a method that will be run once, and only once, at the start of the topology before any processing of tuples has begun?
Any implementation of a singleton/static class, or synchronized method alone will not work, as the two instances are running on separate nodes.
Perhaps there are some Storm methods that I can use to decide if I'm the first Spout to be instantiated, and run only then? I tried playing around with the getThisTaskId() & getThisWorkerTasks() methods, but was unsuccessful.
NOTE: The parallelism of 2 is to keep things simple. A solution should work for any number of nodes/workers.
Edit: Thought of an easier solution. I'll leave the original answer below in case it is helpful.
You can use TopologyContext.getThisTaskIndex to do this. If you make your spout open method run the code only if TopologyContext.getThisTaskIndex == 0, then your code will run only once, before any tuples are emitted.
If the worker that ran this code crashes, the code will be run again when the spout instance with task index 0 is restarted. In order to fix this, you can use Zookeeper to store state that should carry over across restarts, e.g. put a flag in Zookeeper once the only-once code has run, and have the spout open check that the flag is not set before running the code.
You can use TopologyContext.getStormId to get a constant unique string to identify the topology, so you can tell whether the flag was set by this topology or a previous deployment.
Original answer:
The easiest way to run some code only once on deployment of a topology, is to call the code when you submit the topology. You can call the only-once code at the same time as you wire your topology with TopologyBuilder. This will only get run once. The downside is it will run on the machine you're calling storm jar from.
If you for some reason can't do it this way or need to run the code from one of the worker nodes, there isn't anything built in to Storm to allow you to do this. The reason there isn't such a mechanism is that it requires extra coordination between the worker JVMs, and I don't think anyone has needed something like this.
The best option for you would probably be to look at Zookeeper/Curator to do this coordination (see https://curator.apache.org/curator-recipes/index.html). This should allow you to make only one worker in the cluster run your code. You'll have to consider what should happen if the worker chosen to run your code crashes/stalls.
Storm already uses Zookeeper for coordination, so you can just connect to that cluster.

NiFi - Stop upon failure

I've been trying to google and search stack for the answer but have beeen unable to find.
Using NiFi, is it possible to stop a process upon previous job failure?
We have user data we need to process but the data is sequentially constructed so that if a job fails, we need to stop further jobs from running.
I understand we can create scripts to fail a process upon previous process failure, but what if I need entire group to halt upon failure, is this possible? We don't want each job in queue to follow failure path, we want it to halt until we can look at the data and analyze the failure.
TL;DR - can we STOP a process upon a failure, not just funnel all remaining jobs into the failure flow. We want data in queues to wait until we fix, thus stop process, not just fail again and again.
Thanks for any feedback, cheers!
Edit: typos
You can configure backpressure on the queues to stop upstream processes. If you set the backpressure threshold to 1 on a failure queue, it would effectively stop the processor until you had a chance to address the failure.
The screenshot shows failure routing back to the processor, but this is not required. It is important that the next processor should not remove it from the queue to maintain the backpressure until you take action.

How quartz detect nodes fails

My production environment running a java scheduler job using quartz 2.1.4. on weblogic cluster server with 4 machine and only one schedule job execute at one cluster node (node 1) normally for few months, but node 2 sudden find the node 1 fail at take over the executing job last night. In fact, the node 1 without error (according to the server, network, database, application log), this event caused duplicate message created due to 2 process concurrent execute.
What is the mechanism of quartz to detect node fails? By ping scan, or heart beat ping via UCP broadcast, or database respond time other? Any configuration on it?
I have read the quartz configuration guide
http://quartz-scheduler.org/documentation/quartz-2.1.x/configuration/ConfigJDBCJobStoreClustering
, but there is no answer.
I am using JDBCJobstore. After details checking, we found that there is a database (Oracle) statement executing abnormal long (from 5 sec to 30 sec). The incident happened on this period of time. Do you think it related?
my configuration is
`
org.quartz.threadPool.threadCount=10
org.quartz.threadPool.threadPriority=5
org.quartz.jobStore.misfireThreshold = 10000
org.quartz.jobStore.class=org.quartz.impl.jdbcjobstore.JobStoreTX
`
Anyone have this information? Thanks.
I know the answer is very late, but maybe somebody like both of us will still need it.
Short version: it is all handled by DB. Important property would be org.quartz.jobStore.clusterCheckinInterval.
Long version (all credits go to http://flylib.com/books/en/2.65.1.91/1/ ) :
Detecting Failed Scheduler Nodes
When a Scheduler instance performs the check-in routine, it looks to
see if there are other Scheduler instances that didn't check in when
they were supposed to. It does this by inspecting the SCHEDULER_STATE
table and looking for schedulers that have a value in the
LAST_CHECK_TIME column that is older than the property
org.quartz.jobStore.clusterCheckinInterval (discussed in the next
section). If one or more nodes haven't checked in, the running
Scheduler assumes that the other instance(s) have failed.
Additionally the next paragraph might also be important:
Running Nodes on Separate Machines with Unsynchronized Clocks
As you can ascertain by now, if you run nodes on different machines and the
clocks are not synchronized, you can get unexpected results. This is
because a timestamp is being used to inform other instances of the
last time one node checked in. If that node's clock was set for the
future, a running Scheduler might never realize that a node has gone
down. On the other hand, if a clock on one node is set in the past, a
node might assume that the node has gone down and attempt to take over
and rerun its jobs. In either case, it's not the behavior that you
want. When you're using different machines in a cluster (which is the
normal case), be sure to synchronize the clocks. See the section
"Quartz Clustering Cookbook," later in this chapter for details on how
to do this.

how to know on which node job will executes in Quartz-Scheduler with TerracottaJobStore?

I have bind terracotteJobStore with Quartz-Scheduler
how can terracotteJobStore determine which job should next for which node for execution?
which algorithm uses for node selection in terracotteJobStore any idea ??
If 'Quartz Scheduler' is used with 'TerracotteJobStore' ,and there is any Job next to execute then selection of Node for that Job will be Random.
Using 'Qurtz Where' it is possible to make Job on criteria base.
Means if u want to make a Job that must run on a Node which have core at least 2 or
to make a Job which run on a Node which have 70% CPU load average or
to make a Job which run on a Node which have at least Java Heap Free memory 330 MB
in such case 'Quartz Where' is useful.
It is predictable on which Node , Job will execute only in the case of "Quartz Where'.
With OS Terracotta's JobStore you don't get to decide which node the job will be executed on. Not that it really happens randomly, but the scheduler behaves as in non-clustered mode. So basically, every node will, at a regular interval and, based on the next trigger to fire when to acquire the next trigger(s). Since all the nodes in cluster behave the same way, the first to acquire the lock, will also be able to acquire triggers first.
Terracotta EE comes with the Quartz Where feature that lets you describe where jobs should be fired. You learn more on Quartz Where by watching this short screencast I did: http://www.codespot.net/blog/2011/03/quartz-where-screencast/
Hope that'll help.

Resources