Aerospike's behavior on EC2 - amazon-ec2

In my test setup on EC2 I have done following:
One Aerospike server is running in ZoneA (say Aerospike-A).
Another node of the same cluster is running in ZoneB (say Aerospike-B).
The application using above cluster is running in ZoneA.
I am initializing AerospikeClinet like this:
hosts= new Host[];
hosts[0] = new Host(PUBLIC_IP_OF_AEROSPIKE-A, 3000);
AerospikeClient client = new AerospikeClient(policy, hosts);
With above setup I am getting below behavior:
Writes are happening on both Aerospike-A and Aerospike-B.
Reads are only happening on Aerospike-A (data is around 1million records, occupying 900MB of memory and 1.3 GB of disk)
Question: Why are reads not going to both the nodes?
If I take Aerospike-B down, everything works perfectly. There is no outage.
If I take Aerospike-A down, all the writes and reads start failing. I've waited for 5 mins for other node to take traffic but it didn't work.
Questions:
a. In above scenario, I would expect Aerospike-B to take all the traffic. But this is not happening. Is there anything I am doing wrong?
b. Should I be giving both the hosts while initializing the client?
c. I had executed "clinfo -v 'config-set:context=service;paxos-recovery-policy=auto-dun-all'" on both the nodes. Is that creating a problem?

In EC2 you should place all the nodes of the cluster in the same AZ of the same region. You can use the rack awareness feature to set up nodes in two separate AZs, however you will be paying with a latency hit on each one of your writes.
Now what your seeing is likely due to misconfiguration. Each EC2 machine has a public IP and a local IP. Machines sitting on the same subnet may access each other through the local IP, but a machine from a different AZ cannot. You need to make sure your access-address is set to the public IP in the event that your cluster nodes are spread across two availability zones. Otherwise you have clients which can reach some of the nodes, lots of proxy events as the cluster nodes try to compensate and move your reads and writes to the correct node for you, and weird issues with data upon nodes leaving or joining the cluster.
For more details:
http://www.aerospike.com/docs/operations/configure/network/general/
https://github.com/aerospike/aerospike-client-python/issues/56#issuecomment-117901128

Related

HBase operations hang when nodes go offline

I noticed that operations like Put hang forever if nodes go offline (server crash e.g.)
Here's the relevant logs from the client:
(AsyncProcess.java:1777) - Left over 1 task(s) are processed on
server(s): [s1.mycompany.com,16020,1519065917510,
s2.mycompany.com,16020,1519065918510,
s3.mycompany.com,16020,1519065917410]
(AsyncProcess.java:1785) - Regions against which left over task(s) are processed: [...]
In my case, s2 and s3 went offline. (p.s. ~50 nodes in cluster)
Shouldn't this problem be handled by HBase? E.g. if region servers go offline, their regions are reassigned to other servers and puts change their destination?
Since HBase is fault tolerant, this problem should not happen

Too many fetch faliuers

I have a setup, 2 node hadoop cluster on Ubuntu 12.04 and Hadoop 1.2.1.
While I am trying to run hadoop word count example I am gettig "Too many fetch faliure error". I have referred many articles but I am unable to figure out what should be the entries in Masters,Slaves and /etc/hosts file.
My nodes names are "master" with ip 10.0.0.1 and "slaveone" with ip 10.0.0.2.
I need assistance in what should be the entries in masters,slaves and /etc/hosts file in both master and slave node?
If you're unable to upgrade the cluster for whatever reason, you can try the following:
Ensure that your hostname is bound to the network IP and NOT 127.0.0.1 in /etc/hosts
Ensure that you're using only hostnames and not IPs to reference services.
If the above are correct, try the following settings:
set mapred.reduce.slowstart.completed.maps=0.80
set tasktracker.http.threads=80
set mapred.reduce.parallel.copies=(>= 10)(10 should probably be sufficient)
Also checkout this SO post: Why I am getting "Too many fetch-failures" every other day
And this one: Too many fetch failures: Hadoop on cluster (x2)
And also this if the above don't help: http://grokbase.com/t/hadoop/common-user/098k7y5t4n/how-to-deal-with-too-many-fetch-failures
For brevity and in interest of time, I'm putting what I found to be the most pertinent here.
The number 1 cause of this is something that causes a connection to get a
map output to fail. I have seen:
1) firewall
2) misconfigured ip addresses (ie: the task tracker attempting the fetch
received an incorrect ip address when it looked up the name of the
tasktracker with the map segment)
3) rare, the http server on the serving tasktracker is overloaded due to
insufficient threads or listen backlog, this can happen if the number of
fetches per reduce is large and the number of reduces or the number of maps
is very large.
There are probably other cases, this recently happened to me when I had 6000
maps and 20 reducers on a 10 node cluster, which I believe was case 3 above.
Since I didn't actually need to reduce ( I got my summary data via counters
in the map phase) I never re-tuned the cluster.
EDIT: Original answer said "Ensure that your hostname is bound to the network IP and 127.0.0.1 in /etc/hosts"

How to find the right portion between hadoop instance types

I am trying to find out how many MASTER, CORE, TASK instances are optimal to my jobs. I couldn't find any tutorial that explains how do I figure it out.
How do I know if I need more than 1 core instance? What are the "symptoms" I would see in EMR's console in the metrics that would hint I need more than one core? So far when I tried the same job with 1*core+7*task instances it ran pretty much like on 8*core, but it doesn't make much sense to me. Or is it possible that my job is so much CPU bound that the IO is such minor? (I have a map-only job that parses apache log files into csv file)
Is there such a thing to have more than 1 master instance? If yes, when is it needed? I wonder, because my master node pretty much is just waiting for the other nodes to do the job (0%CPU) for 95% of the time.
Can the master and the core node be identical? I can have a master only cluster, when the 1 and only node does everything. It looks like it would be logical to be able to have a cluster with 1 node that is the master and the core , and the rest are task nodes, but it seems to be impossible to set it up that way with EMR. Why is that?
The master instance acts as a manager and coordinates everything that goes in the whole cluster. As such, it has to exist in every job flow you run but just one instance is all you need. Unless you are deploying a single-node cluster (in which case the master instance is the only node running), it does not do any heavy lifting as far as actual MapReducing is concerned, so the instance does not have to be a powerful machine.
The number of core instances that you need really depends on the job and how fast you want to process it, so there is no single correct answer. A good thing is that you can resize the core/task instance group, so if you think your job is running slow, then you can add more instances to a running process.
One important difference between core and task instance groups is that the core instances store actual data on HDFS whereas task instances do not. In turn, you can only increase the core instance group (because removing running instances would lose the data on those instances). On the other hand, you can both increase and decrease the task instance group by adding or removing task instances.
So these two types of instances can be used to adjust the processing power of your job. Typically, you use ondemand instances for core instances because they must be running all the time and cannot be lost, and you use spot instances for task instances because losing task instances do not kill the entire job (e.g., the tasks not finished by task instances will be rerun on core instances). This is one way to run a large cluster cost-effectively by using spot instances.
The general description of each instance type is available here:
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/InstanceGroups.html
Also, this video may be useful for using EMR effectively:
https://www.youtube.com/watch?v=a5D_bs7E3uc

Redis mget vs get

Setup:
We have a setup of redis in which we have a master and 4 slaves of redis running on same machine. Reason to use the multiple instances were -
To avoid hot keys
Memory was not a constraint as number of keys were small ~10k ( We have a extra large EC2 machine)
Requests:
We approximately make 60 get request from redis per client request. We consolidate 60 gets in 4 mgets. We make a single connection for all the request ( to one of the slave picked up randomly ).
Questions
Does it make sense to run multiple instances of redis with replicated data in the slaves?
Does making mgets instead of gets in our case help us where we have all the instances on the same machine?
Running multiple redis instances on the same machine can be useful. Redis is single threaded so if your machine has multiple cores, you can get more CPU power by using multiple instances. Craigslist runs in this configuration as documented here: http://blog.zawodny.com/2011/02/26/redis-sharding-at-craigslist/.
mget versus get should help since you are only making 4 round trips to the redis server as opposed to 60, increasing throughput - running multiple instances on the same machine shouldn't change that.

How quartz detect nodes fails

My production environment running a java scheduler job using quartz 2.1.4. on weblogic cluster server with 4 machine and only one schedule job execute at one cluster node (node 1) normally for few months, but node 2 sudden find the node 1 fail at take over the executing job last night. In fact, the node 1 without error (according to the server, network, database, application log), this event caused duplicate message created due to 2 process concurrent execute.
What is the mechanism of quartz to detect node fails? By ping scan, or heart beat ping via UCP broadcast, or database respond time other? Any configuration on it?
I have read the quartz configuration guide
http://quartz-scheduler.org/documentation/quartz-2.1.x/configuration/ConfigJDBCJobStoreClustering
, but there is no answer.
I am using JDBCJobstore. After details checking, we found that there is a database (Oracle) statement executing abnormal long (from 5 sec to 30 sec). The incident happened on this period of time. Do you think it related?
my configuration is
`
org.quartz.threadPool.threadCount=10
org.quartz.threadPool.threadPriority=5
org.quartz.jobStore.misfireThreshold = 10000
org.quartz.jobStore.class=org.quartz.impl.jdbcjobstore.JobStoreTX
`
Anyone have this information? Thanks.
I know the answer is very late, but maybe somebody like both of us will still need it.
Short version: it is all handled by DB. Important property would be org.quartz.jobStore.clusterCheckinInterval.
Long version (all credits go to http://flylib.com/books/en/2.65.1.91/1/ ) :
Detecting Failed Scheduler Nodes
When a Scheduler instance performs the check-in routine, it looks to
see if there are other Scheduler instances that didn't check in when
they were supposed to. It does this by inspecting the SCHEDULER_STATE
table and looking for schedulers that have a value in the
LAST_CHECK_TIME column that is older than the property
org.quartz.jobStore.clusterCheckinInterval (discussed in the next
section). If one or more nodes haven't checked in, the running
Scheduler assumes that the other instance(s) have failed.
Additionally the next paragraph might also be important:
Running Nodes on Separate Machines with Unsynchronized Clocks
As you can ascertain by now, if you run nodes on different machines and the
clocks are not synchronized, you can get unexpected results. This is
because a timestamp is being used to inform other instances of the
last time one node checked in. If that node's clock was set for the
future, a running Scheduler might never realize that a node has gone
down. On the other hand, if a clock on one node is set in the past, a
node might assume that the node has gone down and attempt to take over
and rerun its jobs. In either case, it's not the behavior that you
want. When you're using different machines in a cluster (which is the
normal case), be sure to synchronize the clocks. See the section
"Quartz Clustering Cookbook," later in this chapter for details on how
to do this.

Resources