I have Ignite instance started as a 'server mode' on computer A, created cache in it and stored 1M Key->Values inside the cache.
Then I started Ignite instance as a 'server mode' on computer B which joined Ignite instance on computer A and now have a cluster of 2 nodes.
Is it possible to move all 1M K->V from computer A to computer B (without any interruption for querying data or ingesting data) so that computer A can be shut down for maintenance and everything continue to work from computer B?
If this is possible - what are the steps and code to do that (move data from A -> B)?
Ignite distributes data across server nodes according to Cache Modes.
In REPLICATED mode each server holds a copy of all data, so you can shut down any node and data won't be lost.
In PARTITIONED mode you can set CacheConfiguration.backups to 1 (or more) so that data is evenly distributed across server nodes, but each server also holds a copy of data from some other server. In this scenario you can shut down any single node and data won't be lost.
There are the features named "backup" and "CacheRebalanceMode" of IgniteCache.I think you can try these.
Related
I want to migrate our hadoop server with all the data and components to new servers (newer version of redhat).
I saw a post on cloudera site about how to move the namenode,
but I dont know how to move all the datanodes without data loss.
We have replica factor 2.
If I will shutdown 1 datanode at a time hdsfs will generate new replicas?
Is there A way to migrate all the datanodes at once? what is the correct way to transfer all (about 20 server) datanodes to a new cluster?
Also I wanted to know if hbase will have the same problem or if I can just to delete and add the roles on the new servers
Update for clearify:
My Hadoop cluster already contains two sets of servers (They are in the same hadoop cluster, I just splited it logicly for the example)
First set is the older version of linux servers
Second set is the newer version of linux servers
Both sets are already share data and components (the namenode is in the old set of servers).
I want to remove all the old set of servers so only the new set of servers will remain in the hadoop cluster.
Does the execution should be like:
shutdown one datanode (from old servers set)
run balancer and wait for finish
do the same for the next datanodes
because if so, the balancer operation takes a lot of time and the whole operation will take a lot of time.
Same problem is for the hbase,
Now hbase region and master are only on the old set of servers, and I want remove it and install on the new set of servers without data loss.
Thanks
New Datanodes can be freely added without touching the namenode. But you definitely shouldn't shut down more than one at a time.
As an example, if you pick two servers to shut down at random, and both hold a block of a file, there's no chance of it replicating somewhere else. Therefore, upgrade one at a time if you're reusing the same hardware.
In an ideal scenario, your OS disk is separated from the HDFS disks. In which case, you can unmount them, upgrade the OS, reinstall HDFS service, remount the disks, and everything will work as previously. If that isn't how you have the server set up, you should do that before your next upgrade.
In order to get replicas added to any new datanodes, you'll need to either 1) Increase the replication factor or 2) run the HDFS rebalancer to ensure that the replicas are shuffled across the cluster
I'm not too familiar with Hbase, but I know you'll need to flush the regionservers before you install and migrate that service to other servers. But if you flush the majority of them without rebalancing the regions, you'll have one server that holds all the data. I'm sure the master server has similar caveats, although hbase backup seems to be a command worth trying.
#guylot - After adding the new nodes and running the balancer process take the old nodes out of the cluster by going through the decommissioning process. The decommissioning process will move the data to another node in your cluster. As a matter of precaution, only run against on one node at a time. This will limit the potential for a lost data incident.
I have a scenario where we want to use redis, but I am not sure how to go about setting it up. Here is what we want to achieve eventually:
A redundant central redis cluster where all the writes will occur with servers in two aws regions.
Local redis caches on servers which will hold a replica of the complete central cluster.
The reason for this is that we have many servers which need read access only, and we want them to be independent even in case of an outage (where the server cannot reach the main cluster).
I know there might be a "stale data" issue withing the caches, but we can tolerate that as long as we get eventual consistency.
What is the correct way to achieve something like that using redis?
Thanks!
You need the Redis Replication (Master-Slave) Architecture.
Redis Replication :
Redis replication is a very simple to use and configure master-slave replication that allows slave Redis servers to be exact copies of master servers. The following are some very important facts about Redis replication:
Redis uses asynchronous replication. Starting with Redis 2.8, however, slaves periodically acknowledge the amount of data processed from the replication stream.
A master can have multiple slaves.
Slaves are able to accept connections from other slaves. Aside from connecting a number of slaves to the same master, slaves can also be connected to other slaves in a cascading-like structure.
Redis replication is non-blocking on the master side. This means that the master will continue to handle queries when one or more slaves perform the initial synchronization.
Replication is also non-blocking on the slave side. While the slave is performing the initial synchronization, it can handle queries using the old version of the dataset, assuming you configured Redis to do so in redis.conf. Otherwise, you can configure Redis slaves to return an error to clients if the replication stream is down. However, after the initial sync, the old dataset must be deleted and the new one must be loaded. The slave will block incoming connections during this brief window (that can be as long as many seconds for very large datasets).
Replication can be used both for scalability, in order to have multiple slaves for read-only queries (for example, slow O(N) operations can be offloaded to slaves), or simply for data redundancy.
It is possible to use replication to avoid the cost of having the master write the full dataset to disk: a typical technique involves configuring your master redis.conf to avoid persisting to disk at all, then connect a slave configured to save from time to time, or with AOF enabled. However this setup must be handled with care, since a restarting master will start with an empty dataset: if the slave tries to synchronized with it, the slave will be emptied as well.
Go through the Steps : How to Configure Redis Replication.
So I decided to go with redis-sentinel.
Using a redis-sentinel I can set the slave-priority on the cache servers to 0, which will prevent them from becoming masters.
I will have one master set up, and a few "backup masters" which will actually be slaves with slave-priority set to a value which is not 0, which will allow them to take over once the master goes down.
The sentinel will monitor the master, and once the master goes down it will promote one of the "backup masters" and promote it to be the new master.
More info can be found here
I'm new to Hadoop and learning to use it by working with a small cluster where each node is an Ubuntu Server VM. The cluster consists of 1 name node and 3 data nodes with a replication factor of 3. After a power loss on the machine hosting the VMs, all files stored in the cluster were corrupted and with the blocks storing those files missing. No queries were running at the time power was lost and no files were being written to or read from the cluster.
If I shut down the VMs correctly (even without first stopping the Hadoop cluster), then the data is preserved and I don't run into any issues with missing or corrupted blocks.
The only information I've been able to find suggested setting dfs.datanode.sync.behind.writes to true, but this did not resolve the issue (killing the VMs from the host causes the same issue as a power failure). The information I found here seems to indicate this property will only have an effect when writing data to the disk.
I also tried running hdfs namenode -recover, but this did not resolve the issue. Ultimately I had to remove the data stored in the dfs.namenode.name.dir directory, rebooted each VM in the cluster to remove any Hadoop files in /tmp and reformatted the name node before copying the data back into the cluster from local file storage.
I understand that having all nodes in the cluster running on the same hardware and only 3 data nodes to go with a replication factor of 3 is not an ideal configuration, but I'd like a way to ensure that any data that is already written to disk is not corrupted by a power loss. Is there a property or other configuration I need to implement to avoid this in the future (besides separate hardware, more nodes, power backup, etc.)?
EDIT: To clarify further, the issue I'm trying to resolve is data corruption, not cluster availability. I understand I need to make changes to the overall cluster architecture to improve reliability, but I'd like a way to ensure data is not lost even in the event of a cluster-wide power failure.
Sorry that I don't have the resource to set up a cluster to test it, I'm just wondering to know:
Can I deploy hbase region server on a separated machine other than the hadoop data node machine? I guess the answer is yes, but I'm not sure.
Is it good or bad to deploy hbase region server and hadoop data node on different machines?
When putting some data into hbase, where is this data eventually stored in, data node or region server? I guess it's data node, but what is the StoreFile and HFile in region server, isn't it the physical file to store our data?
Thank you!
RegionServers should always run alongside DataNodes in distributed clusters if you want decent performance.
Very bad, that will work against the data locality principle (If you want to know a little more about data locality check this: http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html)
Actual data will be stored in the HDFS (DataNode), RegionServers are responsible of serving and managing regions.
For more information about HBase architecture please check this excelent post from Lars' blog: http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html
BTW, as long as you have a PC with decent RAM you can set up a demo cluster with virtual machines. Do not ever try to set up a production environment without properly test the platform first in a development environment.
To go in more detail about this answer:
RegionServers should always run alongside? DataNodes in distributed clusters if you want decent performance."
I'm not sure how anyone would interpet the term alongside, so let's try to be even more precise:
What makes any physical server an "XYZ" server is that it's running a program called a daemon (think "eternally-running background event-handling" program);
What makes a "file" server is that it's running a file-serving daemon;
What makes a "web" server is that it's running a web-serving daemon;
AND
What makes a "data node" server is that it's running the HDFS data-serving daemon;
What makes a "region" server then is that it's running the HBase region-serving daemon (program);
So, in all Hadoop Distributions (eg Cloudera, MAPR, Hortonworks, others), the general best practice is that for HBase, the "RegionServers" are "co-located" with the "DataNodeServers".
This means that the actual slave (datanode) servers which form the HDFS cluster are each running the HDFS data-serving daemon (program)
and they're also running the HBase region-serving daemon (program) as well!
This way we ensure locality - the concurrent processing and storing of data on all the individual nodes in an HDFS cluster, with no "movement" of gigantic loads of big data from "storage" locations to "processing" locations. Locality is vital to the success of a Hadoop cluster, such that HBase region servers (data nodes running the HBase daemon as well) must also do all their processing (putting/getting/scanning) on each data node containing the HFiles which make up HRegions which make up HTables which make up HBases (Hadoop-dataBases) ... .
So, servers (VMs or physical on Windows, Linux, ..) can run multiple daemons concurrently, often, they run dozens of them regularly.
I have a question regarding Clustering (session replication/failover) in tomcat 6 using BackupManager. Reason I chose BackupManager, is because it replicates the session to only one other server.
I am going to run through the example below to try and explain my question.
I have 6 nodes setup in a tomcat 6 cluster with BackupManager. The front end is one Apache server using mod_jk with sticky session enabled
Each node has 1 session each.
node1 has a session from client1
node2 has a session from client2
..
..
Now lets say node1 goes down ; assuming node2 is the backup, node2 now has two sessions (for client2 and client1)
The next time client1 makes a request, what exactly happens ?
Does Apache "know" that node1 is down and does it send the request directly to node2 ?
=OR=
does it try each of the 6 instances and find out the hard way who the backup is ?
Not too sure about the workings with BackupManager, my reading of this good URL suggests the replication is intelligent enough in identifying the backup.
In-memory session replication, is
session data replicated across all
Tomcat instances within the cluster,
Tomcat offers two solutions,
replication across all instances
within the cluster or replication to
only its backup server, this solution
offers a guaranteed session data
replication ...
SimpleTcpCluster uses Apache Tribes to maintain communicate with the communications group. Group membership is established and maintained by Apache Tribes, it handles server crashes and recovery. Apache Tribes also offer several levels of guaranteed message delivery between group members. This is achieved updating in-session memory to reflect any session data changes, the replication is done immediately between members ...
You can reduce the amount of data by
using the BackupManager (send only to
one node, the backup node)
You'll be able to see this from the logs if notifyListenersOnReplication="true" is set.
On the other hand, you could still use DeltaManager and split your cluster into 3 domains of 2 servers each.
Say these will be node 1 <-> node 2, 3 <-> 4 and 5 <-> 6.
In such a case - configuring the domain worker attribute, will ensure that session replication will only happen within the domain.
And mod_jk then definitely knows which server to look on when node1 fails.
http://tomcat.apache.org/tomcat-6.0-doc/cluster-howto.html states
Currently you can use the domain
worker attribute (mod_jk > 1.2.8) to
build cluster partitions with the
potential of having a more scaleable
cluster solution with the
DeltaManager(you'll need to configure
the domain interceptor for this).
And a better example on this link:
http://people.apache.org/~mturk/docs/article/ftwai.html
See the "Domain Clustering model" section.