Arangodump between two different AWS ec2 clusters - amazon-ec2

I have created a Graph database in ArangoDB in a 5 machine AWS cluster. I do not have enough space in the Database AWS cluster to store the dump. So I would like to take a dump of the database in an AWS instance in a different cluster. I have the key files to connect to the machines. How to do it using Arangodump ? Thanks.

I do get that correctly that you're using DC/OS clusters on AWS?
The problem with arangoimp is, that it doesn't know howto authenticate with the DC/OS proxy, and thus can't reach the routes it would require to import to arangodb.
The problem is similar to Running Arango Shell on DC/OS cluster - you want to use sshutle as lalitlogical describes to forward the ArangoDB server port (usually 8529) to your target environment.

Related

Neo4J Causal Cluster using 3 AWS EC2 Instances

We are trying to setup Neo4j Causal Clustering using 3 EC2 instances that we have created in our AWS Account each with Neo4j Enterprise Causal Cluster AMI. We made necessary configurations in the neo4j.template of each Neo4j EC2 Instance to enable Causal Clustering. For example the parameters - causal_clustering.initial_discovery_members and causal_clustering.discovery_listen_address are set to the Public IPs of the EC2.
After making the necessary changes, the Neo4j Server was started using the command ./etc/init.d/neo4j start. When checked the status using systemctl status neo4j, it is still showing neo4j_mode as SINGLE though it was set to CORE. The Neo4j UI is also not accessible from the browser window.
How can we enable Neo4j Causal Clustering using 3 EC2 Instances with Neo4j AMIs.? Is there any specific AMI that we should use with EC2 to enable Causal Clustering.? Is there any documentation for the same (EC2 Neo4j Causal Clustering)? What should be the correct neo4j.conf file configuration for enabling Causal Cluster?

Using the dask labextenstion to connect to a remote cluster

I'm interested in running a Dask cluster on EMR and interacting with it from inside of a Jupyter Lab notebook running on a separate EC2 instance (e.g. an EC2 instance not within the cluster and not managed by EMR).
The Dask documentation points to dask-labextension as the tool of choice for this use case. dask-labextension relies on a YAML config file (and/or some environment vars) to understand how to talk to the cluster. However, as far as I can tell, this configuration can only be set to point to a local Dask cluster. In other words, you must be in a Jupyter Lab notebook running on an instance within the cluster (presumably on the master instance?) in order to use this extension.
Is my read correct? Is it not currently possible to use dask-labextension with an external Dask cluster?
Dask Labextension can talk to any Dask cluster that is visible from where your web client is running. If you can connect to a dashboard in a web browser then you can copy that same address to the Dask-Labextension search bar and it will connect.

configure MSMQ cluster failover on an azure vm (windows server 2016)

I want to create a Failover cluster for MSMQ for two vm's in azure. I created two VM's in azure and have them domain joined. I can create the failover cluster with both nodes. However when i try to add a role for MSMQ i need an cluster shared disk. I tried to create a new managed disk in azure and attach it to the vm's but it still wasn't able to find the disk.
Also tried fileshare-sync, but still not working.
I found out i need iSCSI disk, there was this article https://learn.microsoft.com/en-us/azure/storsimple/storsimple-virtual-array-deploy3-iscsi-setup . But it is end of life next year.
So i am wondering if it is possible to setup a failover cluster for msmq on azure and if so how can i do it?
Kind regards,
You should be able to create a Cluster Shared Volume using Storage Spaces Direct across a cluster of Azure VMs. Here are instructions for a SQL failover cluster. I assume this should work for MSMQ, but I haven't set up MSMQ in over 10 years and I don't' know if requirements are different.

How to deploy a Cassandra cluster on two ec2 machines?

It's a known fact that it is not possible to create a cluster in a single machine by changing ports. The workaround is to add virtual Ethernet devices to our machine and use these to configure the cluster.
I want to deploy a cluster of , let's say 6 nodes, on two ec2 instances. That means, 3 nodes on each machine. Is it possible? What should be the seed nodes address, if it's possible?
Is it a good idea for production?
You can use Datastax AMI on AWS. Datastax Enterprise is a suitable solution for production.
I am not sure about your cluster, because each node need its own config files and it is default. I have no idea how to change it.
There are simple instructions here. When you configure instances settings, you have to write advanced settings for cluster, like --clustername yourCluster --totalnodes 6 --version community etc. You also can install Cassandra manually by installing latest version java and cassandra.
You can build cluster by modifying /etc/cassandra/cassandra.yaml (Ubuntu 12.04) fields like cluster_name, seeds, listener_address, rpc_broadcast and token. Cluster_name have to be same for whole cluster. Seed is master node, which IP you should add for every node. I am confused about tokens

Amazon RDS Master-Slave Relationship between EC2 instances with load balancing activated

We're planning to move our Tomcat/MySQL app onto the Amazon cloud employing 2 EC2 instances (inst_1 & inst_2) running in different availability zones whereby inst_1 will contain the master RDS db and inst_2 the slave RDS db.
If we employ elastic load balancing to balance traffic between the two instances, will traffic directed to inst_2 that includes insert/update/delete db transactions first update the master RDS db in inst_1 followed by a synchronous update of the slave in inst_2; thereby ensuring that the two RDS instances are always synchronized?
Amazon's published info (whitepapers) suggests such, but doesn't explicitly state it. If not, how does one ensure that the two RDS instances remain synchronized?
Additional note: We're planning to employ Amazon's Elastic Beanstalk. Thanks!
You have to take a few things into consideration
AWS RDS instances are simple managed EC2 instances which run a MySQL server.
If you add a slave ( I think Amazon calls them read-replica) this is a read-only slave
Amazon doesn't manage the distribution of writing queries to the master server automatically.
Replication will ensure that your read slave always is up-to-date automatically ( with minimal delay which is increasing with write-load on the master )
This behavior is MySQL-specific
This means that you have to delegate manipulating queries to the master exclusively.
This can either be done by your application or by a MySQL proxy running on a extra machine.
The proxy then is the only interface your application servers will talk to. It is able to manage balancing between your RDS instances and the direction of any manipulation query to the master instance.
When RDS is used in multi-az mode you have no access to the secondary instance. There is only ever one instance that is visible to you, so most if your question doesn't apply. In case of failover the DNS address you are given will start resolving to a different ip. Amazon doesn't disclose how the two instances are kept in sync.
If instead of using a multi-az instance you use a single-az instance + a replica then it is up to you to direct queries appropriately - any attempt to alter data on the replica will fail. Since this is just standard MySQL replication, the replica can lag behind the master (in particular with current versions of MySQL the replica only runs a single thread)

Resources