how to deploy FreeSWITCH for an existed Hadoop cluster - hadoop

I plan to balance the REGISTER and INVITE tasks' load of FreeSWITCH with the help of Hadoop cluster, so my questions below,
i wanna know whether i can deploy FreeSWITCH for an existed Hadoop
cluster. if not, jump to third question please.
how shuld i configure and deploy the FreeSWITCH.
is there any other solution for FreeSWITCH's load-balance.
PS: my Hadoop cluster is builed on virtual mathine by vmware, and OS is Ubuntu 14.04, Hadoop-2.6, HBase-0.98, ZooKeeper-3.6, FreeSWITCH-1.4.14.

FreeSWITCH cluster can be built on top of an RDBMS: FreeSWITCH is using ODBC to access the database and to store the call and registration status in it. Mysql and Postgres are well-tested backends for this, and I'm not sure anyone has done it with Hadoop.

Related

does Apache Kylin need a Apache Derby or Mysql for run the sample cube

I installed Java and Hadoop and Hbase and Hive and Spark and Kylin.
hadoop-3.0.3
hbase-1.2.6
apache-hive-2.3.3-bin
spark-2.2.2-bin-without-hadoop
apache-kylin-2.3.1-bin
I will be grateful if someone in Help me with Kyle's installation and configuration them.
http://kylin.apache.org/docs/ this may help you. You can send email to dev#kylin.apache.org, then the questions will be discussed and answered in the mailing list. There are some tips for sending the email: 1. provide Kylin version 2. provide log information 3.provide the usage scenario. If you want to get a quick start, you can run Kylin in a Hadoop sandbox VM or in the cloud, for example, start a small AWS EMR or Azure HDInsight cluster and then install Kylin in one of the nodes. When you use Kylin-2.3.1, I suggest you use Spark-2.1.2.

Install HUE or Spark on local PC and ssh root to access Cloudera of my company?

Sorry for my rookie question. I was just wondering if I could ever install a HUE or Spark on my personal PC at home and ssh (through openvpn) the Cloudera of my company? It's just I have no computer science backgroud in the past and I really need to retrieve some data and start my data analysis work. And I'm not very comfortable with terminal view of HIVE or IMPALA which is already installed on the server.
Thanks in advance! Cheers.
It is not possible because Cloudera will warn that spark or hue nodes will lost ping for your network latency.
What you want is possible if your PC is added as a node on the same hadoop cluster.
Post which you may install HUE as a service on it.
As stated in the other answer, network latency could create issues.
Alternatively if your company is using Cloudera Distribution, it should already have Hue installed which you may use over the network using putty(with X11 forwarding) or any similar tool.

How do I install components such as Apache Drill and Apache Hue in IBM Bluemix BigInsights Apache Hadoop

I am new to IBM Bluemix platform and exploring its BigInsights service. I can see pre configured components such as Pig Hive Hbase and others. But I want to know How can I install services like Drill or say Hue which is not configured by default. Also ssh to cluster nodes allows restricted access with no sudo rights in case one need to run yum commands.Does bluemix allows root access as I cannot see one. Thanks In advance.
As far as I know, it is not possible.
But you can use http://www.softlayer.com/ to build your own IOP (IBM Open Platform) Cluster in the cloud.
If you are interested in IBM's value-adds and you just want to try out:
https://www.youtube.com/watch?v=4p7LDeu_qQQ it is a nice tutorial to set up your own cluster via Docker.
This tutorial should be still valid for Hue:
https://developer.ibm.com/hadoop/2015/06/02/deploying-hue-on-ibm-biginsights/
Installing Drill doesn't look complicated:
https://drill.apache.org/docs/installing-drill-in-distributed-mode/
In conclusion: You need to move away from Bluemix, if you want to have a more customised BigInsights. But there are options: Softlayer, AWS, .. or just on your local computer (if you got sufficient resources, since some components like Hbase need a minimum amount of nodes)

Apache Hive Installation on pseudo distributed or multi node cluster environment

I have installed hadoop on multi node environment in my PC as below
1: 4 virtual box instances loaded with ubuntu(14.04)
2: 1-master node , 2-slave node and remaining vm instance works as client
Note: All 4 VM'S are running in my PC itself
I was able to complete apace-2.6 hadoop setup successfully on the above mentioned setup .Now I want to install hive in order to do some data summarization, query, and analysis .
But I am not sure how I have to proceed further. I have few queries mentioned below :
Q1: Do I need to install/setup Apache Hive(0.14) on all nodes(master/name-node and slave/data-node)? or is it only on master node?
Q2: what is the mode should be used to deal with the meta-store is it local mode or remote mode ?
Q3: In case if I want to use mysql for hive meta-store,should I install it on master/name node itself or do I need to use separate client machine for this?
please can some one also share me if there are any steps to be followed to configure metastore? in multi node/pseudo distributed environment.
BR,
San
You need to install the required Hive services (HiveServer2, Metastore, WebHCat) only once. In your lab scenario, you would probably put them on the master. The client can then run Beeline (the HiveServer2 client.)
If you configure the Metastore as Local, Hive will use a local Derby database. Again, for your lab setup, this is probably just what you need/want.
In a production scenario, you would
set up a dedicated server for supporting services that should not fight for resources with the namenode process(es)
and use a dedicated database server for your Metastore database, which will be remote.

Is there an Amazon community AMI for Hadoop/HBase?

I would like to test out Hadoop & HBase in Amazon EC2, but I am not sure how complicate it is. Is there a stable community AMI that has Hadoop & HBase installed? I am thinking of something like bioconductor AMI
Thank you.
I highly recommend using Amazon's Elastic MapReduce service, especially if you already have an AWS/EC2 account. The reasons are:
EMR comes with a working Hadoop/HBase cluster "out of the box" - you don't need to tune anything to get Hadoop/HBase working. It Just Works(TM).
Amazon EC2's networking is quite different from what you are likely used to. It has, AFAIK, a 1-to-1 NAT where the node sees its own private IP address, but it connects to the outside world on a public IP. When you are manually building a cluster, this causes problems - even using software like Apache Whirr or BigTop specifically for EC2.
An AMI alone is not likely to help you get a Hadoop or HBase cluster up and running - if you want to run a Hadoop/HBase cluster, you will likely have to spend time tweaking the networking settings etc.
To my knowledge there isn't, but you should be able to easily deploy on EC2 using Apache Whirr which is a very good alternative.
Here is a good tutorial to do this with Whirr, as the tutorial says you should be able to do this in minutes !
The key is creating a recipe like this:
whirr.cluster-name=hbase
whirr.instance-templates=1 zk+nn+jt+hbase-master,5 dn+tt+hbase-regionserver
whirr.provider=ec2
whirr.identity=${env:AWS_ACCESS_KEY_ID}
whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
whirr.hardware-id=c1.xlarge
whirr.image-id=us-east-1/ami-da0cf8b3
whirr.location-id=us-east-1
You will then be able to launch your cluster with:
bin/whirr launch-cluster --config hbase-ec2.properties

Resources