Connecting Pentaho BI Server to Kylin - mondrian

Kylin 2.1.0 is up and running on a Hadoop 2.7.0 cluster with HBase
1.2.6 and Hive 2.1.1 installed.
We also have Pentaho BI server 6.1.0.1.196 (mondrian 3.11 and saiku)
installed on another machine.
We want Pentaho to access cubes created in Kylin and use Saiku Analytics
Did refer few suggestions on internet but was unable to achieve my goal
https://github.com/mustangore/kylin-mondrian-interaction
Any help on this is truly appreciated.

There is a product called KyAnalyzer,which integrates Mondrain and Saiku with Kylin seemlessly.
Here is the document for this product: https://kyligence.gitbooks.io/kap-manual/en/kyanalyzer/kyanalyzer.en.html

Related

How to connect to an oracle database using databricks

I'm trying to connect to oracle database using pyspark in databricks notebook. I can not find any documentation to install the library for the driver on the cluster
many thanks in advance.
If it is an interactive cluster I'd use maven for the installation. You can specify the Coordinates or search the package you want to install using the UI

does Apache Kylin need a Apache Derby or Mysql for run the sample cube

I installed Java and Hadoop and Hbase and Hive and Spark and Kylin.
hadoop-3.0.3
hbase-1.2.6
apache-hive-2.3.3-bin
spark-2.2.2-bin-without-hadoop
apache-kylin-2.3.1-bin
I will be grateful if someone in Help me with Kyle's installation and configuration them.
http://kylin.apache.org/docs/ this may help you. You can send email to dev#kylin.apache.org, then the questions will be discussed and answered in the mailing list. There are some tips for sending the email: 1. provide Kylin version 2. provide log information 3.provide the usage scenario. If you want to get a quick start, you can run Kylin in a Hadoop sandbox VM or in the cloud, for example, start a small AWS EMR or Azure HDInsight cluster and then install Kylin in one of the nodes. When you use Kylin-2.3.1, I suggest you use Spark-2.1.2.

Browsing Hbase data in Hue through Phoenix

I am using CDH 5.4.4 and installed Phoenix parcel to be able to run SQL on hbase tables. Has anyone tried to browse that data using Hue?
I know since we can connect using JDBC connection to Phoenix, there must be a way for Hue to connect to it too.
The current status is that we would need to add HUE-2745 and then it would show up in DBQuery / Notebook
The latest https://phoenix.apache.org/server.html is brand new and JDBC only.
If there was an HiveServer2 Thrift API or ODBC for Phoenix it would work almost out of the box in the SQL or DB Query apps. Hue could work with JDBC but there will be a JDBC connector that is GPL (so to install separately).
The Hue jira for integrating Phoenix is https://issues.cloudera.org/browse/HUE-2121.

What is Hue all about?

I am new to Big Data. I want to know about Hue. All i know about Hue is that it is a web interface to manage Hadoop ecosystem. Please let me know if i can install in on my pc (Ubuntu Precise). I am running Apache Hadoop 1.2.1 in pseudo distributed mode with PIG and HIVE
Thanks in Advance
Hue is a Web interface for analyzing data with Apache Hadoop. You can install it in any pc with any hadoop version.
Hue is a suite of applications that provide web-based access to CDH components and a platform for building custom applications.
The following figure illustrates how Hue works. Hue Server is a "container" web application that sits in between your CDH installation and the browser. It hosts the Hue applications and communicates with various servers that interface with CDH components.
here you have all the explanations about hue and downloads:
http://gethue.com/

integrate pentaho community with hadoop

i want to integrate hadoop to pentaho data-integration,I found on pentaho site, in that site there is pentaho for hadoop, but it's commercial.i want to make my data-integration community edtion to integrate with hadoop.
How i can solve this ?
Tks
In New version(PDI 4.2.0), you can see hadoop components In PDI.
visit: http://sourceforge.net/projects/pentaho/files/Data%20Integration/
Actually since PDI 4.3.0 ( which got released yesterday ) all the hadoop stuff is now included in the open source version! So just go straight to sourceforge and download! All the docs are on infocenter.pentaho.com
The most recent work for integrating Kettle (ETL) with Hadoop and other various NoSQL data stores can be found in the Pentaho Big Data Plugin. This is a Kettle plugin and provides connectors to HDFS, MapReduce, HBase, Cassandra, MongoDB, CouchDB that work across many Pentaho products: Pentaho Data Integration, Pentaho Reporting, and the Pentaho BA Server. The code is hosted on Github: https://github.com/pentaho/big-data-plugin.
There's a community landing page with more information on the Pentaho Wiki. You'll find How To guides, configuration options, and documentation for the Java Developer here: http://community.pentaho.com/bigdata

Resources