Unable to use Cassandra from Presto - hadoop

I have setup presto 0.76, Cassandra 2.1.2 and created a mykeyspace and a table to it. I started both the Cassandra daemons and Presto daemons. When I try to query Cassandra using presto CLI it returns
presto:mykeyspace> select * from userinfo;
Query 20141216_181006_00021_me4u4 failed: replicate_on_write is not a column defined in this metadata
So is there any way to get over it?

Use latest version 0.88 with fixes for cassandra, http://prestodb.io/docs/current/release/release-0.88.html

Related

problem with connect presto to hive-hadoop3

I have hadoop 3.1.2 and hive 3.1.2 on a cluster and I want to connect to hive with presto-server-0.265.1.
I have just one catalog file in /opt/presto/etc/catalog as hive.properties here is:
connector.name=hive-hadoop2
hive.metastore.uri=thrift://192.168.49.13:9083
The presto service run but it can not connect to hive because I use hadoop3 and when I change hive.properties,presto service can not run.
how can I connect to hadoop3?
Update:
it wasn't about hadoop. hive metastore was not installed correctly so presto had problem to connect hive metastore

Hive tez query fails with java.io.IOException

Executing a long running Hive Tez query, it rarely fails with:
java.io.IOException: File hdfs://XXX with newer attempt ID 1 is smaller than the file hdfs://YYY with older attempt ID 0
In our 20 node HDP 3.1.5 cluster (Hive 3.1.0 and Tez 0.9.1), it fails once over around 200 executions
We where hitting HIVE-23354
It seems to have no workaround. It is solved at Hive 4.0.0
I had the same issue with a query with lots of big joins. Decresing (512 mb->16 mb in my case) the size of the tables that fit in memory, namely hive.auto.convert.join.noconditionaltask.size solved the problem for me.
Stack: HDP 3.1.4, Tez 0.9.1, Hive 3.1.0.

presto + build presto cluster that will be join to exsiting hadoop cluster

we have hadoop cluster that contain all the relevant components/services as
HDFS
YARN
mapreduce
HIVE
Tez
pig
Zookeeper
hadoop clutser contain 3 masters machines and 12 data node machines and 3 kafka
now we want to use presto to run query against data sources ( hadoop cluster / hive )
so we build a new presto cluster as the follwing
1 presto coordinator
8 presto workers
all presto cluster machines are redhat 7.2
now we want to install the presto on all OS
but we are not sure if presto can be installed immodestly after Linux scratch OS
or maybe we need to install something in the middle after the OS and before the presto ?
The only requirement for Presto is a Java Virtual Machine (JVM). We recommend installing the latest OpenJDK 11 version, currently 11.0.2. After that, follow the Presto deployment instructions.
Python is required for the launcher (the script that starts the JVM), but this is normally available on a typical Linux distribution.

select query errored out in Hive

I am using Hadoop - 1.0.4 & Hive - 1.2.1.
I am facing issue with select query in hive CLI. snippet of error log attached. Please help me resolving the issue.
Thanks Nirmal. Its resolved after upgrading hadoop version to 2.6.0

Impala not working on Hbase Table

Hi I have a Hbase table and I can query the same with Hive.
When I try to access the same from impala (either from HUE or shell) I get the following error :
Query: select * from clickview
ERROR: RuntimeException: couldn't retrieve HBase table (clickviewtab) info:
Enable/Disable failed
CDH version - cdh5.4.2
Impala Version - 2.2.0
Hbase version - 1.0.0-cdh5.4.2
All Hbase , impala and hive are part of the CDH 5.4.2 release and been installed as package.
You have to enable the capability for Impala to query HBase Tables in the Impala Configuration.
In Cloudera Magaer, go to the configuration options, search for hbase and then clic the radio button on HBase Service to enable it.

Resources