hive syntax error near unexpected token `(' - hadoop

My Hive query has been throwing an error:
syntax error near unexpected token `('
I am not sure where the error occurs in the query below.
Can you help me out?
select
A.dataA, B.dataB, count(A.nid), count(B.nid)
from
(select
nid, sum(dataA_count) as dataA
from
table_view
group by nid) A
LEFT JOIN
(select
nid, sum(dataB_count) as dataB
from
table_others
group by nid) B ON A.nid = B.nid
group by A.dataA , B.dataB;

I think you didnot close the ) at the end.
Thanks

Sometimes it have been seen that people forget to start the service metastore and thereafter as well as also to enter the hive bash shell, and start passing the commands in similar way of sqoop, when I were the newbie I were also facing these things,
so to overcome of this issue -
goto the hive directory & pass : bin/hive --service metastore & so it will start the hive metastore server for you and later
open another terminal or cli & pass : bin/hive so it will let you enter inside the hive bash shell.
sometimes when you forgot to do these steps, you got silly issues like on thread title we are discussing here,
hope it will help others, thanks.

I have gone through many posts but i didn't realize the my beeline terminal is logged off and i am trying in normal terminal

I faced an issue exactly like this:
First, you have opened 6 open parenthesis and 6 closed ones, so this is not your issue.
Last but not least, you are getting this error because your command is getting interpreted word by word. A statement, like a SQL query, is only known for the databases and if you use the language of a specific database, then only that particular database is able to understand it.
Your "before '(' ..." error means, you are using something before the ( that isnt known for the terminal or the place you are running in.
All you have to do to fix it is:
1- Wrap it with a single or double quotation
2- Use where-clause even if you don't need to(for example Apache Sqoop needs it no matter what). So check the documentation for exact way to do so, usually you can use something like where 1=1 when you dont need it(for Sqoop it was where $CONDITIONS .
3- Make sure your command runs in a designated database first, before asking any third party app to run it.
if this answer was helpful you can give it "chosen answer mark" for better reach in the community.

Related

Got wrong result when using `taosRestful` to access TDengine

I'm using Go connector taosRestful to access TDengine. I want to check databases and tables, so I execute :
db.Exec("show databases")
But I got an error "wrong result".
Can't taosRestful execute this statement? Or is there another way to write it?
I Googled the question but got no answer
You should use taos.Query() to execute a query command. taos.Exec() is for inserting.

Hive managed table drop doesn't delete files on HDFS. Any solutions?

While deleting managed tables from the hive, its associated files from hdfs are not being removed (on azure-databricks). I am getting the following error:
[Simba]SparkJDBCDriver ERROR processing query/statement. Error Code: 0, SQL state: org.apache.spark.sql.AnalysisException: Can not create the managed table('`schema`.`XXXXX`'). The associated location('dbfs:/user/hive/warehouse/schema.db/XXXXX) already exists
This issue is occurring intermittently. Looking for a solution to this.
I've started hitting this. It was fine for the last year then something is going on with the storage attachment I think. Perhaps enhancements going on in the back ground that are causing issues (PaaS!) As a safeguard I'm manually deleting the directly path as well dropping the table until I can get a decent explanation of what's going on or get a support call answered.
Use
dbutils.fs.rm("dbfs:/user/hive/warehouse/schema.db/XXXXX", true)
becarefull with that though! Get the path wrong and it could be tragic!
So sometimes the metadata(schema info of Hive table) itself gets corrupted. So whenever we try to delete/drop the table we get errors as, spark checks for the existance of the table before deleting.
We can avoid that if we use hive clint to drop the table, as it avoids checking the table's existence.
Please refer this wonder databricks documentation

Oracle Uniface - selected data too large for SQL Workbench

I am currently working on a stupid system where I have given no direct DB access but a weird SQL Workbench which can not do most of the things apart from some basic stuff. So for some reason I need to do a SELECT * on one of the tables which have 174 columns. And whenever I try that it gives me the following error:
"ERROR: Error -27 was encountered whilst running the SQL command. (-3)
Error -3 running SQL : ORACLE Driver Error [-27]: Selected data too
large for SQL Workbench"
Quick googling gave me nothing apart from (in one of the oracle documents):
In the SQL Editor, the maximum length of one row of the formatted
result is 8190 bytes. When this length is exceeded, the ORA connector
generates the above error
Now, I was wondering if anyone could give me a solution that would be a great help. One of the solution I am thinking is to increase the Maximum Length for Ora Connector/Driver. But I am novice in Oracle and do not know anything apart from querying. So haven't been able to change the Maximum Length yet.
So, please if anybody could help me out with this, that would be great.
Thanks a lot guys
Being asked to do database work trough the Uniface SQL Workbench is not a good situation. It is only a very simple thing that you can use in an emergency if nothing else is available.
You could run a couple of queries, each time with the primary key and a bunch of fields and stitch the result together in Excel.
If you have access to the Uniface Development Environment you can use it to convert your Oracle data to, for example, XML. Instructions are in the Uniface helpfile ulibrary.chm, see command line switch /cpy.
You cannot change the maximum record length of the Uniface Oracle Connector.

Return just the results from %SYS.ProcessQuery using $SYSTEM.SQL.Shell() in Intersystems Caché on a Unix server

Background
Hi,
I work with a Unix-based application that uses an Intersystems Caché database. Since I'm not that familiar with Caché, it wasn't until recently that I found out I could type...
$ cache
...to enter the database. From here, I found out I could access a number of things like the %FREECNT report, the ^DATABASE routine (to view/modify the size and other properties of the database), and $SYSTEM.SQL.Shell().
Since I found the $SYSTEM.SQL.Shell(), I've found a number of things I can use it for to obtain info about the database, specifically running processes using the %SYS.ProcessQuery table.
I'm able to run queries successfully - for example:
USER>ZN "%SYS"
%SYS>D $SYSTEM.SQL.Shell()
SQL Command Line Shell #Comment - Sql Shell Intro text
--------------------------------
Enter q to quit, ? for help.
%SYS>Select PID As Process_ID, State As Process_Status From %SYS.ProcessQuery
The above query will return results in this format:
Process_ID Process_State
--------------------------------
528352 READ
2943582 HANG
707023 RUN
3 Rows(s) Affected
--------------------------------
Question
Considering the background identified above, I'm looking for a way to return just the results without the "SQL Command Line Shell" intro text, the column names, or the row count footer. When I write a .ksh script in Unix to connect to Caché and run a query, like above, I return the results, along with the following text that I don't want included:
SQL Command Line Shell
--------------------------------
Enter q to quit, ? for help.
Process_ID Process_State
--------------------------------
3 Rows(s) Affected
--------------------------------
Additional Info
I realize I could use Unix commands to filter out some of the text using awk and sed, but I'm looking for something a little easier/cleaner way that might be built-in. Maybe something that has a silent or no_column_names flag, like the example in this LINK.
My end game is to have a script run that will obtain info from a query, then use that info to make changes to the database when certain thresholds are met. Ultimately, I want to schedule the script to run at regular intervals, so I need all the processing to occur on the server instead of creating a separate Client app that binds to the database.
You want to create a Cache Routine for this. You can do this in Cache Studio.
http://docs.intersystems.com/ens20131/csp/docbook/DocBook.UI.Page.cls?KEY=GSTD_Routines
In the Routine, you want to use either Embedded SQL or Dynamic SQL to run the query, and iterate through the results, and print them using WRITE. I would recommend Dynamic SQL, as it will be more flexible in the future.
Introduction to SQL:
http://docs.intersystems.com/ens20131/csp/docbook/DocBook.UI.Page.cls?KEY=GSQL_intro#GSQL_intro_embeddedsql
Dynamic SQL Information:
http://docs.intersystems.com/ens20131/csp/documatic/%CSP.Documatic.cls?APP=1&LIBRARY=%SYS&CLASSNAME=%SQL.Statement
Embedded SQL Information:
http://docs.intersystems.com/ens20131/csp/docbook/DocBook.UI.Page.cls?KEY=GSQL_esql
You can create duplicate class of %SQL.Shell in your own namespace and you can edit it..If you want as Rountine means you can call this method ...%Go() from your routine.

Command to get the list of databases in sqlite

what is the command to get the list of sqlite databases created so far in terminal.
i googled it but all i could get is .schema and .table commands which are not working after the terminal is reopened.
.databases should do what you want.
I guess there is no way to know how many db u have in some directory ...
u can use .databasesin an opened database to know all attached databses to your database + the main one .
so that u get just 2 DBs ..
regards

Resources