Can't subtract two columns involving an alias in Hive query - hadoop

I'm attempting to do the following query where I use a windowing to fetch the next log timestamp and then do a subtraction between it and the current timestamp.
SELECT
LEAD(timestamp) OVER (PARTITION BY id ORDER BY timestamp) AS lead_timestamp,
timestamp,
(lead_timestamp - timestamp) as delta
FROM logs;
However, when I do this, I get the following error:
FAILED: SemanticException [Error 10004]: Line 4:1 Invalid table alias or column reference 'lead_timestamp': (possible column names are: logs.timestamp, logs.latitude, logs.longitude, logs.principal_id)
If I drop this subtraction, the rest of the query works, so I'm stumped - am I using the AS syntax wrong above for lead_timestamp?

One of the limitations of Hive is that you can't refer to aliases you assigned in the same query (except for the HAVING clause). This is due to the way the code is structured around aliasing. You'll have to write this using a sub query.
SELECT lead_timestamp, timestamp, (lead_timestamp - timestamp) AS delta
FROM (
SELECT LEAD(timestamp) OVER (PARTITION BY id ORDER BY timestamp) AS lead_timestamp,
timestamp
FROM logs
) a;
It's ugly, but works.

Related

Entity-framework generated query throws ORA-12704 when using TO_CHAR or TO_NUMBER

I currently face the problem, that I get an exception when executing a query generated by the entity framework.
The query worked until I've joined another table using .include() to the existing entity. Now, whenever I execute the query, I get an Oracle error ORA-12704 character set mismatch.
I narrowed the problem down to the following:
Before joining the table, the generated SQL is a simple query with some join statements. After joining another table, the generated SQL cointains two subqueries which are combined using UNION ALL. In one of the subqueries, a lot of helper-columns are selected.
They look like this:
SELECT
... some other columns...
TO_NUMBER(NULL) AS C1,
TO_CHAR(NULL) AS C2,
...
If I remove those columns and also the corresponding ones in the other subquery, no error is thrown. When I replace the columns with NULL instead of TO_XXX(NULL), the query also works as expected.
Is there any way force the entity-framework not to use these problematic casts?
The problem is caused with the usage of a NVARCHAR2 column in combination with the function TO_CHAR (that returns VARCHAR2 data type) as illustrated below
create table tab
(txt nvarchar2(10));
select txt from tab
union all
select to_char(null) from dual;
ORA-12704: character set mismatch
So your goal is to motivate the tool to generate a query that uses either TO_NCHAR(null) or cast(null as nvarchar2(10)) - both will work.
To do this, you need to add the following data-annotation to the property of the corresponding entity:
[Column(TypeName = "NVARCHAR2")]
The given TypeName must match with the type of the column in the database.
After this addition, the entityframework will generate the correct casts. In this case, the following cast ist generated:
SELECT
...
TO_NCHAR(NULL),
...
You should see no problem with the to_number(null) if the corresponding UNIONcolumn is of a number datatype.

getting error while using alias in hive

I am trying to subtract two columns and fetch the result if the difference is greater then 100 in hive. I have written the following query:
select District.ID,Year,(volume_IN-volume_OUT) as d1 from petrol where d1>100;
but I am getting error.
The table column names are:
District.ID, Distributer name, volume_IN ,volume_OUT, Year
Please help me, Is there any error in the query. I am new to the hive.
One of the Limitations of hive is you cannot refer the alias you used in the same query
Try writing a subquery, may be something like below
select * from (select District_ID,year, (volume_IN-volume_OUT) as d1 from petrol) t1 where d1>100;

Insert timestamp into Hive

Hi i'm new to Hive and I want to insert the current timestamp into my table along with a row of data.
Here is an example of my team table :
team_id int
fname string
lname string
time timestamp
I have looked at some other examples, How to insert timestamp into a Hive table?, How can I add a timestamp column in hive and can't seem to get it to work.
This is what I am trying:
insert into team values('101','jim','joe',from_unixtime(unix_timestamp()));
The error I get is:
FAILED: SemanticException [Error 10293]: Unable to create temp file for insert values Expression of type TOK_FUNCTION not supported in insert/values
If anyone could help, that would be great, many thanks frostie
Can be achieved through current_timestamp() , but only via select clause. don't even require from clause in select statment.
insert into team select '101','jim','joe',current_timestamp();
or if your hive version doesn't support leaving from in select statment
insert into team select '101','jim','joe',current_timestamp() from team limit 1;
If you don't already have a table with at least one row, you can accomplish the desired result as such.
insert into team select '101','jim','joe',current_timestamp() from (select '123') x;

timestamp not working in hive

I have a table with one column data type as 'timestamp'. Whenever i try to do some queries on the table, even simple select statement i am getting errors.
example of a row in my column,
'2014-01-01 05:05:20.664592-08'
The statement i am trying,
'select * from mytable limit 10;'
the error i am getting is
'Failed with exception java.io.IOException:java.lang.NumberFormatException: For input string: "051-08000"'
Date functions in hive like TO_DATE are also not working.If i change the data type to string, i am able to extract the date part using substring. But i need to work with timestamp.
Has anyone faced this error before? Please let me know.
Hadoop is having trouble understanding '2014-01-01 05:05:20.664592-08' as a timestamp because of the "592-08" at the end. You should change the datatype to string, cut off the offending portion with a string function, then cast back to timestamp:
select cast(substring(time_stamp_field,1,23) as timestamp) from mytable

Comparing Date column to sysdate yields: a non-numeric character was found where a numeric was expected

I've been having a strange issue where the comparison of a date column to SYSDATE yields the following error:
01858. 00000 - "a non-numeric character was found where a numeric was expected"
*Cause: The input data to be converted using a date format model was
incorrect. The input data did not contain a number where a number was
required by the format model.
*Action: Fix the input data or the date format model to make sure the
elements match in number and type. Then retry the operation.
I'm re-creating a MATERIALIZED VIEW; which included some minor changes, and whenever the process aborts it always points to the '>=' in the following derived table query:
SELECT id,
desc,
start_date,
end_date
FROM T_LIPR_POLICY_ROLE TLPR
WHERE end_date >= SYSDATE
Now end_date is a type DATE, and I can actually execute this query by itself, but whenever I try to run it in the materialized view it always aborts with the error above. Although last week I was able to create it with the same query.
Any ideas?
Thank you,
Hi I'm terribly sorry for the long delay. I just couldn't post the whole statement for security reasons.
Now the issue has been resolved. The problem was that our materialized view script was aggregating data from 17 different places vía a UNIONs. Now for some reason the error was pointing to wrong line of code (see below).
SELECT id,
desc,
start_date,
end_date
FROM T_LIPR_POLICY_ROLE TLPR
WHERE end_date >= SYSDATE <-- ORACLE POINTS TO THIS LINE
Now this was like the tenth statement in the script, but the error really was in the sixth statement in the script; which was obviously misleading. In this statement a particular record (out of millions) was attempting the following operation:
to_date(' / 0/ ') <-- This was the cause of the problem.
Note that this text wasn't like this in the actual script it literally said to_date(<column name of type varchar>), but 2 records out of 15 million had the text specified above.
Now what I don't quite get is why Oracle points to the wrong line of code.
¿Is it an Oracle issue?
¿Is it a problem with the SQL Developer?
¿Could it be a conflict with a hint? We use several like this: /*+ PARALLEL (init 4) */
Thank you for all your help.
Is desc a column name? If yes then you are using a oracle reserved keyword desc as a column name.
SELECT id,
desc,---- here
start_date,
end_date
FROM T_LIPR_POLICY_ROLE TLPR
WHERE end_date >= SYSDATE
We cannot use oracle reserved keywords in column names.
Please change the column name.

Resources