Is there an equivalent/replacement of ANY_VALUE (of SQL) in Apache Pinot Query - apache-pinot

I'm looking for a function in Apache Pinot that can replace ANY_VALUE from SQL.
My query in SQL is
SELECT
id,
ANY_VALUE(kind) AS kind
...
GROUP BY id
The closest function I find in Pinot is FIRSTWITHTIME
SELECT
id,
FIRSTWITHTIME(kind, date, 'STRING') as NAME_DIFF_KIND
...
GROUP BY id
Disadv: it cannot remain the same name as current column.
Is there any function can do this?

Related

Is it possible to evaluate SQL given as text? (Oracle APEX 21.1)

I have created a classic report region (REGION: REPORT_FILTER_SHOP_TYPE).
That has a SQL below this.
SELECT
ID, SHOP_NAME, SHOP_TYPE, OPEN_YEAR, CITY
FROM SHOP_LIST;
I want to apply a filter to this table. The filter criteria will be selected from the list item. And this page have some lists.
For example, if there is no filter, the SQL right above one. But if the "SHOP_TYPE" and "OPEN_YEAR" are selected, execute the SQL below.
SELECT * FROM (
SELECT
ID, SHOP_NAME, SHOP_TYPE, OPEN_YEAR, CITY
FROM SHOP_LIST
) S
WHERE S.SHOP_TYPE = 'BOOKSTORE' AND S.OPEN_YEAR <2010;
I can now create the compose SQL text from selected list items.
What do I need to set to display this result in REPORT_FILTER_SHOP_TYPE?
Well, most probably not like that; why using parameters on a page if you hardcode some values into report's query? Use parameters!
Something like this:
SELECT id,
shop_name,
shop_type,
open_year
FROM shop_list
WHERE ( shop_type = :P1_SHOP_TYPE
OR :P1_SHOP_TYPE IS NULL)
AND ( open_year < :P1_OPEN_YEAR
OR :P1_OPEN_YEAR IS NULL);

How to create a subquery that uses listagg in JPA repository?

Using JPA specification classes or predicate builder. How can I convert this WHERE clause?
I am using an oracle db.
WHERE (SELECT listagg(reject_cd,':') within group (order by order_no) as rejectList
FROM REJECT_TABLE WHERE ID = transactio0_ id group by id) like '%06%'
The LISTAGG function is highly specific to Oracle, and is not supported by JPQL. However, you can still use a native query here, e.g.
#Query(
value = "SELECT ... WHERE (SELECT LISTAGG(reject_cd,':') WITHIN GROUP (ORDER BY order_no) AS rejectList FROM REJECT_TABLE WHERE ID = transactio0_ id GROUP BY id) LIKE '%06%'"
nativeQuery = true)
Collection<SomeEntity> findAllEntitiesNative();
Another option here might be to find a way to avoid needing to use LISTAGG. But, we would need to see the full query along with sample data to better understand your requirement.

Impala - Getting error with Multiple count of distinct values

I am using CDH-5.4.4 Cloudera Edition, I have a CSV file in HDFS location, My requirement is to perform Real time SQL queries on Hadoop Environement (OLTP).
So I decided to go with Impala, I have created MetaStore table to a CSV file, then execuing query in impala editor (within HUE application) .
When i am executing below query, i am getting error like
"AnalysisException: all DISTINCT aggregate functions need to have the
same set of parameters as count(DISTINCT City); deviating function:
count(DISTINCT Country)".
CSV File
OrderID,CustomerID,City,Country
Ord01,Cust01,Aachen,Germany
Ord02,Cust01,Albuquerque,USA
Ord03,Cust01,Aachen,Germany
Ord04,Cust02,Arhus,Denmark
Ord05,Cust02,Arhus,Denmark
Problamatic Query
Select CustomerID,Count(Distinct City),Count(Distinct Country) From CustomerOrders Group by CustomerID
Problem:
Unable to execute the Impala Query with More than one Distinct Values in an Query.. I have searched over internet they provide NDV() method as a workaround, But NDV method only returns approximate count of distinct values, I need Exact unique count for more than one fields.
Expectation:
What is the best way to do Exact unique count for more than one fields? Kindly modify the above query to work with Impala.
Note: This is not my original table, I have replicate for the forum question.
I've the same problem in Impala. Here is my workaround:
SELECT CustomerID
,sum(nr_of_cities)
,sum(nr_of_countries)
FROM (
SELECT CustomerID
,Count(DISTINCT City) AS nr_of_cities
,0 AS nr_of_countries
FROM CustomerOrders
GROUP BY CustomerID
UNION ALL
SELECT CustomerID
,0 AS nr_of_cities
,Count(DISTINCT Country) AS nr_of_countries
FROM CustomerOrders
GROUP BY CustomerID
) AS aa
GROUP BY CustomerID
I think this can be done cleaner (untested):
WITH
countries AS
(
SELECT CustomerID
,COUNT(DISTINCT City) AS nr_of_countries
FROM CustomerOrders
GROUP BY 1
)
,
cities AS
(
SELECT CustomerID
,COUNT(DISTINCT City) AS nr_of_cities
FROM CustomerOrders
GROUP BY 1
)
SELECT CustomerID
,nr_of_cities
,nr_of_countries
FROM cities INNER JOIN countries USING (CustomerID)

hive query using regular expression

hi i was looking for a way to query a hive table ( user_acc_detl )
where a column (ACC_DETAILS) data looks like below,
COUNTRY[0]_united staes~DATE[0]_6/10/2014~AMOUNT[0]_200~ID[0]_20140509065052159324~COUNTRY[1]_united kingdom~DATE[1]_6/17/2014~AMOUNT[1]_125~ID[1]_20140516075156389761~COUNTRY[2]_Canada~DATE[2]_6/26/2014~AMOUNT[2]_200~ID[2]_20140515094013444121~COUNTRY[3]_Mexico~DATE[3]_7/3/2014~AMOUNT[3]_1200~ID[3]_20140601000937914898
i can query the hive table by
select ACC_DETAILS["COUNTRY[0]"] as COUNTRY, ACC_DETAILS["DATE[0]"] as DATE, ACC_DETAILS["AMOUNT[0]"] as BILLAMOUNT, ACC_DETAILS["ID[0]"] as PAYMENTID
from user_acc_detl
the above query gives the data for country[0], date[0], amount[0], id[0] which is fine.
Question - all i need to query it using just country, date, amount....without specifying as country[0]...
Question - is there a regular expression way to modify the query accordingly. please help me.
A simple way to achieve this is to wrap your query in a view:
CREATE VIEW user_acc_detl_simple AS
SELECT ACC_DETAILS["COUNTRY[0]"] as COUNTRY
, ACC_DETAILS["DATE[0]"] as DATE
, ACC_DETAILS["AMOUNT[0]"] as BILLAMOUNT
, ACC_DETAILS["ID[0]"] as PAYMENTID
FROM user_acc_detl;
SELECT country, date, billamount, paymentid FROM user_acc_detl_simple;

creating a hiveQL query that uses UDF function that can return column names

I want to create a Hive UDF function that returns specific column names based on some value say retreivecol(age).If the age is 20 then return the list of column names to be used in select query like 'name,email,fbuserid,friend list ' etc and if the age is less than 20 return 'name' alone.So I want my HIVE QL query to look like
select retreivecol(age) from User_Data;
The above query just prints the name of the columns like 'name,email,fbuserid,friendslist' etc as opposed to treating them as column names and filtering based on the same.Any pointers are appreciated.
I'm not sure a UDF is the right place to do this, as UDF's simply see the value passed to them, they don't really have access to the whole table structure.
Instead, could you do this in a nested table?
select name,email,id FROM
(
select
name,
if(age < 20, email, NULL) as email,
if(age < 20, id, NULL) as id
FROM mytable
) a

Resources