I begin to work with the Canvas section in Kibana - and to retrieve data, it uses Elasticsearch SQL.
What I try to do is to retrieve the count of several values ; and I need to group certain values together - the ones that start with the same letters.
My SQL query looks like this :
SELECT
(SELECT COUNT(*) FROM logs WHERE status LIKE 'missingValue%'),
(SELECT COUNT(*) FROM logs WHERE status LIKE 'errorValue%'),
(SELECT COUNT(*) FROM logs WHERE status='exactErrorValue'),
(SELECT COUNT(*) FROM logs WHERE status='anotherExactErrorValue')
When I test this query, using SQL and a little database, it works
Now, I want to make this work inside an element of my canvas. I choose a horizontal bar chart to represent it.
This is my elasticsearch SQL query :
SELECT
(SELECT COUNT(*) FROM "monitoring-func-*"
WHERE status LIKE 'missingValue%'),
(SELECT COUNT(*) FROM "monitoring-func-*"
WHERE status LIKE 'errorValue%'),
(SELECT COUNT(*) FROM "monitoring-func-*"
WHERE status='exactErrorValue'),
(SELECT COUNT(*) FROM "monitoring-func-*"
WHERE status='anotherExactErrorValue')
And I get this error :
{
"error": {
"message": "[essql] > Unexpected error from Elasticsearch: [unresolved_exception] Invalid call to nullable on an unresolved object ScalarSubquery[With[{}]
\\_Project[[?COUNT(?*)]]
\\_Filter[(status) REGEX (LikePattern)#5139]
\\_UnresolvedRelation[[][index=monitoring-func-*],null,Unknown index [monitoring-func-*]],5142] AS ?"
}
}
Seeing "unknown Index", I first thought that the wildcard was the problem.
But it's not, it's perfectly fine in my others Elasticsearch queries.
Is there something about the Subqueries, the multiple SELECT, that Elasticsearch SQL doesn't handle well ?
I didn't find any ressource or topics on this, but maybe I've searched the wrong way.
Depending on your Elasticsearch version, essql either doesn't support subqueries or it is very limited, here is the documentation.
Related
In ElasticSearch using aggs you can easily replicate this sql query
SELECT COUNT(id) FROM table;
But I want this:
SELECT COUNT(id), id FROM table;
Or even better I want to do:
SELECT COUNT(price) * price FROM table;
Is this even possible in Elasticsearch?
I have tried “terms” and similar , using pipelines or bucket_script BUT the problem is doc_count is not accessible in buckets_path .
I have the following Hibernate HQL query:
select t from Term t join ApprovedCourse ap on t.id = ap.term.id group by t order by t desc
It's failing with the
ORA-00979: not a GROUP BY expression
error because Oracle insists that all select values be in the group by. Hibernate, of course, is hiding the various fields of the Term object from us, letting us deal with it as a Term and not Term.id. (This query works on Postgres, by the way. Postgres is more liberal about its group by requirements.)
Hibernate is producing the following SQL:
select term0_.id as id1_12_, term0_.semester_id as semester_id2_12_, term0_.year_id as year_id3_12_
from term term0_
inner join approved_course approvedco1_
on (term0_.id=approvedco1_.term_id)
group by term0_.id
order by term0_.id desc
I've tried just removing the select t from the start of the query, but then Hibernate assumes that I'm selecting both the Term and ApprovedCourse objects, and that makes things worse.
So how do I make this work in a Hibernate way?
I found that I could get what I want by replacing the group by clause with a distinct in the select clause. Here's the resulting query:
select distinct(t) from Term t join ApprovedCourse ap on t.id = ap.term.id order by t desc
I have a created a table (movies) in Hive as below(id,name,year,rating,views)
1,The Nightmare Before Christmas,1993,3.9,4568
2,The Mummy,1932,3.5,4388
3,Orphans of the Storm,1921,3.2,9062
4,The Object of Beauty,1991,2.8,6150
5,Night Tide,1963,2.8,5126
6,One Magic Christmas,1985,3.8,5333
7,Muriel's Wedding,1994,3.5,6323
8,Mother's Boys,1994,3.4,5733
9,Nosferatu: Original Version,1929,3.5,5651
10,Nick of Time,1995,3.4,5333
I want to write a hive query to get the name of the movie with highest views.
select name,max(views) from movies;
but it gives me an error
FAILED: Error in semantic analysis: Line 1:7 Expression not in GROUP BY key name
but doing a group by with name gives me the complete list (which is expected).
What changes should I make to my query?
It is very possible that there is a simpler way to do this.
select name
from(
select max(views) as views
, name
, row_number() over (order by max(views) desc) as row_num
from movies
group by name
) m
where row_num = 1
After little bit of digging, I found out that the answer is not so straightforward as we do in SQL. Below query gives the expected result.
select a.name,a.views from movies a left semi join(select max(views) views from movies)b on (a.views=b.views);
I'm looking for a way to optimize my query.
We have a table with events called lea, with a column app_properties, which are tags, stored as a comma separated string.
I would like to select all the events that match the result of a query that select the desired tags.
My first try:
SELECT uuid, app_properties, tag
FROM events
LATERAL VIEW explode(split(app_properties, '(, |,)')) tag_table AS tag
WHERE tag IN (SELECT source_value FROM mapping WHERE indicator = 'Bandwidth Usage')
But Hive will not allow this...
FAILED: SemanticException [Error 10249]: Line 4:6 Unsupported SubQuery Expression 'tag': Correlating expression cannot contain unqualified column references.
Gave it another try by replacing WHERE tag IN by WHERE tag_table.tag IN but not luck...
FAILED: SemanticException Line 4:6 Invalid table alias tag_table' in definition of SubQuery sq_1 [tag_table.tag IN (SELECT source_value FROM mapping WHERE indicator = 'Bandwidth Usage')] used as sq_1 at Line 4:20.
In the end... The query below gives the desired result, but I've a feeling that this is not the most optimized way of solving this use case. Has anyone ran into the same use case where you need the select from a LATERAL VIEW using a Sub query?
SELECT to_date(substring(events.time, 0, 10)) as date, t2.code, t2.indicator, count(1) as total
FROM events
LEFT JOIN (
SELECT distinct t.uuid, im.code, im.indicator
FROM mapping im
RIGHT JOIN (
SELECT tag, uuid
FROM events
LATERAL VIEW explode(split(app_properties, '(, |,)')) tag_table AS tag
) t
ON im.source_value = t.tag AND im.indicator = 'Bandwidth Usage'
WHERE im.source_value IS NOT NULL
) t2 ON (events.uuid = t2.uuid)
WHERE t2.code IS NOT NULL
GROUP BY to_date(substring(events.time, 0, 10)), t2.code, t2.indicator;
The Hive subquery in the WHERE clause can be used with IN, NOT IN, EXIST, or NOT
EXIST as follows. If the alias (see the following example for the employee table) is not specified before columns (name) in the WHERE condition, Hive will report the error Correlating expression cannot contain unqualified column references. This is a limitation of the Hive subquery.
From Apache Hive Essentials.
I guess this problem is also caused by subquery.
events should have an alias
I have the following tables:
Table1:
user_name Url
Rahul www.cric.info.com
ranbir www.rogby.com
sahil www.google.com
banit www.yahoo.com
Table2:
Keyword category
cric sports
footbal sports
google search
I want to search Table1 by matching the keyword in Table2. I can perform the same using case statement and the query works but it is not the right approach because each time I have to add the case statement when I will add new search keyword.
select user_name from table1
case when url like '%cric%' then sports
else 'undefined'
end as category
from table1;
Thanks find the soluntions for this approach. FIrst we need to do the Join and after that we need to filter the record.
select user_name,url,Keyword,catagory from(select table1.user_name,table1.url ,table2.keyword,table2.catagory from table1 left outer join table2)a where a.url like (concat('%',a.phrase,'%')
Not sure about more current versions, but I've run into a similar problem... the primary issue is that Hive only supports equi-join statements... when you apply logic to either side of the join, it has difficulty translating into a Map Reduce function.
The alternative method, if you have a reliably structured field, is that you can create a matching key from the larger field. For example, if you know that you're looking for your keyword to exist in the second position of a dot-delimited URI, you could do something like:
select
Uri
, split(Uri, "\\.")[1] as matchKey
from
Table1
join Table2 on Table2.keyword = Table1.matchKey
;