First this is how my table looks like:
tbl
------------------------------------------
| USERID | requestID
|test1#gmail.com | sunsun#gmail.com
|sunsun#gmail.com | test1#gmail.com
|test2#gmail.com | kittyhsk#gmail.com
|sunsun#gmail.com | test2#gmail.com
|test#gmail.com | sunsun#gmail.com
|sunsun#gmail.com | test3#gmail.com
I named my columns wrong but
userIds are the ids that are following requestIds,
and requestIds are the ids that are being followed.
What I want to do is to find the cases that the ids are following each other.
Like for example, I log in with the id sunsun#gmail.com(this is not real address), then I find ids that I'm following and also the ids that follows me, but under the ids that are following each other, I want to print out some text saying that they are following each other. (So under test1 and test2, I should have that text.)
I found this but this does not really apply to my situation as I have to get the results under one logged in ID.
I was trying to do this by myself but I'm all out of ideas. Please help me out. Thanks in advance.
You will have to join the table with itself and compare. So something like
SELECT *
FROM table as t1
JOIN table as t2
ON t1.requestid = t2.useriD and t1.userid = t2.requestid
Related
I have a section table and class Table
class table is designed in this way
(id,class_name,section_id)
one class has many sections like
--------------------------------------------
| SN | ClassName | Section_id |
--------------------------------------------
| 1 | ClassOne | 1 |
| 2 | ClassOne | 2 |
| 3 | ClassOne | 3 |
| 4 | ClassOne | 4 |
--------------------------------------------
Now i want to groupBy Only ClassName and display all the sections of that class
$data['classes'] = SectionClass::groupBy('class_name')->paginate(10);
i have groupby like this but it only gives me one section id
Try this way...
$things = SectionClass::paginate(10);
$data['classes']= $things->groupBy('class_name');
You are getting just one row because that is what GROUP BY does, groups a set of rows into a set of summary rows and returns one row for each group. In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause. For example, in SQL Server if you try the next clause
SELECT * FROM [Class] GROUP BY [ClassName]
You'll get the next error
"Column 'SN' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause"
Think about it, you are grouping by ClassName, and following your sample data, this will return just one row. Your SELECT clause includes column ClassName, which is easy to get because is the same in every single row, but when you are selecting another, which one should be return if only one has to be selected?
Now, things change a little bit in MySQL. MySQL extends the standard SQL use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are nondeterministic. You can find a complete explanation about this topic here https://dev.mysql.com/doc/refman/5.6/en/group-by-handling.html
If you are expecting a result in one row, you can use GROUP_CONCAT() function to get something like
--------------------------------
| ClassName | Sections |
--------------------------------
| ClassOne | 1,2,3,4 |
--------------------------------
Your query must be something like:
select `ClassName`, group_concat(Section_id) from `class` group by `ClassName`
You can get this with a raw query in laravel or its up to you to find a way to get the same result using query builder ;)
How can I retrieve the distinct values from an internal table?
I am using the SORT and DELETE ADJACENT DUPLICATES to get what I need, but I would like to improve these kind of selections.
The point is: imagine you have an internal table with two purchase orders information, where each one has two items. How can I get the distinct purchase orders number?
For instance: I've selected the following information from EKPO:
ebeln | ebelp
---------- | -----
1234567890 | 00010
1234567890 | 00020
1234567891 | 00010
1234567891 | 00020
To get distinct ebeln values:
ebeln
----------
1234567890
1234567891
For that, I need to sort the table and apply the DELETE ADJACENT DUPLICATES. I would like to know if there is any trick to replace these commands.
COLLECT also results distinct values
DATA: lt_collect like table of lt_source-some_field.
LOOP AT lt_source INTO ls_source.
COLLECT ls_source-some_field INTO lt_collect.
ENDLOOP.
* lt_collect has distinct values of lt_source-some_field
To get distinct EBELN what you need to do is simply
SELECT DISTINCT ebeln
FROM ekpo
INTO TABLE lt_distinct_ebeln
WHERE (your_where_condition).
That's all it takes.
An option would be to create a loop and select when the values change. For this to work as you mention, the table must be sorted by the field you are looking for.
loop at GT_TABLE into WA_TABLE.
on change FIELD.
*Operation
endon.
endloop.
Another option is to use the same but with a AT. In order for AT to work, the values from the field select in AT declaration to the left of the table must be the same.
loop at GT_TABLE into WA_TABLE.
at new WA_TABLE-FIELD.
*Operation
endat.
endloop.
i am new to cassandra and i am trying to read a row from database which contains values
siteId | country | someMap
1 | US | {a:b, x:z}
2 | PR | {a:b, x:z}
I have also created an index on table using create index on columnfamily(keys(someMap));
but still when i query as select * from table where siteId=1 and someMap contains key 'a'
it returns an entiremap as
1 | US | {a:b, x:z}
Can somebody help me on what should i do to get the value as
1 | US | {a:b}
You can not: even if internally each entry of a Map|List|Set is stored as a column you can only retrieve the whole collection but not part of it. You are not asking cassandra give me the entry of the map containing X, but the row whom map contains X.
HTH,
Carlo
I am using the following hive query script for the version 0.13.0
DROP TABLE IF EXISTS movies.movierating;
DROP TABLE IF EXISTS movies.list;
DROP TABLE IF EXISTS movies.rating;
DROP DATABASE IF EXISTS movies;
ADD JAR /usr/local/hadoop/hive/hive/lib/RegexLoader.jar;
CREATE DATABASE IF NOT EXISTS movies;
CREATE EXTERNAL TABLE IF NOT EXISTS movies.list (id STRING, name STRING, genre STRING)
ROW FORMAT SERDE 'com.cisco.hadoop.loaders.RegexSerDe'with SERDEPROPERTIES(
"input.regex"="^(.*)\\:\\:(.*)\\:\\:(.*)$",
"output.format.string"="%1$s %2$s %3$s");
CREATE EXTERNAL TABLE IF NOT EXISTS movies.rating (id STRING, userid STRING, rating STRING, timestamp STRING)
ROW FORMAT SERDE 'com.cisco.hadoop.loaders.RegexSerDe'
with SERDEPROPERTIES(
"input.regex"="^(.*)\\:\\:(.*)\\:\\:(.*)\\:\\:(.*)$",
"output.format.string"="%1$s %2$s %3$s %4$s");
LOAD DATA LOCAL INPATH 'ml-10M100K/movies.dat' into TABLE movies.list;
LOAD DATA LOCAL INPATH 'ml-10M100K/ratings.dat' into TABLE movies.rating;
CREATE TABLE movies.movierating(id STRING, name STRING, genre STRING, rating STRING);
INSERT OVERWRITE TABLE movies.movierating
SELECT list.id, list.name, list.genre, rating.rating from movies.list list LEFT JOIN movies.rating rating ON (list.id=rating.id) GROUP BY list.id;
The issue is when I execute the script without the "GROUP BY" clause it works fine.
But when I execute it with the "GROUP BY" clause, I get the following error
FAILED: SemanticException [Error 10002]: Line 4:21 Invalid column reference 'name'
Any ideas what is happening here?
Appreciate your help
Thanks!
If you group by a column, your select statement can only select a) that column, b) columns derived only from that column, or c) a UDAF applied to other columns.
In this case, you're only grouping by list.id, so when you try to select list.name, that's invalid. Think about it this way: what if your list table contained the following two entries:
id|name |genre
--+-----+------
01|name1|comedy
01|name2|horror
What would you expect this query to return:
select list.id, list.name, list.genre from list group by list.id;
In this case it's nonsensical. I'm guessing that id in reality is a primary key, but note that hive does not know this, so the above data set is perfectly valid.
With all that in mind, it's not clear to me how to fix it because I don't know the desired output. For example, let's say without the group by (just the join), you have as output:
id|name |genre |rating
--+-----+------+-------
01|name1|comedy|'pretty good'
01|name1|comedy|'bad'
02|name2|horror|'9/10'
03|name3|action|NULL
What would you want the output to be with the group by? What are you trying to accomplish by doing the group by?
OK let me see if I can ask this in a better way.
Here are my two tables
Movies list table - Consists of movies information
ID | Movie Name | Genre
1 | Movie 1 | comedy
2 | movie 2 | action
3 | movie 3 | thriller
And I have ratings table
MOVIE_ID | USER ID | RATING on 5 | TIMESTAMP
1 | xyz | 5 | 12345612
1 | abc | 4 | 23232312
2 | zvc | 1 | 12321123
2 | zyx | 2 | 12312312
What I would like to do is get the output in the following way:
Movie ID | Movie Name | Genre | Rating Average
1 | Movie 1 | comedy | 4.5
2 | Movie 2 | action | 1.5
I am not a db expert but I understand this, when you group the data together you need to convert the multiple values to the scalar values or all the values, if string should be same right?
For example in my previous case, I was grouping them together as a string. So which is okay for list.id, list.name and list.genre, but the list.rating, well that is always going to give some problem here (I just learnt PIG along with hive, so grouping works differently there)
So to tackle the problem, I casted the rating and averaged it out and stored it in the float table. Have a look at my code below:
CREATE TABLE movies.movierating(id STRING, name STRING, genre STRING, rating FLOAT);
INSERT OVERWRITE TABLE movies.movierating
SELECT list.id, list.name, list.genre, AVG(cast(rating.rating as FLOAT)) from movies.list list LEFT JOIN movies.rating rating ON (list.id=rating.id) GROUP BY list.id, list.name,list.genre order by list.id DESC;
Thank you for your explanation. I might save the following question for the next thread but here is my observation:
The performance of the Overall job is reduced when performing Grouping and Joining together than to do it in two separate queries. For the same job, I had changed the code a bit to perform the grouping first and then joining the data and the over all time was reduced by 40 seconds. Earlier it was taking 140 seconds and now it is taking 100 seconds. Any reasons to that?
Once again thank you for your explanation.
I came across same issue:
org.apache.hadoop.hive.ql.parse.SemanticException: Invalid column reference "charge_province"
After I put the "charge_province" in the group by, the issue is gone. I don't know why.
I am having some trouble. I am trying to build a SQL query that uses "starts with" logic. A little background first...
In the database that I've been tasked to write reports from, there is a "user" table and a "salesperson" table, with salespersons belonging to a user. In a not-so-brilliant move, the designer of the database decided to associate the salespersons through a substring match to their employee code. For example:
John Smith's "employee_code" would be "JS". But he has multiple "salespersons" to distinguish his different sale types. So he might have "JS1", "JS2", "JS3", etc., as his "salesperson_code".
To illustrate:
user table:
|----------|-----------|----------|---------------|
| username | firstname | lastname | employee_code |
|----------|-----------|----------|---------------|
| JSMITH | John | Smith | JS |
|----------|-----------|----------|---------------|
salesperson table:
|------------------|------------------|
| salesperson_name | salesperson_code |
|------------------|------------------|
| John Smith 1 | JS1 |
| John Smith 2 | JS2 |
| John Smith 3 | JS3 |
|------------------|------------------|
There is no foreign key on the salesperson table linking them to the user table, only the substring from the employee code.
I do not remember where I found this answer, but in my queries I've been doing this:
select user.name
from user user
inner join salesperson spn on spn.salesperson_code like user.employee_code || '%'
This logic successfully does the "starts with" match. However, there are users with blank employee codes and they, also, match this query.
What I am looking for: how do I modify this query so that if the employee_code is blank it will not match? I'm pretty newbie with Oracle queries. Other DBMS' have a starts with clause that will not match blank fields.
Thank you in advance for your help!
Try this
select user.name
from user user
inner join salesperson spn
on spn.salesperson_code like nvl(trim(user.employee_code),'-') || '%'
try
select user.name
from user user
inner join salesperson spn
on spn.salesperson_code like DECODE (user.employee_code,
NULL, NULL,
user.employee_code || '%')
I would suggest using a regular expression to extract the non-digit parts of the salesperson code and optionally the digits part. Create a view for the table with these added fields or use it as a table expression in the query.
SELECT regexp_substr(salesperson_code,'\D+') AS employee_code,
regexp_substr(salesperson_code,'\d+') AS employee_sales_no,
salesperson_name, salesperson_code
FROM salesperson
Note: the regular expressions match one or more non-digits and one or more digits respectively.
Add an IS NOT NULL condition:
select *
from user
inner join salesperson spn
on spn.salesperson_code like user.employee_code || '%'
and user.employee_code is not null;