How to pivot data in Hive? - hadoop

First, I've checked other topics on the subject like this one How to transpose/pivot data in hive? but that doesn't match with what I want.
So this is the table I have
| ID | Day | Status |
| 1 | 1 | A |
| 2 | 10 | B |
| 3 | 101 | A |
| 3 | 322 | B |
| 3 | 102 | C |
| 3 | 354 | D |
And i'd like to concat the different Status for each IDs ordering by the Day, in order to have this :
| ID | Status |
| 1 | A |
| 2 | B |
| 3 | A,C,B,D |
The thing is that I don't know how many status I can have, so i can't create as many columns I want for the days since I don't know how many day/status I'll have, so the answers from other topics with group_map or others, I don't know how to adapt it for my problem.
Thank's for helping me ^^

use collect_set (for distinct values) or collect_list to aggregate array and concatenate it using concat_ws:
select ID, concat_ws(',',collect_list(Status)) as Status
from table
group by ID;

Related

Laravel eloquent relationship has many joints

I have 3 tables that I want to join
table users table comments table posts
| id | name | | id | user_id | post_id | comment | | id| text |
|----|------| | -- | ------- | ------- | ------- | |---| ---- |
| 1 | a | | 1 | 1 | 2 | b | | 1 | a |
| 2 | b | | 2 | 1 | 1 | c | | 2 | b |
i want to display it like this
| id | user_id | name | post_id | comment |
i only know joining tables between comments and posts using hasMany relationship
how to join more table (in this case users table) with eloquent relationship?
You can join using following code. But I dont think you can display the record as | id | user_id | name | post_id | comment | because you will have - a user can have many comments and a comment can have many posts. So you will not be able to show on same row. But you can make use of arrays.
An example
$user=User::with(['comments','comments.post'])->get();

DAX Query with multiple filters in powerbi

I have two tables 'locations' and 'markets', where, a many to many relationship exists between these two tables on the column 'market_id'. A report level filter has been applied on the column 'entity' from 'locations' table. Now, I'm supposed to distinctly count the 'location_id' from 'markets' table where 'active=TRUE'. How can I write a DAX query such that the distinct count of location_id dynamically changes with respect to the selection made in the report level filter?
Below is an example of the tables:
locations:
| location_id | market_id | entity | active |
|-------------|-----------|--------|--------|
| 1 | 10 | nyc | true |
| 2 | 20 | alaska | true |
| 2 | 20 | alaska | true |
| 2 | 30 | miami | false |
| 3 | 40 | dallas | true |
markets:
| location_id | market_id | active |
|-------------|-----------|--------|
| 2 | 20 | true |
| 2 | 20 | true |
| 5 | 20 | true |
| 6 | 20 | false |
I'm fairly new to powerbi. Any help will be appreciated.
Here you go:
DistinctLocations = CALCULATE(DISTINCTCOUNT(markets[location_id]), markets[active] = TRUE())

laravel, group by category and select the record with the minimum price

I have models Book and BookCategory
How do I select the cheapest book in every category?
Book table:
| id | name | price | book_category_id |
| 1 | test | 10 | 1
| 2 | test | 15 | 3
| 3 | test | 75 | 1
| 4 | test | 25 | 2
| 5 | test | 19 | 1
| 6 | test | 11 | 2
| 7 | test | 10 | 1
The selection should be :
| id | name | price | book_category_id |
| 1 | test | 10 | 1
| 2 | test | 15 | 3
| 6 | test | 11 | 2
I've tried:
$books = Book::groupBy("book_category_id")->orderBy("price")->get()
But the output is not the minimum price row.
any idea?
EDIT:
I found this page:
https://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/
it has 90% of the solution in SQL
SELECT *
FROM books
WHERE price = (SELECT MIN(price) FROM books AS u WHERE u.book_category_id= books.book_category_id)
GROUP BY books.book_category_id
how to convert this to laravel query builder?
You need to perform a subquery like this post. Try this:
$books = Book::from(DB::raw("SELECT * FROM books order by price asc"))
->groupBy("book_category_id")->get();
Please note that this is a mysql only solution because in mysql you're allowed to not aggregate non-group-by columns. If you need to do this for another DB, you need to perform an inner join on the subquery

Get duplicate rows based on one column using BIRT

I have one table in BIRT Report :
| Name | Amount |
| A | 200 |
| B | 100 |
| A | 150 |
| C | 80 |
| C | 100 |
I need to summarize this table in to another table as : I name is same and add corresponding values.
Summarized table would be :
| A | 350 |
| B | 100 |
| C | 180 |
Here A = 200 + 150 , B = 100 , C = 80 + 100
How I can summarize table from another table present in BIRT Report ?
That is quite easy. Just add another table to your report, select the same datasource as the first table (on the tab binding)
Go to the tab groups and add a group on the your 'Name' column.
You'll see the table change. It added group header row and group footer row. The header will also have an element on which you grouped (in this case name)
Now right click next to name in the amount column. Select Insert->Aggregation.
Select function SUM, expression should be amount, Aggregate On should be your newly created group.
Now you can see the results but it will be something like:
| A | 350 |
| A | 200 |
| A | 150 |
| B | 100 |
| B | 100 |
| C | 180 |
| C | 100 |
| C | 80 |
If you delete the detail row from the table, you'll have the result your after.
For you information:
Have a play with this, its good excersise. Move the new aggregation to the group footer, add a top border to that cell, put a label total in front if it and you'll have something like this:
| A | |
| A | 200 |
| A | 150 |
----------
| total | 350 |
| B | |
| B | 100 |
----------
| total | 100 |
| C | |
| C | 100 |
| C | 80 |
----------
| total | 180 |
Also, you don't have to select the datasource as the binding, you can also select your first table for the bindings:
select the table, open the tab biding, select report item and pick your first table from the dropdown.
This can create very complex situations, therefor I usually try to work from the original dataset.

80% Rule Estimation Value in PL/SQL

Assume a range of values inserted in a schema table and in the end of the month i want to apply for these records (i.e. 2500 rows = numeric values) the algorithm: sort the values descending (from the smallest to highest value) and then find the 80% value of the sorted column.
In my example, if each row increases by one starting from 1, the 80% value will be the 2000 row=value (=2500-2500*20/100). This algorithm needs to be implemented in a procedure where the number of rows is not constant, for example it can varries from 2500 to 1,000,000 per month
Hint: You can achieve this using Oracle's cumulative aggregate functions. For example, suppose your table looks like this:
MY_TABLE
+-----+----------+
| ID | QUANTITY |
+-----+----------+
| A | 1 |
| B | 2 |
| C | 3 |
| D | 4 |
| E | 5 |
| F | 6 |
| G | 7 |
| H | 8 |
| I | 9 |
| J | 10 |
+-----+----------+
At each row, you can sum the quantities so far using this:
SELECT
id,
quantity,
SUM(quantity)
OVER (ORDER BY quantity ROWS UNBOUNDED PRECEDING)
AS cumulative_quantity_so_far
FROM
MY_TABLE
Giving you:
+-----+----------+----------------------------+
| ID | QUANTITY | CUMULATIVE_QUANTITY_SO_FAR |
+-----+----------+----------------------------+
| A | 1 | 1 |
| B | 2 | 3 |
| C | 3 | 6 |
| D | 4 | 10 |
| E | 5 | 15 |
| F | 6 | 21 |
| G | 7 | 28 |
| H | 8 | 36 |
| I | 9 | 45 |
| J | 10 | 55 |
+-----+----------+----------------------------+
Hopefully this will help in your work.
Write a query using the percentile_disc function to solve your problem. Sounds like it does what you want.
An example would be
select percentile_disc(0.8) within group (order by the_value)
from my_table

Resources