PowerQuery: taking the average of each of many columns - powerquery

I'm new to PowerQuery and I have a table that is essentially a matrix of dates and hours within those days: the first column holds each date and the rest of the columns are labeled 1 through 24. An example is:
Date H1 H2 H3 H4 ...
---- -- -- -- --
Jan 1
Jan 2
Jan 3
...
This is stored in an Excel file that is quite large, so I want to be able to simply query that file and pull subsets of the data. One example is the average hourly number by year. In SQL this would be represented by "SELECT YEAR(Date), AVG(H1), AVG(H2), ... FROM Source Table GROUPBY YEAR(Date)". However, in PowerQuery it seems like you can only use GROUPBY to generate a new column with the grouped result and thus have to repeat the operation x24 in this case, or more if I had data by seconds for example (to be fair, in the SQL query you also have to type out each column if you don't consider scripting solutions). Is there a simpler approach to generate my desired table (essentially collapsing each column to its average), or do I need to manually add each column?

You can unpivot your hour columns and then you only need to group by year and the unpivoted attribute column.
I made a sample table of your data like this and loaded it into power query. I converted the Date column to Year only, Unpivoted Other Columns on the Date column, then Grouped by the Date and Hour column after unpivoting. The result looks like this.
You can of course repivot the data after if you want inside or outside of power query. This is what the code in power query looks like, but this was all created with normal menu options, not written by hand.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Extracted Year" = Table.TransformColumns(Source,{{"Date", Date.Year, Int64.Type}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Extracted Year", {"Date"}, "Hour", "Value"),
#"Grouped Rows" = Table.Group(#"Unpivoted Other Columns", {"Date", "Hour"}, {{"Average", each List.Average([Value]), type number}})
in
#"Grouped Rows"

Related

Google Sheets Formula - Get Total from filtered dates per row (undefined number of columns)

I have this data in Google Sheets where in I need to get the total of the filtered data columns per row. The date columns are not fixed (may increase over time, I already know how to handle this undefined number of columns). What my current challenge encountered is how can I efficiently get a summary of totals per user based on filtered date columns.
My data is like this:
My expected result is like this:
My current idea is this:
Here is a sample spreadsheet for reference:
https://docs.google.com/spreadsheets/d/1_dByPabStGQvh94TabKxwFeUyVaRFnkBCRf4ioTY5jM/edit?usp=sharing
This is a method to unpivot the data so you can work with it
=ARRAYFORMULA(
QUERY(
IFERROR(
SPLIT(
FLATTEN(
IF(ISBLANK(A2:A),,A2:A&"|"&B1:G1&"|"&B2:G)),
"|")),
"select Col1, Sum(Col3)
where
Col2 >= "&DATE(2022,1,1)&" and
Col2 <= "&DATE(2022,1,15)&"
group by Col1
label
Col1 'Person',
Sum(Col3) 'Total'"))
Basically, its creating an output of User1|44557|8 -- it then FLATTENs it all and splits by the pipe, which gives you three clean columns.
Run that through a QUERY to SUM by the person between the dates and you get what you're after. If you wanted to use cell references for dates, simply replace the dates with the cell references.
To expand the table, change B1:G1 and B2:G2 to match the width of the range.

Power Query - Merge Pivot Combined Columns into rows

Hi, I'm creating a new thread since the problem I'm trying to solve is different to similar solutions which I've tried unsuccessfully.
I have a table with the following structure (see below), column "City" provides a list of cities A,B...D Column "Date 1" provides dates for the 1st date of an event happening at each city. Column "Date 2" provides the dates for the second event at each city.
City
Date 1
Date 2
A
4/4
5/3
B
4/5
5/4
C
4/6
D
4/7
5/5
I'm trying to bring all the dates for both events into a single column as shown in the example below: Column "Date". While I'm able to pivot columns into rows using Power Query's Split function, I'm unable to solve this specific problem since the data across two separate columns "Date 1" and "Date 2".
Any Power Query ideas to solve this would be awesome, thanks in advance everyone!
City
Date
Date
A
4/4
B
4/5
C
4/6
D
4/7
A
5/3
B
5/4
D
5/5
Right click the City column choose "unpivot other columns"
Then remove extra columns, sort, rename columns as needed
Another fairly quick way to do this in the GUI:
Select the date columns and click Merge Columns (under the Transform tab, Text Column section).
Choose a separator, say Semicolon, and click OK to do the merge.
Now choose Split Column > By Delimiter and choose the delimiter you just used (e.g. Semicolon).
IMPORTANT: Under advanced options, choose Split into Rows and click OK.
Filter out any nulls/blanks from the merged column.
#horseyride's suggestion is certainly fewer steps and cleaner code.
let
Source = <Your Data Source Here>
#"Unpivoted Columns" = Table.Unpivot(Source, {"Date 1", "Date 2"}, "Column", "Date"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Columns", {"Column"})
in
#"Removed Columns"

Power Query, row by row the sum of the next 3 values

I have a power query table, 1 column with integer values. In another column, the sum of the current row and the other 2 rows should be calculated row (cell) by row (cell). - In plain Excel, I calculate it like this:
B1: = SUM(B1:B3)
B2: = SUM(B2:B4)
B3: = SUM(B3:B5)
...
How can I solve this with Power Query? If an error occurs in the last 2 lines, this is negligible.
Thanks and regards
Guenther
Is this what you're looking for?
If you start with this as your Source table:
Then if you add a custom column set up like this:
You'll get this:
Here's the M code, loading it from a spreadsheet's workbook, where the data is in a table named Table1:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Custom", each List.Sum(List.Range(Source[Column1],[Column1]-1,3)))
in
#"Added Custom"

Coldfusion query of queries count by date

I'm trying to get an count based on two dates and I'm not sure how it should look in a query. I have two date fields; I want to get a count based on those dates.
<cfquery>
SELECT COUNT(*)
FROM Table1
Where month of date1 is one month less than month of date2
</cfquery>
Assuming Table1 is your original query, you can accomplish your goal as follows.
Step 1 - Use QueryAddColumn twice to add two empty columns.
Step 2 - Loop through your query and populate these two columns with numbers. One will represent date1 and the other will represent date2. It's not quite as simple as putting in the month numbers because you have to account for the year as well.
Step 3 - Write your Q of Q with a filter resembling this:
where NewColumn1 - NewColumn2 = 1

How to pull the data from Oracle on quarterly based on EFF_Date

I have a data in the table POL_INFO pol_num,pol_sym,pol_mod,eff_date. I need to pull the data from it on quarterly basis using EFF_DATE.
I'm not sure what you want to query, so here's an example that will hopefully get you started; it counts rows by quarter based on eff_date:
SELECT TO_CHAR(eff_date, 'YYYYQ'), COUNT(*)
FROM my_table
GROUP BY TO_CHAR(eff_date, 'YYYYQ')
The query relies on the TO_CHAR date format code Q, which returns the calendar quarter (Jan-Mar = quarter 1, Apr-Jun = quarter 2, etc.).
Finally, be warned that the WHERE clause is not optimizable. If you have millions of rows you'll want a different approach.

Resources