Using PowerQuery in Excel 2016 to combine words in one column by the Category stored in another column. I use GroupBy with Text.Combine in it.
I am expecting the order of words to be sustained but it seems random.
There are 3 pictures below
first is my original table before SORT.
MAPPED WORDS is what i need combined, CATEGORY is the bucket of the combinations,
POSITION is the column which indicates the position of MAPPED WORD in SKU - i sort words in that order expecting that Text.Combine would retain that in the final strings. I am interested in red and blue-highlighted words for this example.
ORIGINAL TABLE
Market Tag SKU Position Category Mapped Word
ABG 130 HELLO DAY CRYSTAL MIDI GRANOLA CHOCOLATE 11 BRAND Crystal
ABG 130 HELLO DAY CRYSTAL MIDI GRANOLA CHOCOLATE 7 BRAND Day
AAI 30 FINAX HEALTHY GOOD MUESLI APPLE NUT RAISIN 1 BRAND Finax
AAI 30 FINAX HEALTHY GOOD MUESLI APPLE NUT RAISIN 7 OTHER Healthy
ABG 130 HELLO DAY CRYSTAL MIDI GRANOLA CHOCOLATE 1 BRAND Hello
ABG 130 HELLO DAY CRYSTAL MIDI GRANOLA CHOCOLATE 19 BRAND Midi
AAI 30 FINAX HEALTHY GOOD MUESLI APPLE NUT RAISIN 20 TYPE Muesli
AAI 30 FINAX HEALTHY GOOD MUESLI APPLE NUT RAISIN 33 FLAVOURS Nuts
AAI 30 FINAX HEALTHY GOOD MUESLI APPLE NUT RAISIN 37 FLAVOURS Raisins
ABG 130 HELLO DAY CRYSTAL MIDI GRANOLA CHOCOLATE 32 FLAVOURS Chocolate
AAI 30 FINAX HEALTHY GOOD MUESLI APPLE NUT RAISIN 27 FLAVOURS Apple
ABG 130 HELLO DAY CRYSTAL MIDI GRANOLA CHOCOLATE 24 TYPE Granola
AAI 30 FINAX HEALTHY GOOD MUESLI APPLE NUT RAISIN 15 BRAND Good
AFTER SORT
AFTER GROUPBY-COMBINE.
The problem is that the result does not look like any logic - the order is ignored. Red words are appended in alphabetical order, whilst blue with no specific order.
I need the words combined in the order as per POSITION column.
It's the same answer I gave to a different question around operations after sorting, but I tested it and if you put your sorting step inside Table.Buffer() it seems like that works here as well.
Table.Buffer(Table.Sort(PROPERCASE_WORDS,{{"TAG",Order.Ascending},{"CATEGORY",Order.Ascending}, {"POSITION",Order.Ascending}}))
AFAIK Table.Buffer loads the table into memory and in doing so resets an internal index used by various PQ operations to match the current sorting of the table. I don't know if there are any downsides to doing this, but it seems to work in a number of cases where you want an operation to proceed in a "top to bottom" manner.
Related
Let's say we are selling 3 different flavors of juice (orange, apple and grape), and customers purchase several bottles of juice for a group of people. For the sake of this question, let's assume they select flavors depending on various input data such as season, weather, temperature, etc.. There can be many inputs but let's limit the inputs to 4 in this example. Here is an example of their purchase history:
Order qty
Input_1 (season)
Input_2 (weather)
Input_3
Input_4
orange
apple
grape
50
summer
sunny
78
adult
20
0
30
30
winder
rainy
35
children
20
10
0
75
spring
cloudy
50
both
30
30
15
What machine learning algorithm can predict how many of each flavor a customer would purchase given the input parameters? Notice that the total of 3 flavors must add up to the order quantity, and number cannot be less than zero.
It is a regression problem.
You can solve it easily with deep learning: just one-hot encode the categorical features and please normalize numerical values.
X is features 0 to 5 and y is orange apple and grapes features.
Then you can train and predict a deep neural network model.
Tensorflow and Pytorch are good example of deep learning libraries
I've just started using Quicksight and I can't for the life of me figure this out.
Let's say I have a table that goes like this :
date
country
color
nb_sales
10-11
USA
Black
10
10-11
USA
Blue
5
10-12
USA
Blue
20
10-11
UK
Black
10
10-12
UK
Black
15
10-11
UK
Blue
15
What I want is the average daily number of sales by country, preferably in a pieplot :
country
avg_nb_sales
USA
17.5
UK
20
So I need first to group by date, country and sum the ratings and then once this aggregation is done I need the average by country. I thought I should be using avgOver(sum(ratings), country) but I can't get it right.
So how do I achieve that ?
I thank you for your time.
I have managed to get the result expected though not sure that it is the best way to do it (especially if you have some dates when you get sales only for one country):
avgOver(sum(sales), [country])/distinct_count(date)
Im trying to create an application that would form a team of 4 people in a shooter game.
There are 3 roles for 4 players. We need 2x assault, 1x sniper and 1x medic in a team.
I would be choosing players from 3 arrays, each array contains signups for that role (playername and priority number). Players can signup for multiple roles.
Sniper[0] John 100
Sniper[1] Mort 91
Sniper[2] Stef 70
Medic[0] Jerry 92
Medic[1] Mort 91
Medic[2] Jambo 19
Assault[0] Jerry 92
Assault[1] Haler 91
Assault[2] Gowgow 79
Assault[3] Jambo 19
This is how the 3 arrays would look like.
Selection in this case should be:
Sniper - John 100
Medic - Mort 91
Assault1 - Jerry 92
Assault2 - Haler 91
Application should always try to select people with highest priority for available roles.
If anyone could at least point me in the right direction on how to solve this issue. Im really stuck here as I have no idea how to do it and I don't know what to search for online either, to learn.
I solved my selection problem with a "Hungarian algorithm or Munkres".
I organizaed my data so that I get an output that gives me counts by modality. for example,
Resident Location Year Modality PGY ModalityCount
47 john smith SMH 2009-2010 CT PGY2 12
48 john smith SMH 2009-2010 MRI PGY2 4
However, I would like to make the counts overall counts of how many counts of scans they did overall.
How would i do this?
Thanks
I am trying to set up an Access database and will be importing the data from Excel. We do our analysis in R and the current Excel worksheet we use is formatted and arranged to work well for exporting to R and doing analysis there.
The format is as follows:
The first 12 columns of data describe date, location and other information which then applies to the following 12 columns. The trouble is that for a single set of observations the information in the first 12 columns doesn't change from row to row but the values for the second 12 columns does change from row to row.
year mm dd loc start end obs sess test object success
2013 5 15 park 1600 1700 MTM MTM1 1 ball y
2013 5 15 park 1600 1700 MTM MTM1 2 stick y
2013 5 15 park 1600 1700 MTM MTM1 3 rock n
2013 5 15 park 1600 1700 MTM MTM1 4 rock n
2013 5 15 park 1600 1700 MTM MTM1 5 stick y
2013 5 15 park 1600 1700 MTM MTM1 6 stick y
2013 6 24 yard 1500 1530 LFR LFR1 1 ball n
2013 6 24 yard 1500 1530 LFR LFR1 2 stick n
2013 6 24 yard 1500 1530 LFR LFR1 3 stick n
2013 6 24 yard 1500 1530 LFR LFR1 4 stick n
2013 6 24 yard 1500 1530 LFR LFR1 5 stick y
2013 6 24 yard 1500 1530 LFR LFR1 6 rock y
2013 6 24 yard 1500 1530 LFR LFR1 7 ball y
Above is an imaginary dataset which matches the format of the real one (the real one is too wide to fit here).
Notice that the entries for year, mm (month), dd (day), loc (location), start, end, obs (observer), and sess (session) all stay the same but test, object, and success change from row to row for a given set of observations.
In Access I would like to use a unique_ID (primary key) to relate tables so that the information for the first 8 columns need only be entered once and have it relate to each entry for the last 3 columns. In this example then, I have one Excel worksheet that will become two related Access tables (objects).
Before converting to Access though I would like to know that I will be able to export the data back to Excel (and/or directly to a text file) so that it will look just like this again. That is, I do NOT want to export multiple tables to separate Excel worksheets. I want all Access tables within my database to be exported to just one worksheet and in the format shown above. The reason for this is that we run analysis in R based on both the session and the instance levels (called different things in the real data, but that is the idea) so it is important for the location data and result data to be associated with every row in the output file (.xls or .csv).
Is this possible?
I am mostly looking for an outline of how this might be done. Specific code is not necessary, though your personal assessment of the complexity of the potential code (given my complete and absolute ignorance of VBA) that will be required would be appreciated.
The answer appears to be Yes. I just need to create a Query which will return the data to the original form and then export that query as a table.