Using a filter with a spill range as criteria - filter

I have a sheet using a spill range of product groups. However the groups are relatively detailed. IE: Pencils Black, Pencils Green, Pens Black, Pens Blue. There are 57 different detailed groups.
I have a table that lists these items and concise groups. IE: Pencils Black = Pencils, Pens Black = Pens, Erasers = Misc. There are only 4 final groups.
I am trying to filter the Item Group table using the spill range (it's one column of data) as criteria and I get a #N/A error.
Here is the formula: =FILTER(tblItemGroup[Item Group],tblItemGroup[Item Kind & Level 1]=D8#)
If I change the formula to the cell reference (D8) instead of the array, it works fine but only for that row. I can copy that formula but then it is not dynamic and that will affect other calculations.
But I need these results to be dynamic as the amount of rows will vary based on another selection on the sheet (Salesman).

Related

Power BI - Matrix dimension based on multiple slicer selection

I would like to give the user the possibility to choose in which column the element from dimension table will appear in the matrix. That is why I created three slicers for each position of matrix table. If the user selects BMW from Slicer 1, I expect that this element will appear in the first column of the matrix. The same applies to Slicer 2 and Slicer 3. The user can select any and only one element from each slicer. So applying alphabetical order or any other predefined rule is not possible. I assume I need a UNION of all selected values that I can use as a column dimension in the matrix.
I tried to use calculation groups. I am able to define calculation items that take selected value from each slicer. That works fine, but the calculation item's name (that is shown as the header of column) is static. I also tried to use dynamic textboxes that I positioned over the column headers. However, they don't move with scrollbar on the matrix if we enlarge our example to more than only 3 column (e.g. 10).
Any ideas? Anything is appreaciated.

How to do a new column with string values that come from different segmentations on the same data

I have a database going under three different types of segmentation (via slicer/buttons):
The rank selectors on green that I can choose or not to be applied;
The pink "Top N" slicer, which allows me to restrain the number of Asset Codes displayed in each one of the four white visuals on the top of the page;
The four orange slicers that allow me to restrain values if I want to (what also automatically rearrange the rankings in the four white visuals).
My goal is to generate dynamically a table/slicer/anything that shows me the Asset Codes contained in all white visuals after I choose whichever selections at any or (all) of the 3 segmentation levels, gathered in one place, just like the table in the blue visual (I applied a filter level to build an example of the final result).
Example
Here is the file of the pbix: https://drive.google.com/file/d/1N_g1fl5zkXibp6vNwy9s_oBifJCiGu0n/view?usp=sharing
Does anyone have a clue how to solve this issue?
Tks!

How to optimize an algorithm for sorting OCR detected text chunks in the right order?

Consider the following paragraph along with detected word and equation chunks:
While the green rectangles highlight a text section, the blue rectangles highlight individual words and equations. The selections were retrieved from using both the google vision api as well as the mathpix ocr api.
All of the rectangles have been extracted and labeled as either "block", "word" or "math" along with its coordinates on the image ([top-left, top-right, bottom-right, bottom-left] - (X, Y) for each) and its text value (e.g. "Der").
Each rectangle inside a block is saved as a child of said block so one can easily iterate through all the children of a block.
The task now becomes to extract the words inside a block based on their order and line breaks. This example contains printed text, however the OCR algorithm can also detect handwritten text and math - which may lead to more volatility in the top coordinates of words on the same line of text.
The question now becomes what the fastest way to sort these results would be.
First approach
The first approach I've tried is to iterate through the words in a block and keep track of them in a map of y ranges, so that each entry in the map would represent a line.
lines = {
[10, 20]: ["Der", "entsprechende", "Kapitalwert", "der", "Zahlungsreihe", "betrÃĪgt" ...],
[25, 35]: ["next", "line"]
}
One would then iterate through each line and sort the words by their x coordinate in ascending order.
This comes with the drawback of having to iterate through all the entries of the map for every single word in order to check that the y value is within the range of the two values.
Is there perhaps a way to represent the individual ranges as a unique value (perhaps using hashes?) and saving each line to the corresponding hash?
Order-preserving hash functions could be the way to go for ordered values, but it's unclear how those could be applied to ranges with a threshold.
What Google gives you
As mentioned the results have been obtained by using the google vision api for document text detection among other things. Google already provides you with a "full annotation" of the scanned document but the results of individual paragraphs do not appear to be ordered. Perhaps there is a way to leverage the results provided by the vision api.

Merging header in Matrix

I want to merge the blue area of my matrix:
When selecting these 3 cells and right-clicking on them, I don't see the merge option, as described here: https://msdn.microsoft.com/en-us/library/dd207131.aspx
This can't be done as you're trying to merge cells in your Column Group (qwesa) with cells outside that group (gdfr). I assume you need to use a matrix for your report, hence the column grouping.
Please see the MSDN reference on Merging Cells in a Data region, which states
Cells can only be merged within each area of a data region: corner, column headers, group definition (or row headers), and body. You cannot merge cells that cross area boundaries. For example, you cannot merge a cell in the data region corner area with a cell in the row group area.
If you do not need column groups, you should instead use a tablix where the desired merging options is available.

Sort labels of segmented image in kmeans based on cluster mean

I have a simple question but is very interesting. As you know, Kmeans can be give different result after each running due to randomly initial cluster center. However, assume I know that cluster 1 has smaller mean value than cluster 2, cluster 2 has smaller mean value than cluster 3 and so on. I want to make a algorithm to implement that cluster has small mean value, then it will be assigned to small cluster index.
This is my Matlab code. If you are have more sort or more clear way. Please suggest to me
%% K-mean
num_cluster=2;
nrows = size(Img_original,1);
ncols = size(Img_original,2);
I_1D = reshape(Img_original,nrows*ncols,1);
[cluster_idx mu]=kmeans(double(I_1D),num_cluster,'distance','sqEuclidean','Replicates',3);
cluster_label = reshape(cluster_idx,nrows,ncols);
%% Sort based on mu
[mu_sort id_sort]=sort(mu);
idx=cell(1,num_cluster)
%% Save index of order if mu
for i=1:num_cluster
idx{i}=find(cluster_label==id_sort(i));
end
%% Sort cluster label based on mu
for i=1:num_cluster
cluster_label(idx{i})=i;
end
It's unclear to me as to why you'd want to relabel the clusters based on the ordering of each centroid. You can simply use the labelling vector that is output from k-means to reference which cluster / centroid each point belongs to.
Nevertheless, the initial idea that you had to sort the centroids is a good one. The last part of your code seems rather inefficient because you're looping over each label and doing the reassignment. One thing I could perhaps suggest is to have a lookup table where the input is the original label and the output is the reordered labels based on the sorted centroids.
If you want to pursue this route, you can use a containers.Map where the keys are the labels given from the sort order that is output from sort, and the values are the reordered labels... namely, a vector that goes from 1 up to as many classes you have. You need to do this because the second output of sort tells you where each value in the original array would appear in the sorted result, so you must use this ordering to properly perform the relabelling. In addition, I would use the sortrows function in MATLAB, not raw sort. With how you're doing it, you are sorting each column / variable independently and that will give the wrong centroids. This will work for grayscale images where you only have one feature to consider, namely the grayscale, but if you go beyond grayscale and perhaps go into RGB or whatever colour space you desire, using raw sort will give you incorrect results. You need to consider each row as a single point, then sort the rows jointly.
Given your code, you'd do something like this:
%% K-mean
num_cluster=2;
nrows = size(Img_original,1);
ncols = size(Img_original,2);
I_1D = reshape(Img_original,nrows*ncols,1);
[cluster_idx mu]=kmeans(double(I_1D),num_cluster,'distance','sqEuclidean','Replicates',3);
%% Sort based on mu
[mu_sort id_sort]=sortrows(mu);
%// New - Create lookup
lookup = containers.Map(id_sort, 1:size(mu_sort,1));
%// Relabel the vector
cluster_idx_sort = lookup.values(num2cell(cluster_idx));
cluster_idx_sort = [cluster_idx_sort{:}];
%// Reshape back to original image dimensions
cluster_label = reshape(cluster_idx_sort,nrows,ncols);
This should hopefully give you some speedup in your code.
To double check, I tried this on the cameraman.tif image, that's part of the image processing toolbox. Running the code gives me these cluster centres:
>> mu
mu =
153.3484
23.7291
Once I sort the clusters in ascending order, this is what I get for the ordering and for the centroids:
>> mu_sort
mu_sort =
23.7291
153.3484
>> id_sort
id_sort =
2
1
So that works as we expected... now if we display the original cluster label map before sorting on the centroids with:
cluster_label = reshape(cluster_idx, nrows, ncols);
imshow(cluster_label,[]);
... we get this image:
Now, if we run through the sorting logic and display the centroids:
imshow(cluster_label, []);
... we get this image:
This works as I expected. Because the centroids flipped, so should the colouring.

Resources