Power BI(DAX) - - dax

I need to rank the values based on lower values as 1 and higher value as last value, for instance, If I have five customers and the sales of there values can be ranked using RANKX function(highest as 1 and lowest as 5) but in mycase I need the ranks in reverse order that is highests values to be as ranked as last and lowest value to be ranked as 1.
Tried using RANKX function and switch function and tried reversing the values but if I filter then the rank values are not correct

Related

Neo4j: Difference between rand() and rand() in with clause when matching random nodes

I found here that i can select random nodes from neo4j using next queries:
MATCH (a:Person) RETURN a ORDER BY rand() limit 10
MATCH (a:Person) with a, rand() as rnd RETURN a ORDER BY rnd limit 10
Both queries seems to do the same thing but when I try to match random nodes that are in relationship with a given node then I have different results:
The next query will return always the same nodes (nodes are not randomly selected)
MATCH (p:Person{user_id: '1'})-[r:REVIEW]->(m:Movie)
return m order by rand() limit 10
...but when I use rand() in a with clause I get indeed random nodes:
MATCH (p:Person{user_id: '1'})-[r:REVIEW]->(m:Movie)
with m, rand() as rnd
return m order by rnd limit 10
Any idea why rand() behave different in a with clause in the second query but in the first not?
It's important to understand that using rand() in the ORDER BY like this isn't doing what you think it's doing. It's not picking a random number per row, it's ordering by a single number.
It's similar to a query like:
MATCH (p:Person)
RETURN p
ORDER BY 5
Feel free to switch up the number. In any case, it doesn't change the ordering because ordering every row, when the same number is used, doesn't change the ordering.
But when you project out a random number in a WITH clause per row, then you're no longer ordering by a single number for all rows, but by a variable which is different per row.

How to create an algorithm for mapping two values into one

I have a sorted sequence of about 200 elements. The value "rank" with double type stores place of element in this sequence. If I change place of element in the collection, then rank also changed. Element also has boolean field of "priority" and it can be changed too. So I need to map these "rank" and "priority" fields into one "result" field. It should be something like sorting on 2 columns - at first sort by priority and second by rank. And I want to get brief value of result, for example from 1 to 1000.
The issue is in range on rank. It may be almost any value from double type. So 10 or 1000000000 is correct values for this field. Therefore, I can't simply use "weight" coefficients to increase "result" for priority items. Also since I want to get short value as result, I don't want to get hundred of billions value. I want to get values that easily comparable by humans.
And one more restriction. I want to limit result field recalculation as much as possible. In ideal case if I change rank or priority for one element, then result value changed only for this element. I don't want to recalculate result values for all elements after one change.
These restrictions can be relaxed, for example it's ok to get result value in range from 1 to 10000, but at first I want to get outcome as much as possible to ideal.
I believe it something alike to creation hash function, but I need a human read result of this function and I doubt, that something similar exist.
Example:
At first a have 4 elements with rank and priority and then I get sorted collection of these elements with result value.
A=(10, 0), B=(100,1), C=(1000,0), D=(10000, 1) => B=1, D=2, A=3, C=4.
If I change priority for D, then D=(10000, 0) and outcome collection will be:
B=1, A=2, C=3, D=4.
So I don't want to use simple indexes like 1, 2, 3, 4, since in this case I need to recalculate result value for 3 elements.
If I may have result like B=1, D=5, A=10, C=15, then if I change result of D to, for example, 20, then I will need to change result only for one element.

How to offset limit by sorted index with AQL?

I have a document collection of members which have two relevant properties: _key and score. I've also created a persistent index on the score field, as that should make sorting significantly faster. I want to write an AQL query that returns different results based on the sorted index of a specific member (referred to as A):
Always returns at least the top 5 members by score. (LIMIT 5)
If A is in the top 10, return the 6 - 10 ranked members. (LIMIT 5, 5)
Otherwise, return the members directly above and below A in rank. (LIMIT x - 1, 3, x = A's rank)
I was unable to do this in a single query, however I was able to fetch the rank of a member by doing something along the lines of
RETURN LENGTH(
FOR m IN members
FILTER m.score > DOCUMENT("members", "ID").score
RETURN 1
) + 1
and then use a second query to fetch the ranked data I wanted, something like
FOR m IN members
SORT m.score DESC LIMIT 10
RETURN m
or joining two sub-queries with LIMIT 5 and LIMIT rank - 2, 3 depending on the rank.

Is there a search algorithm for huge two-dimensional arrays?

This is not a real-life question, it is just theory-crafting.
I have a big array which consists of elements like [1,140,245,123443], all
integer or floats with low selectivity, and the number of unique values is ten
times less than the size of the array. B*tree indexing is not good in this case.
I also tried to implement bitmap indexing, but in Ruby, binary operations are not so fast.
Are there any good algorithms for searching two-dimensional arrays of fixed size vectors?
And, the main question is, how do I convert the vector in value, where the conversion function has to be monotonic, so I can apply range queries such as:
(v[0]<10, v[2]>100, v[3]=32, 0.67*10^-8<v[4]<1.2154241410*10^-6)
the only idea i have is to create separate sorted indexes for each component of vector...binary search then and merge...but it is a bad idea because in the worst case scenario it will require O(N*N) operations...
Assuming that each "column" is vaguely evenly distributed in a known range, you could keep track of a series of buckets for each column, and a list of rows that satisfy the bucket. The number of buckets for each column can be the same, or different, it's totally arbitrary. More buckets is faster, but takes slightly more memory.
my table:
range: {1to10} {1to4m} {-2mto2m}
row1: {7 3427438335 420645075}
row2: {5 3862506151 -1555396554}
row3: {1 2793453667 -1743457796}
buckets for column 1:
bucket{1-3} : row3
bucket{4-6} : row2
bucket{7-10} : row1
buckets for column 2:
bucket{1-2m} :
bucket{2m-4m} : row1, row2, row4
buckets for column 3:
bucket{-2m--1m} : row2, row3
bucket{-1m-0} :
bucket{0-1m} :
bucket{1m-2m} : row1
Then, given a series of criteria: {v[0]<=5, v[2]>3*10^10}, we pull out the buckets that match that criteria:
column 1:
v[0]<=5 matches buckets {1-3} and {4-6}, which is rows 2 and 3.
column 2:
v[2]>3*10^10} matches buckets {2m-4m} and {4-6}, which is rows 1, 2 and 3.
column 3:
"" matches all , which is rows 1, 2 and 3.
Now we know that the row(s) we're looking for meet all three criteria, so we list all the rows that are in the buckets that matched all the criteria, in this case, rows 2 and 3. At this point, the number of rows remaining will be small even for massive amounts of data, depending on the granularity of your buckets. You simply check each of the rows that is left at this point to see if they match. In this sample we see that row 2 matches, but row 3 doesn't.
This algorithm is technically O(n), but in practice, if you have large numbers of small buckets, this algorithm can be very fast.
Using an index :)
The basic idea is to turn the 2 dimensional array into a 1 dimensional sorted array(while keeping the original position) and apply binary search on the later.
This method works for any n dimensional array and is used widely by databases which can be seen as a n dimensional array with variable lengths.

programming algorithm: how evenly distribute categories across columns

I have a number of categories each category has a number of elements. I'm now looking for a programming algorithm to distribute these categories across a predefined number of columns without breaking up the categories, keeping the category order, and keeping the number of elements in each column as optimal as possible.
For example:
Distribution 5 categories across 3 columns
Data:
category A, 7 elements
category B, 7 elements
category C, 3 elements
category D, 2 elements
category E, 8 elements
Outcome:
Column 1: category A, 7 elements
Column 2: category B and C, 10 elements
Column 3: category D and E, 10 elements
You have the total number of elements, so you can divide that number by the number of columns to get the expected number of elements in each column. Your job is then to minimize the sum of the squares of the differences (so, if you have to store 8 elements and you store 10, you have a squared difference of 2² = 4 for that column).
You can then write a recursive function that, for every category, decides whether to move that category to the next column, or keep it in the current column. This is a boolean decision, so you can start by the branch that creates the smallest difference, and then the branch that creates the largest. The function would keep track of the best solution found so far, and immediately stop if the current sum of squared differences is greater than the total for that solution.

Resources