Combining two Dimensions in Tableau - filter

I have two Dimensions I’m trying to combine into one calculated field. Below are their names and the options for each:
1. Agree / Disagree/ Overturn
a. Agree
b. Disagree
c. Overturn
2. Severity
a. Important
b. Informational
c. Minor
d. Moderate
e. Severe
For starters I created
COUNTD(IF ([Agree / Disagree / Overturn] = 'Agree'
OR [Agree / Disagree / Overturn] = 'Overturn')
THEN [Loan Number]
END)
This worked fine, but did not filter out everything I was needing to. I also need to bring in the Severity dimension and keep everything except “Minor”. So I tried the following:
COUNTD(IF ([Agree / Disagree / Overturn] = 'Agree'
OR [Agree / Disagree / Overturn] = 'Overturn'
AND [Severity] = 'Moderate'
OR [Severity] = 'Severe'
OR [Severity] = 'Important'
OR [Severity] = 'Informational')
THEN
[Loan Number]
END)
This worked great as well, but only on one month. For the other two months of the quarter the count is off by several. I don’t have any other filters on that should cause it to work with one month and not with others.
Thoughts?

Related

Separate characters and numbers following specific rules

I am trying to distinguish flight numbers.
Example:
flightno = "FR556"
split_data = flightno.upcase.match(/([A-Za-z]+)(\d+)/)
first = split_data[1] # FR
second = split_data[1] # 556
I then go on to query the database to find an airline based on the FR in this example and apply some logic with the result which is Ryanair.
My problem is when the flight number might be:
flightno = "U21920"
split_data = flightno.upcase.match(/([A-Za-z]+)(\d+)/)
first = split_data[1] # U
second = split_data[1] # 21920
i basically want first to be U2 not just U. This is used to search the database of airlines by their IATA code in this case is U2
****EDIT**
In the interest of clarity i made some mistakes in terminology when asking my question. Due to the complexities of booking reference numbers, the input is taken from whatever the passenger provides. For an easyJet flight for example, the passenger may input EZY1920 or U21920 only the airline provides either so the passenger is ignorant really.
"EZY" = ICAO
"U2" = IATA
I take the input from the user and try to separate the ICAO or IATA from the flight number "1920" but there is no way of determining that without searching the database or separating the input which i feel is cumbersome from a user experience point of view.
Using a regex to separate characters from numbers works until the user inputs an IATA as part of their flight number (the passenger won't know the difference) and as you can see in the example above this confuses the regex.**
The trouble is i cant think of any other pattern with flight numbers. They always have at least two characters made up of just letters or a mixture of a letter and a number and can be 3 characters in length. The numbers part can be as short as 1 but can also be as long as 4 - always numbers.
****edit**
As has been mentioned in the comments, there is no fixed size however one thing that is always true (at least so far) is the first character will always be a letter regardless if it is ICAO or IATA.
After considering every bodies input so far i'm wondering if searching the database and returning airlines with an IATA or ICAO that matches the first two letters provided by the user (U2), (FR), (EZ) might be one way to go, however this is subject to obvious problems should an ICAO or IATA be released that matches another airline, for example "EZY" & "EZT". This is not future proof and i'm looking for better ruby or regex solutions.**
Appreciate your input.
EDIT
I have answered my own question below. While other answers provide a solution for handling some conditions they would fall down if the flight number began with a number so i worked out a crass but to date stable way to analyse the string for digits and then work out if it is an ICAO or IATA from that.
A solution I think of is that you match your given flight number against a complete list of ICAO/IATA codes: https://raw.githubusercontent.com/datasets/airport-codes/master/data/airport-codes.csv
Spending some time with google might give you a more appropriate list.
Then use the first three characters (if that is the maximum) of your flight number to find a match within the icao codes. If you find one, you will know where to seperate your string.
Here a minimal ugly example that should set you on a track. Feel free to update!
ICAOCODES = %w(FR DEU U21) # grab your data here
def retrieve_flight_information(flightnumber)
ICAOCODES.each do |icao|
co = flightnumber.match(icao).to_s
if co.length > 0
# airline
puts co
# flight number
puts flightnumber.gsub(co,'')
end
end
end
retrieve_flight_information("FR556")
#=> FR
#=> 556
retrieve_flight_information("U21214123")
#=> U21
#=> 214123
The biggest flaw lies in using .gsub() as it might mess up your flightnumber in case it looks like this: "FR21413FR2"
However you will find plenty of solutions to this problem on so.
As mentioned in the comments, a list of icao codes is not what you are looking for. But what is relevant here, is that you somehow need a list of strings that you can securely compare against.
I have a fairly crass solution that seems to be working in all scenarios i can throw at it to date. I wanted to make this available to anybody else that might find it useful?
The general rule of thumb for flight codes/numbers seems to be:
IATA: two characters made up of any combination letters and digits
ICAO: three characters made up of letters only (to date)
With that in mind we should be able to work out if we need to search the database by IATA or ICAO depending on the condition of the first three characters.
First we take the flight number and convert to uppercase
string = "U21920".upcase
Next we analyse the first three characters to check for any numbers.
first_three = string[0,3] # => U21
Is there a digit in first_three?
if first_three =~ /\d/ # => true
iata = first_three[0,2] # => If true lets get rid of the last character
# Now we go to the database searching IATA (U2)
search = Airline.where('iata LIKE ?', "#{iata}%") # => Starts with search, just in case
Otherwise if there isnt a digit found in the string
else
icao = string.match(/([A-Za-z]+)(\d+)/)
search = Airline.where('icao LIKE ?', "#{icao[1]}%")
This seems to work for the random flight numbers ive tested it with today from a few of the major airport live departure/arrival boards. Its an interesting problem because some airlines issue tickets with either an ICAO or IATA code as part of the flight number which means passengers won't know any different, not to mention, some airports provide flight information in their own format so assumign there isnt a change to the ICAO and IATA build then the above should work.
Here is an example script you can run
test.rb
puts "What is your flight number?"
string = gets.upcase
first_three = string[0,3]
puts "Taking first three from #{string} is #{first_three}"
if first_three =~ /\d/ # Calling String's =~ method.
puts "The String #{first_three} DOES have a number in it."
iata = first_three[0,2]
search = Airline.where('iata LIKE ?', "#{iata}%")
puts "Searching Airlines starting with IATA #{iata} = #{search.count}"
puts "Found #{search.first.name} from IATA #{iata}"
else
puts "The String #{first_three} does not have a number in it."
icao = string.match(/([A-Za-z]+)(\d+)/)
search = Airline.where('icao LIKE ?', "#{icao[1]}%")
puts "Searching Airlines starting with ICAO #{icao[1]} = #{search.count}"
puts "Found #{search.first.name} from IATA #{icao[1]}"
end
Airline
Airline(id: integer, name: string, iata: string, icao: string, created_at: datetime, updated_at: datetime )
stick this in your lib folder and run
rails runner lib/test.rb
Obviously you can remove all of the puts statements to get straight to the result. I'm using rails runner to include access to my Airline model when running the script.

How does "Assignment Branch Condition size for index is too high" work?

Rubocop is always report the error:
app/controllers/account_controller.rb:5:3: C: Assignment Branch Condition size for index is too high. [30.95/24]
if params[:role]
#users = #search.result.where(:role => params[:role])
elsif params[:q] && params[:q][:s].include?('count')
#users = #search.result.order(params[:q][:s])
else
#users = #search.result
end
How to fix it? Anyone has good idea?
The ABC size [1][2] is
computed by counting the number of assignments, branches and conditions for a section of code. The counting rules in the original C++ Report article were specifically for the C, C++ and Java languages.
The previous links details what counts for A, B, and C. ABC size is a scalar magnitude, reminiscent of a triangulated relationship:
|ABC| = sqrt((A*A)+(B*B)+(C*C))
Actually, a quick google on the error shows that the first indexed page is the Rubocop docs for the method that renders that message.
Your repo or analysis tool will define a threshold amount when the warning is triggered.
Calculating, if you like self-inflicting....
Your code calcs as
(1+1+1)^2 +
(1+1+1+1+1+1+1+1+1+1+1+1+1)^2 +
(1+1+1+1)^2
=> 194
That's a 'blind' calculation with values I've made up (1s). However, you can see that the error states numbers that probably now make sense as your ABC and the threshold:
[30.95/24]
So cop threshold is 24 and your ABC size is 30.95. This tells us that the rubocop engine assign different numbers for A, B, and C. As well, different kinds or Assignments (or B or C) could have different values, too. E.G. a 'normal' assignment x = y is perhaps scored lower than a chained assignment x = y = z = r.
tl;dr answer
At this point, you probably have a fairly clear idea of how to reduce your ABC size. If not:
a simple way it to take the conditional used for your elsif and place it in a helper method.
since you are assigning an # variable, and largely calling from one as well, your code uses no encapsulation of memory. Thus, you can move both if and elsif block actions into each their own load_search_users_by_role and load_search_users_by_order methods.

Algorithm to create unique random concatenation of items

I'm thinking about an algorithm that will create X most unique concatenations of Y parts, where each part can be one of several items. For example 3 parts:
part #1: 0,1,2
part #2: a,b,c
part #3: x,y,z
And the (random, one case of some possibilities) result of 5 concatenations:
0ax
1by
2cz
0bz (note that '0by' would be "less unique " than '0bz' because 'by' already was)
2ay (note that 'a' didn't after '2' jet, and 'y' didn't after 'a' jet)
Simple BAD results for next concatenation:
1cy ('c' wasn't after 1, 'y' wasn't after 'c', BUT '1'-'y' already was as first-last
Simple GOOD next result would be:
0cy ('c' wasn't after '0', 'y' wasn't after 'c', and '0'-'y' wasn't as first-last part)
1az
1cx
I know that this solution limit possible results, but when all full unique possibilities will gone, algorithm should continue and try to keep most avaible uniqueness (repeating as few as possible).
Consider real example:
Boy/Girl/Martin
bought/stole/get
bottle/milk/water
And I want results like:
Boy get milk
Martin stole bottle
Girl bought water
Boy bought bottle (not water, because of 'bought+water' and not milk, because of 'Boy+milk')
Maybe start with a tree of all combinations, but how to select most unique trees first?
Edit: According to this sample data, we can see, that creation of fully unique results for 4 words * 3 possibilities, provide us only 3 results:
Martin stole a bootle
Boy bought an milk
He get hard water
But, there can be more results requested. So, 4. result should be most-available-uniqueness like Martin bought hard milk, not Martin stole a water
Edit: Some start for a solution ?
Imagine each part as a barrel, wich can be rotated, and last item goes as first when rotates down, first goes as last when rotating up. Now, set barells like this:
Martin|stole |a |bootle
Boy |bought|an |milk
He |get |hard|water
Now, write sentences as We see, and rotate first barell UP once, second twice, third three and so on. We get sentences (note that third barell did one full rotation):
Boy |get |a |milk
He |stole |an |water
Martin|bought|hard|bootle
And we get next solutions. We can do process one more time to get more solutions:
He |bought|a |water
Martin|get |an |bootle
Boy |stole |hard|milk
The problem is that first barrel will be connected with last, because rotating parallel.
I'm wondering if that will be more uniqe if i rotate last barrel one more time in last solution (but the i provide other connections like an-water - but this will be repeated only 2 times, not 3 times like now). Don't know that "barrels" are good way ofthinking here.
I think that we should first found a definition for uniqueness
For example, what is changing uniqueness to drop ? If we use word that was already used ? Do repeating 2 words close to each other is less uniqe that repeating a word in some gap of other words ? So, this problem can be subjective.
But I think that in lot of sequences, each word should be used similar times (like selecting word randomly and removing from a set, and after getting all words refresh all options that they can be obtained next time) - this is easy to do.
But, even if we get each words similar number od times, we should do something to do-not-repeat-connections between words. I think, that more uniqe is repeating words far from each other, not next to each other.
Anytime you need a new concatenation, just generate a completely random one, calculate it's fitness, and then either accept that concatenation or reject it (probabilistically, that is).
const C = 1.0
function CreateGoodConcatenation()
{
for (rejectionCount = 0; ; rejectionCount++)
{
candidate = CreateRandomConcatination()
fitness = CalculateFitness(candidate) // returns 0 < fitness <= 1
r = GetRand(zero to one)
adjusted_r = Math.pow(r, C * rejectionCount + 1) // bias toward acceptability as rejectionCount increases
if (adjusted_r < fitness)
{
return candidate
}
}
}
CalculateFitness should never return zero. If it does, you might find yourself in an infinite loop.
As you increase C, less ideal concatenations are accepted more readily.
As you decrease C, you face increased iterations for each call to CreateGoodConcatenation (plus less entropy in the result)

Are there sequence-operator implementations in .NET 4.0?

With that I mean similar to the Linq join, group, distinct, etc. only working on sequences of values, not collections.
The difference between a sequence and a collection is that a sequence might be infinite in length, whereas a collection is finite.
Let me give you an example:
var c1 = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var c2 = FunctionThatYieldsFibonacciNumbers();
var c3 = c1.Except(c2);
This does not work. The implementation of Except does not work on the basis that the numbers in either collection will be strictly ascending or descending, so it first tries to gather all the values from the second collection into a set (or similar), and only after that will it start enumerating the first collection.
Assuming that the function above is just a While-loop that doesn't terminate unless you explicitly stop enumerating it, the above code will fail with a out-of-memory exception.
But, given that I have collections that are considered to be strictly ascending, or descending, are there any implementations already in .NET 4.0 that can do:
Give me all values common to both (inner join)
Give me all values of both (union/outer join)
Give me all values in sequence #1 that isn't in sequence #2
I need this type of functionality related to a scheduling system I need to build, where I need to do things like:
c1 = the 1st and 15th of every month from january 2010 and onwards
c2 = weekdays from 2010 and onwards
c3 = all days in 2010-2012
c4 = c1 and c2 and c3
This would basically give me every 1st and 15th of every month in 2010 through 2012, but only when those dates fall on weekdays.
With such functions it would be much easier to generate the values in question without explicitly having to build collections out of them. In the above example, building the first two collections would need to know the constraint of the third collection, and the examples can become much more complex than the above.
I'd say that the LINQ operators already work on general sequences - but they're not designed to work specifically for monotonic sequences, which is what you've got here.
It wouldn't be too hard to write such things, I suspect - but I don't believe anything's built-in; there isn't even anything for this scenario in System.Interactive, as far as I can see.
You may onsider the Seq module of F#, which is automatically invoked by using special F# language constructs like 1 .. 10 which generates a sequence. It supports infinite sequences the way you describe, because it allows for lazy evaluation. Using F# may or may not be trivial in your situation. However, it shouldn't be too hard to use the Seq module directly from C# (but I haven't tried it myself).
Following this Mandelbrot example shows a way to use infinite sequences with C# by hiding yield. Not sure it brings you closer to what you want, but it might help.
EDIT
While you already commented that it isn't worthwhile in your current project and accepted an answer to your question, I was intrigued by the idea and conjured up a little example.
It appeared to be rather trivial and works well in C# with .NET 3.5 and .NET 4.0, by simple including FSharp.Core.dll (download it for .NET 3.5) to your references. Here's an out-of-the box example of an infinite sequence implementing your first use-case:
// place in your using-section:
using Microsoft.FSharp.Collections;
using Microsoft.FSharp.Core;
// [...]
// trivial 1st and 15th of the month filter, starting Jan 1, 2010.
Func<int, DateTime> firstAndFifteenth = (int i) =>
{
int year = i / 24 + 2010;
int day = i % 2 != 0 ? 15 : 1;
int month = ((int)i / 2) % 12 + 1;
return new DateTime(year, month, day);
};
// convert func to keep F# happy
var fsharpFunc = FSharpFunc<int, DateTime>.FromConverter(
new Converter<int, DateTime>(firstAndFifteenth));
// infinite sequence, returns IEnumerable
var infSeq = SeqModule.InitializeInfinite<DateTime>(fsharpFunc);
// first 100 dates
foreach (var dt in infSeq.Take(100))
Debug.WriteLine("Date is now: {0:MM-dd-yyy}", dt);
Output is as can be expected, first few lines like so:
Date is now: 01-01-2010
Date is now: 01-15-2010
Date is now: 02-01-2010
Date is now: 02-15-2010
Date is now: 03-01-2010

Speed dating algorithm

I work in a consulting organization and am most of the time at customer locations. Because of that I rarely meet my colleagues. To get to know each other better we are going to arrange a dinner party. There will be many small tables so people can have a chat. In order to talk to as many different people as possible during the party, everybody has to switch tables at some interval, say every hour.
How do I write a program that creates the table switching schedule? Just to give you some numbers; in this case there will be around 40 people and there can be at most 8 people at each table. But, the algorithm needs to be generic of course
heres an idea
first work from the perspective of the first person .. lets call him X
X has to meet all the other people in the room, so we should divide the remaining people into n groups ( where n = #_of_people/capacity_per_table ) and make him sit with one of these groups per iteration
Now that X has been taken care of, we will consider the next person Y
WLOG Y be a person X had to sit with in the first iteration itself.. so we already know Y's table group for that time-frame.. we should then divide the remaining people into groups such that each group sits with Y for every consecutive iteration.. and for each iteration X's group and Y's group have no person in common
.. I guess, if you keep doing something like this, you will get an optimal solution (if one exists)
Alternatively you could crowd source the problem by giving each person a card where they could write down the names of all the people they got dine with.. and at the end of event, present some kind of prize to the person with the most names in their card
This sounds like an application for genetic algorithm:
Select a random permutation of the 40 guests - this is one seating arrangement
Repeat the random permutation N time (n is how many times you are to switch seats in the night)
Combine the permutations together - this is the chromosome for one organism
Repeat for how ever many organisms you want to breed in one generation
The fitness score is the number of people each person got to see in one night (or alternatively - the inverse of the number of people they did not see)
Breed, mutate and introduce new organisms using the normal method and repeat until you get a satisfactory answer
You can add in any other factors you like into the fitness, such as male/female ratio and so on without greatly changing the underlying method.
Why not imitate real world?
class Person {
void doPeriodically() {
do {
newTable = random (numberOfTables);
} while (tableBusy(newTable))
switchTable (newTable)
}
}
Oh, and note that there is a similar algorithm for finding a mating partner and it's rumored to be effective for those 99% of people who don't spend all of their free time answering programming questions...
Perfect Table Plan
You might want to have a look at combinatorial design theory.
Intuitively I don't think you can do better than a perfect shuffle, but it's beyond my pre-coffee cognition to prove it.
This one was very funny! :D
I tried different method but the logic suggested by adi92 (card + prize) is the one that works better than any other I tried.
It works like this:
a guy arrives and examines all the tables
for each table with free seats he counts how many people he has to meet yet, then choose the one with more unknown people
if two tables have an equal number of unknown people then the guy will choose the one with more free seats, so that there is more probability to meet more new people
at each turn the order of the people taking seats is random (this avoid possible infinite loops), this is a "demo" of the working algorithm in python:
import random
class Person(object):
def __init__(self, name):
self.name = name
self.known_people = dict()
def meets(self, a_guy, propagation = True):
"self meets a_guy, and a_guy meets self"
if a_guy not in self.known_people:
self.known_people[a_guy] = 1
else:
self.known_people[a_guy] += 1
if propagation: a_guy.meets(self, False)
def points(self, table):
"Calculates how many new guys self will meet at table"
return len([p for p in table if p not in self.known_people])
def chooses(self, tables, n_seats):
"Calculate what is the best table to sit at, and return it"
points = 0
free_seats = 0
ret = random.choice([t for t in tables if len(t)<n_seats])
for table in tables:
tmp_p = self.points(table)
tmp_s = n_seats - len(table)
if tmp_s == 0: continue
if tmp_p > points or (tmp_p == points and tmp_s > free_seats):
ret = table
points = tmp_p
free_seats = tmp_s
return ret
def __str__(self):
return self.name
def __repr__(self):
return self.name
def Switcher(n_seats, people):
"""calculate how many tables and what switches you need
assuming each table has n_seats seats"""
n_people = len(people)
n_tables = n_people/n_seats
switches = []
while not all(len(g.known_people) == n_people-1 for g in people):
tables = [[] for t in xrange(n_tables)]
random.shuffle(people) # need to change "starter"
for the_guy in people:
table = the_guy.chooses(tables, n_seats)
tables.remove(table)
for guy in table:
the_guy.meets(guy)
table += [the_guy]
tables += [table]
switches += [tables]
return switches
lst_people = [Person('Hallis'),
Person('adi92'),
Person('ilya n.'),
Person('m_oLogin'),
Person('Andrea'),
Person('1800 INFORMATION'),
Person('starblue'),
Person('regularfry')]
s = Switcher(4, lst_people)
print "You need %d tables and %d turns" % (len(s[0]), len(s))
turn = 1
for tables in s:
print 'Turn #%d' % turn
turn += 1
tbl = 1
for table in tables:
print ' Table #%d - '%tbl, table
tbl += 1
print '\n'
This will output something like:
You need 2 tables and 3 turns
Turn #1
Table #1 - [1800 INFORMATION, Hallis, m_oLogin, Andrea]
Table #2 - [adi92, starblue, ilya n., regularfry]
Turn #2
Table #1 - [regularfry, starblue, Hallis, m_oLogin]
Table #2 - [adi92, 1800 INFORMATION, Andrea, ilya n.]
Turn #3
Table #1 - [m_oLogin, Hallis, adi92, ilya n.]
Table #2 - [Andrea, regularfry, starblue, 1800 INFORMATION]
Because of the random it won't always come with the minimum number of switch, especially with larger sets of people. You should then run it a couple of times and get the result with less turns (so you do not stress all the people at the party :P ), and it is an easy thing to code :P
PS:
Yes, you can save the prize money :P
You can also take look at stable matching problem. The solution to this problem involves using max-flow algorithm. http://en.wikipedia.org/wiki/Stable_marriage_problem
I wouldn't bother with genetic algorithms. Instead, I would do the following, which is a slight refinement on repeated perfect shuffles.
While (there are two people who haven't met):
Consider the graph where each node is a guest and edge (A, B) exists if A and B have NOT sat at the same table. Find all the connected components of this graph. If there are any connected components of size < tablesize, schedule those connected components at tables. Note that even this is actually an instance of a hard problem known as Bin packing, but first fit decreasing will probably be fine, which can be accomplished by sorting the connected components in order of biggest to smallest, and then putting them each of them in turn at the first table where they fit.
Perform a random permutation of the remaining elements. (In other words, seat the remaining people randomly, which at first will be everyone.)
Increment counter indicating number of rounds.
Repeat the above for a while until the number of rounds seems to converge.

Resources