I am trying to get a count of all candidates who passed manager interview but do not have a grade , problem is, the grade column has both nulls and '', so if I use the code below, I am only getting a count where the grade column has a null and but, my question is how can I modify this code to capture both nulls and ''.
FILTER("Fact - # of Applicaitons" USING (IFNULL((case
when "XX"."Job Information"."Job Family Name"='Claims' then "XX"."Application Grade Details"."Final Claims Grade"
when "XX"."Job Information"."Job Family Name"='Soup' then "XX"."Application Grade Details"."Final Soup Grade"
when "XX"."Job Information"."Job Family Name"='Key' then "XX"."Application Grade Details"."Final key Grade"
when "XX"."Job Information"."Job Family Name"='Damage' then "XX"."Application Grade Details"."Final damage Grade"
End), 'Missing Scores') ='Missing Scores' AND "Application Grade Details"."Manager Decision"='Pass'))
Honestly? Best advice is to sort out the data quality issue in the source. You are trying to analyze things...not correct mistakes and inconsistencies! Every correction slows doen the analytical system. And things become especially problematic if you do things like shown above in the front end. Not only does the "correction logic" execute a thousand times per day fot each access rather than being corrected once and for all in the source, the logic itself will also need to be multiplied, reproduced and maintained for each point of usage. Long story short: sorry to say this but already conceptually and approach-wise this is the worst way of handling the issue.
Related
I'm trying to collect a dataset that could be used for automatically generating baseball articles.
I have play-by-play records of MLB games from retrosheet.org that I would like to be written out to plain text, as those that could possibly appear as part of a recap news article.
Here are some examples of the play-by-play records:
play,2,0,semim001,32,.CBFFFBBX,9/F
play,2,0,phegj001,01,FX,S7/G
play,2,0,martn003,01,CX,3/G
play,2,1,youne003,00,,NP
The following is what I would like to achieve:
For the first example
play,2,0,semim001,32,.CBFFFBBX,9/F,
I want it to be written out as something like:
"semim001 (Marcus Semien) was on three balls and two strikes in the second inning as the away player. He hit the ball into play after one called strike, one ball, three fouls, and another two balls. The fly ball was caught by the right outfielder."
The plays are formatted in the following way:
The first field is the inning, an integer starting at 1.
The second field is either 0 (for visiting team) or 1 (for home team).
The third field is the Retrosheet player id of the player at the plate.
The fourth field is the count on the batter when this particular event (play) occurred. Most Retrosheet games do not have this information, and in such cases, "??" appears in this field.
The fifth field is of variable length and contains all pitches to this batter in this plate appearance and is described below. If pitches are unknown, this field is left empty, nothing is between the commas.
The sixth field describes the play or event that occurred.
Explanations for all the symbols in the fifth and sixth field can be found on this Retrosheet page.
With Python 3, I've been able to format all the info of invariable length into a formatted sentence, which is all but the last two fields. I'm having difficulty in thinking of an efficient way to unparse (correct me if this is the wrong term to use here) the fifth and sixth fields, the pitches and the events that occurred, due to their variable length and wide variety of things that can occur.
I think I could write out all the rules based on the info on the Retrosheet website, but I'm looking for suggestions for a smarter way to do this. I wrote natural language processing as tags, hoping this could be a trivial problem in that field. Any pointers will be greatly appreciated!
So I'm curious as to what I'm missing here. I have a program for school and part of the program requires that I measure the length of the input string. I have it laid out as "if String==6" which you can see in my code below. My professor would rather it be stored in a variable and that I use the .length method to measure it. His exact words are as follows, "To see if the ticket number is greater than six characters, you need to store it in a variable. Then, on line 19, you can check it by using ticket.length == 6."
I tried using his method and I put "ticket_number.length==6." but that returns an error. Im not sure why, isnt "ticket_number" the variable that needs measured? Or do I need to create another variable just for ticket length? I'm sure there is an easy answer, I just cant seem to find it. Thanks in advance for any and all help!
begin
print "Please enter your six-digit ticket number."
ticket_number=gets.chomp.to_i
ones_digit=ticket_number%10
truncated_number=ticket_number/10.floor
remainder=truncated_number%7
if String=6 and ones_digit==remainder and ticket_number>0
print "Your ticket number is valid."
else
print "Your ticket number is invalid."
end
end while ticket_number>0
There's a couple of problems here but the biggest one is that converting to an integer means you've forfeited your opportunity to test vs. length:
ticket_number = gets.chomp
if (ticket_number.length != 6)
puts "Your ticket number must be six digits"
next
end
You can convert after the fact:
ticket_number = ticket_number.to_i
Then do your math.
Ideally you'd wrap this up in a function that, given a ticket number, will return true or false depending on validity. This de-couples it from your display and looping logic, simplifying things.
I'm wondering about the performance of Firebase when making n + 1 queries. Let's consider the example in this article https://www.firebase.com/blog/2013-04-12-denormalizing-is-normal.html where a link has many comments. If I want to get all of the comments for a link I have to:
Make 1 query to get the index of comments under the link
For each comment ID, make a query to get that comment.
Here's the sample code from that article that fetches all comments belonging to a link:
var commentsRef = new Firebase("https://awesome.firebaseio-demo.com/comments");
var linkRef = new Firebase("https://awesome.firebaseio-demo.com/links");
var linkCommentsRef = linkRef.child(LINK_ID).child("comments");
linkCommentsRef.on("child_added", function(snap) {
commentsRef.child(snap.key()).once("value", function() {
// Render the comment on the link page.
));
});
I'm wondering if this is a performance concern as compared to the equivalent of this query if I were using a SQL database where I could make a single query on comments: SELECT * FROM comments WHERE link_id = LINK_ID clause.
Imagine I have a link with 1000 comments. In SQL this would be a single query, but in Firebase this would be 1001 queries. Should I be worried about the performance of this?
One thing to keep in mind is that Firebase works over web sockets (where available), so while there may be 1001 round trips there is only one connection that needs to be established. Also: a lot of the round trips will be happening in parallel. So you might be surprised at how much time this takes.
Should I worry about this?
In general people over-estimate the amount of use they'll get. So (again: in general) I recommend that you don't worry about it until you actually have that many comments. But from day 1, ensure that nothing you do today precludes optimizing later.
One way to optimize is to further denormalize your data. If you already know that you need all comments every time you render an article, you can also consider duplicating the comments into the article.
A fairly common scenario:
/users
twitter:4784
name: "Frank van Puffelen"
otherData: ....
/messages
-J4377684
text: "Hello world"
uid: "twitter:4784"
name: "Frank van Puffelen"
-J4377964
text: "Welcome to StackOverflow"
uid: "twitter:4784"
name: "Frank van Puffelen"
So in the above data snippet I store both the user's uid and their name for every message. While I could look up the name from the uid, having the name in the messages means I can display the messages without the lookup. I'm also keeping the uid, so that I provide a link to the user's profile page (or other message).
We recently had a good question about this, where I wrote more about the approaches I consider for keeping the derived data up to date: How to write denormalized data in Firebase
iPhone has a pretty good telephone number splitting function, for example:
Singapore mobile: +65 9852 4135
Singapore resident line: +65 6325 6524
China mobile: +86 135-6952-3685
China resident line: +86 10-65236528
HongKong: +886 956-238-82
USA: +1 (732) 865-3286
Notice the nice features here:
- the splitting of country code, area code, and the rest is automatic;
- the delimiter is also nicely adopted to different countries, e.g. "()", "-" and space.
Note the parsing logic is doable to me, however, I don't know where to get the knowledge of most countries' telephone number format.
where could i found such knowledge, or an open source code that implemented it?
You can get similar functionality with the libphonenumber code library.
Interestingly enough, you cannot use an NSNumberFormatter for this, but you can write your own custom class for it. Just create a new class, set properties such as countryCode, areaCode and number, and then create a method that formats the number based on the countryCode.
Here's a great example: http://the-lost-beauty.blogspot.com/2010/01/locale-sensitive-phone-number.html
As an aside: a friend told me about a gigantic regular expression he had to maintain that could pick telephone numbers out of intercepted communications from hundreds of countries around the world. It was very non-trivial.
Thankfully your problem is easier, as you can just have a table with the per-country formats:
format[usa] = "+d (ddd) ddd-dddd";
format[hk] = "+ddd ddd-ddd-dd";
format[china_mobile] = "+dd ddd-dddd-dddd";
...
Then when you're printing, you simply output one digit from the phone number string in each d spot as needed. This assumes you know the country, which is a safe enough assumption for telephone devices -- pick "default" formats for the few surrounding countries.
Since some countries have different formats with different lengths you might need to store your table with additional information:
format[germany][10] = "..."
format[germany][11] = "....."
I am new to Ruby and Shoes, I think I have everything. the program appears to work correctly except when I get to the last step. I, enter the loan amount, interest rate, in to edit_lines, when I press the calculate button, it performs the calculations, stores the calculated numbers to a variable. The last step is dividing the total loan (loan and interest) by the length of the loan in months to ge the monthly payment, so I can make a payment table for the entire loan, but I either get in-corredt results or I get no reeults.
I think I converted the integers to floats, etc. , but... not sure. It appears to add, multiply, subtrct, except it will not divide 2 qbjects. If I enter numbers it works ok.
What am I doing wrong. It does seem like it is that difficult. Example code of dividng the values in a varible by the value of another varible?
It looks like you're using eval(), which you almost never, ever want to use. You can do the exact same thing in normal ruby. I'm just guessing right now since the code I can see in your comment is lacking newlines, but I think this code would work:
#numberbox3.text = #totalinterest + #loadamount
#numberbox5.text = #totalloan / #lengthyears
Hope this helps!