How can I find the difference in years between two date columns? - lubridate

I'm doing research involving the age of teachers when they take certification tests. I have a column labelled Birth.Date and another labelled Test.Date both of these have dates in the format of year-Month-date (1990-12-26). I'm quite new at R and programming, and need to analyze my data.
Tried Lubridate - but could only find directions for individual rows. Not the remotest clue.

Related

PowerBI - Displaying the average of row figures in a matrix

I've been Googling around this problem for hours and haven't found a solution that suits my needs.
I have a large data set with agent activities and the total time in seconds each activity lasts. I'm pulling this together in a matrix, to display agent names on the left and the start date of each week across the top like so:
This is working as intended (I've used a measure to convert the seconds into hours) but I need the average of the displayed weeks as another column, or to replace the Total column.
I've tried solutions involving DAX measures but none are applicable, likely because I'm using a custom column (WeekStart) to roll up my numbers into weeks. Adding more complexity is I have 2 filters on the matrix; one to exclude any weeks older that 5 weeks in the past and another to exclude any future weeks.
In Excel I'd just add another column next to the table, averaging the 5 cells to the left of it. I could add it to the data table with a SUMIFS checking the Activity date is within the week range and dividing the result by 5. I can't do either of these in PowerBI and I'm new to the software so I'm at a loss as to how to do this.

Obiee column measures are same for different time periods

I was creating some analysis on revenue for past years. One thing I noticed is measures of revenue for each month of a year are same for every year's corresponding months. That is revenue for April 2015 is same as revenue for April 2016.
I did some searching to solve this problem. I found that our measure column 'Revenue' is aggreagted based on time dimension as 'Last(sum(revenue))'. So actual revenue values of April 2019 is considered by OBIEE as last and copied to other year's April month revenue.
I can understand that keyword 'last' may be the reason of this, but shouldn't year, quarter, month columns choose exactly those numbers that corresponds to that date? Can someone explain how this works and suggest solutions, please?
Very simply put: The "LAST" is the reason. It doesn't "copy" the value though. It aggregates the values to the last existing value along the dimensional hierarchy specified.
The question is: What SHOULD that Saldo show? What is the real business rule?
Also lastly: Using technical column names and ALL UPPER CASE COLUMN NAMES in the BMM layer shouldn't be done. The names should be user-focused, readabla and pretty. Otherwise everybody has to go and change it 50 times over and over in the front-end.
It's been a year since I posted this question,but a fix for this incorrect representation of data was added today. In the previous version of rpd, we used another alternative solution to this by creating two measure columns of saldo ( saldo_year and saldo_month) and setting level for them at year and level respectively and using them both in an analysis. This was a temporary solution until we did the second version of our rpd since we realized that structure of the old one wasn't completely correct and it was easier and less time consuming to make it from ground and create a new one than to fix the old one.
So as #Chris mentioned, it was all about correct time dimension and hierarchies. We thought we created it with all requirements met, but recently we got the same problem in our analyses. Then we figured out that we didn't set id columns as primary key in month and quarter logical levels. After we got the data we want. If anybody faces this kind of problem, then the first thing to check in rpd is how the time dimension and hierarchy is defined, how logical levels and primary keys and chronological keys are set in hierarchy.

Advanced Excel Search and Sorting

I have a incredibly large spreadsheet that lists details for the computers in my company's inventory. We need to know how many systems we have that are x years old. I was able to sort it by model but because the model names are wildly different it didn't help much. For example, one model name is
13-inch MacBook Pro (2011)
And another is
13-inch Retina MacBook Pro (Mid 2017)
The only constant value in the parentheses is the year at the end. I'm trying to write a formula that will spit out how many of each system there are. We need to know how many are 2011 computers, how many are 2017, etc. We are fine with grouping up "Early, Mid, Late" since we just need a year separation but those terms don't show up in every cell throwing my math off. The rows don't have to be sorted, I just need a count.
My plan of attack would be to first, convert the spreadsheet into a table using Insert > Table... this enables Excel to manage calculating columns for you.
The following assumes that the cell at the top of your list contains the word "Detail".
Second, I would make a new column at the far right with an equation like this:
=mid([#Detail], find(")",[#Detail])-4, 4)
...and I would tune the "Find" function and the "mid" function until it gives me just the year.
Third, sort the entire table by this new column. Tada!
Transfer the data to column A. Cells A1 to A1000 in my Example.
In Enter the years in column C. Cells C2 to C20 in my example.
In cell D2, enter the following Array Formula, and drag it down.
=SUM(IFERROR(IF(VALUE(LEFT(RIGHT($A$1:$A$1000,5),4))=C2,1,0),"-"))
Array Formulas are entered using Control + Shift + Enter, instead of Enter.
The Formula takes the last 5 characters of all entries in the column A. Then it takes the first 4 characters of this new text (to eliminate the closing bracket) and converts the text entries to numerical values. It matches each entry with the year in column C, and totals the matches.
I hope this solves your problem.
Regards,
Vijaykumar Shetye,
Spreadsheet Excellence,
Panaji, Goa India

Extract combination of items from database based on column values

I have a database of food items (approx. 2000 items) and their nutritional values (nutrition per 100 grams). I want to create a function that extracts one optimal or multiple combinations of food items that are as close to a certain defined combination of nutritional values as possible.
It's kind of like an extreme version of the knapsack problem. What I'm doing currently is sorting the items by each nutritional value, and manually combing through them to find combinations that work somewhat. But figuring out a way to do this programmatically would be gold.
Not sure if I'm in the right forum or if the question is specific enough. And I'm certainly not expecting code snippets or full solutions. Just trying to figure out if it's even possible to accomplish.

Appointments scheduling algorithm [duplicate]

This question already has answers here:
Best Fit Scheduling Algorithm
(4 answers)
Closed 9 years ago.
I have the following problem to solve, perhaps you could give me some ideas regarding the problem - solving approach:
There are:
8 classrooms
16 teachers
201 students
149 parents
241 appointments (most of the parents need to see multiple teachers, either because the have more than one child or because one child is being taught by two or more teachers)
2 days.
For each day:
7 classrooms are available for 20 hours per day.
1 classroom is available for 10 hours per day.
Each teacher occupies one classroom
Each appointment lasts for one hour
Further constrains:
- For each parent, all appointments must be sequential (having at maximum 1 hour pause)
- Each parent should have to visit the school one day only.
- For each teacher, all appointments within the day must be sequential (having at maximum 2 hours pause)
- Out of the 16 teachers, 3 can only be present one of the two days.
I'm trying to find an approach to produce the appointments schedule, obviously without having to calculate all possible variations until all the requirements are met. Any ideas?
You need to define your constraints a bit more; i.e., what is the relationship between students and teachers? What is the relationship between students and parents? Do parents have to have individual appointments, or are the parents of a single student allowed to meet with a single teacher together?
I'd approach this with an initial (in testing) naive approach; just pick your highest constrained resource (looks like the teachers that can only be present for one of the two days), schedule those using the first available resources, and then continue through the set of resources, scheduling them using the first available resources that match their constraints and see if you can schedule the entire set. If not, you need to find your limiting resource(s), and apply some heuristics to your matching to find the best way to optimize those limited resources.
It's a bit of a tricky problem; have fun!
Take a look at the curriculum course track of ITC2007 and it's implementation in Drools Planner or unitime. Both of them use meta-heuristics such as tabu search and simulated annealing.

Resources