I need help with either an advanced filter or expression in either SAS EG (4.3) of Excel (2013)
I have a dataset with different breeds, father's country of origin and birth dates for pigs. I need to compare offspring born on the same date, for each breed, but entries must be limited to dates that include one international country and the birth country.
So for example.. If country B and C are international countries, for all born in January 2010, if breed 1 only has offspring from country A and breed 2 has offspring from both country A and C, only breed 2's entries will be shown for both countries. Also, there must always be one entry from country A.
So if the entries are:
Date Country Breed
Jan 2010 1 A
Jan 2010 1 A
Jan 2010 2 A
Jan 2010 2 C
Feb 2013 1 B
Feb 2013 1 B
I only want to only see
Date Breed Country
Jan 2010 2 A
Jan 2010 2 C
Any help will be greatly appreciated!!
Here's a possible Excel approach. Using COUNTIFS is just a way of seeing if there is at least one case in a group with same date and breed where the following conditions are satisfied.
(1) At least one with country A (in E2):-
=COUNTIFS($A$2:$A$7,"="&A2,$B$2:$B$7,"="&B2,$C$2:$C$7,"="&C2,$D$2:$D$7,"=A")
(2) At least one with a different country (in F2):-
=COUNTIFS($A$2:$A$7,"="&A2,$B$2:$B$7,"="&B2,$C$2:$C$7,"="&C2,$D$2:$D$7,"<>"&D2)
Then combining the two conditions (in G2):-
=AND(E2,F2)
then pull the formulae down and filter on column G.
You could also do it with an array formula but I think this is easier to understand.
Related
Given the new legal regulations due to covid-19 the daycare of my child and all involved parents are overwhelmed and we need to come up with a schedule of when which child can be cared for.
Given the demanded care time per child (s. below), We need an algorithm to optimize the following:
Minimal total contacts / fixed groups. If children meet, it is best if they stay in that group and do not see children of other groups.
While point 1 is more important, the 2nd prio would be to reduce the size per group or maybe it should be phrased as minimal count of different children met per child
Even less important: reduce total contact time.
(Maybe there are other requirements, that I overlooked?)
The demands are of following nature (Timespan and type) :
Case
Child
Timespan
Type
(1) Fixed time, Required
1
Monday, 8:30 - 13:00
Required
(2) Fixed time, nice to have
1
Tuesday, 8:30 - 13:00
Nice to have
(see (1))
1
Tuesday, 13:00 - 16:00
Required
(see (1))
1
Thursday, 8:30 - 13:00
Required
(see (1))
2
Monday 8:30 - 13:00
Required
(3) Flexibel date, required
2
Any 2 other days 8:30 - 13:00
Required
(4) Flexibel date, nice to have
2
Any day 13:00 - 16:00
Nice to have
(5) Flexibel datetime, Required
3
3 hours
Required
(6) Flexibel datetime, Nice to have
3
3 additional hours
Nice to have
...
...
...
Required = The child must have daycare
Nice to have = daycare is demanded but not required. E.g. if child 1 meets child 2 and 3 on Monday and Thursday, it would be fine to meet the same children on Tuesday morning as well, but if it is a completely different group of children than this would not make sense.
All provided timespans must stay in one continuous piece (meaning that 3 hours cannot be split up into multiple slots).
Additional Information
There is only one room available.
There are 15 children in total.
If a solution is much better than the other it is ok-ish to violate "Required" demands for a few cases. We might be able to find a different solution for the parents in few situations. The algorithm should hence contain a parameter like maxAllowedViolations - let's say it's 3 and it should compare how much the solution is better than without the errors.
The demand is provided per week and might change from week to week. I only know the demand one week in advance. The ideal grouping hence might change per week, but it might be better to respect the grouping of the last week as a guidance, because corona has about 7 to 10 days of incubation time.
The caregivers are tested for covid-19 twice a week, the children are not.
I do not care in which language or pseudo-code-ish way the algorithm is, but I will try to implement the algorithm in a web-based format so other daycare centers can use it as well.
I was wondering if Cognos Framework Manager has the built-in function "Last" like in Dynamic Cubes?
Or does someone know how to model following case:
We have two dimensions - a time dimension with year, half-year, quarter and month and another dimension that categorises people depending how long they are attending a project (1-30 days, 31-60 d, 60-180, 180 -365, 1-2 years, +2 years). However the choice of the time dimension level (year, half-year etc.) influences the categorization of the other dimension).
An example:
A person attends a project starting from 15.11.2018 and ends 30.06.2020. The cognos user uses for the time dimension the year level thus 2018, 2019 & 2018 will be displayed.
For 2018 the person will be in the category 31-60 days, since 46 days have passed until 31.12.2018. For 2019 the person will be listed in category 1-2 years as 46 + 365 days will have been passed since 31.12.2019. For 2020 the person will also be in that category as 46 + 365 + 180 day have gone by.
The categories will change if the user selects another time dimension level e.g. half-years:
2nd HY 2018: 31-60 (46 days passed)
1st HY 2019: 180-365 days (46 + 180 --> End of HY2019)
2nd HY 2019: 1-2 years (46 + 180 + 180)
1st HY 2020: 1-2 years (46 + 180 + 180 + 180)
Does someone know how to model dynamic dimension categories based on selection of another dimension (here time dimension)?
The fact table contains monthly data and for the mentioned peroson above there will be 20 seperate records (for each month between november 2018 and june 2020).
For any period, a person may or may not be working on a project.
Without knowing exactly what your data and metadata is it would be somewhat difficult to prescribe an exact solution but the approach would probably be somewhat similar to a degenerate dimension scenario.
You would want to model the project dimension as a fact as well as a dimension. You would have relationships between it and time and whatever other dimensions you need.
Depending on the data and the metadata you might need to do some gymnastics to get there.
If the data was in a form similar to this it would be not too difficult. This is an example to get you an idea about some ways of approaching the problem.
Date_Key Person_Key Project_Key commitment_status, which would be the measure.
20200101 1 1 1
20200101 1 2 0
20200101 1 3 0
20200102 1 1 1
20200102 1 2 0
20200102 1 3 0
20200103 1 1 0
20200103 1 2 1
20200103 1 3 0
In the above, person 1 was working on project 1 for 2 days and then put onto project 2 for a day. By aggregating the commitment status, which is done by setting the aggregate rule property, you would be able to determine the number of days a person has been working on a project no matter what time period you have set in your query.
I have a line chart ( x represents date, y represents amount of car rentals on that date) that needs to be connected at all times, since the values are all valid - there is always at least one car rental per that date. The only time that the line shouldn't be connected, but should make a gap between two valid values/points is when the two successive dates are too wide apart. I have to figure out the best alghorithm for what this 'two wide apart' means and, based on these dates (or something), set a parameter.I don't know all the possible combinations of dates, but I think they can be anything:
2010 2011 2013 2018 2019
or
1990 2001 2002 2012 2015
or
possibly anything else
Is there any standard way to deal with this kind of problem?
The problem is to characterize what it means to be too wide apart. One solution is to build a histogram (i.e. a probability density function) of the date differences of the x coordinates of the data points, and then to consider as too wide, those differences that are in, say the top 33% (or whatever other proportion you wish).
For example, suppose the x coordinates are the years:
1990 1995 2001 2002 2003 2010 2011 2012 2013 2017 2019
Let say we calculate date differences in years (we could choose any other duration unit). We calculate the differences between the values above and build the histogram below.
Counts: 5 1 0 1 1 1 1
Diff.: 1 2 3 4 5 6 7
Now, if we choose to keep disconnected differences in the top 33%, from the histogram, this means that differences greater than or equal to 5 years would be disconnected.
I'm working on a Universal windows store app and I want to add a simple calendar so that users can add birthdays of their friends and save them and some other stuff. Is there any simple way to do this?
PS: Using Visual Studio 2013
What I do understand about Calendars, is there are fourteen of them in total that are possible - seven for each day of the week a normal year begins on, and seven more for leap years. You may have to create all fourteen and put them in somehow. There is instead a formula for working out what day of the year anyone was born on by their date of birth, and formulae can be programmed into r or other programmes using perhaps the for loop, or if command. This formula is :
{4(d+y)+x-4c}/28, where d is what number day of the year it is - so Jan 1st, d=1, and so on, but for Leap Years it gets different from Feb 29, y is the year in question, x the closest year before y divisible by four, and c an era constant. For modern dates, c=0, so no worries, but if You go back before September, 1752, it changes, as this is when the British went from the Julian to the Gregorian, while in Catholic Europe it was October, 1582, but these are only a concern if You are into such historical dates. For those, c=1, but may have different values much earlier - if You want to know more, reply to this, and I will get out what I have written on it.
For example, if one was born on 17th May, 1979, you would go :
{4(137+1979)+1976-4 times zero}/28, which equals 372 plus six sevenths - you ignore the whole number, and look at how many sevenths there are. Six means Thursday, and thus the 17th May, 1979 was a Thursday. This is it in short - for more details, if this interests, feel free.
The simple way is in use third-party controls like Telerik Calendar or something free like this one.
I am trying to understand Time dimension is SSAS.
In SSAS we have an option to create Time dimension. I have two questions related to it
What difference it makes if i generate, Regular calendar, Fiscal
calendar, Manufacturing Calendar or ISO 8601 Calendar?
Once the dimension is created, is it possible to update it. Let
say i generated it for the range of 1 Jan 2012 to 31 DEC 2012 and
now i want to increase it to 31 Jan 2013 is it possible.
Thanks in advance.
I don't know the answer to #1, but to #2, yes, you will be able to extend the range of your time dimension after it is created.