Extract which ID appears in consecutive years - time

i am having troubles trying to extract from my data only the IDs that appear at least on two consecutive years. I couldn't find similar questions for solutions, much sorry if this question it's a duplicate of some sort.
I'll create a data example:
ID= c(1, 2, 1, 3, 4, 2, 1, 5, 4, 1, 2, 6, 7, 3, 1, 2,6,9,5)
Year= c(2006, 2006, 2006, 2006, 2006, 2006, 2006,2006, 2007,2007,2007,2007,2007,2007,2007,2007,2008,2008,2008)
DF<- data.frame(ID, Year)
I would like to get a result which shows me which IDs appear in consecutive years only, namely IDs 1,2,3, 4 and 6, as 5 also appears twice, but not consecutively, and the others are unique entries.

Surely there are several ways to do this. One fairly simple way (which assumes the years are already sorted as in your data example) is to first compute the minimum of the differences of the unique years for each ID:
mdu = aggregate(Year~ID, DF, function(y) suppressWarnings(min(diff(unique(y)))))
This yields (with your data example):
ID Year
1 1 1
2 2 1
3 3 1
4 4 1
5 5 2
6 6 1
7 7 Inf
8 9 Inf
(The suppressWarnings serves to silence a warning from min for an empty diff list, coming from IDs only appearing in one year.)
Now, the wanted IDs are those where the minimum year difference is 1 (which means the ID appears in consecutive years); they can be easily extracted by:
mdu$ID[mdu$Year==1]

Related

SPSS Randomly select a number from a respondent's input

I'm hoping there is SPSS syntax that I can use to randomly select a number from among a couple of variables. For example: the data lists the ages of respondent's children in four variables - Age1 Age2 Age3 Age4
Resp 1: 3 6 8
Resp 2: 2 10
Resp 3: 4
I want to create a variable that stores a randomly selected age for each respondent - something like:
Resp 1: 6
Resp 2: 2
Resp 3: 4
The code I'm using at the moment:
COUNT kids=age1 to age4 (1 thru 16).
COMPUTE rand=RND(RV.UNIFORM(1, kids),1).
DO REPEAT
x1=age1 to age4
/x2=1 to 4.
IF (rand=x2) random_age=x1.
END REPEAT.
Here is my suggested code for the task.
First creating some sample data to demonstrate on:
data list list/id age1 to age4 (5f2).
begin data
1, 4, 5, 6, 7
2, 4, 5, 6,
3, 6, 7,,
4, 8,,,
5, 5, 6, 7,
6, 10,,,
end data.
Now to randomly select one of the ages:
compute numages=4-nmiss(age1 to age4).
compute SelectThis = rnd(uniform(numages)+.5).
do repeat ag=age1 to age4 /ind=1 to 4.
if SelectThis=ind SelectedRandAge=ag.
end repeat.
exe.
Well, here's my attempt for the time being:
data list list /age1 to age4.
begin data.
10 9 5 8
3
13 15
1 4 5
4 7 8 2
end data.
count valid=age1 to age4 (lo thru hi).
compute s=trunc(1+uniform(valid)).
vector age=age1 to age4.
compute myvar=age(s).
list age1 to age4 myvar.

Creating simple repeating number sequence in SPSS

I want to create the following sequence in SPSS syntax. I've tried LOOP and DO REPEAT, but cannot figure out how to re-create this:
1 1 1 2 2 2 3 3 3 4 4 4 5 5 5
Your question is really not clear enough, so I'm just guessing. Please edit your question so we can know if this is the right solution (and for the benefit of future readers).
If what you want is a variable that has the values 1, 1, 1, 2, 2, 2, 3, 3, 3, etc', Here is a way to get that:
compute MyVar=trunc(($casenum-1)/3)+1.
exe.

Getting the combination of facevalues that gives the highest score in a dicegame

Working on a dicegame for school and I have trouble figuring out how to do automatic calculation of the result. (we don't have to do it automatically, so I could just let the player choose which dice to use and then just check that the user choices are valid) but now that I have started to think about it I can't stop...
the problem is as follows:
I have six dice, the dice are normal dice with the value of 1-6.
In this example I have already roled the dice and they have the following values:
[2, 2, 2, 1, 1, 1]
But I don't know how to calulate all combinations so that as many dicecombinations as possible whose value combined(addition) are 3 (in this example) are used.
The values should be added together (for example a die with value 1 and another die with the value 2 are together 3) then there are different rounds in the game where the aim is to get different values (which can be a combination(addition) of die-values for example
dicevalues: [2, 2, 2, 2, 2, 2]
could give the user a total of 12 points if 4 is the goal for the current round)
2 + 2 = 4
2 + 2 = 4
2 + 2 = 4
if the goal of the round instead where 6 then the it would be
2 + 2 + 2 = 6
2 + 2 + 2 = 6
instead which would give the player 12 points (6 + 6)
[1, 3, 6, 6, 6, 6]
with the goal of 3 would only use the dice with value 3 and discard the rest since there is no way to add them up to get three.
2 + 1 = 3
2 + 1 = 3
2 + 1 = 3
would give the user 9 points.
but if it where calculated the wrong way and the ones where used up together instead of each 1 getting apierd with a two 1 + 1 + 1 which would only give the player 3 points och the twos couldn't be used.
Another example is:
[1, 2, 3, 4, 5, 6]
and all combinations that are equal to 6 gives the user points
[6], [5, 1], [4 ,2]
user gets 18 points (3 * 6)
[1 ,2 ,3], [6]
user gets 12 points (2 * 6) (Here the user gets six points less due to adding upp 1 + 2 + 3 instead of doing like in the example above)
A dice can have a value between 1 and 6.
I haven't really done much more than think about it and I'm pretty sure that I could do it right now, but it would be a solution that would scale really bad if I for example wanted to use 8 dices instead and every time I start programming on it I start to think that have to be a better/easier way of doing it... Anyone have any suggestion on where to start? I tried searching for an answer and I'm sure it's out there but I have problem forumulating a query that gives me relevant result...
With problems that look confusing like this, it is a really good idea to start with some working and examples. We have 6 die, with range [1 to 6]. The possible combinations we could make therefore are:
target = 2
1 combination: 2
2 combination: 1+1
target = 3
1 combination: 3
2 combination: 2+1
3 combination: 1+1+1
target = 4
1 combination: 4
2 combination: 3+1
2+2
3 combination: 2+1+1
4 combination: 1+1+1+1
target = 5
1 combination: 5
2 combination: 4+1
3+2
3 combination: 2+2+1
4 combination: 2+1+1+1
5 combination: 1+1+1+1+1
See the pattern? Hint, we go backwards from target to 1 for the first number we can add, and then given this first number, and the size of the combination, there is a limit to how big subsequent numbers can be!
There is a finite list of possible combinations. You can by looking for 1 combination scores, and remove these from the die available. Then move on to look for 2 combination scores, etc.
If you want to read more about this sub-field of mathematics, the term you need to look for is "Combinatorics". Have fun!

Binary tree in concentric circles

Recently I came across a question in an interview "Print a complete binary tree in concentric circles".
1
2 3
4 5 6 7
8 9 0 1 2 3 4 5
The output should be
1 2 4 8 9 0 1 2 3 4 5 7 3
5 6
Could anyone help me out on how we can solve this problem?
Here is how you can approach the problem. Arrange the tree by levels:
1
2, 3
4, 5, 6, 7
8, 9, 0, 1, 2, 3, 4, 5
So the data you have is k levels L1, L2, ..., Lk. Now answer this questions: After we execute one step, that is when one circle is traversed, how would the tree levels would look like after the traverse elements have been removed from the levels? How should I modify the levels and which elements should I print so that it would seems like I've traversed on circle?
In your example, after the first step the levels would be modified to just:
5, 6
So what was the operation that was executed?
After you've answered the questions just apply the same procedure couple of times until you've printed all elements.

Quicksort algorithm

I used the quicksort algorithm to sort
11 8 9 4 2 5 3 12 6 10 7
and I got the list:
4 3 2 5 9 11 8 12 6 10 7.
5 was used as a pivot. Now I am stuck. How do I proceed to sort the lowersublist and the uppersublist?
pivot=5 11 8 9 4 2 5 3 12 6 10 7
Move pivot to position 0 5 8 9 4 2 11 3 12 6 10 7
i (position 1 = 8)
j (position 6 = 3) ⇒ swap 8 and 3 5 3 9 4 2 11 8 12 6 10 7
i (position 2 = 9)
j (position 4 = 2) ⇒ swap 9 and 2 5 3 2 4 9 11 8 12 6 10 7
i (position 3 = 4)
– no smaller elements than 5 ⇒ swap 5 and 4 4 3 2 5 9 11 8 12 6 10 7
– list after the partition
Quicksort is a recursive algorithm. Once you have sorted the elements by the pivot, you get two sets of items. The first with all elements smaller or equal to the pivot, and the second with all elements larger than the pivot. What you do now, is that you apply quicksort again to each of these sets (with an appropriate pivot).
To do this, you will have to choose a new pivot every time. You can do something like always pick the first element, or draw one at random.
Once you reach a point where a set contains only one element, you stop.
A good way to understand these things is to try to sort a deck of cards using this algorithm. All cards are face down, and you are only allowed to look at two cards at a time, compare these and switch them if necessary. You must pretend to not remember any of the cards that are face down for that to work.
A key component of the algorithm is that the chosen pivot value came from the original list, which means (in your case) the element with the value 5 is now in the correct final position after the first partitioning:
4 3 2 5 9 11 8 12 6 10 7
This should be fairly obvious and follows simple intuition. If every element to the left of an item is smaller than that item and every element to the right is larger, then the item must be in the correct, sorted position.
The insight necessary to understanding the entire Quicksort algorithm is that you can just keep doing this to each of the sublists -- the list of values to the left of the pivot and the list containing all values to the right -- to arrive at the final, sorted list. This is because:
Each partitioning puts one more element in its proper position
Each iteration removes one element -- the pivot -- from the list of elements left to process (which is why we'll eventually reach the base case of zero (or one, depending on how you do it) elements)
Let's assume you chose the partition value of 5 based on the following pseudo-code:
Math.floor(list.length / 2)
For our purposes, the actual choice of a pivot doesn't really matter. This one works for your orginal choice, so we'll go with it. Now, let's play this out 'till the end (starting where you left off):
concat(qs([4 3 2]), 5, qs([9 11 8 12 6 10 7])) =
concat(qs([2]), 3, qs([4]), 5, qs([9, 11, 8, 6, 10, 7]), 12, qs([])) =
concat(2, 3, 4, 5, qs([6, 7]), 8, qs([9, 11, 10]), 12) =
concat(2, 3, 4, 5, qs([6]), 7, qs([]), 8, qs([9, 10]), 11, qs([]), 12) =
concat(2, 3, 4, 5, 6, 7, 8, qs([9]), 10, qs([]), 11, 12) =
concat(2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
Note that each time you see a single call to qs it will follow this pattern:
qs(<some_left_list>), <the_pivot>, qs(<some_right_list>)
And each call of qs on one line results in two more such calls on the following line (representing the processing of both new sublists (except note that I immediately decompose calls to qs on single-value lists)).
It's a good idea to go through this exercise yourself. Yes, with actual pen and paper.

Resources