From monthly to yearly data in TraMineR - time

I am using TraMineR for a while now and I have a question regarding changing the time granularity of my sequences. At the moment I have my sequences aligned on months, but for several reasons I would like to change this to years. I would like to use the longest spell in each year as the state for that particular year. In other words, if somebody was cohabiting for 4 months and then got married and stayed married for the other 8 months in the year 2000, I would like to code that person as being married in 2000. I was wondering if there is an easy way to do this with TraMineR.
Thanks in advance,
Tom

The seqgranularity function from the TraMineRextras package aggregates each successive subsequence of length tspan into a single state. In its stable version on the CRAN two aggregation methods are proposed: "first" or "last" that replace the sequence over the period with respectively the first and last state in the period.
The option you are looking for, i.e., replace the period with the most frequent state, is currently in test in the development version of TraMineRextras available from R-Forge. The argument is method="mostfreq"
Here is an example where we aggregate monthly data into yearly data:
library(TraMineRextras)
data(mvad)
mvad.seq <- seqdef(mvad, 17:86)
mvad.seq2 <- seqgranularity(mvad.seq, tspan=12, method="mostfreq")
par(mfrow=c(2,1))
seqiplot(mvad.seq, withlegend=F)
seqiplot(mvad.seq2, withlegend=F)

Related

(Google Play) App dropped to rock bottom in multiple countries / top-lists after simple metadata change in a single country

Background info: Our game has been pretty stable for months in the top 100 roleplay category in over 55 countries while we've been hovering in the top 10 in around 5-8 countries.
Problem: After some metadata change (minor title & short description adjustments) in a few countries we've been immediately dropping like 100-400 positions in all countries (whether the country was involved in the metadata changes or not). We couldn't believe our eyes and were not able to wrap our head around how a ‘short description’ change in Poland can make us drop from #70 to #416 in the US-Roleplay charts. We dug through our data and we were able to find another similar occurrence. In June last year we exclusively changed the title of our game in France and immediately dropped 3x our ranking in 20 or more countries as well. In June all positions recovered over the timespan of 8 days, unfortunately it seems like this time this is not the case.
We’re aware of the importance of keywords and the impact metadata can have. We still rank very good on all our important keywords and the traffic coming from play store searches haven’t really changed as well.
Have any of you ever experienced this? What information are we missing? Looking forward to hearing your thoughts. Needless to say, we appreciate all input :)
We also faced something similar during the same time as you posted this message. To confirm, is your app title longer than 30 characters (do check all listings)? Google might have started to penalize for that. We are also figuring the same.
No luck with recovering the ranks

WP7 Default dates

I have released an app which,. within its functionality displays date and time strings.
I am aware of the differing formats across cultures - however in some cases I had hardcoded values- for example I had gone with a custom format that was the 12 hr clock and showed AM/PM
I am now changing to use the standard date time format strings where possible, and so, for my times,I am now using the shortTimePattern.
What has surprised me is that for the US this shows as say 3:15PM but in the UK its 16:15 i.e the default there is the 24 hr clock.
Similarly in the US the long date includes the day of the week, where as in the UK it does not.
I am thinking that these defaults must be right and are what is expected within that country but is this really the case? I had no idea that the UK default would be a 24 hr clock. And, for those users in the UK who have the app, will they be annoyed when the next update shows the time in this format?
Interested in any opinions around this.
thanks
UK Users will not care between 24hr and am/pm (we don't talk to each other saying "It's fourteen hundred o'clock :P).
Dates are also fine unless you're using format of 12/02/12 as in the UK that's considered 12th Feb whereas in the states it's December 2nd.
This is not the place to solicit "opinions" on whether the behaviour provided by the framework is correct.
Assume that the framework is correct, unless you know otherwise.
If you want to know what your users will think you should ask them. (I assume that your users are not typical users of StackOverflow.) If you don't already have beta users in your target markets to ask then make the change to use the "standard" behaviour. If there is a problem your users will tell you.

Barcode Encryption of Personal Identifiers (or alternatives suggested by you)

I am trying to create a health application of a rather sensitive nature which will require some form of cryptography/obfuscation. There is a health study in which once a year, known individuals with permanent and recognisable identifier numbers (eg KIG0005001 as an individuals identifier) walk into the clinic, are identified, have their blood tested as part of a study. Next year, the same happens again, as this is a longitudinal study. Now the results of the blood test should NOT be able to be traceable to an actual individual (HIV status, etc are highly sensitive bits of information that should not be linkable with actual individuals due to their right to privacy), but it is IMPERATIVE that we can identify year on year which blood samples belong to one unique individual (without knowing WHO the individual actually is, the emphasis is on the blood samples being traceable to one individual, not the individual).
My idea (and here is where am asking for your expertise in cryptography and obfuscation) is that when the individual visits the clinic they come with an identifying card with their regular id number KIG0005001 . This number is entered into a system where via an algorithm/encryption it spits out a barcode (based on the original id KIG0005001 , therefore any future visits should produce the SAME barcode for a particular individual) which can be printed out as stickers. These barcode stickers are the ones to be used to identify the samples (stick em on the samples). The stickers should have the following information in them: unique identifier (via barcode?), the round number that the sample was taken (samples will be taken once a year, so year 1= round 1) and date sample taken.
Is this possible? What are the alternatives? How/What should I do in terms of transforming KIG0005001 into an encrypted barcode which is repeatable year on year (so blood sample can always be traced back to the same source). Am programming in Java.
Thanks in advance,
Tumaini
To answer this question, I don't think it needs to be in the barcode section.
First of all, there is no way to keep everything 100% secure... but you can make it more complicated to be understood by a human.
It's the same thing as the passport controversy... A biometric passport must be secure: it's not possible to read the information without knowing the "private key". But let's say you read and record everybody's passport that enters your store and save it to a database. You will be able to trace who is coming back and even what they previously bought since you have their passport's ID...
To make the life harder for your employees, you need to generate an ID that will match the real person's ID. So if the employee is testing the blood of KIG0005001, they will receive a different unique ID for that day; the computer will know how to link them up. So that your employee has no idea who is this number at that moment...
Cryptography is probably useless here since you work with IDs. Even a gibberish data repeated multiple time is still an ID.

Most simple way to do holiday calculation?

I want to make a little free calendar program to help me and others calculate how much time we have got left in a project. I mean real working time, not just time. Time in a raw form is not saying much.
Typically when my boss tells me that I have time until 05-05-2011 it doesn't tell me really how much time I have to do my job.
You know...so many things stop me from work:
A) beeing at home, not at work (so called "free time" or "spare time"). That is in my case I work exactly 8 hours a day and then the cleaning ladies throw me out of the office with their incredible loud industrial vacuum cleaners every evening (my boss accepts that as an excuse to go home in time, regularly).
B) weekends, or more precisely saturdays and sundays
C) official holiday rescuing me from having to go to work.
what I want to do is make a little utility which tells me how many working hours I really have in a given time period.
The first two things A and B are pretty easy to implement. But the last thing C scares my pants off. Holidays. OOOHHH man. You know what that means. Chaos. Pure chaos.
The huge question is: HOW TO CALCULATE HOLIDAYS?!
Since I want my program to be useful for anyone anywhere in the world, I can't just hardcode all holidays for my little town.
So which options do I have?
I) I could hand-craft downloadable lists of holidays. Users search them within the application and download them from an webserver. Or I ship all of them in the package. But I would get very, very old if I tried that by myself for every country, state and town.
II) I make an initial data sheet with holidays for my town, and don't care about the rest. However, I make that sheet with an how-to public, so that everyone who feels like beeing very nice can provide holiday data for his country / region / whatever. Those are made public on a webserver and everyone can get the data packages he/she needs for the app.
III) ?
I care a lot about usability. I don't want to make an ugly linux hack style hard to use app that only computer freaks can use.
So you need to tell me more about holiday science. I was never really clever at this. I assume every single country in the world has it's own set of holidays. In every country there may be several states. For example the US has some, and Germany has also some states. Holidays vary from state to state. But I know from an good programmer he told me never assume anything. So the questions about holiday science are:
Which categories do I need to make holiday-data-packs searchable? A guy from India should find quickly his holiday data pack, and a guy from Sillicon Valley should find his pack as equally fast. It makes most sense to me to filter for COUNTRY > STATE > WHATEVER. Like a drill-down-search. Did I miss something?
What would be the best data format to hold holiday information? A holiday has a start and end date and a name. That should be enough. Would I put all this stuff in thousands of XML files?
How would you go about this? Any hint / help is highly welcome! Thanks to everyone!
We use a table.
It should not be that hard. If you look at your corporate holiday schedule you should be able to calculate the list of around 10 days. The only problem is that many of these are arbitrary. i.e Christmas falls on a Saturday so give the Friday before off.
Have you looked at this site to calculate the list of known Holidays ?
Many organizations post their holiday schedule on the web, it might be possible to read that and get the schedule ?
In this case, I would suggest that you are encountering less of an engineering problem and more of a data collection problem.
Rather than define a "definitive set of holidays" for each possible user, allow the user to easily setup his holidays. By offering a (usable, quick, easy) way for users to select holidays, you do not make any assumptions.
You could even make it "social" by allowing users to upload their selections - imagine your HR department takes 10 minutes to setup and upload a set of holidays for all your company employees. Now you just need to provide a way for everyone to find that set.
On another topic, I would suggest using a common format, like iCal to store your calendar data. Here's a page with some example iCal files.

Home loan calculation formula (algorithm)?

How a bank calculate home loan's payments?
For example,
$1,000,000 at 5.00% over a 25 year period.
Monthly payment: $5,845.90
Current Payment To Date
Payment -------------------------- ----------------------------------------------
Number Interest Principal Interest Paid Principal Paid Balance
1 $4,166.67 $1,679.23 $4,166.67 $1,679.23 $998,320.77
2 $4,159.67 $1,686.23 $8,326.34 $3,365.46 $996,634.54
3 $4,152.64 $1,693.26 $12,478.98 $5,058.72 $994,941.28
4 $4,145.59 $1,700.31 $16,624.57 $6,759.03 $993,240.97
5 $4,138.50 $1,707.40 $20,763.07 $8,466.43 $991,533.57
6 $4,131.39 $1,714.51 $24,894.46 $10,180.94 $989,819.06
7 $4,124.25 $1,721.65 $29,018.71 $11,902.59 $988,097.41
8 $4,117.07 $1,728.83 $33,135.78 $13,631.42 $986,368.58
9 $4,109.87 $1,736.03 $37,245.65 $15,367.45 $984,632.55
10 $4,102.64 $1,743.26 $41,348.29 $17,110.71 $982,889.29
I'm trying to do same calculations in Excel, but I get another numbers...
The algorithms are well shown and discussed here (in Javascript) -- implement exactly the same algorithms in Excel's VBA, Javascript, Ruby, whatever, and you'll get pretty much the same results!-)
the magic words are amortization schedule.
The difference you see in Excel is probably to do with the way the compound interest is calculated. Most banks add compound interest daily (gets them more money).
The wikipedia article has a nice example of the equation used by US banks. You can code that up.

Resources