UI design for time tracking - user-interface

I'm trying to think an interface for a time-keeping system that will let users see "at a glance" how much time still needs to be completed.
This is complicated by the fact that there are no fixed hours - employees must work at least 6 hours on any given week day, and at the end of each month should have worked 7.5 hours for each week day that month. 7.5 hours either way can be carried over to next month. Employees can also take up to 1 morning and 1 afternoon per month 'flex' time. Time needs should be recorded by 10.15 the following day, but this rule is bent during busy times, and end of week and particularly end of month boundaries become more important.
So what's the most readable format to display the currently entered time and highlight approaching deadlines for uncompleted entries, while still giving a feeling for how far ahead/behind track the employee is (i.e. if you have a week to go that month and you're 8 hours ahead you could schedule a little extra duvet time).
My first take was a bar chart for each day with a red line at 6. But this doesn't give you any idea of how far ahead or behind you are for each day or month or whether you are close to missing a deadline...
(Please excuse the horrible mockup)
Maybe I'm trying to convey too much info in one place?
--
Here's a mockup of time recorded v.s. time required as suggested by Dickon Reed
--
EDIT: this side works great for stuff like this. i'm going to kick some of these ideas around in the morning and hopefully get something posted back here.

Your first chart could be improved by showing the remaining time as shown below. The remaining time until the next weekly/montly deadline is divided evenly over the remaining days before the deadline. So the UI will answer to the question: "How many hours per day will I need to work to meet my deadline?" If the user wants to take one morning/afternoon off (or did I understand you incorrectly?), you should have a checkbox for that day, after checking which the estimates will be adjusted accordingly.
You should also show the exact time on top of every vertical bar, so that the user doesn't need to estimate it in his head (I was lazy and added them to only a couple of bars in the mockup). Also highlight the current day, for example in the below picture this was done by circling "Thursday" and using a bold font for the time that the user has been working today. You could also use some visual effect for today's vertical bar.
improved bar chart http://img11.imageshack.us/img11/6755/20090216203339em2.png
For your second chart you should also draw in the weekly or monthly deadlines, as horizontal bars. In some cases the second chart might be better, but in this case I would settle with the first chart (with my improvements), because it might better correspond to the users' mental model, and that way you can also visualize the daily minimum 6h some way.

how about a "test tube/thermometer" image, like they have for charities... it fills up as you add time, trying to reach a goal (the red line).
Fill each day with a different color, and label accordingly... if the user works overtime M-T, they may have reached the line before Friday, etc.
similar to:
alt text http://goyamarketing.com/blog/wp-content/uploads/2008/11/goal_thermometer.gif

The char bar looks nice visually, but what you really need to know is how far behind/ahead you are in your hours totally.
If i understood correctly, you are interested only in your current saldo and your current deadline, so you could display that information in 2 seperate fields (e.g. below the chart). Of course, you could also mark the deadline with some horizontal line in the graph.

How about a pie chart where the total size represents the minimum goal and add pie slices for each day with actual days that met the minimum one color, days with extra time another and then fill the remaining days as gray slices with the number of hours the employee would have to work divided evenly among them?

A line plot with "working days elapsed" on the X axis and "fraction of monthly required work done" on the Y axis? With only a few data points, it should be clear by visually extrapolating whether you'll hit the top early or late. The plot would look like an irregular staircase.
ps. "The Visual Display of Quantative Information" by Edward Tufte is a great book for ideas on this kind of topic.

To take Dickon Reed's idea one step further, why not do each point as a +/- hours ahead or behind, with your baseline always 0. That way you have more detail. I'm not sure if I got my point across, but it's hard to do without an example, and I'm short on time.

Related

Highcharts: getting runtime durations on the yAxis

I run daily a job. Today, that job takes 1:45:09 hrs. I have a lot of such durations for that job from the past weeks and I want to be able to show that graphically using a simple column chart. On the Y axis I want duration ticks from 0:00:00 - 5:00:00 or so that I can easily compare the runtimes from the past weeks and see if the job is gradually taking longer and longer.
I have read and implemented a lot of answers from StackOverflow and other internet resources but none of them fit my purpose. When using unix timestamps (since 1970, etc) I get columns that are all of the same hight and Y-axis ticks in years instead of hours from 1970 to now.
Another option was to just calculate the minutes or seconds. Then the difference become appearant, but instead of time elements in the Y axis and tooltips I get integers.
Can someone show me how to achieve my goal in a fiddle? The question looks common enough to me for any monitoring software.
-- EDIT --
Here is a Photoshop sample of what I am trying to achieve:
On the Y-axis: a time scale. In the tooltip: date, objectname and time taken.
-- END EDIT --
BTW, I have no chart type preference. The usual column charts just seem to fit the purpose.
Thanks for any help!

Gap in time series not appearing

I am working with time series data that omit data for the weekend. When graphing these time series in D3 v4 the graph interpolates over the weekend. See the following URL for an illustration (including code, data, and graph output):
No records for weekend
Instead, I want a gap at the weekend; graph stopping on Friday and resuming on Monday.
I could fix the problem by creating dummy records for the weekend, with values 'NA', and using the D3 defined method, as shown in the following:
Data has NA records
However, generating dummy records feels to me like excessively heavy lifting. Is there a simple, natural way to get D3 to leave a gap when time series records are missing?
Is there a simple, natural way to get D3 to leave a gap when time series records are missing?
Unfortunately no, that's the normal behaviour of a time scale. According to Mike Bostock, D3 creator,
A d3 time scale should be used when you want to display time as a continuous, quantitative variable, such as when you want to take into account the fact that days can range from 23-25 hours due to daylight savings changes, and years can vary from 365-366 days due to leap years.
So, the time scale was created having in mind a continuous time.
Your current approach in the line generator...
.defined(function(d) { return !isNaN(d.value); })
... doesn't work because all the dates in your CSV have values, and d3 will connect the dots.
That having been said, if you want to keep the gap, just use dummy records (as null or any non numeric value) for the weekends and line.defined, as in your second link.

CartoDB Torque - using a start and end for each point

I am trying to make an animated map using the torque from CartoDB.
I have about 8000 points, which I would like to show and hide following a 'period of validity'. Let's say for instance it shows data for one day; some points will be valid all day, others from midnight to 6am, others 4am to 8am, and so on and so forth. The torque would start at midnight, showing all the points valid at that time, then to 1am, removing and adding some, and so on until midnight.
In order to do that, I would like to set a start and an end time, but I cannot find any option for this. I saw the cumulative option, but then it would stay until the end of the animation, which is not exactly what I want.
Any suggestion would be welcomed, thanks!
The only thing it cames to me is repeating the records as many times as the time step you want on the animated map and assigning this new time as the torque animation field.
So if your torque steps every hour, and your point is valid for six, then you'll have one point per hour. To produce that query you'll probably need to use generate_series.

Best approach: transfer daily values from one year to another

I will try to explain what I want to accomplish. I am looking for an algorithm or approach, not the actual implementation in my specific system.
I have a table with actuals (incoming customer requests) on a daily basis. These actuals need to be "copied" into the next year, where they will be used as a basis for planning the amount of requests in the future.
The smallest timespan for planning, on a technical basis, is a "period", which consists of at least one day. A period always changes after a week or after a month. This means, that if a week is both in May and June, it will be split in two periods.
Here's an example:
2010-05-24 - 2010-05-30 Week 21 | Period_Id 123
2010-05-31 - 2010-05-31 Week 22 | Period_Id 124
2010-06-01 - 2010-06-06 Week 22 | Period_Id 125
We did this to reduce the amount of data, because we have a few thousand items that have 356 daily values. For planning, this is reduced to "a few thousand x 65" (or whatever the period count is per year). I can aggregate a month, or a week, by combining all periods that belong to one month. The important thing about this is, I could still use daily values, then find the corresponding period and add it there if necessary.
What I need, is an approach on aggregating the actuals for every (working)day, week or month in next years equivalent period. My requirements are not fixed here. The actuals have a certain distribution, because there are certain deadlines and habits that are reflected in the data. I would like to be able to preserve this as far as possible, but planning is never completely accurate, so I can make a compromise here.
Don't know if this is what you're looking for, but this is a strategy for calculating the forecasts using flexible periods:
First define a mapping for each day in next year to the corresponding day in this year. Then when you need a forecast for period x you take all days in that period and sum the actuals for the matching days.
With this you can precalculate every week/month but create new forecasts if the contents of periods change.
Map weeks to weeks. The first full week of this year to the first full week of the next. Don't worry about "periods" and aggregation; they are irrelevant.
Where a missing holiday leaves a hole in the data, just take the values for the same day of the previous week or the next week, and do the same at the beginning/end of the year.
Now for each day of the week, combine the results for the year and look for events more than, say, two standard deviations from the mean (if you don't know what that means then skip this step), and look for correlations with known events like holidays. If a holiday doesn't show an effect in this test then ignore it. If you find an effect, shift it to compensate for the different date next year. Don't worry about higher-order effects, you don't have enough data to pin them down.
Now draw in periods wherever you like and aggregate all you want.
Don't make any promises about the accuracy of these predictions, there's no way to know it. Don't worry about whether this is the best possible way; it isn't, but it's as good as any you're likely to find. You can spend as much more time and effort fine-tuning this as you wish; it might raise expectations but it's not likely to make the results much more accurate-- it's about as likely to make them worse.
There is no A-priori way to answer that question. You have to look at your data, and decide what the important parameters (day of week, week number, month, season, temperature outside?) using the results.
For example, if many of your customers are jewish/muslim, then the gregorian calendar, and ISO-week numbers and all that won't help you much, because jewish/muslim holidays (and so users behaviour) are determined using other calendars.
Another example - Trying to predict iPhone search volume according to last year's search doesn't sound like a good idea. It seems that the important timescales are much longer than a year (the technology becoming mainstream over the years) and much shorter than a year (Specific events that affect us for days-weeks).

How to notice unusual news activity

Suppose you were able keep track of the news mentions of different entities, like say "Steve Jobs" and "Steve Ballmer".
What are ways that could you tell whether the amount of mentions per entity per a given time period was unusual relative to their normal degree of frequency of appearance?
I imagine that for a more popular person like Steve Jobs an increase of like 50% might be unusual (an increase of 1000 to 1500), while for a relatively unknown CEO an increase of 1000% for a given day could be possible (an increase of 2 to 200). If you didn't have a way of scaling that your unusualness index could be dominated by unheard-ofs getting their 15 minutes of fame.
update: To make it clearer, it's assumed that you are already able to get a continuous news stream and identify entities in each news item and store all of this in a relational data store.
You could use a rolling average. This is how a lot of stock trackers work. By tracking the last n data points, you could see if this change was a substantial change outside of their usual variance.
You could also try some normalization -- one very simple one would be that each category has a total number of mentions (m), a percent change from the last time period (δ), and then some normalized value (z) where z = m * δ. Lets look at the table below (m0 is the previous value of m) :
Name m m0 δ z
Steve Jobs 4950 4500 .10 495
Steve Ballmer 400 300 .33 132
Larry Ellison 50 10 4.0 400
Andy Nobody 50 40 .20 10
Here, a 400% change for unknown Larry Ellison results in a z value of 400, a 10% change for the much better known Steve Jobs is 495, and my spike of 20% is still a low 10. You could tweak this algorithm depending on what you feel are good weights, or use standard deviation or the rolling average to find if this is far away from their "expected" results.
Create a database and keep a history of stories with a time stamp. You then have a history of stories over time of each category of news item you're monitoring.
Periodically calculate the number of stories per unit of time (you choose the unit).
Test if the current value is more than X standard deviations away from the historical data.
Some data will be more volatile than others so you may need to adjust X appropriately. X=1 is a reasonable starting point
Way over simplified-
store people's names and the amount of articles created in the past 24 hours with their name involved. Compare to historical data.
Real life-
If you're trying to dynamically pick out people's names, how would you go about doing that? Searching through articles how do you grab names? Once you grab a new name, do you search for all articles for him? How do you separate out Steve Jobs from Apple from Steve Jobs the new star running back that is generating a lot of articles?
If you're looking for simplicity, create a table with 50 people's names that you actually insert. Every day at midnight, have your program run a quick google query for past 24 hours and store the number of results. There are a lot of variables in this though that we're not accounting for.
The method you use is going to depend on the distribution of the counts for each person. My hunch is that they are not going to be normally distributed, which means that some of the standard approaches to longitudinal data might not be appropriate - especially for the small-fry, unknown CEOs you mention, who will have data that are very much non-continuous.
I'm really not well-versed enough in longitudinal methods to give you a solid answer here, but here's what I'd probably do if you locked me in a room to implement this right now:
Dig up a bunch of past data. Hard to say how much you'd need, but I would basically go until it gets computationally insane or the timeline gets unrealistic (not expecting Steve Jobs references from the 1930s).
In preparation for creating a simulated "probability distribution" of sorts (I'm using terms loosely here), more recent data needs to be weighted more than past data - e.g., a thousand years from now, hearing one mention of (this) Steve Jobs might be considered a noteworthy event, so you wouldn't want to be using expected counts from today (Andy's rolling mean is using this same principle). For each count (day) in your database, create a sampling probability that decays over time. Yesterday is the most relevant datum and should be sampled frequently; 30 years ago should not.
Sample out of that dataset using the weights and with replacement (i.e., same datum can be sampled more than once). How many draws you make depends on the data, how many people you're tracking, how good your hardware is, etc. More is better.
Compare your actual count of stories for the day in question to that distribution. What percent of the simulated counts lie above your real count? That's roughly (god don't let any economists look at this) the probability of your real count or a larger one happening on that day. Now you decide what's relevant - 5% is the norm, but it's an arbitrary, stupid norm. Just browse your results for awhile and see what seems relevant to you. The end.
Here's what sucks about this method: there's no trend in it. If Steve Jobs had 15,000 a week ago, 2000 three days ago, and 300 yesterday, there's a clear downward trend. But the method outlined above can only account for that by reducing the weights for the older data; it has no way to project that trend forward. It assumes that the process is basically stationary - that there's no real change going on over time, just more and less probable events from the same random process.
Anyway, if you have the patience and willpower, check into some real statistics. You could look into multilevel models (each day is a repeated measure nested within an individual), for example. Just beware of your parametric assumptions... mention counts, especially on the small end, are not going to be normal. If they fit a parametric distribution at all, it would be in the Poisson family: the Poisson itself (good luck), the overdispersed Poisson (aka negative binomial), or the zero-inflated Poisson (quite likely for your small-fry, no chance for Steve).
Awesome question, at any rate. Lend your support to the statistics StackExchange site, and once it's up you'll be able to get a much better answer than this.

Resources