How to find period in time series – with fuzziness? - algorithm

Given a set of timestamps, I want to find a periodic grid most of them fall into.
Example set that falls into a grid with a period of 30:
10, 40, 72, 99, 164, 172, 190
Three fuzziness parameters here:
small deviation (72->70, 164->160) is acceptable;
some amount of samples (172) fall out of the grid. Their acceptable percentage is also set by some parameter;
sample around 130 is skipped, but is also ok. Acceptable amount of "holes" can also be somehow set.
Looking at intervals between the samples, I can think of searching for the greatest common divisor (GCD), again, with some fuzziness.
Please advice an approach to this problem.

Related

Conflict-free schdule?

I'm trying to figure out how I can use a graph (as in graph theory) to find a few conflict-free schedules for almost anything. For example:
You have 4 courses you need to take and you want to know what section you should sign up for. Your input should be the course subjects and course number (e.g. MATH 100, MATH 101, MATH 200, etc.) and the output should be the sections you should take to avoid a time-conflict with any other class.
What graph theory algorithm should I use for this? How would I use it?

Algorithm to create numbers that are easy to memorize and difficult to mistake

I want to assign numbers to people as their ID. As these numbers will have to be remembered, entered manually, and written on paper often, I would like to make sure that the numbers are selected to decrease the risk of misremembering or mistyping one person's ID and by accident typing some other person's ID. IDs must be numerical because they also must be encoded in barcode.
Smaller example: For 10 people using the numbers 1 to 10 makes it easy to mistype say 5 instead of 4. Instead, if people were assigned 16, 29, 31, 48, 57, 62, 75, 83, 94, this risk should be reduced as every number has no valid direct neighbors, any digit is unique in any position and misremembering another valid number is less likely.
In reality I need to assign numbers to 1000 people using numbers with six digits.
I am looking for an algorithm that can select these numbers. Preferably it would take into consideration both memorability and the risk of writing or inputting in another valid number by mistake. Unfortunately I cannot immediately describe how to measure these two factors. I was hoping that there exists some standard solution that I am just unable to find for the lack of keywords.
I have also thought about checksums but they do not work on paper. Assigning a second number to participants so a wrong input can be caught by the mismatch is not feasible and faces similar difficulties on paper.
You are in the realm of error correction and error detection codes. Choose your poison.
Simplest way would be to dedicate a digit as a control digit, which allows you to easily detect errors of one typo. For 1000 people and 6 digits (1,000,000 numbers) you can have 2 such digits and allow an easy detection of 2 typos.

Is there an algorithm for detecting odd values relative to set of data

I want to develop an algorithm that makes an action if it detects that some numbers are odd relative to another array of numbers and each number has a date. Those numbers can vary across the day, but the rate of change of these numbers is not necessarily related.
For example,
The data can be
[
{number: 200, date: '12:00'},
{number: 250, date: '12:02'},
{number: 180, date: '12:04'},
{number: 500, date: '12:06'}
]
and the array that I want to test is
[
{number: 400, date: '12:08'},
{number: 50, date: '12:10'}
]
I gather these data for a defined time interval (the above interval is two minutes)
I want to detect if the data falls as time passes, but it can't be directly measured with previous data as it's not consistent and it can fall and rise but I want it to check on the long term.
My Question is what is the approach that I should pursue? Do I have to train a model for that task? If so, what approach should I implement?
I was thinking of writing some hard coded rules that measures the average and compares the data with a threshold. But it wasn't effective on large sets of data as it is not consistent as I stated.
If you have any helpful resources that can help I would be very thankful.
P.S. The above data is not real.
Thanks in advance.
You want an outlier detection which is only searching for decreases.
I propose to create a kernel which predicts the next value based on the recent ones. See Gaussian process regression tutorial | Jupyter nbviewer for starters. The kernel can give you a prediction as well as a confidence margin. If your actual value is more than a certain distance below the allowed confidence margin of the prediction, you can call that an outlier in the negative direction and react on it:
Diagram of a confidence margin:

Suitable machine learning algorithm for column selection

I am new in machine learning. In my work I require a machine learning algorithm to select some columns out of many columns in a 2D matrix depending on the spread of the data. Below is a sample of the 2D matrix:
400 700 4 1400
410 710 4 1500
416 716 4 1811
..............
410 710 4 1300
Previously I have used standard deviation method to select columns depending on some threshold values(as a measure of spread of data for a particular column). Observe that the 3rd column is constant and last column in varying tremendously. 1st and 2nd column in also varying but the spread of their data is small. By applying standard deviation on each of the columns I get (sigma) = 10, 10, 0, 200 respectively.
I have considered some experimental threshold values to discard some columns. If the (sigma) crosses the threshold value range then the corresponding column gets discarded. I calculated those threshold values manually. Though this method was very simple but dealing with the threshold values is a very tedious task as there are many existing columns.
For this reason I want to use a standard machine learning algorithm or somehow if I can make these threshold values adaptive. So that I don't require to hard-code the threshold values inside the code. Can anyone please suggest me an appropriate algorithm for this?

Millisecond accuracy of ActionScript new Date() or getTimer()

I'd like to measure the reaction time of a user. In this example, I'm using actionscript, but the concept is really what is important, so feel free to answer in your language of choice, if you want to show any code.
The user sits in front of a screen and will be presented with a red dot. When they see the red dot, they hit the space bar.
My logic is as follows: make red dot visible, create a new date, wait for spacebar, create a new date, find the difference in milliseconds using a TimeSpan object.
//listen for the keystroke
this.systemManager.stage.addEventListener(KeyboardEvent.KEY_DOWN, catchSpace, true, 1);
...
if (e.keyCode == Keyboard.SPACE) {
e.preventDefault();
this.dispatchEvent(new PvtEvent(PvtEvent.BTN_CLICK));
}
//show the red dot, making note of the time
redDot.visible = true;
this.startCount=new Date();
//user clicks the space bar
this.endCount=new Date();
var timeSpan:Number=TimeSpan.fromDates(this.startCount, this.endCount).totalMilliseconds;
I feel like this should work, but I'm getting some values that are disconcerting. Here is a typical result set:
[254, 294, 296, 305, 306, 307, 308, 309, 310, 308, 312, 308, 338, 346, 364, 370, 380, 387, 395, 402, 427]
Notice that some of the values are close, and 308 is recorded multiple times. So, my questions are as follows:
Is my code, or the logic I'm using, flawed in some way?
What is the probability that the user is able to produce repeat times?
If the probability is low, then what am I missing here?
I should also note that I have (quite accidentally) received a 12ms response time. I was testing the app, and happen to hit the space bar just as the red dot appeared. So, I am doubting that my code cannot judge accurate time, at least to an accuracy of ±12ms :) .
I would suppose that reaction time have somewhat normal distribution, so it might be the case that some results are more likely to occur several times. Your reaction times are from 254 to 427, that is 174 possible different results. so question is in x tests, how likely is it that in x tests, some are the same? since it is probably normaly distributed this increases.
If you run it on your computer, then remember other applications/threads interact with the CPU. Further, some latency in the OS, and if you connect via USB or PS/2 (USB-device/hub is polled, while PS/2 is direct to the IRQ)
No, the logic seems fine. This is a perfectly simple way to measure time to the ms.
Turns out human beings and computers can seldom do anything to millisecond accuracy.
The thing I'm tripping on is Flash!
After a few months of on and off testing, we figured out the issue; the language. From the ASDOC on the flex Timer:
A delay lower than 20 milliseconds is not recommended. Timer frequency
is limited to 60 frames per second, meaning a delay lower than 16.6
milliseconds causes runtime problems.
Flash runs with a frame rate of 60 FPS. I guess this means that if you try to measure time, and want to be accurate to less the 16 ms, you are out of luck. However this does explain why I would see repeating values, as anything in this "60 FPS window" was just being measured as the same time.

Resources