Auction model time complexity - algorithm

I have a basic algorithm of a double-auction mechanism which I am having difficulty finding the Big-O time complexity for. What steps should I take to to rigorously analyse this algorithm?
I know the Big-O notation is a notation for the time complexity of an algorithm, and that it represents the upper bound of the limit of the scaling factor of the algorithm.
However, I am not sure how you would get the second function to define an upper bound to my algorithm.
While (r = min(supply, demand)) {
oa = Pul;
ob = Pll;
While {true} {
nondetermistic choice;
If (a buyer submits a bid:) {
If (bid = ob or out of [Pll, Pul]) {
bid is an invalid bid;
} else {
bid updates ob and becomes a new ob;
}
If (ob = oa) {
Pt = oa;
The round is ended;
}
} ElseIf (a seller submits an ask:) {
If (ask = oa or out of [Pll, Pul]) {
ask is an invalid ask;
} else {
ask updates oa and becomes a new oa;
}
If (ob = oa) {
Pt = ob;
The round is ended;
}
} Else (time out:) {
If (no new oa or ob in a pre-specified time period) {
The round is ended with no transaction;
}
}
end nondetermistic choice ;
}
r = r + 1;
}

Related

Check if two mathematical expressions are equivalent

I came across a question in an interview. I tried solving it but could not come up with a solution. Question is:
[Edited]
First Part: You are given two expressions with only "+" operator, check if given two expressions are mathematically equivalent.
For eg "A+B+C" is equivalent to "A+(B+C)".
Second Part : You are given two expressions with only "+" and "-" operators, check if given two expressions are mathematically equivalent.
For eg "A+B-C" is equivalent to "A-(-B+C)".
My thought process : I was thinking in terms of building an expression tree out of the given expressions and look for some kind of similarity. But I am unable to come up with a good way of checking if two expression trees are some way same or not.
Can some one help me on this :) Thanks in advance !
As long as the operations are commutative, the solution I'd propose is distribute parenthetic operations and then sort terms by 'variable', then run an aggregator across them and you should get a string of factors and symbols. Then just check the set of factors.
Aggregate variable counts until encountering an opening brace, treating subtraction as addition of the negated variable. Handle sub-expressions recursively.
The content of sub-expressions can be directly aggregated into the counts, you just need to take the sign into account properly -- there is no need to create an actual expression tree for this task. The TreeMap used in the code is just a sorted map implementation in the JDK.
The code takes advantage of the fact that the current position is part of the Reader state, so we can easily continue parsing after the closing bracket of the recursive call without needing to hand this information back to the caller explicitly somehow.
Implementation in Java (untested):
class Expression {
// Count for each variable name
Map<String, Integer> counts = new TreeMap<>();
Expression(Srring s) throws IOException {
this(new StringReader(s));
}
Expression(Reader reader) throws IOException {
int sign = 1;
while (true) {
int token = reader.read();
switch (token) {
case -1: // Eof
case ')':
return;
case '(':
add(sign, new Expression(reader));
sign = 1;
break;
case '+':
break;
case '-':
sign = -sign;
break;
default:
add(sign, String.valueOf((char) token));
sign = 1;
break;
}
}
}
void add(int factor, String variable) {
int count = counts.containsKey(variable) ? counts.get(variable) : 0;
counts.put(count + factor, variable);
}
void add(int sign, Expression expr) {
for (Map.Entry<String,Integer> entry : expr.counts.entrySet()) {
add(sign * entry.getVaue(), entry.getKey());
}
}
void equals(Object o) {
return (o instanceof Expression)
&& ((Expression) o).counts.equals(counts);
}
// Not needed for the task, just added for illustration purposes.
String toString() {
StringBuilder sb = new StringBuilder();
for (Map.Entry<String,Integer> entry : expr.counts.entrySet()) {
if (sb.length() > 0) {
sb.append(" + ");
}
sb.append(entry.getValue()); // count
sb.append(entry.getKey()); // variable name
}
return sb.toString();
}
}
Compare with
new Expression("A+B-C").equals(new Expression("A-(-B+C)"))
P.S: Added a toString() method to illustrate the data structure better.
Should print 1A + 1B + -1C for the example.
P.P.P.P.S.: Fixes, simplification, better explanation.
You can parse the expressions from left to right and reduce them to a canonical form for comparison in a straightforward way; the only complication is that when you encounter a closing bracket, you need to know whether its associated opening bracket had a plus or minus in front of it; you can use a stack for that; e.g.:
function Dictionary() {
this.d = [];
}
Dictionary.prototype.add = function(key, value) {
if (!this.d.hasOwnProperty(key)) this.d[key] = value;
else this.d[key] += value;
}
Dictionary.prototype.compare = function(other) {
for (var key in this.d) {
if (!other.d.hasOwnProperty(key) || other.d[key] != this.d[key]) return false;
}
return this.d.length == other.d.length;
}
function canonize(expression) {
var tokens = expression.split('');
var variables = new Dictionary();
var sign_stack = [];
var total_sign = 1;
var current_sign = 1;
for (var i in tokens) {
switch(tokens[i]) {
case '(' : {
sign_stack.push(current_sign);
total_sign *= current_sign;
current_sign = 1;
break;
}
case ')' : {
total_sign *= sign_stack.pop();
break;
}
case '+' : {
current_sign = 1;
break;
}
case '-' : {
current_sign = -1;
break;
}
case ' ' : {
break;
}
default : {
variables.add(tokens[i], current_sign * total_sign);
}
}
}
return variables;
}
var a = canonize("A + B + (A - (A + C - B) - B) - C");
var b = canonize("-C - (-A - (B + (-C)))");
document.write(a.compare(b));

Path finding with theta* when the triangle inequality is not fulfilled

I try to use the theta* algorithm (aigamedev.com/open/tutorial/lazy-theta-star) to find the fastest path on a rectangular grid. Distances are euqlidian, but speed between nodes is time dependent and varies between directions. So the triangle inequality, which is a prerequisite for the algorithm, is violated. Still it works excellently in most cases. How could I modify the code to work nicely also in turbulent areas? I suspect I may have to reevaluate some closed nodes and put them back into the open list. If so, under what conditions? An extensive web search hasn't helped.
vertex_t *thetastar(vertex_t *startnode, vertex_t *finishnode, double starttime) {
vertex_t *s, *s1;
double gold, gnew;
int dir; //8 directions to search for node neighbours
//Initialize
vertex[0].row = startnode->row;
vertex[0].col = startnode->col;
vertex[0].g = starttime;
vertex[0].h = h(&vertex[0], finishnode);
vertex[0].open = true;
vertex[0].closed = false;
vertex[0].parent = &vertex[0];
openlist[0] = &vertex[0];
openlist[1] = NULL;
//Find path
while ((s = pop(openlist)) != NULL) {
if (s->row == finishnode->row && s->col == finishnode->col) {
return s;
}
s->closed = true;
for (dir = 0; dir < 8; dir++) {
if ((s1 = nghbrvis(s, dir)) != NULL) {
if (!s1->closed) {
if (!s1->open) {
s1->g = inftime;
s1->parent = NULL;
}
gold = s1->g;
//Path 2
if (lineofsight(s->parent, s1)) {
gnew = (s->parent)->g + c(s->parent, s1);
if (gnew < s1->g) {
s1->parent = s->parent;
s1->g = gnew;
} }
//Path 1
gnew = s->g + c(s, s1);
if (gnew < s1->g) {
s1->parent = s;
s1->g = gnew;
}
if (s1->g < gold) {
s1->h = h(s1, finishnode);
if (s1->open)
remove(s1, openlist);
insert(s1, openlist);
} } } } }
return NULL;
}

Algorithm that discovers all the fields on a map with as least turns as possible

Let's say I have such map:
#####
..###
W.###
. is a discovered cell.
# is an undiscovered cell.
W is a worker. There can be many workers. Each of them can move once per turn. In one turn he can move by one cell in 4 directions (up, right, down or left). He discovers all 8 cells around him - turns # into .. In one turn, there can be maximum one worker on the same cell.
Maps are not always rectangular. In the beginning all cells are undiscovered, except the neighbours of W.
The goal is to make all the cells discovered, in as least turns as possible.
First approach
Find the nearest # and go towards it. Repeat.
To find the nearest # I start BFS from W and finish it when first # is found.
On exemplary map it can give such solution:
##### ##### ##### ##### ##... #.... .....
..### ...## ....# ..... ...W. ..W.. .W...
W.### .W.## ..W.# ...W. ..... ..... .....
6 turns. Pretty far from optimal:
##### ..### ...## ....# .....
..### W.### .W.## ..W.# ...W.
W.### ..### ...## ....# .....
4 turns.
Question
What is the algorithm that discovers all the cells with as least turns as possible?
Here is a basic idea that uses A*. It is probably quite time- and memory-consuming, but it is guaranteed to return an optimal solution and is definitely better than brute force.
The nodes for A* will be the various states, i.e. where the workers are positioned and the discovery state of all cells. Each unique state represents a different node.
Edges will be all possible transitions. One worker has four possible transitions. For more workers, you will need every possible combination (about 4^n edges). This is the part where you can constrain the workers to remain within the grid and not to overlap.
The cost will be the number of turns. The heuristic to approximate the distance to the goal (all cells discovered) can be developed as follows:
A single worker can discover at most three cells per turn. Thus, n workers can discover at most 3*n cells. The minimum number of remaining turns is therefore "number of undiscovered cells / (3 * worker count)". This is the heuristic to use. This could even be improved by determining the maximum number of cells that each worker can discover in the next turn (will be max. 3 per worker). So overall heuristic would be "(undiscorvered cells - discoverable cells) / (3 * workers) + 1".
In each step you examine the node with the least overall cost (turns so far + heuristic). For the examined node, you calculate the costs for each surrounding node (possible movements of all workers) and go on.
Strictly speaking, the main part of this answer may be considered as "Not An Answer". So to first cover the actual question:
What is the algorithm that discovers all the cells with as least turns as possible?
Answer: In each step, you can compute all possible successors of the current state. Then the successors of these successors. This can be repeated recursively, until one of the successors contains no more #-fields. The sequence of states through which this successor was reached is optimal regarding the number of moves that have been necessary to reach this state.
So far, this is trivial. But of course, this is not feasible for a "large" map and/or a "large" number of workers.
As mentioned in the comments: I think that finding the optimal solution may be an NP-complete problem. In any case, it's most likely at least a tremendously complicated optimization problem where you may employ some rather sophisticated techniques to find the optimal solution in optimal time.
So, IMHO, the only feasible approach for tackling this are heuristics.
Several approaches can be imagined here. However, I wanted to give it a try, with a very simple approach. The following MCVE accepts the definition of the map as a rectangular string (empty spaces represent "invalid" regions, so it's possible to represent non-rectangular maps with that). The workers are simply enumerated, from 0 to 9 (limited to this number, at the moment). The string is converted into a MapState that consists of the actual map, as well as the paths that the workers have gone through until then.
The actual search here is a "greedy" version of the exhaustive search that I described in the first paragraph: Given an initial state, it computes all successor states. These are the states where each worker has moved in either direction (e.g. 64 states for 3 workers - of course these are "filtered" to make sure that workers don't leave the map or move to the same field).
These successor states are stored in a list. Then it searches the list for the "best" state, and again computes all successors of this "best" state and stores them in the list. Sooner or later, the list contains a state where no fields are missing.
The definition of the "best" state is where the heuristics come into play: A state is "better" than another when there are fewer fields missing (unvisited). When two states have an equal number of missing fields, then the average distance of the workers to the next unvisited fields serves as the criterion to decide which one is "better".
This finds and a solution for the example that is contained in the code below rather quickly, and prints it as the lists of positions that each worker has to visit in each turn.
Of course, this will also not be applicable to "really large" maps or "many" workers, because the list of states will grow rather quickly (one could consider dropping the "worst" solutions to speed this up a little, but this may have caveats, like being stuck in local optima). Additionally, one can easily think of cases where the "greedy" strategy does not give optimal results. But until someone posts an MVCE that always computes the optimal solution in polynomial time, maybe someone finds this interesting or helpful.
import java.awt.Point;
import java.util.ArrayList;
import java.util.Comparator;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
public class MapExplorerTest
{
public static void main(String[] args)
{
String mapString =
" ### ######"+"\n"+
" ### ###1##"+"\n"+
"###############"+"\n"+
"#0#############"+"\n"+
"###############"+"\n"+
"###############"+"\n"+
"###############"+"\n"+
"###############"+"\n"+
"###############"+"\n"+
"###############"+"\n"+
"##### #######"+"\n"+
"##### #######"+"\n"+
"##### #######"+"\n"+
"###############"+"\n"+
"###############"+"\n"+
"###############"+"\n"+
"### ######2##"+"\n"+
"### #########"+"\n";
MapExplorer m = new MapExplorer(mapString);
MapState solution = m.computeSolutionGreedy();
System.out.println(solution.createString());
}
}
class MapState
{
private int rows;
private int cols;
private char map[][];
List<List<Point>> workerPaths;
private int missingFields = -1;
MapState(String mapString)
{
workerPaths = new ArrayList<List<Point>>();
rows = countLines(mapString);
cols = mapString.indexOf("\n");
map = new char[rows][cols];
String s = mapString.replaceAll("\\n", "");
for (int r=0; r<rows; r++)
{
for (int c=0; c<cols; c++)
{
int i = c+r*cols;
char ch = s.charAt(i);
map[r][c] = ch;
if (Character.isDigit(ch))
{
int workerIndex = ch - '0';
while (workerPaths.size() <= workerIndex)
{
workerPaths.add(new ArrayList<Point>());
}
Point p = new Point(r, c);
workerPaths.get(workerIndex).add(p);
}
}
}
}
MapState(MapState other)
{
this.rows = other.rows;
this.cols = other.cols;
this.map = new char[other.map.length][];
for (int i=0; i<other.map.length; i++)
{
this.map[i] = other.map[i].clone();
}
this.workerPaths = new ArrayList<List<Point>>();
for (List<Point> otherWorkerPath : other.workerPaths)
{
this.workerPaths.add(MapExplorer.copy(otherWorkerPath));
}
}
int distanceToMissing(Point p0)
{
if (getMissingFields() == 0)
{
return -1;
}
List<Point> points = new ArrayList<Point>();
Map<Point, Integer> distances = new HashMap<Point, Integer>();
distances.put(p0, 0);
points.add(p0);
while (!points.isEmpty())
{
Point p = points.remove(0);
List<Point> successors = MapExplorer.computeSuccessors(p);
for (Point s : successors)
{
if (!isValid(p))
{
continue;
}
if (map[p.x][p.y] == '#')
{
return distances.get(p)+1;
}
if (!distances.containsKey(s))
{
distances.put(s, distances.get(p)+1);
points.add(s);
}
}
}
return -1;
}
double averageDistanceToMissing()
{
double d = 0;
for (List<Point> workerPath : workerPaths)
{
Point p = workerPath.get(workerPath.size()-1);
d += distanceToMissing(p);
}
return d / workerPaths.size();
}
int getMissingFields()
{
if (missingFields == -1)
{
missingFields = countMissingFields();
}
return missingFields;
}
private int countMissingFields()
{
int count = 0;
for (int r=0; r<rows; r++)
{
for (int c=0; c<cols; c++)
{
if (map[r][c] == '#')
{
count++;
}
}
}
return count;
}
void update()
{
for (List<Point> workerPath : workerPaths)
{
Point p = workerPath.get(workerPath.size()-1);
for (int dr=-1; dr<=1; dr++)
{
for (int dc=-1; dc<=1; dc++)
{
if (dr == 0 && dc == 0)
{
continue;
}
int nr = p.x + dr;
int nc = p.y + dc;
if (!isValid(nr, nc))
{
continue;
}
if (map[nr][nc] != '#')
{
continue;
}
map[nr][nc] = '.';
}
}
}
}
public void updateWorkerPosition(int w, Point p)
{
List<Point> workerPath = workerPaths.get(w);
Point old = workerPath.get(workerPath.size()-1);
char oc = map[old.x][old.y];
char nc = map[p.x][p.y];
map[old.x][old.y] = nc;
map[p.x][p.y] = oc;
}
boolean isValid(int r, int c)
{
if (r < 0) return false;
if (r >= rows) return false;
if (c < 0) return false;
if (c >= cols) return false;
if (map[r][c] == ' ')
{
return false;
}
return true;
}
boolean isValid(Point p)
{
return isValid(p.x, p.y);
}
private static int countLines(String s)
{
int count = 0;
while (s.contains("\n"))
{
s = s.replaceFirst("\\\n", "");
count++;
}
return count;
}
public String createMapString()
{
StringBuilder sb = new StringBuilder();
for (int r=0; r<rows; r++)
{
for (int c=0; c<cols; c++)
{
sb.append(map[r][c]);
}
sb.append("\n");
}
return sb.toString();
}
public String createString()
{
StringBuilder sb = new StringBuilder();
for (List<Point> workerPath : workerPaths)
{
Point p = workerPath.get(workerPath.size()-1);
int d = distanceToMissing(p);
sb.append(workerPath).append(", distance: "+d+"\n");
}
sb.append(createMapString());
sb.append("Missing "+getMissingFields());
return sb.toString();
}
}
class MapExplorer
{
MapState mapState;
public MapExplorer(String mapString)
{
mapState = new MapState(mapString);
mapState.update();
computeSuccessors(mapState);
}
static List<Point> copy(List<Point> list)
{
List<Point> result = new ArrayList<Point>();
for (Point p : list)
{
result.add(new Point(p));
}
return result;
}
public MapState computeSolutionGreedy()
{
Comparator<MapState> comparator = new Comparator<MapState>()
{
#Override
public int compare(MapState ms0, MapState ms1)
{
int m0 = ms0.getMissingFields();
int m1 = ms1.getMissingFields();
if (m0 != m1)
{
return m0-m1;
}
double d0 = ms0.averageDistanceToMissing();
double d1 = ms1.averageDistanceToMissing();
return Double.compare(d0, d1);
}
};
Set<MapState> handled = new HashSet<MapState>();
List<MapState> list = new ArrayList<MapState>();
list.add(mapState);
while (true)
{
MapState best = list.get(0);
for (MapState mapState : list)
{
if (!handled.contains(mapState))
{
if (comparator.compare(mapState, best) < 0)
{
best = mapState;
}
}
}
if (best.getMissingFields() == 0)
{
return best;
}
handled.add(best);
list.addAll(computeSuccessors(best));
System.out.println("List size "+list.size()+", handled "+handled.size()+", best\n"+best.createString());
}
}
List<MapState> computeSuccessors(MapState mapState)
{
int numWorkers = mapState.workerPaths.size();
List<Point> oldWorkerPositions = new ArrayList<Point>();
for (int i=0; i<numWorkers; i++)
{
List<Point> workerPath = mapState.workerPaths.get(i);
Point p = workerPath.get(workerPath.size()-1);
oldWorkerPositions.add(p);
}
List<List<Point>> successorPositionsForWorkers = new ArrayList<List<Point>>();
for (int w=0; w<oldWorkerPositions.size(); w++)
{
Point p = oldWorkerPositions.get(w);
List<Point> ps = computeSuccessors(p);
successorPositionsForWorkers.add(ps);
}
List<List<Point>> newWorkerPositionsList = new ArrayList<List<Point>>();
int numSuccessors = (int)Math.pow(4, numWorkers);
for (int i=0; i<numSuccessors; i++)
{
String s = Integer.toString(i, 4);
while (s.length() < numWorkers)
{
s = "0"+s;
}
List<Point> newWorkerPositions = copy(oldWorkerPositions);
for (int w=0; w<numWorkers; w++)
{
int index = s.charAt(w) - '0';
Point newPosition = successorPositionsForWorkers.get(w).get(index);
newWorkerPositions.set(w, newPosition);
}
newWorkerPositionsList.add(newWorkerPositions);
}
List<MapState> successors = new ArrayList<MapState>();
for (int i=0; i<newWorkerPositionsList.size(); i++)
{
List<Point> newWorkerPositions = newWorkerPositionsList.get(i);
if (workerPositionsValid(newWorkerPositions))
{
MapState successor = new MapState(mapState);
for (int w=0; w<numWorkers; w++)
{
Point p = newWorkerPositions.get(w);
successor.updateWorkerPosition(w, p);
successor.workerPaths.get(w).add(p);
}
successor.update();
successors.add(successor);
}
}
return successors;
}
private boolean workerPositionsValid(List<Point> workerPositions)
{
Set<Point> set = new HashSet<Point>();
for (Point p : workerPositions)
{
if (!mapState.isValid(p.x, p.y))
{
return false;
}
set.add(p);
}
return set.size() == workerPositions.size();
}
static List<Point> computeSuccessors(Point p)
{
List<Point> result = new ArrayList<Point>();
result.add(new Point(p.x+0, p.y+1));
result.add(new Point(p.x+0, p.y-1));
result.add(new Point(p.x+1, p.y+0));
result.add(new Point(p.x-1, p.y+0));
return result;
}
}

complexity for recursion function

I have written a function for reverse a stack inline. these two are member function of stack class .
void reverse()
{
int first=pop();
if(first!=-1)
{
reverse();
insert(first);
}
}
private:
void insert(int i)
{
int temp=pop();
if(temp==-1)
{
push(i);
}
else
{
/* there is already a element in the stack*/
insert(i);
push(temp);
}
}
Now how can i analyze my function in form of big O to calculate complexity.
Your insert() takes O(length of the stack) time because:
T(n) = T(n-1) + O(1)[to push] = O(n)
and your reverse() takes O(square of the length of the stack) time because:
T(n) = T(n-1) + O(n)[for insert] = O(n^2)

Determine Event Recurrence Pattern for a set of dates

I am looking for a pattern, algorithm, or library that will take a set of dates and return a description of the recurrence if one exits, i.e. the set [11-01-2010, 11-08-2010, 11-15-2010, 11-22-2010, 11-29-2010] would yield something like "Every Monday in November".
Has anyone seen anything like this before or have any suggestions on the best way to implement it?
Grammatical Evolution (GE) is suitable for this kind of problem, because you are searching for an answer that adheres to a certain language. Grammatical Evolution is also used for program generation, composing music, designing, etcetera.
I'd approach the task like this:
Structure the problem space with a grammar.
Construct a Context-free Grammar that can represent all desired recurrence patterns. Consider production rules like these:
datepattern -> datepattern 'and' datepattern
datepattern -> frequency bounds
frequency -> 'every' ordinal weekday 'of the month'
frequency -> 'every' weekday
ordinal -> ordinal 'and' ordinal
ordinal -> 'first' | 'second' | 'third'
bounds -> 'in the year' year
An example of a pattern generated by these rules is: 'every second and third wednesday of the month in the year 2010 and every tuesday in the year 2011'
One way to implement such a grammar would be through a class hierarchy that you will later operate on through reflection, as I've done in the example below.
Map this language to a set of dates
You should create a function that takes a clause from your language and recursively returns the set of all dates covered by it. This allows you to compare your answers to the input.
Guided by the grammar, search for potential solutions
You could use a Genetic algorithm or Simulated Annealing to match the dates to the grammar, try your luck with Dynamic Programming or start simple with a brute force enumeration of all possible clauses.
Should you go with a Genetic Algorithm, your mutation concept should consist of substituting an expression for another one based on the application of one of your production rules.
Have a look at the following GE-related sites for code and information:
http://www.bangor.ac.uk/~eep201/jge/
http://nohejl.name/age/
http://www.geneticprogramming.us/Home_Page.html
Evaluate each solution
The fitness function could take into account the textual length of the solution, the number of dates generated more than once, the number of dates missed, as well as the number of wrong dates generated.
Example code
By request, and because it's such an interesting challenge, I've written a rudimentary implementation of the algorithm to get you started. Although it works it is by no means finished, the design should definitively get some more thought, and once you have gleaned the fundamental take-aways from this example I recommend you consider using one the libraries I've mentioned above.
/// <summary>
/// This is a very basic example implementation of a grammatical evolution algorithm for formulating a recurrence pattern in a set of dates.
/// It needs significant extensions and optimizations to be useful in a production setting.
/// </summary>
static class Program
{
#region "Class hierarchy that codifies the grammar"
class DatePattern
{
public Frequency frequency;
public Bounds bounds;
public override string ToString() { return "" + frequency + " " + bounds; }
public IEnumerable<DateTime> Dates()
{
return frequency == null ? new DateTime[] { } : frequency.FilterDates(bounds.GetDates());
}
}
abstract class Bounds
{
public abstract IEnumerable<DateTime> GetDates();
}
class YearBounds : Bounds
{
/* in the year .. */
public int year;
public override string ToString() { return "in the year " + year; }
public override IEnumerable<DateTime> GetDates()
{
var firstDayOfYear = new DateTime(year, 1, 1);
return Enumerable.Range(0, new DateTime(year, 12, 31).DayOfYear)
.Select(dayOfYear => firstDayOfYear.AddDays(dayOfYear));
}
}
abstract class Frequency
{
public abstract IEnumerable<DateTime> FilterDates(IEnumerable<DateTime> Dates);
}
class WeeklyFrequency : Frequency
{
/* every .. */
public DayOfWeek dayOfWeek;
public override string ToString() { return "every " + dayOfWeek; }
public override IEnumerable<DateTime> FilterDates(IEnumerable<DateTime> Dates)
{
return Dates.Where(date => (date.DayOfWeek == dayOfWeek));
}
}
class MonthlyFrequency : Frequency
{
/* every .. */
public Ordinal ordinal;
public DayOfWeek dayOfWeek;
/* .. of the month */
public override string ToString() { return "every " + ordinal + " " + dayOfWeek + " of the month"; }
public override IEnumerable<DateTime> FilterDates(IEnumerable<DateTime> Dates)
{
return Dates.Where(date => (date.DayOfWeek == dayOfWeek) && (int)ordinal == (date.Day - 1) / 7);
}
}
enum Ordinal { First, Second, Third, Fourth, Fifth }
#endregion
static Random random = new Random();
const double MUTATION_RATE = 0.3;
static Dictionary<Type, Type[]> subtypes = new Dictionary<Type, Type[]>();
static void Main()
{
// The input signifies the recurrence 'every first thursday of the month in 2010':
var input = new DateTime[] {new DateTime(2010,12,2), new DateTime(2010,11,4),new DateTime(2010,10,7),new DateTime(2010,9,2),
new DateTime(2010,8,5),new DateTime(2010,7,1),new DateTime(2010,6,3),new DateTime(2010,5,6),
new DateTime(2010,4,1),new DateTime(2010,3,4),new DateTime(2010,2,4),new DateTime(2010,1,7) };
for (int cTests = 0; cTests < 20; cTests++)
{
// Initialize with a random population
int treesize = 0;
var population = new DatePattern[] { (DatePattern)Generate(typeof(DatePattern), ref treesize), (DatePattern)Generate(typeof(DatePattern), ref treesize), (DatePattern)Generate(typeof(DatePattern), ref treesize) };
Run(input, new List<DatePattern>(population));
}
}
private static void Run(DateTime[] input, List<DatePattern> population)
{
var strongest = population[0];
int strongestFitness = int.MinValue;
int bestTry = int.MaxValue;
for (int cGenerations = 0; cGenerations < 300 && strongestFitness < -100; cGenerations++)
{
// Select the best individuals to survive:
var survivers = population
.Select(individual => new { Fitness = Fitness(input, individual), individual })
.OrderByDescending(pair => pair.Fitness)
.Take(5)
.Select(pair => pair.individual)
.ToArray();
population.Clear();
// The survivers are the foundation for the next generation:
foreach (var parent in survivers)
{
for (int cChildren = 0; cChildren < 3; cChildren++)
{
int treeSize = 1;
DatePattern child = (DatePattern)Mutate(parent, ref treeSize); // NB: procreation may also be done through crossover.
population.Add((DatePattern)child);
var childFitness = Fitness(input, child);
if (childFitness > strongestFitness)
{
bestTry = cGenerations;
strongestFitness = childFitness;
strongest = child;
}
}
}
}
Trace.WriteLine("Found best match with fitness " + Fitness(input, strongest) + " after " + bestTry + " generations: " + strongest);
}
private static object Mutate(object original, ref int treeSize)
{
treeSize = 0;
object replacement = Construct(original.GetType());
foreach (var field in original.GetType().GetFields())
{
object newFieldValue = field.GetValue(original);
int subtreeSize;
if (field.FieldType.IsEnum)
{
subtreeSize = 1;
if (random.NextDouble() <= MUTATION_RATE)
newFieldValue = ConstructRandomEnumValue(field.FieldType);
}
else if (field.FieldType == typeof(int))
{
subtreeSize = 1;
if (random.NextDouble() <= MUTATION_RATE)
newFieldValue = (random.Next(2) == 0
? Math.Min(int.MaxValue - 1, (int)newFieldValue) + 1
: Math.Max(int.MinValue + 1, (int)newFieldValue) - 1);
}
else
{
subtreeSize = 0;
newFieldValue = Mutate(field.GetValue(original), ref subtreeSize); // mutate pre-maturely to find out subtreeSize
if (random.NextDouble() <= MUTATION_RATE / subtreeSize) // makes high-level nodes mutate less.
{
subtreeSize = 0; // init so we can track the size of the subtree soon to be made.
newFieldValue = Generate(field.FieldType, ref subtreeSize);
}
}
field.SetValue(replacement, newFieldValue);
treeSize += subtreeSize;
}
return replacement;
}
private static object ConstructRandomEnumValue(Type type)
{
var vals = type.GetEnumValues();
return vals.GetValue(random.Next(vals.Length));
}
private static object Construct(Type type)
{
return type.GetConstructor(new Type[] { }).Invoke(new object[] { });
}
private static object Generate(Type type, ref int treesize)
{
if (type.IsEnum)
{
return ConstructRandomEnumValue(type);
}
else if (typeof(int) == type)
{
return random.Next(10) + 2005;
}
else
{
if (type.IsAbstract)
{
// pick one of the concrete subtypes:
var subtypes = GetConcreteSubtypes(type);
type = subtypes[random.Next(subtypes.Length)];
}
object newobj = Construct(type);
foreach (var field in type.GetFields())
{
treesize++;
field.SetValue(newobj, Generate(field.FieldType, ref treesize));
}
return newobj;
}
}
private static int Fitness(DateTime[] input, DatePattern individual)
{
var output = individual.Dates().ToArray();
var avgDateDiff = Math.Abs((output.Average(d => d.Ticks / (24.0 * 60 * 60 * 10000000)) - input.Average(d => d.Ticks / (24.0 * 60 * 60 * 10000000))));
return
-individual.ToString().Length // succinct patterns are preferred.
- input.Except(output).Count() * 300 // Forgetting some of the dates is bad.
- output.Except(input).Count() * 3000 // Spurious dates cause even more confusion to the user.
- (int)(avgDateDiff) * 30000; // The difference in average date is the most important guide.
}
private static Type[] GetConcreteSubtypes(Type supertype)
{
if (subtypes.ContainsKey(supertype))
{
return subtypes[supertype];
}
else
{
var types = AppDomain.CurrentDomain.GetAssemblies().ToList()
.SelectMany(s => s.GetTypes())
.Where(p => supertype.IsAssignableFrom(p) && !p.IsAbstract).ToArray();
subtypes.Add(supertype, types);
return types;
}
}
}
Hope this gets you on track. Be sure to share your actual solution somewhere; I think it will be quite useful in lots of scenarios.
If your purpose is to generate human-readable descriptions of the pattern, as in your "Every Monday in November", then you probably want to start by enumerating the possible descriptions. Descriptions can be broken down into frequency and bounds, for example,
Frequency:
Every day ...
Every other/third/fourth day ...
Weekdays/weekends ...
Every Monday ...
Alternate Mondays ...
The first/second/last Monday ...
...
Bounds:
... in January
... between 25 March and 25 October
...
There won't be all that many of each, and you can check for them one by one.
What I would do:
Create samples of the data
Use a clustering algorithm
Generate samples using the algorithm
Creating a fitness function to measure how well it correlates to the full data set. The clustering algorithm will come up with either 0 or 1 suggestions and you can meassure it against how well it fits in with the full set.
Elementate/merge the occurrence with the already found sets and rerun this algorithm.
Looking at that you may want to use either Simulated Annealing, or an Genetic Algorithm. Also, if you have the descriptions, you may want to compare the descriptions to generate a sample.
You could access the system date or system dateandtime and construct crude calendar points in memory based on the date and the day of the week as returned by the call or function result. Then use the number of days in relevant months to sum them and add on the number of days of the day variable in the input and/or access the calendar point for the relevant week starting sunday or monday and calculate or increment index forward to the correct day. Construct text string using fixed characters and insert the relevant variable such as the full name of the day of the week as required. There may be multiple traversals needed to obtain all the events of which the occurrences are to be displayed or counted.
First, find a sequence, if it exists:
step = {day,month,year}
period=0
for d = 1 to dates.count-1
interval(d,step)=datedifference(s,date(d),date(d+1))
next
' Find frequency with largest interval
for s = year downto day
found=true
for d = 1 to dates.count-2
if interval(d,s)=interval(d+1,s) then
found=false
exit for
end if
next
if found then
period=s
frequency=interval(1,s)
exit for
end if
next
if period>0
Select case period
case day
if frequency mod 7 = 0 then
say "every" dayname(date(1))
else
say "every" frequency "days"
end if
case month
say "every" frequency "months on day" daynumber(date(1))
case years
say "every" frequency "years on" daynumber(date(1)) monthname(date(1))
end select
end if
Finally, deal with "in November", "from 2007 to 2010" etc., should be obvious.
HTH
I like #arjen answer but I don't think there is any need for complex algorithm. This is so so simple. If there is a pattern, there is a pattern... therefore a simple algorithm would work. First we need to think of the types of patterns we are looking for: daily, weekly, monthly and yearly.
How to recognize?
Daily: there is a record every day
Weekly: there is a record every week
Monthly: there is a record every month
Yearly: there is a record every year
Difficult? No. Just count how many repetitions you have and then classify.
Here is my implementation
RecurrencePatternAnalyser.java
public class RecurrencePatternAnalyser {
// Local copy of calendars by add() method
private ArrayList<Calendar> mCalendars = new ArrayList<Calendar>();
// Used to count the uniqueness of each year/month/day
private HashMap<Integer, Integer> year_count = new HashMap<Integer,Integer>();
private HashMap<Integer, Integer> month_count = new HashMap<Integer,Integer>();
private HashMap<Integer, Integer> day_count = new HashMap<Integer,Integer>();
private HashMap<Integer, Integer> busday_count = new HashMap<Integer,Integer>();
// Used for counting payments before due date on weekends
private int day_goodpayer_ocurrences = 0;
private int day_goodPayer = 0;
// Add a new calendar to the analysis
public void add(Calendar date)
{
mCalendars.add(date);
addYear( date.get(Calendar.YEAR) );
addMonth( date.get(Calendar.MONTH) );
addDay( date.get(Calendar.DAY_OF_MONTH) );
addWeekendDays( date );
}
public void printCounts()
{
System.out.println("Year: " + getYearCount() +
" month: " + getMonthCount() + " day: " + getDayCount());
}
public RecurrencePattern getPattern()
{
int records = mCalendars.size();
if (records==1)
return null;
RecurrencePattern rp = null;
if (getYearCount()==records)
{
rp = new RecurrencePatternYearly();
if (records>=3)
rp.setConfidence(1);
else if (records==2)
rp.setConfidence(0.9f);
}
else if (getMonthCount()==records)
{
rp = new RecurrencePatternMonthly();
if (records>=12)
rp.setConfidence(1);
else
rp.setConfidence(1-(-0.0168f * records + 0.2f));
}
else
{
calcDaysRepetitionWithWeekends();
if (day_goodpayer_ocurrences==records)
{
rp = new RecurrencePatternMonthly();
rp.setPattern(RecurrencePattern.PatternType.MONTHLY_GOOD_PAYER);
if (records>=12)
rp.setConfidence(0.95f);
else
rp.setConfidence(1-(-0.0168f * records + 0.25f));
}
}
return rp;
}
// Increment one more year/month/day on each count variable
private void addYear(int key_year) { incrementHash(year_count, key_year); }
private void addMonth(int key_month) { incrementHash(month_count, key_month); }
private void addDay(int key_day) { incrementHash(day_count, key_day); }
// Retrieve number of unique entries for the records
private int getYearCount() { return year_count.size(); }
private int getMonthCount() { return month_count.size(); }
private int getDayCount() { return day_count.size(); }
// Generic function to increment the hash by 1
private void incrementHash(HashMap<Integer, Integer> var, Integer key)
{
Integer oldCount = var.get(key);
Integer newCount = 0;
if ( oldCount != null ) {
newCount = oldCount;
}
newCount++;
var.put(key, newCount);
}
// As Bank are closed during weekends, some dates might be anticipated
// to Fridays. These will be false positives for the recurrence pattern.
// This function adds Saturdays and Sundays to the count when a date is
// Friday.
private void addWeekendDays(Calendar c)
{
int key_day = c.get(Calendar.DAY_OF_MONTH);
incrementHash(busday_count, key_day);
if (c.get(Calendar.DAY_OF_WEEK) == Calendar.FRIDAY)
{
// Adds Saturday
c.add(Calendar.DATE, 1);
key_day = c.get(Calendar.DAY_OF_MONTH);
incrementHash(busday_count, key_day);
// Adds Sunday
c.add(Calendar.DATE, 1);
key_day = c.get(Calendar.DAY_OF_MONTH);
incrementHash(busday_count, key_day);
}
}
private void calcDaysRepetitionWithWeekends()
{
Iterator<Entry<Integer, Integer>> it =
busday_count.entrySet().iterator();
while (it.hasNext()) {
#SuppressWarnings("rawtypes")
Map.Entry pair = (Map.Entry)it.next();
if ((int)pair.getValue() > day_goodpayer_ocurrences)
{
day_goodpayer_ocurrences = (int) pair.getValue();
day_goodPayer = (int) pair.getKey();
}
//it.remove(); // avoids a ConcurrentModificationException
}
}
}
RecurrencePattern.java
public abstract class RecurrencePattern {
public enum PatternType {
YEARLY, MONTHLY, WEEKLY, DAILY, MONTHLY_GOOD_PAYER
}
public enum OrdinalType {
FIRST, SECOND, THIRD, FOURTH, FIFTH
}
protected PatternType pattern;
private float confidence;
private int frequency;
public PatternType getPattern() {
return pattern;
}
public void setPattern(PatternType pattern) {
this.pattern = pattern;
}
public float getConfidence() {
return confidence;
}
public void setConfidence(float confidence) {
this.confidence = confidence;
}
public int getFrequency() {
return frequency;
}
public void setFrequency(int frequency) {
this.frequency = frequency;
}
}
RecurrencePatternMonthly.java
public class RecurrencePatternMonthly extends RecurrencePattern {
private boolean isDayFixed;
private boolean isDayOrdinal;
private OrdinalType ordinaltype;
public RecurrencePatternMonthly()
{
this.pattern = PatternType.MONTHLY;
}
}
RecurrencePatternYearly.java
public class RecurrencePatternYearly extends RecurrencePattern {
private boolean isDayFixed;
private boolean isMonthFixed;
private boolean isDayOrdinal;
private OrdinalType ordinaltype;
public RecurrencePatternYearly()
{
this.pattern = PatternType.YEARLY;
}
}
Main.java
public class Algofin {
static Connection c = null;
public static void main(String[] args) {
//openConnection();
//readSqlFile();
RecurrencePatternAnalyser r = new RecurrencePatternAnalyser();
//System.out.println(new GregorianCalendar(2015,1,30).get(Calendar.MONTH));
r.add(new GregorianCalendar(2015,0,1));
r.add(new GregorianCalendar(2015,0,30));
r.add(new GregorianCalendar(2015,1,27));
r.add(new GregorianCalendar(2015,3,1));
r.add(new GregorianCalendar(2015,4,1));
r.printCounts();
RecurrencePattern rp;
rp=r.getPattern();
System.out.println("Pattern: " + rp.getPattern() + " confidence: " + rp.getConfidence());
}
}
I think you'll have to build it, and I think it will be a devil in the details kind of project. Start by getting much more thorough requirements. Which date patterns do you want to recognize? Come up with a list of examples that you want your algorithm to successfully identify. Write your algorithm to meet your examples. Put your examples in a test suite so when you get different requirements later you can make sure you didn't break the old ones.
I predict you will write 200 if-then-else statements.
OK, I do have one idea. Get familiar with the concepts of sets, unions, coverage, intersection and so on. Have a list of short patterns that you search for, say, "Every day in October", "Every day in November", and "Every day in December." If these short patterns are contained within the set of dates, then define a union function that can combine shorter patterns in intelligent ways. For example, let's say you matched the three patterns I mention above. If you Union them together you get, "Every day in October through December." You could aim to return the most succinct set of unions that cover your set of dates or something like that.
Have a look at your favourite calendar program. See what patterns of event recurrence it can generate. Reverse engineer them.

Resources