Algorithm to calculate the price volatility of a commodity - algorithm

I am trying to design an algorithm to calculate how volatile the price fluctuations of a commodity are.
The way I would like this to work is that if the price of a commodity constantly goes up and down, it should have a higher score than if the price of the commodity gradually increases and then falls in price rapidly.
Here is an example of what I mean:
Commodity A: 1 -> 2 -> 3 -> 2 -> 1 -> 3 -> 4 -> 2 -> 1
Commodity B: 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8 -> 2
Commodity C: 1 -> 2 -> 3 -> 4 -> 5 -> 4 -> 3 -> 2-> 1
Commodity A has a 'wave' like pattern in that its price goes up and falls down on a regular basis.
Commodity B has a 'cliff' like pattern in that the price goes up gradually and then falls steeply.
Commodity C has a 'hill' like pattern in that the price rises gradually and then falls gradually.
A should receive the highest ranking, followed by C, followed by B. The more of a wave pattern the price of the commodity follows, the higher a ranking it should have.
Does have any suggestions for an algorithm that could do this?
Thanks!

My Approach looks something like this.
For my algorithm, I am considering the above example.
A: 1 -> 2 -> 3 -> 2 -> 1 -> 3 -> 4 -> 2 -> 1
B: 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8 -> 2
C: 1 -> 2 -> 3 -> 4 -> 5 -> 4 -> 3 -> 2-> 1
Now I will squash these list, by squash i mean taking the start value and end value of an increasing or decreasing sequence.
So, after squashing the list will look something like this.
A: 1 -> 3 -> 1 -> 4 -> 1
B: 1 -> 8 -> 2
C: 1 -> 5 -> 1
Now once this it done, I take the difference between i and i+1 element and then take the average and based on the average, I give them the rank.
So the difference between i and i+1 element will look something like this
2 2 3 3
A: 1 --> 3 --> 1 --> 4 --> 1
7 6
B: 1 --> 8 --> 2
4 4
C: 1 --> 5 --> 1
Now let's sum this difference and take the average.
A: (2+2+3+3)/4 = 2.5
B: (7+6)/2 = 6.5
C: (4+4)/2 = 4
Now we can assign ranks based on this average value where
A < C < B
Hope this helps!

Related

Find recursive, hierarchical structures in a sequence

I am looking for an efficient algorithm to find structures that are both recursive and hierarchical from a sequence. I did come across the sequitur algorithm but it only returns hierarchical structures.
Consider the following sequence as an example: abcbcabc
What sequitur returns:
0 -> 1 2 1
1 -> a 2
2 -> b c
Required output:
0 -> 1 1
1 -> a 2
2 -> b c | b c 2

Algorithm help :Given a num, can it finally be 1

Given a number, you can divide it or its contiguous part by 2 , or multiply it or its continuous part by 2. Can the number finally be 1?
For example : 13
3 is a part of 13, first we take 3 * 2 = 6, the num turn to be 16,
second we can operate the whole num 16, 16 / 2 = 8, the num is 8 now,
8/2 = 4, num is 4 now,
4/2 = 2, num is 2 now,
2/2 = 1, num is 1 now.
finally we can say 13 can turn into 1, and the path is 13->16->8->4->2->1, we can use a List to store the path.
Example :27
first we operate the whole num 27, 27 * 2 = 54;
then we take 4 as the part of 54, 4 / 2 = 2 , so the 4 is 2 now, num becomes 52;
operate 52, 52 / 2 = 26, num is 26 now;
operate 26, 26 / 2 = 13, num is 13 now;
we just analyzed 13, so 27 can turn into 1 finally.
How to analyze such problem? What's the main idea of solving such type problem?
Sorry about the confusing description, let's take a more complex example: 316
16 is a contiguous part, let 16 / 2 = 8 , so the num is 38 now,
then take 8 / 2 = 4 , the num is 34,
take 4 / 2 = 2, the num is 32,
now take the whole num 32 / 2 = 16,
16 / 2 = 8, num is 8,
8 / 2 = 4, num is 4,
4 / 2 = 2, num is 2,
finally 2 / 2 = 1.
We say, original num 316 can turn into 1 finally after above conversion.
And the contiguous part means, if the input num is 12345,
then 123, 234,345,12,2345 and so on, they are all contiguous parts.
Any continuous subset of num is fine,including head or tail is NOT necessary.
The question is :
How to judge such a num? And if the num can turn into 1, print the path.
Can you find the shortest way?
I got some hints from interviewer (The interview is over):
Most of numbers are eligible, that means nums which are NOT eligible, these characteristics are obvious.
Brute fore way's time complexity is too high, we should pruning timely. (Slide window + pruning ?)
Here is a simple and unoptimized breadth-first search.
def shortest_digit_path (n):
path_from = {n: None}
queue = [n]
count = 0
while True:
m = queue.pop(0)
count += 1
if 0 == count %1000:
print((count, m))
if m == 1:
break
x = str(m)
for i in range(len(x)):
for j in range(i+1, len(x) + 1):
y = x[0:i]
z = x[i:j]
w = x[j:]
if z[0] == '0':
continue # The continuous section is not a proper number.
# Try half of z
if z[-1] in ['2', '4', '6', '8']:
next_m = int(y + str(int(z)//2) + w)
if next_m not in path_from:
path_from[next_m] = m
queue.append(next_m)
# Try doubling z
next_m = int(y + str(int(z)*2) + w)
if next_m not in path_from:
path_from[next_m] = m
queue.append(next_m)
path = []
while m is not None:
path.append(m)
m = path_from[m]
return list(reversed(path))
After playing around with this for a bit, I came up with the following observations.
If the number ends in 0 or 5, there is no path to having any other digit at the end, and therefore you can't get to 1. (The above function will just run forever.
For anything else we can find a path just dealing with 1-2 digits at a time.
Here are the special cases for observation #2. Our first goal is to get to just 0, 1, and 5 as digits.
0: 0
1: 1
2: 2 -> 1
3: 3 -> 6 -> 12 -> 24 -> 28 -> 56 -> 112 -> 16 -> 8 -> 4 -> 2 -> 1
4: 4 -> 2 -> 1
5: 5
6: 6 -> 12 -> 24 -> 28 -> 56 -> 112 -> 16 -> 8 -> 4 -> 2 -> 1
7: 7 -> 14 -> 28 -> 56 -> 112 -> 16 -> 8 -> 4 -> 2 -> 1
8: 8 -> 4 -> 2 -> 1
9: 9 -> 18 -> 28 -> 56 -> 112 -> 16 -> 8 -> 4 -> 2 -> 1
And now from the start of the number we have to deal with the following cases that reduce the number of digits and get back to our desired form.
10: 10 -> 5
11: 11 -> 22 -> 24 -> 28 -> 56 -> 112 -> 16 -> 8 -> 4 -> 2 -> 1
15: 15 -> 110 -> 220 -> 240 -> 280 -> 560 -> 1120 -> 160 -> 80 -> 40 -> 20 -> 10 -> 5
50: 50 -> 25 -> 15 -> 110 -> 220 -> 240 -> 280 -> 560 -> 1120 -> 160 -> 80 -> 40 -> 20 -> 10 -> 5
51: 51 -> 52 -> 26 -> 16 -> 8 -> 4 -> 2 -> 1
55: 55 -> 510 -> 520 -> 260 -> 160 -> 80 -> 40 -> 20 -> 10 -> 5
With this set of rules we can first normalize the number to a standard form, then we can shorten it one digit at a time. This lets us essentially instantly come up with a path. Almost certainly not the shortest one, but definitely a path.
Writing that function is left as an exercise to the reader.
Now back to the shortest path. The algorithm for the breadth-first search can be made much faster if we start with a breadth-first search from both ends and meet in the middle. For this you'd need to also have a path_to that is initialized with {1: None}, a queue containing elements of the form (m, is_rising) and initialize it with [(1, True), (n: False)]. You'd then have to branch on is_rising and before entering values into path_from/path_to check for whether it is in path_to/path_from. If it is, you've met in the middle. Now work out both halves of the path and join them together.
The approach is tricker. But it will let you find the shortest path in the square root of the number of steps that the current approach takes.

Counting number of key comparisons in hash table

I have a hash table that look like this:
0
1 -> 1101 -> 1222 -> 1343 \\ 3 key comparison
2
3 -> 2973 -> 2588 \\ 2 key comparison
4
How many key comparisons are there?
The given answer is 1 + 2 + 1 = 4 but shouldn't it be 3 + 2 = 5?
The given answer is correct. One possible sequence:
At first, you have an empty list -> then add 1101 -> no comparison needed.
Add 1222 -> go to the 1 list, compared it with 1101 -> add it to the end of the list -> 1 comparison.
Add 1343 -> go to the 1 list, compared it with 1101, 1222 -> add it to the end of the list -> 2 comparisons.
Add 2973 -> no comparison,
Add 2588 -> go to 3 list, compared it with 2973 -> 1 comparison.
So, in total, the number of comparison is 0 + 1 + 2 + 0 + 1
Don't know where do you get the 3 + 2 = 5 from? total number of elements?

merging linear lists - reconstruct railway network

I need to reconstruct the sequence of stations in a railway network from the sequences of single trips requested from a arbitrary station. There's no direction given in the data. But every request returns an terminal stop. The sequences of single trips can have gaps.
The (end-) result is always a linear list - forking is not allowed.
For example:
Result trips from requested station "4" :
4 - 3 - 2 - 1
4 - 1
4 - 5 - 6
4 - 8 - 9
4 - 6 - 7 - 8 - 9
manually reordered:
1 - 2 - 3 - 4
1 - 4
- 4 - 5 - 6
- 4 - 8 - 9
- 4 - 6 - 7 - 8 - 9
After merging result should be:
1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9
start/stop: 1, 9
Is there an algorithm to calculate the resulting "rope of pearls" list? I tried to figure it out with perls graph-module, but no luck. My books on algorithms doesn't help either.
I think, there are pathologic cases, where multiple solutions are possible, depending on input data.
Maybe someone has an idea to solve it!
As you see in the answers, there is more than one solution. So here's a real-world dataset:
2204236 -> 2200007 -> 2200001
2204236 -> 2203095 -> 2203976 -> 2200225 -> 2200007 -> 2200001
2204236 -> 2204805 -> 2204813 -> 2204401 -> 2219633 -> 2204476 -> 2202024 -> 2202508 -> 2202110 -> 2202026
2204236 -> 2204813 -> 2204401 -> 2219633 -> 2202508 -> 2202110 -> 2202026 -> 3011047 -> 3011048 -> 3011049
2204236 -> 2204813 -> 2204401 -> 2219633 -> 2204476 -> 2202024 -> 2202508 -> 2202110 -> 2202352 -> 2202026
2204236 -> 2204813 -> 2204401 -> 2219633 -> 2204476 -> 2202024 -> 2202508 -> 2209637 -> 2202110
solution of the example data with perl:
use Graph::Directed;
use Graph::Traversal::DFS;
my $g = Graph::Directed->new;
$g->add_path(1,2,3,4);
$g->add_path(1,4);
$g->add_path(4,5,6);
$g->add_path(4,8,9);
$g->add_path(4,6,7,8,9);
print "The graph is $g\n";
my #topo = $g->toposort;
print "g toposorted = #topo\n";
Output
> The graph is 1-2,1-4,2-3,3-4,4-5,4-6,4-8,5-6,6-7,7-8,8-9
> g toposorted = 1 2 3 4 5 6 7 8 9
Using the other direction
$g->add_path(4,3,2,1);
$g->add_path(4,1);
$g->add_path(4,5,6);
$g->add_path(4,8,9);
$g->add_path(4,6,7,8,9);
reveals the second solution
The graph is 2-1,3-2,4-1,4-3,4-5,4-6,4-8,5-6,6-7,7-8,8-9
g toposorted = 4 3 2 1 5 6 7 8 9
Treat the lists node links in a graph. 4-3-2-1 should mean 4 must come before 3, 3 before 2 and 2 before 1. So add arcs from 4 to 3, 3 to 2, 2 to 1.
Once you have all of those you run a topological sort(look it up on wikipedia) on the resulting graph. This will guarantee that the order you get will always respect the partial orderings you are given.
The only case when you are not going to find a solution is when the data is contradicting itself (if you have 4-3-2 and 4-2-3 there's no possible ordering).
You are right, there are multiple cases. Another good solution is 4-5-6-7-8-9-3-2-1, for your example.
Terminal stop station is articulation node and it splits graph into multiple partitions: all nodes inside partition are reachable from one another, nodes in different partitions are reachable only via known terminal stop station. Number of partitions is 2 in your example, but may be much larger, e.g. consider star-like structure 1 - 2, 1 - 3, 1 - 4, 1 - 5.
First of all you need to enumerate partitions. You treat your graph as undirected graph and run DFS from stop station in each of directions. At first run you discover partition #1, at second run partition #2 and so on.
Then you treat you graph as directed with stop station as root node for all partitions and run topological sorting (TS) for each of partitions.
Possible outcomes:
TS for one of partitions fails. This means there is no solution.
Number of partitions is one and TS for it succeeds. Solution is unique.
Number of partitions is more than one and TS succeeds for all of them. This means there are multiple solutions. To get any single valid result, you choose some partition and declare that it contains another terminal station. All other partitions are inserted into the first one in between arbitrary pair of nodes.

Finding the root value of a binary tree?

I have an array which stores the relations of values, which makes several trees something like:
So, in this case, my array would be (root, linked to)
(8,3)
(8,10)
(3,1)
(3,6)
(6,4)
(6,7)
(10,14)
(14,13)
And i'd like to set all the root values in the array to the main root in the tree (in all trees):
(8,3)
(8,1)
(8,6)
(8,4)
(8,7)
(8,10)
(8,14)
(8,13)
What algorithm should i investigate?
1) Make a list of all the unique first elements of the tuples.
2) Remove any that also appear as the second element of a tuple.
3) You'll be left with the root (8 here). Replace the first elements of all tuples with this value.
EDIT:
A more complicated approach that will work with multiple trees would be as follows.
First, convert to a parent lookup table:
1 -> 3
3 -> 8
4 -> 6
6 -> 3
7 -> 6
10 -> 8
13 -> 14
14 -> 10
Next, run "find parent with path compression" on each element:
1)
1 -> 3 -> 8
gives
1 -> 8
3 -> 8
4 -> 6
...
3)
3 -> 8
4)
4 -> 6 -> 3 -> 8
gives
1 -> 8
3 -> 8
4 -> 8
6 -> 8
7 -> 6
...
6)
6 -> 8 (already done)
7)
7 -> 6 -> 8
etc.
Result:
1 -> 8
3 -> 8
4 -> 8
6 -> 8
7 -> 8
...
Then convert this back to the tuple list:
(8,1)(8,3)(8,4)...
The find parent with path compression algorithm is as find_set would be for disjoint set forests, e.g.
int find_set(int x) const
{
Element& element = get_element(x);
int& parent = element.m_parent;
if(parent != x)
{
parent = find_set(parent);
}
return parent;
}
The key point is that path compression helps you avoid a lot of work. In the above, for example, when you do the lookup for 4, you store 6 -> 8, which makes later lookups referencing 6 faster.
So assume you have a list of tuples representing the points:
def find_root(ls):
child, parent, root = [], [], []
for node in ls:
parent.append(node[0])
child.append(node[1])
for dis in parent:
if (!child.count(dis)):
root.append(dis)
if len(root) > 1 : return -1 # failure, the tree is not formed well
for nodeIndex in xrange(len(ls)):
ls[nodeIndex] = (root[0], ls[nodeIndex][1])
return ls

Resources