indicator('Oversold/overbought', overlay=true)
length12 = input.int(12, minval=1, title='Length 12')
length6 = input.int(6, minval=1, title='Length 6')
k = input.int(3, minval=1)
d = input.int(3, minval=1)
day_stoch6 = ta.stoch(close, length6, k, d)
hour1_stoch12 = request.security('H', 'D', ta.stoch(close, length12, k, d))
day_oversold6 = day_stoch6 < 20
hour1_oversold12 = hour1_stoch12 < 20
day_overbought6 = day_stoch6 > 80
hour1_overbought12 = hour1_stoch12 > 80
plotshape(day_oversold6 and hour1_overbought12, location=location.bottom, color=color.new(color.red, 0), text="s")
plotshape(day_overbought6 and hour1_oversold12, location=location.bottom, color=color.new(color.green, 0), text="l")
This implements a Stochastic oscillator and plot shapes based on overbought/oversold conditions on different timeframes.
The script calculates the Stochastic oscillator with two different length periods (12 and 6) on daily and 1-hour candles. Then it checks if the Stochastic oscillator is overbought or oversold based on certain thresholds (80 for overbought and 20 for oversold for length 6 and 12). If both the day timeframe with length 6 and the 1-hour timeframe with length 12 meet the overbought/oversold conditions, the script plots a shape on the chart with a label of "s" or "l" to indicate the overbought/oversold conditions.
I have saved it and no error was detected in the script, then I posted it into the chart. I expect the system will plot mark 's' or 'l' at the bottom bar when the condition is met. But the outcome is not that i expected, kindly assist.
Related
Problem
I have a set of events, some of them connected and these connections define order. Events must be held in defined order. A connection may contain a min and/or a max requirement for distance between connected events. Let the distance be in days.
I use a directed acyclic graph for a representation of my model.
I need to order this events on the given number of days respecting the defined order and min/max requirements. The distribution should tend to be even. The events should be stretched all over the given distance.
What ways or algorithms may you suggest on solving this problem? I tried to find some solution with topological sorting or constraint ordering but had little to no results.
Example
We have a set of events a, b, c with the following connections a -> b, b -> c, a -> c
The given number of days for distribution is 7.
a. without any requirements for distance between connections.
Then the best solution would be
1 2 3 4 5 6 7
a b c
b. with requirement where distance in days between events (a, b) is [1, 2].
Then the best solution would be
1 2 3 4 5 6 7
a b c
c. with requirements where distance in days between events (a, b) is [1, 2] and between events (a, c) is <= 4.
Then the best solution would be
1 2 3 4 5 6 7
a b c
d. with requirements where distance in days between events (a, b) is [1, 2], between events (a, c) is <= 4, between events (b, c) is >= 3.
Then the best solution would be
1 2 3 4 5 6 7
a b c
EDIT:
There may be multiple events per day:
if the number of events is greater than the number of days for distribution.
if we have have max = 0 requirement.
If we have several suitable solutions, then the best one will be where the distance between the current event and its neighbors is approximately the same. We aim for the distance between events to be (DAYS_FOR_DISTRIBUTION / NUMBER_OF_EVENTS) where DAYS_FOR_DISTRIBUTION > NUMBER_OF_EVENTS.
If we have several suitable solutions with the same distances between events, then the best is left shifted solution.
Examples of connected events are attached below
Find R = all events that have no in edges
LOOP while R contains one or more event
SELECT N from R with the largest number of out edges
IF first time through
Place N on day 1
ELSE
Place N in middle of largest gap between events
LOOP M over descendants of N in required order
Place M as far from other events as possible, within M's allowed range
Method.
Uses a constraint library to generate all event orderings satisfying the event constraints, without regard to (un)evenness, then finds the constrained solution with minimal unevenness iteratively.
Evenness ?
If unevenness is defined by looking at all the differences between days and event_days calculated and returning the max minus the min days difference then a need for clarification with your answer of:
1 2 3 4 5 6 7
a b c
Is that it has the values off to the left w.r.t. the days.
A better answer might be:
1 2 3 4 5 6 7
a b c
With the one shift to the right, their is less of a stretch from day 7 to any event day.
If you think the above is equivalent then you need to better define evenness - is it evennness over the extent of the event days perhaps?
STOP PRESS! Evenness has been edited by author and is now better described.
Code
The source is set to run your example if you just hit return on each prompt.
# -*- coding: utf-8 -*-
"""
Even distribution of directed acyclic graph with respect to edge length
https://stackoverflow.com/questions/71532362/even-distribution-of-directed-acyclic-graph-with-respect-to-edge-length
Created on Sat Mar 26 08:53:19 2022
#author: paddy
"""
# https://pypi.org/project/python-constraint/
from constraint import Problem
from itertools import product
#%% Inputs
days = input("Days: int = ")
try:
days = int(days.strip())
except:
days = 7
print(f"Using {days=}")
events = input("events: space_separated = ")
events = events.strip().split()
if not events:
events = list("abc")
print(f"Using {events=}")
constraint_funcs = []
while True:
constr = input("constraint: string_expression (. to end) = ").strip()
if not constr:
constraint_funcs = ["1 <= (b - a) <=2",
"(c - a) <= 4",
"(c - b) >= 3"]
break
if constr == '.':
break
constraint_funcs.append(constr)
print(f"\nUsing {constraint_funcs=}")
#%% Constraint Setup
print()
problem = Problem()
problem.addVariables(events, range(1, days+1))
for constr in constraint_funcs:
constr_events = sorted( set(events) & set(compile(constr, '<input>',
mode='eval').co_names))
expr = f"lambda {', '.join(constr_events)}: {constr}"
print(f" Add Constraint {expr!r}, On {constr_events}")
func = eval(expr)
problem.addConstraint(func, constr_events)
#%% Solution optimisation for "evenness"
print()
def unevenness(event_days: list[int], all_days: int):
"(Max - min diff between ordered events, leftmost event)"
maxdiff, mindiff = -1, all_days + 1
for event, next_event in zip(event_days, event_days[1:]):
diff = next_event - event
maxdiff = max(maxdiff, diff)
mindiff = min(mindiff, diff)
return maxdiff - mindiff, event_days[0]
def printit(solution, all_days):
drange = range(1, all_days+1)
# print(solution)
print(' '.join(str(i)[0] for i in drange))
for event, day in sorted(solution.items(), key=lambda kv: kv[::-1]):
print(' ' * (day - 1) + event )
print()
current_best = None, 9e99
for ans in problem.getSolutionIter():
unev = unevenness(sorted(ans.values()), days)
if current_best[0] is None or unev < current_best[1]:
current_best = ans, unev
print("Best so far:")
printit(ans, days)
The unevenness function has been updated. A better function might be to minimise the standard deviations of the days between successive events, but for this example, this works.
Output
Sample run using your constraints, but longer event names.
Days: int = 7
Using days=7
events: space_separated = arch belt card
Using events=['arch', 'belt', 'card']
constraint: string_expression (. to end) = 1 <= (belt - arch) <= 2
constraint: string_expression (. to end) = (card - arch) <= 4
constraint: string_expression (. to end) = (card - belt) >= 3
constraint: string_expression (. to end) = .
Using constraint_funcs=['1 <= (belt - arch) <= 2', '(card - arch) <= 4', '(card - belt) >= 3']
Add Constraint 'lambda arch, belt: 1 <= (belt - arch) <= 2', On ['arch', 'belt']
Add Constraint 'lambda arch, card: (card - arch) <= 4', On ['arch', 'card']
Add Constraint 'lambda belt, card: (card - belt) >= 3', On ['belt', 'card']
Best so far:
1 2 3 4 5 6 7
arch
belt
card
Best so far:
1 2 3 4 5 6 7
arch
belt
card
Best so far:
1 2 3 4 5 6 7
arch
belt
card
Note: The first letter of event names align with the day columns.
The second item in the return tuple of the unevenness function is the day that the earliest, (left-most), event happens. if the spread of the events are equal, this will tend to favour the solution further to the left when minimised.
Suppose I have a function phi(x1,x2)=k1*x1+k2*x2 which I have evaluated over a grid where the grid is a square having boundaries at -100 and 100 in both x1 and x2 axis with some step size say h=0.1. Now I want to calculate this sum over the grid with which I'm struggling:
What I was trying :
clear all
close all
clc
D=1; h=0.1;
D1 = -100;
D2 = 100;
X = D1 : h : D2;
Y = D1 : h : D2;
[x1, x2] = meshgrid(X, Y);
k1=2;k2=2;
phi = k1.*x1 + k2.*x2;
figure(1)
surf(X,Y,phi)
m1=-500:500;
m2=-500:500;
[M1,M2,X1,X2]=ndgrid(m1,m2,X,Y)
sys=#(m1,m2,X,Y) (k1*h*m1+k2*h*m2).*exp((-([X Y]-h*[m1 m2]).^2)./(h^2*D))
sum1=sum(sys(M1,M2,X1,X2))
Matlab says error in ndgrid, any idea how I should code this?
MATLAB shows:
Error using repmat
Requested 10001x1001x2001x2001 (298649.5GB) array exceeds maximum array size preference. Creation of arrays greater
than this limit may take a long time and cause MATLAB to become unresponsive. See array size limit or preference
panel for more information.
Error in ndgrid (line 72)
varargout{i} = repmat(x,s);
Error in new_try1 (line 16)
[M1,M2,X1,X2]=ndgrid(m1,m2,X,Y)
Judging by your comments and your code, it appears as though you don't fully understand what the equation is asking you to compute.
To obtain the value M(x1,x2) at some given (x1,x2), you have to compute that sum over Z2. Of course, using a numerical toolbox such as MATLAB, you could only ever hope to compute over some finite range of Z2. In this case, since (x1,x2) covers the range [-100,100] x [-100,100], and h=0.1, it follows that mh covers the range [-1000, 1000] x [-1000, 1000]. Example: m = (-1000, -1000) gives you mh = (-100, -100), which is the bottom-left corner of your domain. So really, phi(mh) is just phi(x1,x2) evaluated on all of your discretised points.
As an aside, since you need to compute |x-hm|^2, you can treat x = x1 + i x2 as a complex number to make use of MATLAB's abs function. If you were strictly working with vectors, you would have to use norm, which is OK too, but a bit more verbose. Thus, for some given x=(x10, x20), you would compute x-hm over the entire discretised plane as (x10 - x1) + i (x20 - x2).
Finally, you can compute 1 term of M at a time:
D=1; h=0.1;
D1 = -100;
D2 = 100;
X = (D1 : h : D2); % X is in rows (dim 2)
Y = (D1 : h : D2)'; % Y is in columns (dim 1)
k1=2;k2=2;
phi = k1*X + k2*Y;
M = zeros(length(Y), length(X));
for j = 1:length(X)
for i = 1:length(Y)
% treat (x - hm) as a complex number
x_hm = (X(j)-X) + 1i*(Y(i)-Y); % this computes x-hm for all m
M(i,j) = 1/(pi*D) * sum(sum(phi .* exp(-abs(x_hm).^2/(h^2*D)), 1), 2);
end
end
By the way, this computation takes quite a long time. You can consider either increasing h, reducing D1 and D2, or changing all three of them.
i have to keep the RBG of a image in 3 different vectors (one for every colour). Then ,for all the vectors, i count the number of apparences of every pixel in the [0;255] interval found in them.I have to split this [0;255] (pixel range) interval into N intervals , and make the sum of every interval, using the following formula :
[x * 256/N ; x * 256/N + 256/N )
with x <= N
N is greater of equal to 16.
So far , my code works for N that divide 256 , for the rest i get "Index out of bound " .
I think the formula implementation is somehow wrong. but I don't know how to fix it.
You get 'index out of bounds' because of the rounding in int16(256/N).
N = 16 ... int16(256/N) = 16 (no rounding)
N = 19 ... int16(256/N) = 13 (13.47 is rounded to 13)
N = 20 ... int16(256/N) = 13 (12.80 is rounded to 13)
Sometimes the result is rounded to next smaller value and sometimes to the next greater value(that is when you get out of bounds).
Solution: use floor(256/N) instead of int16(256/N).
How can i generate a random number between A = 1 and B = 10 where each number has a different probability?
Example: number / probability
1 - 20%
2 - 20%
3 - 10%
4 - 5%
5 - 5%
...and so on.
I'm aware of some hard-coded workarounds which unfortunately are of no use with larger ranges, for example A = 1000 and B = 100000.
Assume we have a
Rand()
method which returns a random number R, 0 < R < 1, can anyone post a code sample with a proper way of doing this ? prefferable in c# / java / actionscript.
Build an array of 100 integers and populate it with 20 1's, 20 2's, 10 3's, 5 4's, 5 5's, etc. Then just randomly pick an item from the array.
int[] numbers = new int[100];
// populate the first 20 with the value '1'
for (int i = 0; i < 20; ++i)
{
numbers[i] = 1;
}
// populate the rest of the array as desired.
// To get an item:
// Since your Rand() function returns 0 < R < 1
int ix = (int)(Rand() * 100);
int num = numbers[ix];
This works well if the number of items is reasonably small and your precision isn't too strict. That is, if you wanted 4.375% 7's, then you'd need a much larger array.
There is an elegant algorithm attributed by Knuth to A. J. Walker (Electronics Letters 10, 8 (1974), 127-128; ACM Trans. Math Software 3 (1977), 253-256).
The idea is that if you have a total of k * n balls of n different colors, then it is possible to distribute the balls in n containers such that container no. i contains balls of color i and at most one other color. The proof is by induction on n. For the induction step pick the color with the least number of balls.
In your example n = 10. Multiply the probabilities with a suitable m such that they are all integers. So, maybe m = 100 and you have 20 balls of color 0, 20 balls of color 1, 10 balls of color 2, 5 balls of color 3, etc. So, k = 10.
Now generate a table of dimension n with each entry being a probability (the ration of balls of color i vs the other color) and the other color.
To generate a random ball, generate a random floating-point number r in the range [0, n). Let i be the integer part (floor of r) and x the excess (r – i).
if (x < table[i].probability) output i
else output table[i].other
The algorithm has the advantage that for each random ball you only make a single comparison.
Let me work out an example (same as Knuth).
Consider simulating throwing a pair of dice.
So P(2) = 1/36, P(3) = 2/36, P(4) = 3/36, P(5) = 4/36, P(6) = 5/36, P(7) = 6/36, P(8) = 5/36, P(9) = 4/36, P(10) = 3/36, P(11) = 2/36, P(12) = 1/36.
Multiply by 36 * 11 to get 393 balls, 11 of color 2, 22 of color 3, 33 of color 4, …, 11 of color 12.
We have k = 393 / 11 = 36.
Table[2] = (11/36, color 4)
Table[12] = (11/36, color 10)
Table[3] = (22/36, color 5)
Table[11] = (22/36, color 5)
Table[4] = (8/36, color 9)
Table[10] = (8/36, color 6)
Table[5] = (16/36, color 6)
Table[9] = (16/36, color 8)
Table[6] = (7/36, color 8)
Table[8] = (6/36, color 7)
Table[7] = (36/36, color 7)
Assuming that you have a function p(n) that gives you the desired probability for a random number:
r = rand() // a random number between 0 and 1
for i in A to B do
if r < p(i)
return i
r = r - p(i)
done
A faster way is to create an array of (B - A) * 100 elements and populate it with numbers from A to B such that the ratio of the number of each item occurs in the array to the size of the array is its probability. You can then generate a uniform random number to get an index to the array and directly access the array to get your random number.
Map your uniform random results to the required outputs according to the probabilities.
E.g., for your example:
If `0 <= Round() <= 0.2`: result = 1.
If `0.2 < Round() <= 0.4`: result = 2.
If `0.4 < Round() <= 0.5`: result = 3.
If `0.5 < Round() <= 0.55`: result = 4.
If `0.55 < Round() <= 0.65`: result = 5.
...
Here's an implementation of Knuth's Algorithm. As discussed by some of the answers it works by
1) creating a table of summed frequencies
2) generates a random integer
3) rounds it with ceiling function
4) finds the "summed" range within which the random number falls and outputs original array entity based on it
Inverse Transform
In probability speak, a cumulative distribution function F(x) returns the probability that any randomly drawn value, call it X, is <= some given value x. For instance, if I did F(4) in this case, I would get .6. because the running sum of probabilities in your example is {.2, .4, .5, .55, .6, .65, ....}. I.e. the probability of randomly getting a value less than or equal to 4 is .6. However, what I actually want to know is the inverse of the cumulative probability function, call it F_inv. I want to know what is the x value given the cumulative probability. I want to pass in F_inv(.6) and get back 4. That is why this is called the inverse transform method.
So, in the inverse transform method, we are basically trying to find the interval in the cumulative distribution in which a random Uniform (0,1) number falls. This works out to the algorithm that perreal and icepack posted. Here is another way to state it in terms of the cumulative distribution function
Generate a random number U
for x in A .. B
if U <= F(x) then return x
Note that it might be more efficient to have the loop go from B to A and check if U >= F(x) if the smaller probabilities come at the beginning of the distribution
I am having trouble fully understanding the K-Means++ algorithm. I am interested exactly how the first k centroids are picked, namely the initialization as the rest is like in the original K-Means algorithm.
Is the probability function used based on distance or Gaussian?
In the same time the most long distant point (From the other centroids) is picked for a new centroid.
I will appreciate a step by step explanation and an example. The one in Wikipedia is not clear enough. Also a very well commented source code would also help. If you are using 6 arrays then please tell us which one is for what.
Interesting question. Thank you for bringing this paper to my attention - K-Means++: The Advantages of Careful Seeding
In simple terms, cluster centers are initially chosen at random from the set of input observation vectors, where the probability of choosing vector x is high if x is not near any previously chosen centers.
Here is a one-dimensional example. Our observations are [0, 1, 2, 3, 4]. Let the first center, c1, be 0. The probability that the next cluster center, c2, is x is proportional to ||c1-x||^2. So, P(c2 = 1) = 1a, P(c2 = 2) = 4a, P(c2 = 3) = 9a, P(c2 = 4) = 16a, where a = 1/(1+4+9+16).
Suppose c2=4. Then, P(c3 = 1) = 1a, P(c3 = 2) = 4a, P(c3 = 3) = 1a, where a = 1/(1+4+1).
I've coded the initialization procedure in Python; I don't know if this helps you.
def initialize(X, K):
C = [X[0]]
for k in range(1, K):
D2 = scipy.array([min([scipy.inner(c-x,c-x) for c in C]) for x in X])
probs = D2/D2.sum()
cumprobs = probs.cumsum()
r = scipy.rand()
for j,p in enumerate(cumprobs):
if r < p:
i = j
break
C.append(X[i])
return C
EDIT with clarification: The output of cumsum gives us boundaries to partition the interval [0,1]. These partitions have length equal to the probability of the corresponding point being chosen as a center. So then, since r is uniformly chosen between [0,1], it will fall into exactly one of these intervals (because of break). The for loop checks to see which partition r is in.
Example:
probs = [0.1, 0.2, 0.3, 0.4]
cumprobs = [0.1, 0.3, 0.6, 1.0]
if r < cumprobs[0]:
# this event has probability 0.1
i = 0
elif r < cumprobs[1]:
# this event has probability 0.2
i = 1
elif r < cumprobs[2]:
# this event has probability 0.3
i = 2
elif r < cumprobs[3]:
# this event has probability 0.4
i = 3
One Liner.
Say we need to select 2 cluster centers, instead of selecting them all randomly{like we do in simple k means}, we will select the first one randomly, then find the points that are farthest to the first center{These points most probably do not belong to the first cluster center as they are far from it} and assign the second cluster center nearby those far points.
I have prepared a full source implementation of k-means++ based on the book "Collective Intelligence" by Toby Segaran and the k-menas++ initialization provided here.
Indeed there are two distance functions here. For the initial centroids a standard one is used based numpy.inner and then for the centroids fixation the Pearson one is used. Maybe the Pearson one can be also be used for the initial centroids. They say it is better.
from __future__ import division
def readfile(filename):
lines=[line for line in file(filename)]
rownames=[]
data=[]
for line in lines:
p=line.strip().split(' ') #single space as separator
#print p
# First column in each row is the rowname
rownames.append(p[0])
# The data for this row is the remainder of the row
data.append([float(x) for x in p[1:]])
#print [float(x) for x in p[1:]]
return rownames,data
from math import sqrt
def pearson(v1,v2):
# Simple sums
sum1=sum(v1)
sum2=sum(v2)
# Sums of the squares
sum1Sq=sum([pow(v,2) for v in v1])
sum2Sq=sum([pow(v,2) for v in v2])
# Sum of the products
pSum=sum([v1[i]*v2[i] for i in range(len(v1))])
# Calculate r (Pearson score)
num=pSum-(sum1*sum2/len(v1))
den=sqrt((sum1Sq-pow(sum1,2)/len(v1))*(sum2Sq-pow(sum2,2)/len(v1)))
if den==0: return 0
return 1.0-num/den
import numpy
from numpy.random import *
def initialize(X, K):
C = [X[0]]
for _ in range(1, K):
#D2 = numpy.array([min([numpy.inner(c-x,c-x) for c in C]) for x in X])
D2 = numpy.array([min([numpy.inner(numpy.array(c)-numpy.array(x),numpy.array(c)-numpy.array(x)) for c in C]) for x in X])
probs = D2/D2.sum()
cumprobs = probs.cumsum()
#print "cumprobs=",cumprobs
r = rand()
#print "r=",r
i=-1
for j,p in enumerate(cumprobs):
if r 0:
for rowid in bestmatches[i]:
for m in range(len(rows[rowid])):
avgs[m]+=rows[rowid][m]
for j in range(len(avgs)):
avgs[j]/=len(bestmatches[i])
clusters[i]=avgs
return bestmatches
rows,data=readfile('/home/toncho/Desktop/data.txt')
kclust = kcluster(data,k=4)
print "Result:"
for c in kclust:
out = ""
for r in c:
out+=rows[r] +' '
print "["+out[:-1]+"]"
print 'done'
data.txt:
p1 1 5 6
p2 9 4 3
p3 2 3 1
p4 4 5 6
p5 7 8 9
p6 4 5 4
p7 2 5 6
p8 3 4 5
p9 6 7 8