What is the meaning of percentage change in the SSML prosody pitch attribute? - ssml

The SSML prosody element can take a value representing a relative change, which may be a percentage value (e.g. +50% or -30%).
What should that be a percentage of? Is it the Hz value of the current pitch (so an octave interval (i.e. +12st) is the same as +100%)? Or is it related to something else, such as the range between x-low and x-high (so x-low +50% is the same as medium, then another +50% is x-high)? Is it simply left up to the implementers to decide?
I understand that SSML is not a system for marking up music, and that this represents the "baseline pitch" or the utterance, rather than the exact pitch at which the whole utterance is to be delivered. I just wish to know whether certain expressions can be considered equivalent.

Yes, my understanding is that the percentage is based on the current pitch so -50% for an octave down and +100% for an octave up.
The ratio for each semitone is calculated as a power of the 12th root of 2. So the first semitone above is a ratio of 1.0595 or a percent change of 5.95%, the second is 1.0595^2 which results in a percent change of 12.25%, etc. The first semitone below is -5.61% because it decreases as the inverse of the 12th root of 2.
In general the relative percent change for each semitone is computed as ((2^(1/12))^n) - 1) * 100 or approximately ((1.0595^n) - 1) * 100 for integer n.

Related

Algorithm to find areas of support in a candlestick chart

I am in the process of designing an algorithm that will calculate regions in a candlestick chart where strong areas of support exist. An "area of support" in this case is defined as an area in the chart where the price of a stock rises by a large amount in a short period of time. (Please see the diagram below, the blue dots represent these strong areas of support)
The data I am working with is a list of over 6000 TOHLC (timestamp, open price, high price, low price, close price) values. For example, the first entry in this list of data is:
[1555286400, 83.7, 84.63, 83.7, 84.27]
The way I have structured the algorithm to work is as follows:
1.) The list of 6000+ TOHLC values are split into sub-lists of 30 TOHLC values (30 is a number that I arbitrarily chose). The lowest low price (LLP) is then obtained from each of these sub-lists. The purpose behind using this method is to find areas in the chart where prices dip.
2.) The next step is to determine how high the price rose from each of these lows. For this, I take the next 30 candlestick values from the low and determine what the highest high price (HHP) is. Then, if HHP / LLP >= 1.03, the low price is accepted, otherwise it is discarded. Again, 1.03 is a value that I arbitrarily chose, by analysing the stock chart manually and determining how much the price rose on average from these lows.
The blue dots in the chart above represent the accepted areas of support by the algorithm. It appears to be working well, in terms of that I am trying to achieve.
So the question I have is: does anyone have any improvements they can suggest for this algorithm, or point out any faults in it?
Thanks!
I may have understood wrong, however, from your explanation it seems like you are doing your calculation in separate 30-ish sub lists and then combining them together.
So, what if the LLP is the 30th element of sublist N and HHP is 1st element of sublist N+1 ? If you have taken that into account, then it's fine.
If you haven't taken that into account, I would suggest doing a moving-window type of approach in reading those data. So, you would start from 0th element of 6000+ TOHLC and start with a window size of 30 and slide it 1 by 1. This way, you won't miss any values.
Some of the selected blue dots have higher dip than others. Why is that? I would separate them into another classifier. If you will store them into an object, store the dip rate as well.
Floating point numbers are not suggested in finance. If possible, I'd use a different approach and perhaps classifier, solely using integers. It may not bother you or your project as of now, but surely, it will begin to create false results when the numbers add up in the future.

How can I normalize trending data?

Say I want to calculate the velocity of two datapoints (A and A'), each having a score, and a time published (A' is a future version of A, and has a higher score). This would be
[A'(score) - A(score)] / [A'(time published) - A (time published)]
What I want to capture are trends with high velocities. This means I want a score going from 20 to 200 having higher weight than 8500 to 9000. So I thought I'd normalize this data by dividing the scores by a baseline.
Ex. if A(score) is 2, and A'(score) is 3, the baseline is 2, so in the formula above,
A'(score) - A(score) would be (3/2 - 2/2)
However, this means that when the numbers are this low, the velocities will be very high (since on the other hand
9000/8500 - 8500/8500
produces very low velocities, given that time difference is constant in this example only, however normally, time differences are variable).
Is there any way to reduce the impact of low starting scores WHILE at the same time allowing jumps from, say, 20 to 200 being significant? Thank you.
There are two ways to look at this. Either could give you what you want.
My first thought was that your question came very close to providing your answer. You gave yourself an important hint by calling your first calculation your velocity - your rate of change of a score over time. You could then look at its acceleration - your rate of change of the velocity over time. That's:
(A''(score) - A'(score)) - (A'(score) - A(score))
Note, I'm not dividing by time, because you say the time difference is constant for each measurement. Then you're dividing each value by a constant, which is inefficient and probably doesn't give you any further clarity.
More likely, though, it seems you want how significant the change is from one score to the next. I suspect what you want is:
(A'(score) - A(score)) / A(score)
This is (a - b) / b, which reduces down to (a/b) - 1. If you don't care about the -1, the simplest way you can see the relevant change in your score is:
A'(score)/A(score)
This shows the rate of growth of the score from one step to the next.
Edit, after clarification:
Given your comment, a variable rate of time makes the logic more complicated, but still do-able.
In that case, you do want to calculate velocity, as you were doing:
V = A'(score) - A(score) / A'(time) - A(time)
But you want to normalize it based on the previous velocity:
result = V'/V
This then becomes similar to the "acceleration" example - it requires 3 samples to have a good idea of the rate of change of the rate of change. If you spell it out all the way, you get something like:
result = (A''(score) - A'(score))/(A''(time) - A'(time)) / (A'(score) - A(score))/(A'(time) - A(time))
You can do some math to shove these numbers around, but there's really no prettier result than that.

suitable formula/algorithm for detecting temperature fluctuations

I'm creating an app to monitor water quality. The temperature data is updated every 2 min to firebase real-time database. App has two requirements
1) It should alert the user when temperature exceed 33 degree or drop below 23 degree - This part is done
2) It should alert user when it has big temperature fluctuation after analysing data every 30min - This part i'm confused.
I don't know what algorithm to use to detect big temperature fluctuation over a period of time and alert the user. Can someone help me on this?
For a period of 30 minutes, your app would give you 15 values.
If you want to figure out a big change in this data, then there is one way to do so.
You can use implement the following method:
Calculate the mean and the standard deviation of the values.
Subtract the data you have from the mean and then take the absolute value of the result.
Compare if the absolute value is greater than one standard deviation, if it is greater then you have a big data.
See this example for better understanding:
Lets suppose you have these values for 10 minutes:
25,27,24,35,28
First Step:
Mean = 27 (apprx)
One standard deviation = 3.8
Second Step: Absolute(Data - Mean)
abs(25-27) = 2
abs(27-27) = 0
abs(24-27) = 3
abs(35-27) = 8
abs(28-27) = 1
Third Step
Check if any of the subtraction is greater than standard deviation
abs(35-27) gives 8 which is greater than 3.8
So, there is a big fluctuation. If all the subtracted results are less than standard deviation, then there is no fluctuation.
You can still improvise the result by selecting two or three standard deviation instead of one standard deviation.
Start by defining what you mean by fluctuation.
You don't say what temperature scale you're using. Fahrenheit, Celsius, Rankine, or Kelvin?
Your sampling rate is a new data value every two minutes. Do you define fluctuation as the absolute value of the difference between the last point and current value? That's defensible.
If the max allowable absolute value is some multiple of your 33-23 = 10 degrees you're in business.

FMOD frequency analysis/normalisation

I am using the FMOD library to apply FFT to an audio stream, providing me with a constantly updating fixed number of frequency bins. Each bin represents an equal frequency range, with a value between 0 and 1 to represent the intensity of this range from the processed audio. FMOD documentation states that these values can be represented in decibels, where val is the value between 0 and 1:
Decibels = 10.0f * (float)log10(val) * 2.0f
I am attempting to make an automated strobe-like beat detecting visualisation. So far, I test at a constant interval to see whether a particular frequency bin's intensity value surpasses a specified boundary - if this is the case, the strobe flashes. Although a pretty crude way of doing this, it works fairly effectively for my requirements.
However, this specified boundary only works effectively when the system/music player's volumes are maximum. When I reduce either volume, the strobe sensitivity is reduced and becomes either very inaccurate or stops flashing completely. I assume that I need to normalise the data in some way so analysis is performed independent of volume, though by scaling the data by 1/value of largest bin the largest value is always maxed out. This surpasses the specified boundary permanently, causing the strobe to flash indefinitely. I can't think how else this can be achieved and have been on a mental block for days - any help or a point in the right direction would be greatly appreciated!
Normalise over a a longer scale. You'll need something like an envelope follower with a long release time.
If you search for 'compressor' source code, or automatic gain control something will definitely turn up.
But broadly in pseudo C++, and working on your incoming audio (the time-domain signal before the FFT):
auto instant_level = std::abs(signal);
peak_level *= 0.99f;
peak_level = peak_level > instant_level ? peak_level : instant_level;
Now peak_level decays slowly over time. And you can use this to calculate a gain factor to normalize your incoming audio. Adjust the 0.99f as required for a sensible decay time and for the correct sample rate.
There's also a Signal Processing stack exchange site where you'll get quicker answers to these kinds of questions (although occasionally with an almost incomprehensible piece of algebra attached :) )

Floating point calculations with latitudes and longitudes of varying precisions

Background: I receive a long and lat as parameters to a web service. They are typically up to 6 decimal places. When a new request is received, I calculate the distance between the last recorded loc and the long/lat in the params of the request. If the distance is greater than a certain threshold of miles apart, I update the current loc.
Problem: I use the geokit gem/plugin to calculate the distance between the locs. Very rarely, a bug shows up (the zero distance bug mentioned on the author's site - I'm using 1.4.1 which claims the bug is fixed, but I still see it occurring shrug) that causes the distance calc to return something wildly inaccurate when calculating the distance between two points that are identical (this occurs if the user is not moving). This is causing updates to the user's current loc that should not be happening. You're probably wondering - well if it's just updating the loc to be exactly the same coordinates, who cares? Well, the answer is that a bunch of other crap occurs when the loc is updated that makes it an actual issue.
Attempted Solution: I tried to add in logic to manually check if the two locs are identical before calculating the distance and then just skipping the calc and not updating if that is the case. The incoming parameters are long/lats with 6 decimal precision; whereas, in my database, I store the values as floats, which appears to only store 4 decimal places. This is causing my float comparison to always fail and the inappropriate loc updates continue to occur.
Phew, ok so the actual question is: How should I perform this comparison? Should I truncate 2 of the decimal places from the incoming lat/longs, round up somehow so the fourth digit is correct and then compare? Or, should I do a "within a certain range" sort of comparison (e.g. reported_loc.long > current_loc.long - .0001 && reported_loc.long < current_loc.long + .0001)? Also any recommendations for existing ruby gem/plugins or built in functions to do this sort of thing would be much appreciated.
Here is sample output from the log:
[update_loc] Last location history record at lat: 41.5024, long: -81.6816
[update_loc] Current loc at lat: 41.502467, long: -81.681623
[update_loc] Distance from current loc and last loc history: 5795.10615113555 miles
[update_loc] Locs not identical and distance greater than threshold, inserting new loc history
[update_loc] Location update complete
Thank you
Tom
The usual way to test if two numbers are close is to use abs, ie,
(reported_loc.long - current_loc.long).abs <= tol
where tol is some pre-specified tolerance, eg, 0.0001.
A GPS receiver can give you a location with a precision of many decimals, but that doesn't mean that the measurement is actually that accurate.
Usually about 95% of the measurements lie within a circle of a couple of meters, which is about the same accuracy that you can store with a 32bits float.
However, you will clearly notice rounding errors when you plot a gps log as floats on a map.
Anyway, to do this comparison, I'd use a range instead of rounding the incoming value and comparing that to the database value. You won't be able to detect the smallest movements anymore, but at least you won't get false positives anymore either.
As you're using floats to store stuff, you clearly don't care about millimeters or centimeters anyway.

Resources