Calculate P&L on the arbitrary period - fifo

Let's assume I have an array of trades (Buys/Sells), associated with timestamps.
[
{
"time": "01-05-2021", // DD-MM-YYYY
"operation": "BUY",
"amount": 2,
"price": 10
},
{
"time": "01-06-2021",
"operation": "SELL",
"amount": 1,
"price": 15
},
{
"time": "01-07-2021",
"operation": "BUY",
"amount": 2,
"price": 20
},
{
"time": "01-08-2021",
"operation": "SELL",
"amount": 3,
"price": 25
}
]
And I want to calculate P&L on these trades, using FIFO, but for arbitrary time period.
The problem is - calculated value depends on time period I'll choose.
For 08.2021 it'll be 0 (3 items were sold, none was bought).
For 07-08.2021 it'll be 10 (2 items were bought for total of 40, 2 were sold for total 50).
For 06-08.2021 it'll be 0 (SELL 1 on 15 -> BUY 1 on 20 == -5 and BUY 1 on 20 -> SELL 1 on 25 == 5).
And so on.
The only working solution that I have right now is to calculate P&L values for each deal, from the beginning of trading activity. And then, just "cut off" everything beside required period. But it's not scalable, because with even without automated trading it can be thousands of deals each year.
The most obvious thing to do is to add some initial state to the beginning of given period, which will be starting point for all further calculations.
Are there any algorithms or tools which I can utilize to perform this task?
I'm a Javascript developer, and my solution works in the browser's runtime. Maybe I need some backend, maybe I need R with it's statistics-dedicated package...

Related

Elasticsearch, how to calculate cumulative probability of normal distribution?

Suppose, each document has the following data:
{
//some_other_fields,
"seasonal_data": [
{
"day_of_year": 1,
"sales": 3
},
{
"day_of_year": 2,
"sales": 5
}
]
}
When ranking documents, I want to consider seasonal score along with others.
For a given day_of_year, I'll consider 7 days (including 3 days prior, 3 days after).
I'll get average sales value of seven days.
We assume sales data follows a normal distribution:
seasonal_score = p-value(avg_sales_7days)
How can this be done in Elasticsearch?

Finding largest difference in array of compass headings

I'm trying to have the "range" of compass headings over the last X seconds. Example: Over the last minute, my heading has been between 120deg and 140deg on the compass. Easy enough right? I have an array with the compass headings over the time period, say 1 reading every second.
[ 125, 122, 120, 125, 130, 139, 140, 138 ]
I can take the minimum and maximum values and there you go. My range is from 120 to 140.
Except it's not that simple. Take for example if my heading has shifted from 10 degrees, to 350 degrees (ie it "passed" through North, changing -20deg.
Now my array might look something like this:
[ 9, 10, 6, 3, 358, 355, 350, 353 ]
Now the Min is 3 and max 358, which is not what I need :( I'm looking for the most "right hand" (clockwise) value, and most "left hand" (counter-clockwise) value.
Only way I can think of, is finding the largest arc along the circle that includes none of the values in my array, but I don't even know how I would do that.
Would really appreciate any help!
Problem Analysis
To summarize the problem, it sounds like you want to find both of the following:
The two readings that are closest together (for simplicity: in a clockwise direction) AND
Contain all of the other readings between them.
So in your second example, 9 and 10 are only 1° apart, but they do not contain all the other readings. Conversely, traveling clockwise from 10 to 9 would contain all of the other readings, but they are 359° apart in that direction, so they are not closest.
In this case, I'm not sure if using the minimum and maximum readings will help. Instead, I'd recommend sorting all of the readings. Then you can more easily check the two criteria specified above.
Here's the second example you provided, sorted in ascending order:
[ 3, 6, 9, 10, 350, 353, 355, 358 ]
If we start from the beginning, we know that traveling from reading 3 to reading 358 will encompass all of the other readings, but they are 358 - 3 = 355° apart. We can continue scanning the results progressively. Note that once we circle around, we have to add 360 to properly calculate the degrees of separation.
[ 3, 6, 9, 10, 350, 353, 355, 358 ]
*--------------------------> 358 - 3 = 355° separation
[ 3, 6, 9, 10, 350, 353, 355, 358 ]
-> *----------------------------- (360 + 3) - 6 = 357° separation
[ 3, 6, 9, 10, 350, 353, 355, 358 ]
----> *-------------------------- (360 + 6) - 9 = 357° separation
[ 3, 6, 9, 10, 350, 353, 355, 358 ]
-------> *----------------------- (360 + 9) - 10 = 359° separation
[ 3, 6, 9, 10, 350, 353, 355, 358 ]
----------> *------------------- (360 + 10) - 350 = 20° separation
[ 3, 6, 9, 10, 350, 353, 355, 358 ]
--------------> *-------------- (360 + 350) - 353 = 357° separation
[ 3, 6, 9, 10, 350, 353, 355, 358 ]
-------------------> *--------- (360 + 353) - 355 = 358° separation
[ 3, 6, 9, 10, 350, 353, 355, 358 ]
------------------------> *---- (360 + 355) - 358 = 357° separation
Pseudocode Solution
Here's a pseudocode algorithm for determining the minimum degree range of reading values. There are definitely ways it could be optimized if performance is a concern.
// Somehow, we need to get our reading data into the program, sorted
// in ascending order.
// If readings are always whole numbers, you can use an int[] array
// instead of a double[] array. If we use an int[] array here, change
// the "minimumInclusiveReadingRange" variable below to be an int too.
double[] readings = populateAndSortReadingsArray();
if (readings.length == 0)
{
// Handle case where no readings are provided. Show a warning,
// throw an error, or whatever the requirement is.
}
else
{
// We want to track the endpoints of the smallest inclusive range.
// These values will be overwritten each time a better range is found.
int minimumInclusiveEndpointIndex1;
int minimumInclusiveEndpointIndex2;
double minimumInclusiveReadingRange; // This is convenient, but not necessary.
// We could determine it using the
// endpoint indices instead.
// Check the range of the greatest and least readings first. Since
// the readings are sorted, the greatest reading is the last element.
// The least reading is the first element.
minimumInclusiveReadingRange = readings[array.length - 1] - readings[0];
minimumInclusiveEndpointIndex1 = 0;
minimumInclusiveEndpointIndex2 = array.length - 1;
// Potential to skip some processing. If the ends are 180 or less
// degrees apart, they represent the minimum inclusive reading range.
// The for loop below could be skipped.
for (int i = 1; i < array.length; i++)
{
if ((360.0 + readings[i-1]) - readings[i] < minimumInclusiveReadingRange)
{
minimumInclusiveReadingRange = (360.0 + readings[i-1]) - readings[i];
minimumInclusiveEndpointIndex1 = i;
minimumInclusiveEndpointIndex2 = i - 1;
}
}
// Most likely, there will be some different readings, but there is an
// edge case of all readings being the same:
if (minimumInclusiveReadingRange == 0.0)
{
print("All readings were the same: " + readings[0]);
}
else
{
print("The range of compass readings was: " + minimumInclusiveReadingRange +
" spanning from " + readings[minimumInclusiveEndpointIndex1] +
" to " + readings[minimumInclusiveEndpointIndex2]);
}
}
There is one additional edge case that this pseudocode algorithm does not cover, and that is the case where there are multiple minimum inclusive ranges...
Example 1: [0, 90, 180, 270] which has a range of 270 (90 to 0/360, 180 to 90, 270 to 180, and 0 to 270).
Example 2: [85, 95, 265, 275] which has a range of 190 (85 to 275 and 265 to 95)
If it's necessary to report each possible pair of endpoints that create the minimum inclusive range, this edge case would increase the complexity of the logic a bit. If all that matters is determining the value of the minimum inclusive range or it is sufficient to report just one pair that represents the minimum inclusive range, the provided algorithm should suffice.

Algorithm for searching the combination with lowest price [duplicate]

I have items with ID (1001, 1002, 1003, 1004, 1005, 1006). There respective quantities are (2, 5, 1, 1, 5, 2): Now I have data like following.There is an offerId for each row.
offerId :{[Item_Id, Item_quantity_on_which_offer_Applied, Discount per quantity]}
1 :{[1001, 2, 21]}
4 :{[1002, 5, 5]}
6 :{[1003, 1, 25] [1004, 1, 25]}
5 :{[1004, 1, 20]}
3 :{[1005, 5, 17.5] [1002, 5, 17.5]}
2 :{[1005, 2, 18.33] [1001, 2, 26] [1006, 2, 21.67]}
Explaination When offer Id 1 is applied, I get Item 2 quantities of Item Id 1002 at 21 rs. discount per quantity i.e. I am getting 21 rs. discount on 1 quantity of 1002.
Objective I want to get the best offer combination. For example in above case best offer combination will be:
OfferId : 2 (discount = 132 (i.e. (18.33+26+21.67)*2))
OfferId : 3 (note: for 3 quantities of item 1005 and 3 quantities of item 1002 since 2 quantities of item 1005 is already in offer Id 2). (discount = 105(i.e. (17.5+17.5)*3))
Now item 1002 has 2 quantity remaining , so:
offerId : 4 (applied on 2 quantities of item 1002)(discount = 10(i.e 5*2))
offerId : 6 (discount = (25+25)*1 = 50)
So in a nutshell, offerids 2, 3 , 4 , 6 will give me best combination of offers where offer 4 is applied on 2 quantities of item 1002,
offer 3 for 3 quantities of item 1005 and 3 quantities of item 1002.
Above is the result I desire to compute best offer combination depending on quantity.
So far, I had been able to find best offer combination without considering quantity. But now my requirement is to consider quantities of Items and then find best offer combination.
It would be really helpful if anyone can provide me with a pseudocode. Any suggestions are greatly appreciated.
P.S. I am writing my code in Golang but solutions in any language are welcomed.
I hope I framed my question correctly. Comment below if any more information regarding question is required.
Thanks in advance.
Even if there is only a single item of each type and every offer gives the same total discount (say, $1), this is NP-hard, since the NP-hard problem Set Packing can be reduced to it: for each set, make an offer for the same elements with total discount $1. Since all offers provide the same benefit, the optimal solution to this constructed instance of your problem is the one that uses the largest number of offers, and this solution corresponds directly to the optimal solution to the original Set Packing problem.
Thus there's no hope for a polynomial-time solution to your problem.

Algorithm to find best offer combination that gives maximum discount on a given set Of Items

I have items with ID (1001, 1002, 1003, 1004, 1005, 1006). There respective quantities are (2, 5, 1, 1, 5, 2): Now I have data like following.There is an offerId for each row.
offerId :{[Item_Id, Item_quantity_on_which_offer_Applied, Discount per quantity]}
1 :{[1001, 2, 21]}
4 :{[1002, 5, 5]}
6 :{[1003, 1, 25] [1004, 1, 25]}
5 :{[1004, 1, 20]}
3 :{[1005, 5, 17.5] [1002, 5, 17.5]}
2 :{[1005, 2, 18.33] [1001, 2, 26] [1006, 2, 21.67]}
Explaination When offer Id 1 is applied, I get Item 2 quantities of Item Id 1002 at 21 rs. discount per quantity i.e. I am getting 21 rs. discount on 1 quantity of 1002.
Objective I want to get the best offer combination. For example in above case best offer combination will be:
OfferId : 2 (discount = 132 (i.e. (18.33+26+21.67)*2))
OfferId : 3 (note: for 3 quantities of item 1005 and 3 quantities of item 1002 since 2 quantities of item 1005 is already in offer Id 2). (discount = 105(i.e. (17.5+17.5)*3))
Now item 1002 has 2 quantity remaining , so:
offerId : 4 (applied on 2 quantities of item 1002)(discount = 10(i.e 5*2))
offerId : 6 (discount = (25+25)*1 = 50)
So in a nutshell, offerids 2, 3 , 4 , 6 will give me best combination of offers where offer 4 is applied on 2 quantities of item 1002,
offer 3 for 3 quantities of item 1005 and 3 quantities of item 1002.
Above is the result I desire to compute best offer combination depending on quantity.
So far, I had been able to find best offer combination without considering quantity. But now my requirement is to consider quantities of Items and then find best offer combination.
It would be really helpful if anyone can provide me with a pseudocode. Any suggestions are greatly appreciated.
P.S. I am writing my code in Golang but solutions in any language are welcomed.
I hope I framed my question correctly. Comment below if any more information regarding question is required.
Thanks in advance.
Even if there is only a single item of each type and every offer gives the same total discount (say, $1), this is NP-hard, since the NP-hard problem Set Packing can be reduced to it: for each set, make an offer for the same elements with total discount $1. Since all offers provide the same benefit, the optimal solution to this constructed instance of your problem is the one that uses the largest number of offers, and this solution corresponds directly to the optimal solution to the original Set Packing problem.
Thus there's no hope for a polynomial-time solution to your problem.

Complexity and binary tries

I have a big problem understanding complexity and especially with binary trees.
For example, I know that when we have a problem, with say the problem's size is x=log2(sizeofarray) but I don't understand where this log2 comes from?
Let's take a binary search as the easy example. Say you have a sorted list of 64 elements, and you're searching for a particular one. In each iteration, you halve the dataset. By the time your dataset has 1 element, you have halved it 6 times (count the arrows, not the numbers):
64 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1
The reason for this is the fact that 64 = 2 ^ 6, where 2 is the base (you divide the dataset in 2 parts in each iteration), and the exponent is 6 (as you get to the bottom in 6 iterations). There is another way to write this, since exponentiation has its inverse in logarithm:
64 = 2 ^ 6
6 = log2 64
So we can see that the number of iterations scales with the base-two logarithm of the number of elements.
It's log2 because each level of tree splits your problem into two.
For instance, consider this set of data:
{ 1, 2, 3, 4, 5, 6, 7, 8 }
The first level could be
{ 1, 2, 3, 4 }, { 5, 6, 7, 8 }
the second level:
{ 1, 2 }, { 3, 4 }, { 5, 6 }, { 7, 8 }
the third level:
{ 1 }, { 2 }, { 3 }, { 4 }, { 5 }, { 6 }, { 7 }, { 8 }
Here with 8 values, log2(8) = 3, and there are 3 levels in the tree.
Also see these other StackOverflow questions for more:
"Why is the height of a balanced binary tree log(n)? (Proof)" - the answer follows a similar vein to the answer that Amadan posted on this question.
"Search times for binary search tree" - contains some pretty ASCII art, and examines best/worst case scenarios.

Resources