I am working on a resource allocation problem and was looking for a algorithm that I could use. Here's are the data items
Each timeslot is 15 mins in duration
There are n number of a resource type
A resource can be requested for n timeslots (e.g. 1 hour = 4 timeslots) at a certain time (e.g. 10 am)
Input to Algorithm: Request n resources for n timeslots at h hour, would a resource be available to fulfil the request.
e.g. can I hire a car for 1 hour at 10:00 am, from a inventory of 4 cars, however 2 of these cars are already booked between 9:30 and 10:00 am.
Any pointers on how this can done, I'll be very grateful
Al
Related
I am working on a project where my task is to find the % price change given 2 time frames for 100+ stocks in an efficient way.
The time frames are pre defined and can only be 5 mins, 10 mins, 30 mins, 1 hour, 4 hour, 12 hour, 24 hours.
Given a time frame, I need a way to efficiently figure out the % price change of all the stocks that I am tracking.
As per the current implementation, I am getting price data for those stocks every second and dumping the data to a price table.
I have another cron job which updates the % change of the stock based on the values in the price table every few seconds.
The solution is kind of in working state but its not efficient. Is there any data structure/ algorithm that I can use to find the % change efficiently?
Suppose there exists an application X.
Is DAU defined as either (a) the number of users that login X every single day over a specified period, or (b) the average total number of users that login X each day over a specified period?
For example:
Specified period = 5 days
The same 50 users login X everyday. In addition, a random number of users login X on top of this each day, say 20, 40, 10, 25, 30.
Does DAU = 50 or DAU = (70+90+60+75+80)/5
DAU is how many unique users visit the site daily. In other words, DAU lives only during a specific day, and not other specific period.
In your example, DAU for the first day is 50 users, the socond - 70 users, the third - 90 and so on.
DAU = (70+90+60+75+80)/5 is not a DAU, that is more likely an average value of DAU for 5 days;
as wel as 50 is DAU for the first day only, not for whole 5 days.
If you wanna calculate an Active Users index for a specific period, you may user Weekly Active Users (WAO) and Monthly Active Users (MAO) or, let's say, a [5 days] Active Users counters.
To calculate "[N Days]AU", you should measure it by counting the number of unique users during a specific measurement period, such as within the previous N days.
So, if User1 (and no one else) logins to the site every 5 days, you'll still have [N Days]AU = 1 for the site, because you have only 1 unique active user during this period.
I have a database with stock values in a table, for example:
id - unique id for this entry
stockId - ticker symbol of the stock
value - price of the stock
timestamp - timestamp of that price
I would like to create separate arrays for a timeframe of 24 hour, 7 days and 1 month from my database entries, each array containing datapoints for a stock chart.
For some stockIds, I have just a few data points per hour, for others it could be hundreds or thousands.
My question:
What is a good algorithm to "aggregate" the possibly many datapoints into a few - for example, for the 24 hours chart I would like to have at a maximum 10 datapoints per hour. How do I handle exceptionally high / low values?
What is the common approach in regards to stock charts?
Thank you for reading!
Some options: (assuming 10 points per hour, i.e. one roughly every 6 minutes)
For every 6 minute period, pick the data point closest to the centre of the period
For every 6 minute period, take the average of the points over that period
For an hour period, find the maximum and minimum for each 4 minutes period and pick the 5 maximum and 5 minimum in these respective sets (4 minutes is somewhat randomly chosen).
I originally thought to pick the 5 minimum points and the 5 maximum points such that each maximum point is at least 8 minutes apart, and similarly for minimum points.
The 8 minutes here is so we don't have all the points stacked up on each other. Why 8 minutes? At an even distribution, 60/5 = 12 minutes, so just moving a bit away from that gives us 8 minutes.
But, in terms of the implementation, the 4 minutes approach will be much simpler and should give similar results.
You'll have to see which one gives you the most desirable results. The last one is likely to give a decent indication of variation across the period, whereas the second one is likely to have a more stable graph. The first one can be a bit more erratic.
I have a dataset containing > 100,000 records where each record has a timestamp.
This dataset has been aggregated from several "controller" nodes which each collect their data from a set of children nodes. Each controller collects these records periodically, (e.g. once every 5 minutes or once every 10 minutes), and it is the controller that applies the timestamp to the records.
E.g:
Controller One might have 20 records timestamped at time t, 23 records timestamped at time t + 5 minutes, 33 records at time t + 10 minutes.
Controller Two might have 30 records timestamped at time (t + 2 minutes) + 10 minutes, 32 records timestamped at time (t + 2 minutes) + 20 minutes, 41 records timestamped at time (t + 2 minutes) + 30 minutes etcetera.
Assume now that the only information you have is the set of all timestamps and a count of how many records appeared at each timestamp. That is to say, you don't know i) which sets of records were produced by which controller, ii) the collection interval of each controller or ii) the total number of controllers. Is there an algorithm which can decompose the set of all timestamps into individual subsets such that the variance in difference between consecutive (ordered) elements of each given subset is very close to 0, while adding any element from one subset i to another subset j would increase this variance? Keep in mind, for this dataset, a single controller's "periodicity" could fluctuate by +/- a few seconds because of CPU timing/network latency etc.
My ultimate objective here is to establish a) how many controllers there are and b) the sampling interval of each controller. So far I've been thinking about the problem in terms of periodic functions, so perhaps there are some decomposition methods from that area that could be useful.
The other point to make is that I don't need to know which controller each record came from, I just need to know the sampling interval of each controller. So e.g. if there were two controllers that both started sampling at time u, and one sampled at 5-minute intervals and the other at 50-minute intervals, it would be hard to separate the two at the 50-minute mark because 5 is a factor of 50. This doesn't matter, so long as I can garner enough information to work out the intervals of each controller despite these occasional overlaps.
One basic approach would be to perform an FFT decomposition (or, if you're feeling fancy, a periodogram) of the dataset and look for peaks in the resulting spectrum. This will give you a crude approximation of the periods of the controllers, and may even give you an estimate of their number (and by looking at the height of the peaks, it can tell you how many records were logged).
I know that Little's Law states (paraphrased):
the average number of things in a system is the product of the average rate at which things leave the system and the average time each one spends in the system,
or:
n=x*(r+z);
x-throughput
r-response time
z-think time
r+z - average response time
now i have question about a problem from programming pearls:
Suppose that system makes 100 disk accesses to process a transaction (although some systems require fewer, some systems will require several hundred disk access per transaction). How many transactions per hour per disk can the system handle?
Assumption: disk access takes 20 milliseconds.
Here is solution on this problem
Ignoring slowdown due to queuing, 20 milliseconds (of the seek time) per disk operation gives 2 seconds per transaction or 1800 transactions per hour
i am confused because i did not understand solution of this problem
please help
It will be more intuitive if you forget about that formula and think that the rate at which you can do something is inversely proportional to the time that it takes you to do it. For example, if it takes you 0.5 hour to eat a pizza, you eat pizzas at a rate of 2 pizzas per hour because 1/0.5 = 2.
In this case the rate is the number of transactions per time and the time is how long a transaction takes. According to the problem, a transaction takes 100 disk accesses, and each disk access takes 20 ms. Therefore each transaction takes 2 seconds total. The rate is then 1/2 = 0.5 transactions per second.
Now, more formally:
Rate of transactions per seconds R is inversely proportional to the transaction time in seconds TT.
R = 1/TT
The transaction time TT in this case is:
TT = disk access time * number of disk accesses per transaction =
20 milliseconds * 100 = 2000 milliseconds = 2 seconds
R = 1/2 transactions per second
= 3600/2 transactions per hour
= 1800 transactions per hour