Hard Disk scheduling simulator algorithm (track to track timing) Perl - algorithm

I am trying to get to grips with perl. I am trying to write a few scripts as a scheduling simulator. FCFS, SSTF and Scan and Look
I have one array with a list of block requests and another to act as the buffer. First I will copy over the first request, then I need to work out the time it takes to get from the first to the second block.
the buffer reads in blocks at 1 per ms, seek, search and access time are all 1ms to make the calculations a bit easier, the simulator always starts on block 1 track 1.
http://postimg.org/image/d9osb8tkj/
so if the first block is 5, the search time will be 3ms to traverse to the start of the 5th block, the seek time will be zero as its on the same track and the access time to read the block will always be 1ms. This means that the time for this request will be 4ms so the simulator will read in the next 4 requests into the buffer. In first come first served this will just be the order that the requests are served.
So if the next request to serve is 12 the arm is on the end of the 5th block so will take 2ms to get to the right track then 1ms to get to the start of the 12th block and another 1ms to access it.
I was just wondering if anyone could give me some idea how I could express this as an algorithm. Just some pointers would be much appreciated.

write a class HardDiskSim::Abstract, 3 subs seek_time(), spin_time(), and read_time()
Write a subclass of AbstractDisk for each different set of values/logic for the three methods.
Fir example:
package HardDiskSim::Simple;
use base qw(HardDiskSim::Abstract);
our $SECTORS_PER_TRACK = 5;
our $SEEK_TTIM_PER_TRACK = 1;
sub read_time { return 1 }
sub seek_time {
my $block = #_;
my $tracks_to_seek = int($block / $SECTORS_PER_TRACK);
return $tracks_to_seek * $SEEK_TTIM_PER_TRACK;
}
sub spin_time {
# compute head position at end of seek using seek time and RPM of disk
# compute number of sectors to spin past using computed head position
# return number_of_sectors_to_spin_past * time_per_sector
}
I had the fun of writing this kind of code in Fortran, for a class, back in 1985.

Related

How to measure execution time of Vulkan pipeline

Summary
I wish to be able to measure time elapsed in milliseconds, on the GPU, of running the entire graphics pipeline. The goal: To be able to save benchmarks before/after optimizing the code (next step would be mipmapping textures) to see improvements. This was really simple in OpenGL, but I'm new to Vulkan, and could use some help.
I have browsed related existing answers (here and here), but they aren't really of much help. And I cannot find code samples anywhere, so I dare ask here.
Through documentation pages I have found a couple of functions that I think I should be using, so I have in place something like this:
1: Creating query pool
void CreateQueryPool()
{
VkQueryPoolCreateInfo createInfo{};
createInfo.sType = VK_STRUCTURE_TYPE_QUERY_POOL_CREATE_INFO;
createInfo.pNext = nullptr; // Optional
createInfo.flags = 0; // Reserved for future use, must be 0!
createInfo.queryType = VK_QUERY_TYPE_TIMESTAMP;
createInfo.queryCount = mCommandBuffers.size() * 2; // REVIEW
VkResult result = vkCreateQueryPool(mDevice, &createInfo, nullptr, &mTimeQueryPool);
if (result != VK_SUCCESS)
{
throw std::runtime_error("Failed to create time query pool!");
}
}
I had the idea of queryCount = mCommandBuffers.size() * 2 to have space for a separate query timestamp before and after rendering, but I have no clue whether this assumption is correct or not.
2: Recording command buffers
// recording command buffer i:
vkCmdWriteTimestamp(mCommandBuffers[i], VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, mTimeQueryPool, i);
// render pass ...
vkCmdWriteTimestamp(mCommandBuffers[i], VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT, mTimeQueryPool, i);
vkCmdCopyQueryPoolResults(/* many parameters here */);
I'm looking for a couple of clarifications:
What is the concequence of writing to the same query index? Do I need two separate query pools - one for before render time and one for after render time?
How should I handle synchronization? I assume having a separate query for each command buffer.
For the destination buffer containing the query result, is it good enough to store somewhere with "host visible bit", or do I need staging memory for "device visible only"? I'm a bit lost on this one as well.
I have not been able to find any online examples of how to measure render time, but I just assume it's such a common task that surely there must be an example out there somewhere.
So, thanks to #karlschultz, I managed to get something working. So in case other people will be looking for the same answer, I decided to post my findings here. For the Vulkan experts out there: Please let me know if I make obvious mistakes, and I will correct them here!
Query Pool Creation
I fill out a VkQueryPoolCreateInfo struct as described in my question, and let its queryCount field equal twice the number of command buffers, to store space for a query before and after rendering.
Important here is to reset all entries in the query pool before using the queries, and to reset a query after writing to it. This necessitates a few changes:
1) Asking graphics queue if timestamps are supported
When picking the graphics queue family, the struct VkQueueFamilyProperties has a field timestampValidBits which must be greater than 0, otherwise the queue family cannot be used for timestamp queries!
2) Determining the timestamp period
The physical device contains a special value which indicates the number of nanoseconds it takes for a timestamp query to be incremented by 1. This is necessary to interpret the query result as e.g. nanoseconds or milliseconds. That value is a float, and can be retrieved by calling vkGetPhysicalDeviceProperties and looking at the field VkPhysicalDeviceProperties.limits.timestampPeriod.
3) Asking for query reset support
During logical device creation, one must fill out a struct and add it to the pNext chain to enable the host query reset feature:
VkDeviceCreateInfo createInfo{};
VkPhysicalDeviceHostQueryResetFeatures resetFeatures;
resetFeatures.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_HOST_QUERY_RESET_FEATURES;
resetFeatures.pNext = nullptr;
resetFeatures.hostQueryReset = VK_TRUE;
createInfo.pNext = &resetFeatures;
4) Recording command buffers
Timestamp queries should be outside the scope of the render pass, as seen below. It is not possible to measure running time of a single shader (e.g. fragment shader), only the entire pipeline or whatever is outside the scope of the render pass, due to (potential) temporal overlap of pipeline stages.
vkCmdWriteTimestamp(mCommandBuffers[i], VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, mTimeQueryPool, i * 2);
vkCmdBeginRenderPass(/* ... */);
// render here...
vkCmdEndRenderPass(mCommandBuffers[i]);
vkCmdWriteTimestamp(mCommandBuffers[i], VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT, mTimeQueryPool, i * 2 + 1);
5) Retrieving query result
We have two methods for this: vkCmdCopyQueryPoolResults and vkGetQueryPoolResults. I chose to go with the latter since is greatly simplifies the setup and does not require synchronization with GPU buffers.
Given that I have a swapchain index (in my scenario same is command buffer index!), I have a setup like this:
void FetchRenderTimeResults(uint32_t swapchainIndex)
{
uint64_t buffer[2];
VkResult result = vkGetQueryPoolResults(mDevice, mTimeQueryPool, swapchainIndex * 2, 2, sizeof(uint64_t) * 2, buffer, sizeof(uint64_t),
VK_QUERY_RESULT_64_BIT);
if (result == VK_NOT_READY)
{
return;
}
else if (result == VK_SUCCESS)
{
mTimeQueryResults[swapchainIndex] = buffer[1] - buffer[0];
}
else
{
throw std::runtime_error("Failed to receive query results!");
}
// Queries must be reset after each individual use.
vkResetQueryPool(mDevice, mTimeQueryPool, swapchainIndex * 2, 2);
}
The variable mTimeQueryResults refers to an std::vector<uint64_t> which contains a result for each swapchain. I use it to calculate an average rendering time each second by using the timestamp period determined in step 2).
And one must not forget to cleanup to query pool by calling vkDestroyQueryPool.
There are a lot of details omitted, and for a total Vulkan noob like me this setup was frightening and took several days to figure out. Hopefully this will spare someone else the headache.
More info in documentation.
Writing to the same query index is bad because you are overwriting your "before" timestamp with the "after" timestamp at the same query index. You might want to change the last parameter in your write timestamp calls to i * 2 for the "before" call and to i * 2 + 1 for the "after". You are already allocating 2 timestamps for each command buffer, but only using half of them. This scheme ends up producing a pair of before/after timestamps for each command buffer i.
I don't have any experience using vkCmdCopyQueryPoolResults(). If you can idle your queue, then after idle, call vkGetQueryPoolResults() which will probably be much easier for what you are doing here. It copies the query results back into host memory and you don't have to mess with synchronizing writes to another buffer and then mapping/reading it back.

How do I use Ruby to do a certain number of actions per second?

I want to test a rate-limiting app with Ruby where I define different behavior based on the number of requests per second.
For example, if I see 300 request per second or more, I want it to respond with a block.
But how would I test this by generating 300 requests per second in Ruby? I understand there are hard limitations based on CPU for example, but if I kept the number well below that limitation, how would I still send something that both exceeds the threshold and stays below?
Just looping N-times doesn't guarantee me the throughput.
The quick and dirty way is to spin up 300 threads that each do one request per second. The more elegant way is to use something like Eventmachine to create requests at the required rate. With the right non-blocking HTTP library it can easily generate that level of activity.
You also might try these tools:
ab the Apache benchmarking tool, common many systems. It's very good at abusing your system.
Seige for load testing.
How about a minimal homebrew solution:
OPS_PER_SECOND = 300
count = 0
duration = 10
start = Time.now
while true
elapsed = Time.now - start
break if elapsed >= duration
delay = (count - (elapsed / OPS_PER_SECOND)) / OPS_PER_SECOND
sleep(delay) if delay > 0
do_request
count += 1
end

Howto know that I do not block Ruby eventmachine with a mongodb operation

I am working on a eventmachine based application that periodically polls for changes of MongoDB stored documents.
A simplified code snippet could look like:
require 'rubygems'
require 'eventmachine'
require 'em-mongo'
require 'bson'
EM.run {
#db = EM::Mongo::Connection.new('localhost').db('foo_development')
#posts = #db.collection('posts')
#comments = #db.collection('comments')
def handle_changed_posts
EM.next_tick do
cursor = #posts.find(state: 'changed')
resp = cursor.defer_as_a
resp.callback do |documents|
handle_comments documents.map{|h| h["comment_id"]}.map(&:to_s) unless documents.length == 0
end
resp.errback do |err|
raise *err
end
end
end
def handle_comments comment_ids
meta_product_ids.each do |id|
cursor = #comments.find({_id: BSON::ObjectId(id)})
resp = cursor.defer_as_a
resp.callback do |documents|
magic_value = documents.first['weight'].to_i * documents.first['importance'].to_i
end
resp.errback do |err|
raise *err
end
end
end
EM.add_periodic_timer(1) do
puts "alive: #{Time.now.to_i}"
end
EM.add_periodic_timer(5) do
handle_changed_posts
end
}
So every 5 seconds EM iterates over all posts, and selects the changed ones. For each changed post it stores the comment_id in an array. When done that array is passed to a handle_comments which loads every comment and does some calculation.
Now I have some difficulties in understanding:
I know, that this load_posts->load_comments->calculate cycle takes 3 seconds in a Rails console with 20000 posts, so it will not be much faster in EM. I schedule the handle_changed_posts method every 5 seconds which is fine unless the number of posts raises and the calculation takes longer than the 5 seconds after which the same run is scheduled again. In that case I'd have a problem soon. How to avoid that?
I trust em-mongo but I do not trust my EM knowledge. To monitor EM is still running I puts a timestamp every second. This seems to be working fine but gets a bit bumpy every 5 seconds when my calculation runs. Is that a sign, that I block the loop?
Is there any general way to find out if I block the loop?
Should I nice my eventmachine process with -19 to give it top OS prio always?
I have been reluctant to answer here since I've got no mongo experience so far, but considering no one is answering and some of the stuff here is general EM stuff I may be able to help:
schedule next scan on first scan's end (resp.callback and resp.errback in handle_changed_posts seem like good candidates to chain next scan), either with add_timer or with next_tick
probably, try handling your mongo trips more often so they handle smaller chunks of data, any cpu cycle hog inside your reactor would make your reactor loop too busy to accept events such as periodic timer ticks
no simple way, no. One idea would be to measure diff of Time.now to next_tick{Time.now}, do benchmark and then trace possible culprits when the diff crosses a threshold. Simulating slow queries (Simulate slow query in mongodb? ?) and many parallel connections is a good idea
I honestly don't know, I've never encountered people who do that, I expect it depends on other things running on that server
To expand upon bbozo's answer, specifically in relation to your second question, there is no time when you run code that you do not block the loop. In my experience, when we talk about 'non-blocking' code what we really mean is 'code that doesn't block very long'. Typically, these are very short periods of time (less than a millisecond), but they still block while executing.
Further, the only thing next_tick really does is to say 'do this, but not right now'. What you really want to do, as bbozo mentioned, is split up your processing over multiple ticks such that each iteration blocks for as little time as possible.
To use your own benchmarks, if 20,000 records takes about 3 seconds to process, 4,000 records should take about 0.6 seconds. This would be short enough to not usually affect your 1 second heartbeat. You could split it up even farther to reduce the amount of blockage and make the reactor run smoother, but it really depends on how much concurrency you need from the reactor.

Regulating / rate limiting ruby mechanize

I need to regulate how often a Mechanize instance connects with an API (once every 2 seconds, so limit connections to that or more)
So this:
instance.pre_connect_hooks << Proc.new { sleep 2 }
I had thought this would work, and it sort of does BUT now every method in that class sleeps for 2 seconds, as if the mechanize instance is touched and told to hold 2 seconds. I'm going to try a post connect hook, but it is obvious I need something a bit more elaborate, but what I don't know what at this point.
Code is more explanation so if you are interested following along: https://github.com/blueblank/reddit_modbot, otherwise my question concerns how to efficiently and effectively rate limit a Mechanize instance to within a specific time frame specified by an API (where overstepping that limit results in dropped requests and bans). Also, I'm guessing I need to better integrate a mechanize instance to my class as well, any pointers on that appreciated as well.
Pre and post connect hooks are called on every connect, so if there is some redirection it could trigger many times for one request. Try history_added which only gets called once:
instance.history_added = Proc.new {sleep 2}
I use SlowWeb to rate limit calls to a specific URL.
require 'slowweb'
SlowWeb.limit('example.com', 10, 60)
In this case calls to example.com domain are limited to 10 requests every 60 seconds.

Limitation in retrieving rows from a mongodb from ruby code

I have a code which gets all the records from a collection of a mongodb and then it performs some computations.
My program takes too much time as the "coll_id.find().each do |eachitem|......." returns only 300 records at an instant.
If I place a counter inside the loop and check it prints 300 records and then sleeps for around 3 to 4 seconds before printing the counter value for next set of 300 records..
coll_id.find().each do |eachcollectionitem|
puts "counter value for record " + counter.to_s
counter=counter +1
---- My computations here -----
end
Is this a limitation of ruby-mongodb api or some configurations needs to be done so that the code can get access to all the records at one instant.
How large are your documents? It's possible that the deseriaization is taking a long time. Are you using the C extensions (bson_ext)?
You might want to try passing a logger when you connect. That could help sort our what's going on. Alternatively, can you paste in the MongoDB log? What's happening there during the pause?

Resources