How accurate is ruby mtime and friends? - ruby

In ruby, how accurate is File.atime / File.ctime / File.mtime ? Is it to the nearest second on both Unix and Windows?

It depends on the file system.
FAT has 2 second precision.
NTFS is 100 nanoseconds.
EXT3 is 1 second. It looks like there are extensions to make this 1 nanosecond.
The 2 second FAT thing can bite you if you are backing up based on time stamp between FAT and NTFS or EXT3.

I suspect most, if not all, of the "scripting" languages use the underlying OS calls for retrieving this sort of information and, if that is the case, then 1-second resolution is what you get under Linux and Windows (at least for NTFS - FAT32, if you're still using it, has a 2-second resolution IIRC).
Ideally, you'd look at the Ruby source code to confirm this (actually, ideally, it would be documented, but that may be too much to hope for) but you could run a simple test where you continuously "touch" the file (more than once per second) and monitor its mtime values.
In the absence of a documented statement, you either have to rely on empirical evidence or rely on nothing at all.

Presuming you mean precise (since accuracy would depend on the hardware clock, and many other factors), the docs seem to indicate second precision.
EDIT: You're right. Time actually has microsecond precision (again, not accuracy).

Related

Do computer/cpu really understand(binary)?

I have read and heard from many people,books, sites computer understand nothing but only binary!But they dont tell how computer/cpu understand binary.So I was thinking how can computer/cpu can understand?Cause as of my little knowledge and thinking, to think or understand something one must need a brain and off-course a life but cpu lack both.
*Additionally as cpu run by electricity, so my guess is cpu understand nothing,not even binary rather there are some natural rules for electricity or something like that and we human*(or who invented computer) found it(may be if we flow current in a certain combination or in certain number of circuits we get a row light or like so, who know!) and also a way to manipulate the current flow/straight light to make with it, what we need i.e different letters(with straight three light or magnetic wave occurred from the electricity with the help of manipulation we can have letter 'A') means computer/cpu dont understanad anything.
Its just my wild guess. I hope someone could help me to have a clear idea about if cpu really understand anything(binary)?And if, then how. Anyone detailed answer,article or book would be great.Thanks in advance.
From HashNode article "How does a computer machine understand 0s and 1s?"
A computer doesn't actually "understand" anything. It merely provides you with a way of information flow — input to output. The decisions to transform a given set of inputs to an output (computations) are made using boolean expressions (expressed using specific arrangements of logic gates).
At the hardware level we have bunch of elements called transistors (modern computers have billions of them and we are soon heading towards an era where they would become obsolete). These transistors are basically switching devices. Turning ON and OFF based on supply of voltage given to its input terminal. If you translate the presence of voltage at the input of the transistor as 1 and absence of voltage as 0 (you can do it other way too). There!! You have the digital language.
"understand" no. Computers don't understand anything, they're just machines that operate according to fixed rules for moving from one state to another.
But all these states are encoded in binary.
So if you anthropomorphise the logical (architectural) or physical (out-of-order execution, etc. etc.) operation of a computer, you might use the word "understand" as a metaphor for "process" / "operate in".
Taking this metaphor to the extreme, one toy architecture is called the Little Man Computer, LMC, named for the conceit / joke idea that there is a little man inside the vastly simplified CPU actually doing the binary operations.
The LMC model is based on the concept of a little man shut in a closed mail room (analogous to a computer in this scenario). At one end of the room, there are 100 mailboxes (memory), numbered 0 to 99, that can each contain a 3 digit instruction or data (ranging from 000 to 999).
So actually, LMC is based around a CPU that "understands" decimal, unlike a normal computer.
The LMC toy architecture is terrible to program for except for the very simplest of programs. It doesn't support left/right bit-shifts or bitwise binary operations, which makes sense because it's based on decimal not binary. (You can of course double a number = left shift by adding to itself, but right shift needs other tricks.)

Accurate time delta for moderate time intervals: GetTickCount64 vs QueryPerformanceCounter

There are lots of questions (here, here, here) about mechanisms for getting monotonic time on Windows and their various gotchas and pitfalls. I'm particularly interested in the accuracy (not precision) of the main options.
I'm looking to measure elapsed time on a single machine, when the time is on the order of multiple minutes to an hour. What i know so far:
QueryPerformanceCounter is great for short time intervals, but QPF can have error on the order of 500PPM, which translates to error of 2 seconds over an hour.
More concerning is that even on fairly recent processors, folks are seeing QPC misbehavior.
Microsoft recommends QPC above all else for short-term duration measurements. But short-term isn't defined in any absolute numbers.
GetTickCount64 is often cited as a nice and reliable, less precise alternative for QPC.
I've not found any good details about the accuracy of GetTickCount64. While it is less precise than QPC, how does its accuracy compare? What kind of error might I expect over an hour?
Some programs play with its resolution by using timeBeginPeriod, although I don't think this affects accuracy?
The docs talk about how GetTickCount64's resolution is not affected by adjustments made by the GetSystemTimeAdjustment function. Hopefully this means GetTickCount64 is monotonic and not adjusted ever? It is unusual wording...
GetSystemTimePreciseAsFileTime is an option for same-machine time deltas if I disable automatic time adjustment via SetSystemTimeAdjustment. It is backed by QPC. Is there any benefit to using this over QPC directly? (Perhaps it does sanitization or thread affinity tricks to avoid some of the issues encountered by direct QPC calls?)
One SO QA I found linked to this blog post, which has been particularly useful to read. While it doesn't answer my question directly, it dives into how QPC works on Windows, and how the common linux monotonic time basically uses the same thing.
The gist is that both of them use rtdsc when an invariant TSC on modern hardware is available.

Do (sampling) profilers still "lie" these days?

Most of my limited experience with profiling native code is on a GPU rather than on a CPU, but I see some CPU profiling in my future...
Now, I've just read this blog post:
How profilers lie: The case of gprof and KCacheGrind
about how what profilers measure and what they show you, which is likely not what you expect if you're interested in discerning between different call paths and the time spent in them.
My question is: Is this still the case today (5 years later)? That is, do sampling profilers (i.e. those who don't slow execution down terribly) still behave the way gprof used to (or callgrind without --separate-callers=N)? Or do profilers nowadays customarily record the entire call stack when sampling?
No, many modern sampling profilers don't exhibit the problem described regarding gprof.
In fact, even when that was written, the specific problem was actually more a quirk of the way gprof uses a mix of instrumentation and sampling and then tries to reconstruct a hypothetical call graph based on limited caller/callee information and combine that with the sampled timing information.
Modern sampling profilers, such as perf, VTune, and various language-specific profilers to languages that don't compile to native code can capture the full call stack with each sample, which provides accurate times with respect to that issue. Alternately, you might sample without collecting call stacks (which reduces greatly the sampling cost) and then present the information without any caller/callee information which would still be accurate.
This was largely true even in the past, so I think it's fair to say that sampling profilers never, as a group, really exhibited that problem.
Of course, there are still various ways in which profilers can lie. For example, getting results accurate to the instruction level is a very tricky problem, given modern CPUs with 100s of instructions in flight at once, possibly across many functions, and complex performance models where instructions may have a very different in-context cost as compared to their nominal latency and throughput values. Even that tricky issues can be helped with "hardware assist" such as on recent x86 chips with PEBS support and later related features that help you pin-point an instruction in a less biased way.
Regarding gprof, yes, it's still the case today. This is by design, to keep the profiling overhead small. From the up-to-date documentation:
Some of the figures in the call graph are estimates—for example, the
children time values and all the time figures in caller and subroutine
lines.
There is no direct information about these measurements in the profile
data itself. Instead, gprof estimates them by making an assumption
about your program that might or might not be true.
The assumption made is that the average time spent in each call to any
function foo is not correlated with who called foo. If foo used 5
seconds in all, and 2/5 of the calls to foo came from a, then foo
contributes 2 seconds to a’s children time, by assumption.
Regarding KCacheGrind, little has changed since the article was written. You can check out the change log and see that the latest version was published in April 5, 2013, which includes unrelated changes. You can also refer to Josef Weidendorfer's comments under the article (Josef is the author of KCacheGrind).
If you noticed, I contributed several comments to that post you referenced, but it's not just that profilers give you bad information, it's that people fool themselves about what performance actually is.
What is your goal? Is it to A) find out how to make the program as fast as possible? Or is it to B) measure time taken by various functions, hoping that will lead to A? (Hint - it doesn't.) Here's a detailed list of the issues.
To illustrate: You could, for example, be calling a teeny innocent-looking little function somewhere that just happens to invoke nine yards of system code including reading a .dll to extract a string resource in order to internationalize it. This could be taking 50% of wall-clock time and therefore be on the stack 50% of wall-clock time. Would a "CPU-profiler" show it to you? No, because practically all of that 50% is doing I/O. Do you need many many stack samples to know to 3 decimal places exactly how much time it's taking? Of course not. If you only got 10 samples it would be on 5 of them, give or take. Once you know that teeny routine is a big problem, does that mean you're out of luck because somebody else wrote it? What if you knew what the string was that it was looking up? Does it really need to be internationalized, so much so that you're willing to pay a factor of two in slowness just for that? Do you see how useless measurements are when your real problem is to understand qualitatively what takes time?
I could go on and on with examples like this...

Estimating compile time?

Is there any rule of thumb for determining a 'reasonable' amount of compilation time for a program? Obviously it's somewhat subjective what 'reasonable' means, but all things being equal, a console-based "Hello, World" shouldn't take 2 hours to compile, for instance. To provide a concrete example --
Given a repository of C-code, X # of lines of code, gcc optimization level Y, ... is there any reasonable way to predict the amount of time for compilation? And any opinions on what's a 'reasonable' amount of time?
Clarification
The parameters of interest here are only code dependent, NOT CPU, memory, network dependent.
For most reasonable programs built from reasonable sources on reasonable development machines (2+GHz, 4+GiB RAM), then answer for compiling a single source file should be 'a few seconds'. Anything in the minutes range usually indicates a problem, in my experience. The time taken to compile a complete program is then controlled by how many files there are to compile; it takes longer to compile 20,000 files than it does to compile 20 — roughly a thousand times as long, in fact. (Reasonable sources usually have source files under 10k lines and headers under about 1k lines — there are plenty of exceptions to either guideline, and both numbers are fairly generous.)
But it all depends. If your headers are on a network file system, compilation is likely to be slower than if they are on local file system — unless your local drive is slow, the network blazingly fast and backed by SSD, and ... oh, gosh; there are just so many factors that it is nigh-on impossible to give a good answer!
Other factors, aside from the one's mentioned in Jonathan's answer are: Programming Language, Coding Style, Compliler Version, pCode or binary generation, etc.

Is there a way to predict (or measure) how long it will take to execute a single line of code?

My question is this:
Given a single line of isolated code (ie., does not parse, read/write to file on HDD, not a subroutine, etc.), is it possible to predict or measure with any degree of accuracy and/or consistency how long the code will take to execute on a given system?
For example, say I have the following code (not in any specific language, just a generalization):
1 | If x = 1 then
2 | x = x + 1
3 | End If
how long would it take to execute line 2?
I am not looking for numbers here; I am just wondering if such a thing is possible or practical.
Thanks!
Update:
Now I am looking for some numbers... If I were to execute a simple For loop that simply sleeps 1 second per iteration, after 60 (actual) minutes, how far off would it be? In other words, can the time taken to evaluate a line of isolated code be considered negligible (assuming no errors or interrupts)?
If you look at documentation such as this you will find how many clock cycles the fundamental operations take on a CPU, and how long those clock cycles are. Translating those numbers into the time taken to perform x = x+1 in your program is an imprecise science, but you'll get some kind of clue by examining the assembly code listing that your compiler produces.
The science becomes less and less precise as you move from simple arithmetic statements towards large programs and you start hitting all sorts of issues arising from the complexity of modern CPUs and modern operating systems and memory hierarchies and all the rest.
The simplest way would be to run it a billion times, see how much time it took, and then divide by one billion. This should remove the problems with clock accuracy.
Sounds like you might be interested in learning about Big O Notation. Using Big O, you you can figure out how to write the most efficient algorithms regardless of the speed of the machine on which it is running.
Most computers will usually have some processes, services or other "jobs" running in the background and consuming resources at different times for various reasons... which will tend to make it hard to predict exact execution times. However if you could break down the above code into assembly language and calculate the number of bits, registers and clock ticks that are used at the machine level, I would assume that it would be possible to get an accurate estimation.
The answer is a resounding no. For example consider what would happen if an interrupt was raised during the addition: the next instruction would be executed after the interrupt has been serviced, therefore giving the impression that the addition took an awful lot of time.
If we're talking about an ISR-disabled condition, then the answer is "it depends". For example, consider that a branch misprediction on the if would probably slow down the execution a lot. The same could be said about multicore CPUs (often some of the ALUs are shared between cores, so activity on a core could affect the performance of operations on other cores).
If we're talking about an ISR-disabled condition on a CPU with a single core and a single-stage pipeline then I guess you should be pretty much able to predict how long it takes. Problem is, you'd be probably working on a microcontroller (and an old one, at that). :)
update: on second thought, even in the last case you probably wouldn't be able to predict it exactly: in fact some operations are data-dependant (the canonical example being floating point operations on denormal numbers).
A way you can figure it out is writing that line of code 1000 times and then time how long it takes to run that code and then dividing the time by 1000, to get a more acurate time you can run that test 10 times and see the average time you got out of all those tests.

Resources