Can dbms_utility.get_time rollover? - oracle

I'm having problems with a mammoth legacy PL/SQL procedure which has the following logic:
l_elapsed := dbms_utility.get_time - l_timestamp;
where l_elapsed and l_timestamp are of type PLS_INTEGER and l_timestamp holds the result of a previous call to get_time
This line suddenly started failing during a batch run with a ORA-01426: numeric overflow
The documentation on get_time is a bit vague, possibly deliberately so, but it strongly suggests that the return value has no absolute significance, and can be pretty much any numeric value. So I was suspicious to see it being assigned to a PLS_INTEGER, which can only support 32 bit integers. However, the interweb is replete with examples of people doing exactly this kind of thing.
The smoking gun is found when I invoke get_time manually, it is returning a value of -214512572, which is suspiciously close to the min value of a 32 bit signed integer. I'm wondering if during the time elapsed between the first call to get_time and the next, Oracle's internal counter rolled over from its max value and its min value, resulting in an overflow when trying to subtract one from the other.
Is this a likely explanation? If so, is this an inherent flaw in the get_time function? I could just wait and see if the batch fails again tonight, but I'm keen to get an explanation for this behaviour before then.

Maybe late, but this may benefit someone searching on the same question.
The underlying implementation is a simple 32 bit binary counter, which is incremented every 100th of a second, starting from when the database was last started.
This binary counter is is being mapped onto a PL/SQL BINARY_INTEGER type - which is a signed 32-bit integer (there is no sign of it being changed to 64-bit on 64-bit machines).
So, presuming the clock starts at zero it will hit the +ve integer limit after about 248 days, and then flip over to become a -ve value falling back down to zero.
The good news is that provided both numbers are the same sign, you can do a simple subtraction to find duration - otherwise you can use the 32-bit remainder.
IF SIGN(:now) = SIGN(:then) THEN
RETURN :now - :then;
ELSE
RETURN MOD(:now - :then + POWER(2,32),POWER(2,32));
END IF;
Edit : This code will blow the int limit and fail if the gap between the times is too large (248 days) but you shouldn't be using GET_TIME to compare durations measure in days anyway (see below).
Lastly - there's the question of why you would ever use GET_TIME.
Historically, it was the only way to get a sub-second time, but since the introduction of SYSTIMESTAMP, the only reason you would ever use GET_TIME is because it's fast - it is a simple mapping of a 32-bit counter, with no real type conversion, and doesn't make any hit on the underlying OS clock functions (SYSTIMESTAMP seems to).
As it only measures relative time, it's only use is for measuring the duration between two points. For any task that takes a significant amount of time (you know, over 1/1000th of a second or so) the cost of using a timestamp instead is insignificant.
The number of occasions on where it is actually useful is minimal (the only one I've found is checking the age of data in a cache, where doing a clock hit for every access becomes significant).

From the 10g doc:
Numbers are returned in the range -2147483648 to 2147483647 depending on platform and machine, and your application must take the sign of the number into account in determining the interval. For instance, in the case of two negative numbers, application logic must allow that the first (earlier) number will be larger than the second (later) number which is closer to zero. By the same token, your application should also allow that the first (earlier) number be negative and the second (later) number be positive.
So while it is safe to assign the result of dbms_utility.get_time to a PLS_INTEGER it is theoretically possible (however unlikely) to have an overflow during the execution of your batch run. The difference between the two values would then be greater than 2^31.
If your job takes a lot of time (therefore increasing the chance that the overflow will happen), you may want to switch to a TIMESTAMP datatype.

Assigning a negative value to your PLS_INTEGER variable does raise an ORA-01426:
SQL> l
1 declare
2 a pls_integer;
3 begin
4 a := -power(2,33);
5* end;
SQL> /
declare
*
FOUT in regel 1:
.ORA-01426: numeric overflow
ORA-06512: at line 4
However, you seem to suggest that -214512572 is close to -2^31, but it's not, unless you forgot to typ a digit. Are we looking at a smoking gun?
Regards,
Rob.

Related

Best way finding the middle point

In many algorithm i have seen people are using two different way to get middle point.
(LOW+HIGH)/2
LOW+(HIGH-LOW)/2
Mostly i have seen 2nd method for example in QuickSort. What is the best way to find middle between two number and why ?
It entirely depends on the context, but I'll make a case for case number 2 and explain why.
Let's first assume you go for case number 1, which is this:
(LOW + HIGH) / 2
This looks entirely reasonable, and mathematically, it is. Let's plug in two numbers and look at the results:
(12345 + 56789) / 2
The result is 34567. Looks OK doesn't it?
Now, the problem is that in computers it's not as simple as that. You also got something called data types to contend with. Usually this is denoted with such things as number of bits. In other words, you might have a 32-bit number, or a 16-bit number, or a 64-bit number, and so on.
All of these have what is known as a legal range of values, ie. "what kind of values will these types hold". An 8-bit number, unsigned (which means it cannot be negative) can hold 2 to the power of 8 different values, or 256. A 16-bit unsigned value can hold 65536 values, or a range from 0 to 65535. If the values are signed they go from -half to +half-1, meaning it will go from -128 to +127 for a 8-bit signed value, and from -32768 to +32767 for a signed 16-bit value.
So now we go back to the original formula. What if the data type we're using for the calculation isn't enough to hold LOW + HIGH ?
For instance, let's say we used 16-bit signed values for this, and we still got this expression:
(12345 + 56789) / 2
12345 can be held in a 16-bit value (it is less than 65536), same with 56789, but what about the result? The result of adding 12345 and 56789 is 69134 which is more than 65535 (the highest unsigned 16-bit value).
So what will happen to it? There's two outcomes:
It will overflow, which means it will start back at 0 and count upwards, which means it will actually end up with the result of 3598 (123456 + 56789) - 65536
It will throw an exception or similar, crashing the program with an overflow problem.
If we get the first result, then (12345 + 56789)/2 becomes 3598/2 or 1799. Clearly incorrect.
So, then what if we used the other approach:
12345 + (56789-12345)/2
First, let's do the parenthesis: 56789-12345 is equal to 44444, a number that can be held in a 16-bit data type.
Adding 12345 + 44444 gives us 56789, a number that can also be held in a 16-bit data type.
Dividing 56789 by 2 gives us 28934.5. Since we're probably dealing with "integers" here we get 28934 (typically, unless your particular world rounds up).
So the reason the second expression is chosen above the first is that it doesn't have to handle overflow the same way and is more resilient to this kind of problem.
In fact, if you think about it, the maximum second value you can have is the maximum legal value you can have for your data type, so this kind of expression:
X + (Y-X)
... assuming both X and Y are the same data type, can at most be the maximum of that data type. Basically, it will not have to contend with an overflow at all.
2nd method used, for avoid int overflow during computation.
Imagine, you used just 1-byte unsigned integer, so overflow happening if value reached 256. Imagine, we have low=100 and high=200. See computing:
1. (lo + hi) / 2 = (100 + 200) / 2 = 300 / 2; // 300 > 256, int overflow
2. lo + (hi - lo) / 2 = 100 + (200 - 100) / 2 = 150; // no overflow
There isn't a best one, but they are clearly different.
(LOW+HIGH)/2 causes trouble if the addition overflows, both in the signed and unsigned case. Doesn't have to be an issue, if you can assume that it never happens it's a perfectly reasonable thing to do.
LOW+(HIGH-LOW)/2 can still cause trouble with overflows if LOW is allowed to be arbitrarily negative. For non-negative input it's fine though. Costs an extra operation.
(LOW+HIGH) >>> 1 as seen in Java. For non-negative but signed input, overflow is non-destructive. The sign bit is used as an extra bit of space that the addition can carry into, the shift then divides the unsigned result by two. Does not work at all if the result is supposed to negative, because by construction the result is non-negative. For array indexes that isn't really a consideration. Doesn't help if you were working with unsigned indexes though.
MOV MID, LOW \ ADD MID, HIGH \ RCR MID in pseudo x86 asm, explicitly uses an extra bit of space and therefore works for all unsigned input no matter what, but cannot be used in most languages.
They all have their upsides and downsides. There is no winner.
The best way is dependent on what you're trying to accomplish. The first is clearly faster (best for performance), while the second one is used to avoid overflow (best for correctness). So the answer to your question is dependent on your definition of "best".

A clever homebrew modulus implementation

I'm programming a PLC with some legacy software (RSLogix 500, don't ask) and it does not natively support a modulus operation, but I need one. I do not have access to: modulus, integer division, local variables, a truncate operation (though I can hack it with rounding). Furthermore, all variables available to me are laid out in tables sorted by data type. Finally, it should work for floating point decimals, for example 12345.678 MOD 10000 = 2345.678.
If we make our equation:
dividend / divisor = integer quotient, remainder
There are two obvious implementations.
Implementation 1:
Perform floating point division: dividend / divisor = decimal quotient. Then hack together a truncation operation so you find the integer quotient. Multiply it by the divisor and find the difference between the dividend and that, which results in the remainder.
I don't like this because it involves a bunch of variables of different types. I can't 'pass' variables to a subroutine, so I just have to allocate some of the global variables located in multiple different variable tables, and it's difficult to follow. Unfortunately, 'difficult to follow' counts, because it needs to be simple enough for a maintenance worker to mess with.
Implementation 2:
Create a loop such that while dividend > divisor divisor = dividend - divisor. This is very clean, but it violates one of the big rules of PLC programming, which is to never use loops, since if someone inadvertently modifies an index counter you could get stuck in an infinite loop and machinery would go crazy or irrecoverably fault. Plus loops are hard for maintenance to troubleshoot. Plus, I don't even have looping instructions, I have to use labels and jumps. Eww.
So I'm wondering if anyone has any clever math hacks or smarter implementations of modulus than either of these. I have access to + - * /, exponents, sqrt, trig functions, log, abs value, and AND/OR/NOT/XOR.
How many bits are you dealing with? You could do something like:
if dividend > 32 * divisor dividend -= 32 * divisor
if dividend > 16 * divisor dividend -= 16 * divisor
if dividend > 8 * divisor dividend -= 8 * divisor
if dividend > 4 * divisor dividend -= 4 * divisor
if dividend > 2 * divisor dividend -= 2 * divisor
if dividend > 1 * divisor dividend -= 1 * divisor
quotient = dividend
Just unroll as many times as there are bits in dividend. Make sure to be careful about those multiplies overflowing. This is just like your #2 except it takes log(n) instead of n iterations, so it is feasible to unroll completely.
If you don't mind overly complicating things and wasting computer time you can calculate modulus with periodic trig functions:
atan(tan(( 12345.678 -5000)*pi/10000))*10000/pi+5000 = 2345.678
Seriously though, subtracting 10000 once or twice (your "implementation 2") is better. The usual algorithms for general floating point modulus require a number of bit-level manipulations that are probably unfeasible for you. See for example http://www.netlib.org/fdlibm/e_fmod.c (The algorithm is simple but the code is complex because of special cases and because it is written for IEEE 754 double precision numbers assuming there is no 64-bit integer type)
This all seems completely overcomplicated. You have an encoder index that rolls over at 10000 and objects rolling along the line whose positions you are tracking at any given point. If you need to forward project stop points or action points along the line, just add however many inches you need and immediately subtract 10000 if your target result is greater than 10000.
Alternatively, or in addition, you always get a new encoder value every PLC scan. In the case where the difference between the current value and last value is negative you can energize a working contact to flag the wrap event and make appropriate corrections for any calculations on that scan. (**or increment a secondary counter as below)
Without knowing more about the actual problem it is hard to suggest a more specific solution but there are certainly better solutions. I don't see a need for MOD here at all. Furthermore, the guys on the floor will thank you for not filling up the machine with obfuscated wizard stuff.
I quote :
Finally, it has to work for floating point decimals, for example
12345.678 MOD 10000 = 2345.678
There is a brilliant function that exists to do this - it's a subtraction. Why does it need to be more complicated than that? If your conveyor line is actually longer than 833 feet then roll a second counter that increments on a primary index roll-over until you've got enough distance to cover the ground you need.
For example, if you need 100000 inches of conveyor memory you can have a secondary counter that rolls over at 10. Primary encoder rollovers can be easily detected as above and you increment the secondary counter each time. Your working encoder position, then, is 10000 times the counter value plus the current encoder value. Work in the extended units only and make the secondary counter roll over at whatever value you require to not lose any parts. The problem, again, then reduces to a simple subtraction (as above).
I use this technique with a planetary geared rotational part holder, for example. I have an encoder that rolls over once per primary rotation while the planetary geared satellite parts (which themselves rotate around a stator gear) require 43 primary rotations to return to an identical starting orientation. With a simple counter that increments (or decrements, depending on direction) at the primary encoder rollover point it gives you a fully absolute measure of where the parts are at. In this case, the secondary counter rolls over at 43.
This would work identically for a linear conveyor with the only difference being that a linear conveyor can go on for an infinite distance. The problem then only needs to be limited by the longest linear path taken by the worst-case part on the line.
With the caveat that I've never used RSLogix, here is the general idea (I've used generic symbols here and my syntax is probably a bit wrong but you should get the idea)
With the above, you end up with a value ENC_EXT which has essentially transformed your encoder from a 10k inch one to a 100k inch one. I don't know if your conveyor can run in reverse, if it can you would need to handle the down count also. If the entire rest of your program only works with the ENC_EXT value then you don't even have to worry about the fact that your encoder only goes to 10k. It now goes to 100k (or whatever you want) and the wraparound can be handled with a subtraction instead of a modulus.
Afterword :
PLCs are first and foremost state machines. The best solutions for PLC programs are usually those that are in harmony with this idea. If your hardware is not sufficient to fully represent the state of the machine then the PLC program should do its best to fill in the gaps for that missing state information with the information it has. The above solution does this - it takes the insufficient 10000 inches of state information and extends it to suit the requirements of the process.
The benefit of this approach is that you now have preserved absolute state information, not just for the conveyor, but also for any parts on the line. You can track them forward and backward for troubleshooting and debugging and you have a much simpler and clearer coordinate system to work with for future extensions. With a modulus calculation you are throwing away state information and trying to solve individual problems in a functional way - this is often not the best way to work with PLCs. You kind of have to forget what you know from other programming languages and work in a different way. PLCs are a different beast and they work best when treated as such.
You can use a subroutine to do exactly what you are talking about. You can tuck the tricky code away so the maintenance techs will never encounter it. It's almost certainly the easiest for you and your maintenance crew to understand.
It's been a while since I used RSLogix500, so I might get a couple of terms wrong, but you'll get the point.
Define a Data File each for your floating points and integers, and give them symbols something along the lines of MOD_F and MOD_N. If you make these intimidating enough, maintenance techs leave them alone, and all you need them for is passing parameters and workspace during your math.
If you really worried about them messing up the data tables, there are ways to protect them, but I have forgotten what they are on a SLC/500.
Next, defined a subroutine, far away numerically from the ones in use now, if possible. Name it something like MODULUS. Again, maintenance guys almost always stay out of SBRs if they sound like programming names.
In the rungs immediately before your JSR instruction, load the variables you want to process into the MOD_N and MOD_F Data Files. Comment these rungs with instructions that they load data for MODULUS SBR. Make the comments clear to anyone with a programming background.
Call your JSR conditionally, only when you need to. Maintenance techs do not bother troubleshooting non-executing logic, so if your JSR is not active, they will rarely look at it.
Now you have your own little walled garden where you can write your loop without maintenance getting involved with it. Only use those Data Files, and don't assume the state of anything but those files is what you expect. In other words, you cannot trust indirect addressing. Indexed addressing is OK, as long as you define the index within your MODULUS JSR. Do not trust any incoming index. It's pretty easy to write a FOR loop with one word from your MOD_N file, a jump and a label. Your whole Implementation #2 should be less than ten rungs or so. I would consider using an expression instruction or something...the one that lets you just type in an expression. Might need a 504 or 505 for that instruction. Works well for combined float/integer math. Check the results though to make sure the rounding doesn't kill you.
After you are done, validate your code, perfectly if possible. If this code ever causes a math overflow and faults the processor, you will never hear the end of it. Run it on a simulator if you have one, with weird values (in case they somehow mess up the loading of the function inputs), and make sure the PLC does not fault.
If you do all that, no one will ever even realize you used regular programming techniques in the PLC, and you will be fine. AS LONG AS IT WORKS.
This is a loop based on the answer by #Keith Randall, but it also maintains the result of the division by substraction. I kept the printf's for clarity.
#include <stdio.h>
#include <limits.h>
#define NBIT (CHAR_BIT * sizeof (unsigned int))
unsigned modulo(unsigned dividend, unsigned divisor)
{
unsigned quotient, bit;
printf("%u / %u:", dividend, divisor);
for (bit = NBIT, quotient=0; bit-- && dividend >= divisor; ) {
if (dividend < (1ul << bit) * divisor) continue;
dividend -= (1ul << bit) * divisor;
quotient += (1ul << bit);
}
printf("%u, %u\n", quotient, dividend);
return dividend; // the remainder *is* the modulo
}
int main(void)
{
modulo( 13,5);
modulo( 33,11);
return 0;
}

Obtaining the position of the first active bit in a string of bits in only one operation

The first question would be:
Is possible to get the position of the first active bit in a string of bits in only one operation ?
An the second of course: How to do it ?
Thanks in advance.
I have heard there are some processors where getting the highest bit of an integer is a single instruction, but I can not name which ones. Even if there are such processors, you will only be able to get the highest bit of an integer not of an arbitrary binary number, which seems to be the case in your question.
For longer bit sequences I don't think you have any better option than to check for each bit, while for shorter ones you may precompute the highest bit values(for instance have an array storing the highest bit for all numbers up to 32768) and than simply get a value from that array to have the answer you need for all up to 15 bit sequences.

Time as a Signed Integer

I've been reading up on the Y2038 problem and I understand that time_t will eventually revert to the lowest representable negative number because it'll try to "increment" the sign bit.
According to that Wikipedia page, changing time_t to an unsigned integer cannot be done because it would break programs that handle early dates. (Which makes sense.)
However, I don't understand why it wasn't made an unsigned integer in the first place. Why not just store January 1, 1970 as zero rather than some ridiculous negative number?
Because letting it start at signed −2,147,483,648 is equivalent to letting it start at unsigned 0. It doesn't change the range of a values a 32 bit integer can hold - a 32 bit integer can hold 4,294,967,296 different states. The problem isn't the starting point, the problem is the maximum value which can be held by the integer. Only way to mitigate the problem is to upgrade to 64 bit integers.
Also (as I just realized that): 1970 was set as 0, so we could reach back in time as well. (reaching back to 1901 seemed to be sufficient at the time). If they went unsigned, the epoch would've begun at 1901 to be able to reach back from 1970, and we would have the same problem again.
There's a more fundamental problem here than using unsigned values. If we used unsigned values, then we'd get only one more bit of timekeeping. This would have a definitely positive impact - it would double the amount of time we could keep - but then we'd have a problem much later on in the future. More generally, for any fixed-precision integer value, we'd have a problem along these lines.
When UNIX was being developed in the 1970s, having a 60 year clock sounded fine, though clearly a 120-year clock would have been better. If they had used more bits, then we'd have a much longer clock - say 1000 years - but after that much time elapsed we'd be right back in the same bind and would probably think back and say "why didn't they use more bits?"
Because not all systems have to deal purely with "past" and "future" values. Even in the 70s, when Unix was created and the time system defined, they had to deal with dates back in the 60s or earlier. So, a signed integer made sense.
Once everyone switches to 64bit time_t's, we won't have to worry about a y2038k type problem for another 2billion or so 136-year periods.

Determining Millisecond Time Intervals In Cocoa

Just as background, I'm building an application in Cocoa. This application existed originally in C++ in another environment. I'd like to do as much as possible in Objective-C.
My questions are:
1)
How do I compute, as an integer, the number of milliseconds between now and the previous time I remembered as now?
2)
When used in an objective-C program, including time.h, what are the units of
clock()
Thank you for your help.
You can use CFAbsoluteTimeGetCurrent() but bear in mind the clock can change between two calls and can screw you over. If you want to protect against that you should use CACurrentMediaTime().
The return type of these is CFAbsoluteTime and CFTimeInterval respectively, which are both double by default. So they return the number of seconds with double precision. If you really want an integer you can use mach_absolute_time() found in #include <mach/mach_time.h> which returns a 64 bit integer. This needs a bit of unit conversion, so check out this link for example code. This is what CACurrentMediaTime() uses internally so it's probably best to stick with that.
Computing the difference between two calls is obviously just a subtraction, use a variable to remember the last value.
For the clock function see the documentation here: clock(). Basically you need to divide the return value by CLOCKS_PER_SEC to get the actual time.
How do I compute, as an integer, the number of milliseconds between now and the previous time I remembered as now?
Is there any reason you need it as an integral number of milliseconds? Asking NSDate for the time interval since another date will give you a floating-point number of seconds. If you really do need milliseconds, you can simply multiply by that by 1000 to get a floating-point number of milliseconds. If you really do need an integer, you can round or truncate the floating-point value.
If you'd like to do it with integers from start to finish, use either UpTime or mach_absolute_time to get the current time in absolute units, then use AbsoluteToNanoseconds to convert that to a real-world unit. Obviously, you'll have to divide that by 1,000,000 to get milliseconds.
QA1398 suggests mach_absolute_time, but UpTime is easier, since it returns the same type AbsoluteToNanoseconds uses (no “pointer fun” as shown in the technote).
AbsoluteToNanoseconds returns an UnsignedWide, which is a structure. (This stuff dates back to before Mac machines could handle scalar 64-bit values.) Use the UnsignedWideToUInt64 function to convert it to a scalar. That just leaves the subtraction, which you'll do the normal way.

Resources