How does Precision and Scale work for Oracle's NUMBER data type? - oracle

If I have an ID field that is 7 characters long what do I set the scale and precision to? This is rather confusing. Or say a field of 10 characters for a phone number ... what is the S and P then?

For a series of digits such as a phone number or Social Security Number, or Zip Code, or ISBN, consider what operations you will be carrying out on it.
Will you be:
adding them together, taking the average, representing them with variable numbers of decimal places, multiplying them, etc?
Or will you be checking the length, splitting them out into a pattern, needing to preserve any leading zero, extracting a set of characters (first three, last four), comparing them to a regular expression, etc?
If the former, then they are numbers and you should store them as an integer. If the latter then they are strings constrained to be composed only of certain characters, and you should store them as a string with an appropriate check constraint.
I think at the moment you're asking something like "what temperature should my aquarium water be held at to keep cats in it"? The answer, probably, is not to do it at all, as you'll end up bitten.

For an ID, since you shouldn't need decimals, a precision of 7 and a scale of 0 would work in your case. For a phone number, you would use a varchar2 or something comparable, in which case a scale and precision wouldn't apply.
You mentioned "characters" for the ID, but I'm assuming it's numerical.
Documentation regarding scale and precision in Oracle can be found here... http://docs.oracle.com/cd/B28359_01/server.111/b28318/datatype.htm#CNCPT1832
Specifically:
...you can also specify a precision (total number of digits) and scale (number of digits to the right of the decimal point)
Note though that precision and scale are optional and you may not even need to worry about them for something like an ID.
More examples can be found here... What is the difference between precision and scale?

Related

How NUMBER columns in Oracle deal with numbers exceeding their precision or scale

Assuming I have the following column in my DB:
value NUMBER(12,3)
What will happen if I will try to store a decimal that exceeds the defined scale? 1234.56789
Will I get an error that the scale is exceeded, or will the value be rounded to 3 decimals and stored?
Also, Is it a good practice to let the database do the rounding? Or should that rather be done in the code?
It will be rounded. In your particular example it will be 1234.568 .
For reference:
When you define a NUMBER variable, you can specify its precision (p)
and scale (s) so that it is sufficiently, but not unnecessarily,
large. Precision is the number of significant digits. Scale can be
positive or negative. Positive scale identifies the number of digits
to the right of the decimal point; negative scale identifies the
number of digits to the left of the decimal point that can be rounded
up or down.
The NUMBER data type is supported by Oracle Database standard
libraries and operates the same way as it does in SQL. It is used for
dimensions and surrogates when a text or INTEGER data type is not
appropriate. It is typically assigned to variables that are not used
for calculations (like forecasts and aggregations), and it is used for
variables that must match the rounding behavior of the database or
require a high degree of precision.

Oracle NUMBER data type max limit with & without decimal points

I have a number field in the application which passes a numeric value to my Oracle procedure which is of NUMBER datatype. The numeric value can be positive, negative, with or without decimal points.
I need to restrict it from the application so I need to specify the maximum limit without any round-off. Can I please know what will be the limits for positive, negative, with, and without decimals points.
It is not really a range that is allowed, but a precision. You can pass any number with up to 38 digits, no matter if or where the decimal separator.
Okay:
+12345678901234567890123456789012345678
-12345678901234567890123456789012345678
+1.2345678901234567890123456789012345678
-1234567890123456789012345678901234567.8
May get slightly mutilated:
+123456789012345678901234567890123456789
-123456789012345678901234567890123456789
+1.23456789012345678901234567890123456789
-1234567890123456789012345678901234567.89
Demo with some longer numbers: https://dbfiddle.uk/?rdbms=oracle_18&fiddle=aa74ef09157dcf86daa76507e868fe49

Advice on DB design Best Practices/Standard - Oracle

I'm designing the DB for a new app which is something I've done a thousand times, but in this occasion I suddenly start wondering on some aspects that I've never stopped before. Is there some standard/recommendation for the following things?
Whats the recommended data type for storing currencies (no financial operations, just displaying).
Recommended size for storing phone numbers (internationals)
Recommended minimum size for storing first names / last names (minimum meaning smallest maximum recommended size)
Recommended minimum size for storing comment blocks.(minimum meaning smallest maximum recommended size also)
I'm aware that every application has its own particular requirements to consider, but I feel that there must be something more specific than gut feeling and common sense.
Help, as always, will be deeply appreciated.
Whats the recommended data type for storing currencies
This depends on what kind of currency, and to what degree of accuracy.
If it's cents and dollars, rounded to the nearest cent, it's NUMBER(12,2) which allows you to store amounts between -999,999,999,999.99 and 999,999,999,999.99 - which for most currencies should be enough.
If you need to store intermediate results from, say, interest rate calculations, you may need more precision, e.g. NUMBER(15,5).
If you're talking Zimbabwean dollars, perhaps you should choose the maximum NUMBER instead :)
Recommended size for storing phone numbers (internationals)
VARCHAR2(30) should be sufficient. If it's too long your users will enter all sorts of rubbish data in there.
Recommended minimum size for storing first names / last names /
Recommended minimum size for storing comment blocks
These don't apply since you're in Oracle - use VARCHAR2, so you don't have to worry about minimum size. All you need to specify is the maximum size.
Currencies:
NUMBER(15,2), really depends on how big the numbers are that you expect to run into.
Phone numbers:
VARCHAR2(30), please don't hurt me if it should be larger - can't remember the length per se just that VARCHAR allows flexibility for formatting.
I don't see the point of looking at the minimum size if using VARCHAR2. The concerns for the physical model revolve around how much space the database will consume over time, assuming fields are maxed out.
Comment blocks:
Maximum of VARCHAR2(4000)
EDIFACT generally uses 35 as the size of a Name field and I'd copy that (and document that as a basis). Newer stuff tends to be defined in XML and doesn't normally go into field length definitions.
Alternatively the Canadian post office recommends no more than 40 characters per address line.
Note, that is characters and not bytes. Sizing should take into account multi-byte characters, but obviously not all names will be the maximum length. I've used ten characters per name as a broad approximation for sizing estimates but that could vary a lot between countries, ethnicities etc.
I know you were asking minimum size for comment blocks, but for large free-text areas you ought to consider using a CLOB value. Oracle is pretty smart about how these things are handled, how the data is stored, etc. You NEVER have to worry about size. In addition, you can usually pretend that they are VARCHAR2 columns for easy manipulation.

YouTube URL algorithm?

How would you go about generating the unique video URL's that YouTube uses?
Example:
http://www.youtube.com/watch?v=CvUN8qg9lsk
YouTube uses Base64 encoding to generate IDs for each video.Characters involved in generating Ids consists of
(A-Z) + (a-z) + (0-9) + (-) + (_). (64 Characters).
Using Base64 encoding and only up to 11 characters they can generate 73+ Quintilian unique IDs.How much large pool of ID is that?
Well, it's enough for everyone on earth to produce video every single minute for 18000 years.
And they have achieved such huge number by only using 11 characters (64*64*64*64*64*64*64*64*64*64*64) if they need more IDs they will just have to add 1 more character to their IDs.
So when video is uploaded on YouTube they basically randomly select from 73+ Quintilian possibility and see if its already taken or not.if not use it otherwise look for another one.
Refer to this video for detailed explanation.
Using some non-trivial hashing function. The probability of collision is very low, depending on the function, the parameters and the input domain. Keep in mind that cryptographic hashes were specifically designed to have very low collision rates for non-random input (i.e. completely different hashes for two close-but-unequal inputs).
This post by Jeff Attwood is a nice overview of the topic.
And here is an online hash calculator you can play with.
There is no need to use a hash. It is probably just a quasi-random 64 bit value passed through base64 or some equivalent.
By quasi-random, I mean it is just a one-to-one mapping with the counting integers, just shuffled.
For example, you could take a monotonically increasing database id and multiply it by some prime near 2^64, then base64 the result. If you did not want people to be able to guess, you might choose a more complex mapping or just pick a random number that is not in the database yet.
Normal base64 would add an equals at the end, but in this case it is implied because the size is known. The character mapping could easily be something besides the standard.
Eli's link to Jeff's article is, in my opinion, irrelevant. URL shortening is not the same thing as presenting an ID to the world. Instead, a nicer way would be to convert your existing integer ID to a different radix.
An example in PHP:
$id = 9999;
//$url_id = base_convert($id, 10, 26+26+10); // PHP doesn't like this
$url_id = base_convert($id, 10, 26+10); // Works, but only digits + lowercase
Sadly, PHP only supports up to base 36 (digits + alphabet). Base 62 would support alphabet in both upper-case and lower-case.
People are talking about these other systems:
Random number/letters - Why? If you want people to not see the next video (id+1), then just make it private. On a website like youtube, where it actively shows any video it has, why bother with random ids?
Hashing an ID - This design concept really stinks. Think about it; so you have an ID guaranteed by your DBM software to be unique, and you hash it (introducing a collision factor)? Give me one reason why to even consider this idea.
Using the ID in URL - To be honest, I don't see any problems with this either, though it will grow to be large when in fact you can express the same number with fewer letters (hence my solution).
Using Base64 - Base64 expects bytes of data, literally anything from nulls to spaces. Why use this function when your data consists of a number (ie, a mix of 10 different characters, instead of 256)?
You can use any library or some languages like python provides it in standard library.
Example:
import secrets
id_length = 12
random_video_id = secrets.token_urlsafe(id_length)
You could generate a GUID and have that as the ID for the video.
Guids are very unlikely to collide.
Your best bet is probably to simply generate random strings, and keep track (in a DB for example) of which strings you've already used so you don't duplicate. This is very easy to implement and it cannot fail if properly implemented (no duplicates, etc).
I don't think that the URL v parameter has anything to do with the content (video properties, title, description etc).
It's a randomly generated string of fixed length and contains a very specific set of characters. No duplicates are allowed.
I suggest using a perfect hash function:
Perfect Hash Function for Human Readable Order Codes
As the accepted answer indicates, take a number, then apply a sequence of "bijective" (or reversible) operations on the number to get a hashed number.
The input numbers should be in sequence: 0, 1, 2, 3, and so on.
Typically you're hiding a numeric identifier in the form of something that doesn't look numeric. One simple method is something like base-36 encoding the number. You should be able to pull that off with one or another variant of itoa() in the language of your choice.
Just pick random values until you have one never seen before.
Randomly picking and exhausting all values form a set runs in expected time O(nlogn): What is O value for naive random selection from finite set?
In your case you wouldn't exhaust the set, so you should get constant time picks. Just use a fast data structure to do the duplication lookups.

Common strategies to deal with rounding errors in currency-intensive soft?

What is your advice on:
compensation of accumulated error in bulk math operations on collections of Money objects. How is this implemented in your production code for your locale?
theory behind rounding in accountancy.
any literature on topic.
I currently read Fowler. He mentions Money type, it's typcal structure (int, long, BigDecimal), but says nothing on strategies.
Older posts on money-rounding (here, and here) do not provide a details and formality I need.
Thoughts I found in the inet relate to "Round half even" as the best way to balance error.
Thanks for help.
There are many rounding issues when recording financial data.
First issue is ability to store and retrieve exact decimal numbers
most databases offer decimal data type on which you can specify the number of digits before and after decimal point (currencies vary in number of decimal digits, too, I've dealt with currencies with 0, 2, 3 decimal digits)
when dealing with this data and you want to avoid any unexpected rounding errors on the application side you can use BCD as generic approach, or you can use integers to represent any fixed decimal notation or mix your own
If this first issue is sorted out then no addition (or substraction) can introduce any rounding errors. Same goes for multiplication by integer.
The second issue, after you are able to store and retrieve data without loss of information, are expected rounding errors due to division (or multiplication by non integer).
For example if your currency format allows 2 decimals and you want to store transaction that records balances a debit of 10 to 3 equal pieces you can only store it like
10.00
-3.33
-3.33
-3.33
and
-0.01
(rounding error)
This is expected problem that will occur regardless of the data type storage choice and that needs to be taken care of if you want your accounts to balance. This situation is mainly introduced by division (or by multiplication by non integers that have many significant digits).
One way to deal with this is to verify if your data balances after such operations and recognize the allowed rounding difference as opposed to an error situation.
EDIT:
As for references to literature, this one seems interesting and not too long and concerns quite wide audience with interesting scenarios.
Use Banker's rounding. You round to the nearest two-penny.
http://www.xbeat.net/vbspeed/i_BankersRounding.htm
You can expand upon this to round toward the nearest two-penny instead. So 22.5 rounds to 22, but 23.5 rounds to 24. 23.1 and 22.9 both round to 23. However, the original banker's algorithm is more popular.
Never store money values in a double or float - use an int or long as there is no way to store 0.1 accurately in binary.
It all depends on the application. Hopefully there aren't too many situations where rounding is required. For example, transferring money from one account to another requires no rounding.
For situations where rounding is required, it doesn't really matter what you do as long as you pick a policy, communicate it, and stick to it. For instance, I believe the interest on my savings account rounds down to the nearest penny.
What you should do may well be informed by the conventions of the market or jurisdiction you are operating in. For example, pricing bonds in the Australian market requires that you round certain intermediate operations to 8 decimal places. The final price is quoted to a specific number of decimals (3 I think off the top of my head).
If you are dealing with an accounting app, I would expect the relevant accounting standards for your legal environment to possibly dictate this.
I've worked a bit (just a bit) with monetary amounts and I was extremely curious as to the strategy used in my company...
It turns out that we use double, but they've thought about it.
The thing is that the amounts we deal with are not that great (say less than 10k) and at most we need 3 digits after the decimal, for a total of 7 significant digits.
Since we are using 64bits software (and C++) the double type offers enough significant digits for the number of operations we carry on it :)
If you need more precision, there are algorithms to use (for example while adding multiple moneys) but personally I think the heart of the issue comes more from:
conversion from one money to another, which keeps changing of course
printing issues, with some moneys requiring no decimal, others requiring 2 at most, etc...
Perhaps could you expand on the operations you're doing ?

Resources