mapping NUMBER(15) in hibernate - oracle

So we have in our data schema column of type NUMBER(15). How to map it in hibernate without changing schema? int data type is too small for that, long data type is too big for that(ORA-01438), even when persisting numbers withing DB column bounds.
Assume, that schema is unmodifiable and we do not want to use BigInteger. Say, that we know, that number wont be bigger than 372036854775807, thus can fit into both long and NUMBER(15).
Oracle DB.

so actually this was my fault, due to misread of oracle documentation, regarding number scale and precision. I thought, that precision is the part before '.', and scale after it, but no, precision is number of all digits. That was causing my problem, and on different column.
To answer my own question: ORA-01438 won't be thrown for numbers which fit into type used in DB layer. Therefore if I know, that data type used on app layer cannot be bigger than number(15), I can safely use long in java and NUMBER(15) in DB. Until this constraint is violated, everything is fine. And adhering to this constraint is important in first place, and BigInteger won't help without it, since BigInteger can be bigger than NUMBER(37).

Related

SQLite DB Size Column Data Type Considerations

I'm working with an SQLite DB where all columns are of NVARCHAR data type.
Coming from MS SQL background I know that NVARCHAR has additional baggage associated with it, my first impulse is to refactor most column types to have concrete string lengths enforced (most are under 50 chars long).
But at the same time I know that SQLite treats things a bit differently.
So my question is should I change/refactor the column types? And by doing so is there anything to gain in terms of disk space or performance in SQLite?
DB runs on Android/iOS devices.
Thanks!
You should read https://www.sqlite.org/datatype3.html.
CHARACTER(20), VARCHAR(255), VARYING CHARACTER(255), NCHAR(55)
NATIVE CHARACTER(70), NVARCHAR(100), TEXT and CLOB are treated as TEXT.
Also SQLite does not enforce lengths.
So instead of digging through critic documentation I did a bit of experimenting with column types.
Database I'm working with has well over 1 million records, with most columns as NVARCHAR, so any change on column datatypes was easily seen in file size deltas.
Here are the results I found in effort to reduce DB size:
NVARCHAR:
Biggest savings came from switching column types where possible from NVARCHAR to plain INT or FLOAT. On a DB file of 80MB savings were in Megabytes, very noticable. With some additional refactoring I dropped the size down to 47MB.
NVARCHAR vs. VARCHAR:
Made very little difference, perhaps a few KBs on a DB of a size of 80MBs
NVARCHAR vs. OTHER String Types:
Switching between various string based types made almost no difference, as documentation points out all string types are stored all the same in SQLite, as TEXT
INT vs OTHER numerics
No Difference here, SQLite stores all as NUMBER in the end.
Indexes based on NVARCHAR columns also took up more space, once re-indexed on INT columns I shedded a few MBs

Why no primary key

I have inherited a datababase with tables that lack primary keys. It's an OLTP database. One of the tables in question has ~300k records, and has no primary key implemented, even though examining the rest of the schema tells me one column is used AS a primary key, ie being replicated in another table, with identical name, etc. ie. This is not an 'end of line' table
This database also does not implement FKs.
My question is - is there ANY valid reason for a table (in Oracle for that matter) NOT to have a primary key?
I think PK is mandatory for almost all cases. Lots of reasons will exist but I'll treat some of them.
prevent to insert duplicate rows
rows will be referenced, so it must have a key for it
I saw very few cases make tables without PK (e.g. table for logs).
Not specific to Oracle but I recall reading about one such use-case where mysql was highly customized for a dam (electricity generation) project, I think. The input data from sensors were in the order 100-1000 per second or something. They were using timestamps for each record so didn't need a primary key (like with logs/logging mentioned in another answer here).
So good reasons would be:
Overhead, in the case of high frequency transactions
Necessity or Un-necessity in that case
"Uniqueness" maintained or inferred by application, not by db
In a normalized table, if every record needs to be unique and every field is referenced in other tables, then having a PK additionally adds an index overhead and if the PK would never actually be used in any SQL query (imho, I disagree with this but it's possible). But it should still have a unique index encompassing all the fields.
Bad reasons are infinite :-)
The most frequent bad reason which is actually responsible for the lack of a primary key is when DBs are designed by application/code-developers with little or no DB experience, who want to (or think they should) handle all data constraints in the application.
Any valid reason? I'd say "No"--I'm a database guy--but there are places that insist on using the database as a dumb data store. They usually implement all integrity "constraints" in application code.
Putting integrity constraints into application code isn't usually done to improve performance. In fact, if you built one database that enforces all the known constraints, and you built another with functionally identical constraints only in application code, the first one would almost certainly run rings around the second one.
Instead, application-level constraints usually hope to increase flexibility. (And, in the process, some of the known constraints are usually dropped, which appears to improve performance.) If it becomes inconvenient to enforce certain constraints in order to bulk load some scruffy data, an application programmer can just side-step the application-level constraints for a little while, then clean up the data when it's more convenient.
I'm not a db expert but I remember a conversation with a friend who worked in the Oracle apps dept. who told me that this was done to handle emergencies. If there was a problem in some report being generated which you could fix by putting in a row, db level constraints often stand in your way. They generally implemented things like unique primary keys in the application rather than the database. It was inefficient but enough and for them and much more manageable in case of a disaster recovery scenario.
You need a primary key to enforce uniqueness for a subset of its columns (useful if you need to refer to individual rows). It also speeds up certain queries because of the index associated to it.
If you do not need that index, or that uniqueness constraint, then you may not need a primary key (the index does not come free).
An example that comes to mind are logging tables, that just record some data (that is never updated or queried for individual records).
There is a small overhead when inserting to a table with an index and you need an index if you have a primary key. Downside of course is that finding a row is very costly.

do i need to set length for every poco property in Entity Framework Code First?

do i need to set length for every poco property in Entity Framework Code First ? if i dont
set stringLength or maxlength/minlength for a property , it will be nvarchar(max) , how bad is nvarchar(max) ? should i just leave it alone in development stage , and improve it before production ?
You should define a Max length for each property where you want to restrict the length. Note that the nvarchar(max) data type is different from the nvarchar(n) datatype, where n is a number from 1-4000. The max version that you get when you define no max length is meant for large blocks of text, like paragraphs and the like. It can handle extremely large lengths, and so the data is stored separately from the rest of the fields of the record. nvarchar(n), on the other hand, is stored inline with the rest of the rows.
It's probably best to go ahead and set those values as you want now, rather than waiting to do so later. Choose values that are as large as you will ever need, so you never have to increase them. nvarchar(n) stores its info efficiently; for example, a nvarchar(200) does not necessarily take up 200 characters of space; it only uses enough space to store what is actually put into it, plus a couple extra bytes for saving its length.
So whenever possible, you should set a limit on your entity's text fields.
NVARCHAR - is variable length field. So it consumes only space you need for it. On the other hand NCHAR allocates all the space it requires not on demand as NVARCHAR does.
MSDN advises to use nvarchar when the sizes of the column data entries are probably going to vary considerably.
It's the way to go as for me on the early stages of a project. You can tune it when needed.
According to the next blog post nvarchar(max) is not the same as ntext until the actual value size does not reach 4000 symbols (cause limitation is 8K, and widechars use two bytes per char). As far as it hits this size it behaves pretty much the same as ntext. So as for me I don't see any good reason to avoid using nvarchar(max) data type.

Best-performing method for associating arbitrary key/value pairs with a table row in a Postgres DB?

I have an otherwise perfectly relational data schema in place for my Postgres 8.4 DB, but I need the ability to associate arbitrary key/value pairs with several of my tables, with the assigned keys varying by row. Key/value pairs are user-generated, so I have no way of predicting them ahead of time or wrangling orderly schema changes.
I have the following requirements:
Key/value pairs will be read often, written occasionally. Reads must be reasonably fast.
No (present) need to query off of the keys or values. (But it might come in handy some day.)
I see the following possible solutions:
The Entity-Attribute-Value pattern/antipattern. Annoying, but the annoyance would be generally offset by my ORM.
Storing key/value pairs as serialized JSON data on a text column. A simple solution, and again the ORM comes in handy, but I can kiss my future self's need for queries good-bye.
Storing key/value pairs in some other NoSQL db--probably a key/value or document store. ORM is no help here. I'll have to manage the separate queries (and looming data integrity issues?) myself.
I'm concerned about query performance, as I hope to have a lot of these some day. I'm also concerned about programmer performance, as I have to build, maintain, and use the darned thing. Is there an obvious best approach here? Or something I've missed?
That's precisely what the hstore datatype is for in PostgreSQL.
http://www.postgresql.org/docs/current/static/hstore.html
It's really fast (you can index it) and quite easy to handle. The only drawback is that you can only store character data, but you'd have that problem with the other solutions as well.
Indexes support "exists" operator, so you can query quite quickly for rows where a certain key is present, or for rows where a specific attribute has a specific value.
And with 9.0 it got even better because some size restrictions were lifted.
hstore is generally good solution for that, but personally I prefer to use plain key:value tables. One table with definitions, other table with values and relation to bind values to definition, and relation to bind values to particular record in other table.
Why I'm against hstore? Because it's like a registry pattern. Often mentioned as example of anti pattern. You can put anything there, it's hard to easy validate if it's still needed, when loading a whole row (in ORM especially), the whole hstore is loaded which can have much junk and very little sense. Not mentioning that there is need to convert hstore data type into your language type and convert back again when saved. So you get some overhead of type conversion.
So actually I'm trying to convert all hstores in company I'm working for into simple key:value tables. It's not that hard task though, because structures kept here in hstore are huge (or at least big), and reading/writing an object crates huge overhead of function calls. Thus making a simple task like that "select * from base_product where id = 1;" is making a server sweat and hits performance badly. Want to point that performance issue is not because db, but because python has to convert several times results received from postgres. While key:value is not requiring such conversion.
As you do not control data then do not try to overcomplicate this.
create table sometable_attributes (
sometable_id int not null references sometable(sometable_id),
attribute_key varchar(50) not null check (length(attribute_key>0)),
attribute_value varchar(5000) not null,
primary_key(sometable_id, attribute_key)
);
This is like EAV, but without attribute_keys table, which has no added value if you do not control what will be there.
For speed you should periodically do "cluster sometable_attributes using sometable_attributes_idx", so all attributes for one row will be physically close.

Inserting BigDecimal =>Varchar2 column VS BigDecimal=>Number column

I was doing some tests, where I inserted some records of java bigDecimal to a varchar2 column in Oracle.
What I wanted to do was insert java bigDecimal to number column in Oracle.
I am wondering how the 2 work differently and what interim conversion steps does Oracle take in the scenarios.
BigDecimal =>Varchar2 column
BigDecimal=>Number column
Can I still use the findings from my previous tests. I am mostly looking at latency, throughput etc.
Remember the golden rule: You should never ever under no circumstances store numbers in varchar columns.
Storing numbers in character columns will give you a lot of trouble in the long run.
Always store numbers as numbers.
To store the numbers, use a PreparedStatement and use the setBigDecimal() method to send the number to the database. This will take care of any conversion and will guarantee that the correct value is stored in the database and you don't have to worry e.g. about different decimal separators in different locales when sending a number as a string to the database.
I did not find any measurable performace difference. This was just a test of a prototype so I can use the results.

Resources