Hibernate Mapping for Oracle RAW column - oracle

We're using Hibernate and not sure how to map properties to RAW columns in Oracle table (specifically that have indexes on them).
It's a known fact that String can't be used for entity property value - Hibernate isn't able to prepend the HEXTORAW Oracle function call in order to make index on a column to be used (cause without this Oracle implicitly appends RAWTOHEX to column value itself).
However, it's not clear whether using byte[] as an entity property value is solving this issue or not. Since JDBC driver is sending binary data directly - it's logical to assume that index would be used - cause there is no any need to execute neither HEXTORAW nor RAWTOHEX functions.
However, I'm not sure how to prove it (except putting million of records and performing some benchmarks).
I tried to search similar questions but without success.
Does anyone has knowledge about that?
Thanks in advance,

Final answer - yes, mapping byte[] works.
Tested that on a table with millions of records and primary key of RAW type.
It took ~2 minutes to lookup a record by PK if using String.
With byte[] record was found immediately.

Related

Is there a way around Hibernate calling OracleStatement.getColumnIndex for each row and column?

I am puzzled by Hibernate’s behavior when loading ResultSets with many columns via HQL: it seems like OracleStatement.getColumnIndex(String) is called over and over again, not just once for every column at the beginning of the load but once for every column when retrieving each and every row of data.
In case of a ResultSet with many columns this seems to take a significant amount of time (in our case about 43 % of the total time, see attached screenshot of a VisualVM profile). Our HQL loads a table with many columns, join fetching two other tables with to-many relations (both having lots of columns, too). Unfortunately we cannot restrict the columns to be loaded, because the task is to preload all objects into a Coherence cache on startup of the system, so the objects have to be complete.
As far as I can tell the problem arises because hydrating the mapped result objects of an HQL query from the ResultSet does use nullSafeGet() for each column which takes String arguments to identify the column and therefore has to call getColumnIndex().
(When loading the data from a ResultSet of an SQL query one can use getString(int), getTimestamp(int) etc. instead of String based versions to avoid this issue.)
We are still using an old version of Hibernate (3.6) but the source on github indicates that the same behavior is still present, as nullSafeGet() is still String based instead of taking an index (or object containing the index) which then could be precomputed once at the beginning of the load.
Is there something that I am missing?
Is there a reason for calling getColumnIndex() for each column of each row of data over and over again?
Is there a way around this which does not involve rewriting the query into SQL and using the index based accessors to build up the mapped objects manually?
The only similar issue I was able to find on the internet was this question which has no answer.
The query there had many columns, too.
Thanks for any help!
Thorsten
This problem is addressed in Hibernate 6, which switches from reading JDBC ResultSet by name to reading by position.

Check all table columns for a value

Ok, tricky question I am trying to figure out where a database schema is storing a particular pointer. I know the pointer value I just don't what table it is in or what column. I know the pointer is 123123123. How do I check all table columns to see if any of them have that value?
Thanks.
In h2 you can use fulltext search, but then you would need to add all tables in the search scope and indexing.
If you need to index only primary keys, then it might be better but you still need to come up with individual FT_CREATE_INDEX() calls for each table. You can automate this with several languages or with ETLs (like scriptella).
If you've enough disk space, you could dump a SQL from your db and use a viewer for big files like glogg.
The advantage of the first solution is no external tools but you need to work out a specific indexing script for SQL for any existing or new table. The 2nd solution is a 1 time fix.
I use SQL Search from RedGate. It's free and it helps you find any text anywhere in the database.
https://www.red-gate.com/products/?gclid=CjwKEAjwiYG9BRCkgK-G45S323oSJABnykKAE7IH_EMhnmq7OdLdXljfIkdGZrDD6OnOrT4VB0agahoCVn3w_wcB

Default values in target tables

I have some mappings, where business entities are being populated after transformation logic. The row volumes are on the higher side, and there are quite a few business attributes which are defaulted to certain static values.
Therefore, in order to reduce the data pushed from mapping, i created "default" clause on the target table, and stopped feeding them from the mapping itself. Now, this works out just fine when I am running the session in "Normal" mode. This effectively gives me target table rows, with some columns being fed by the mapping, and the rest taking values based on the "default" clause on the table DDL.
However, since we are dealing with higher end of volumes, I want to run my session in bulk mode (there are no pre-existing indexes on the target tables).
As soon as I switch the session to bulk mode, this particular feature, (of default values) stops working. As a result of this, I get NULL values in the target columns, instead of defined "default" values.
I wonder -
Is this expected behavior ?
If not, am I missing out on some configuration somewhere ?
Should I be making a ticket to Oracle ? or Informatica ?
my configuration -
Informatica 9.5.1 64 bit,
with
Oracle 11g r2 (11.2.0.3)
running on
Solaris (SunOS 5.10)
Looking forward to help here...
Could be expected behavior.
Seem that bulk mode in Informatica use "Direct Path" API in Oracle (see for example https://community.informatica.com/thread/23522 )
From this document ( http://docs.oracle.com/cd/B10500_01/server.920/a96652/ch09.htm , search Field "Defaults on the Direct Path") I gather that:
Default column specifications defined in the database are not
available when you use direct path loading. Fields for which default
values are desired must be specified with the DEFAULTIF clause. If a
DEFAULTIF clause is not specified and the field is NULL, then a null
value is inserted into the database.
This could be the reason of this behaviour.
I don't believe that you'll see a great benefit from not including the defaults, particularly in comparison to the benefits of a direct path load. If the data is going to be readonly then consider compression also.
You should also note that SQL*Net features compression for same values in the same column, so even in conventional path inserts the network overhead is not as high as you might think.

How does SCN_TO_TIMESTAMP work?

Does the SCN itself encode a timestamp or is it a lookup from some table.
From an AskTom post he explains that the timestamp to +/-3seconds is stored in raw field in smon_scn_time. IS that where the function is going to get the value?
If so, when is that table purged if ever? If so, what triggers that purge?
If it is, does that make it impossible to translate old SCN's to Timestamps?
If it's impossible, then it eliminates any uses of that field that are long term things (read: auditing).
If I put that function in a query, would joining to that table be faster?
If so, anyone know how to covert that Raw column?
The SCN does not encode a time value. I believe it is an autoincrementing number.
I would guess that SMON is inserting a row into SMON_SCN_TIME (or whatever table underlies it) every time it increments the SCN, including the current timestamp.
I queried for the minimum recorded timestamp in several databases and they all go back about 5 days and have a little under 1500 rows in the table. So it is less than the instance lifetime.
I imagine the lower bound on how long the data is kept might be determined by the DB_FLASHBACK_RETENTION_TARGET parameter, which defaults to 1 day.
I would recommend using the function, they've probably provided it so they can change the internals at will.
No idea what the RAW column TIM_SCN_MAP contains, but the TIME_DP and SCN column would appear to give you the mapping.

Serializing objects as BLOBs in Oracle

I have a HashMap that I am serializing and deserializing to an Oracle db, in a BLOB data type field.
I want to perform a query, using this field.
Example, the application will make a new HashMap, and have some key-value pairs.
I want to query the db to see if a HashMap with this data already exists in the db.
I do not know how to do this, it seems strange if i have to go to every record in the db, deserialize it, then compare, Does SQL handle comparing BLOBs, so i could have...select * from PROCESSES where foo = ?....and foo is a BLOB type, and the ? is an instance of the new HashMap?
Thanks
Here's an article for you to read: Pounding a Nail: Old Shoe or Glass Bottle
I haven't heard much about your application's underlying architecture, but I can tell you immediately that there is never a reason why you should need to use a HashMap in this way. Its a bad technique, plain and simple.
The answer to your question is not a clever Oracle query, its a redesign of your application's architecture.
For a start, you should not serialize a HashMap to a database (more generally, you shouldn't serialize anything that you need to query against). Its much easier to create a table to represent hashmaps in your application as follows:
HashMaps
--------
MapID (pk int)
Key (pk varchar)
Value
Once you have the content of your hashmaps in your database, its trivial to query the database to see if the data already exists or produce any other kind of aggregate data:
SELECT Count(*) FROM HashMaps where MapID = ? AND Key = ?
Storing serialized objects in a database is almost always a bad idea, unless you know ahead of time that you don't need to query against them.
How are you serializing the HashMap? There are lots of ways to serialize data and an object like a HashMap. Comparing two maps, especially in serialized form, is not trivial, unless your serialization technique guarantees that two equivalent maps always serialize the same way.
One way you can get around this mess is to use XML serialization for some objects that rarely need to be queried. For example, where I work we have a log table where a certain log message is stored as an XML file in a CLOB field. This xml data represents a serialized Java object. Normally we query against other columns in the record, and only read/write the blob in single atomic steps. However once or twice it was necessary to do some deep inspection of the blob, and using XML allowed this to happen (Oracle supports querying XML in varchar2 or CLOB fields as well as native XML objects). It's a useful technique if used sparingly.
Look into dbms_crypto.hash to make a hash of your blob. Store the hash alongside the blob and it will give you something to narrow down the search to something manageable. I'm not recommending storing the hash map, but this is a general technique for searching for an exact match between blobs.
See also SQL - How do you compare a CLOB
i cannot disagree, but i'm being told to do so.
i appreciate your solution, and that's sort of what i had previously.
thanks
I haven't had the need to compare BLOBs, but it appears that it's supported through the dbms_lob package.
See dbms_lob.compare() at http://www.psoug.org/reference/dbms_lob.html
Oracle can have new data types defined with Java (or .net on windows) you could define a data type for your serialized object and define how queries work on it.
Good lack if you try this...
If you serialize your data to xml, and store the data in an xml you can then use xpaths within your sql query. (Sorry as I am more of a SqlServer person, I don’t know the details of how to do this in Oracle.)
If you EVERY need to update only part of the serialized data don’t do this.
Likewise if any of the data is pointed to by other data or points to other data don’t do this.

Resources