We have a use case to store large json strings (about 10 kb +) in Oracle Db. What column data type is the most ideally suited for this? Clob or blob?
For Oracle 12.1 and higher, as Mathguy mentioned, you should follow Oracle's advice and use BLOBs to store JSON data. Recent versions of Oracle have added many SQL/JSON features that seamlessly deal with JSON regardless of the data type, and BLOBs will avoid some character set issues.
For Oracle 11.2 and lower, you should use CLOBs to store JSON data. Since you don't have access to native JSON functionality, you will probably need to rely on regular string processing. And dealing with character data in CLOBs is much easier than dealing with character data in BLOBs. (However, if you use a library like PL/JSON, then BLOBs might still work OK.)
Related
I want to know what does Oracle's CLOB has to offer over BLOB data type.
Both have data storage limits of (4 GB - 1) * DB_BLOCK_SIZE.
A text string which is longer than 4000 bytes can not fit in VARCHAR2 column. Now, I can use CLOB and BLOB as well to store this string.
Everyone says, CLOB is good and meant for character data and BLOB is for binary data such as images, unstructured documents.
But I see I can store character data inside a BLOB as well.
What I want to know:
So, question is on the basics, why CLOB and why not BLOB always? Is there anything to do with encoding?
May be the question title should be, How CLOB handles the character data differently than a BLOB?
I want to know how BLOB treats the character type data.
It doesn't treat it as character type data, it only see it as a stream of bytes - it doesn't know or care what it represents.
From the documentation:
The BLOB data type stores unstructured binary large objects. BLOB objects can be thought of as bitstreams with no character set semantics.
Does clob stores the conding information along with it and uses it while retrieving the data ?
Not explicitly, but the data is stored in the database character set, as with VARCHAR2 data. From the documentation again:
The CLOB data type stores single-byte and multibyte character data. Both fixed-width and variable-width character sets are supported, and both use the database character set.
You might also have noticed that the dbms_lob package has procedures to convert between CLOB and BLOB data types. For both of those you have to specify the character set to use. So if you choose to store character data as a BLOB you have to know the character set when converting it to a BLOB, but perhaps more crucially you have to know the character set to be able convert it back. You can do it, but it doesn't mean you should. You have no way to validate the BLOB data until you come to try to convert it to a string.
As #APC alluded to, this is similar to storing a date as a string - you lose the advantages and type-safety using the correct data type would give you, and instead add extra pain, uncertainty and overhead for no benefit.
The question isn't really what advantages CLOBs have over BLOBs for storing character data; the question is really the reverse: what advantages do BLOBs have over CLOBs for storing character data? And the answer is usually that there are none.
#Boneist mentions the recommendation to store JSON as BLOBs, and there is more about that here.
(The only other reasons I can think of off-hand are that you have to store data from multiple source character sets and want to preserve them exactly as you received them. But then either you are only storing them and will never examine or manipulate the data from within the database itself, and will only return them to some external application untouched; in which case you don't care about the character set - so you're handling purely binary data and shouldn't be thinking of it as character data at all, any more than you'd care that an image you're storing is PNG vs. JPG or whatever. Or you will need to work with the data and so will have to record which character set each BLOB object represents, so you can convert as needed.)
I am new to implementing caches.
The key is a simple string (i.e. 10 characters long). No collisions.
The value is a large string. Is storing this in a MySQL database looked down upon or is it fine?
Alternatives: Memory, File Sys, NoSQL. What do you think about them.
Thanks!
I dont see any problem in doing this, older versions of mysql dont support json datatypes so you wont be able to perform queries over it, but in new mysql versions you can perform queries.
https://dev.mysql.com/doc/refman/8.0/en/json.html
I have a MariaDB database which uses dynamic columns.
There are around 10 such columns, because the data comes from many different devices and each of the device has different attributes. The devices send some binary data which is converted into csv and then inserted. I don't have control over this at all.
Now I am planning to migrate to oracle database 12.2 but not sure how to migrate the dynamic columns to Oracle. Any ideas please?
Oracle RDBMS doesn't support this feature natively, so you will have to write some procedures to implement something analogous to the MariaDB calls.
The closest functionality to dynamic columns is JSON. You're moving to Oracle 12.2 which has pretty JSON support. Find out more. Unless your data is very complicated with lots of nesting it should be trivial to turn CSV into JSON. Once you have JSON is easy to insert, maintain and retrieve the data using Oracle's functionality.
We're in the process of converting our database from Sybase to Oracle and we've hit a performance problem. In Sybase, we had a TEXT field and replaced it with a CLOB in Oracle.
This is how we accessed the data in our java code:
while(rs.next()) {
String clobValue = rs.getString(1); // This takes 176ms in Oracle!
.
.
}
The database is across the country, but still, we didn't have any performance problems with Sybase and its retrieval of TEXT data.
Is there something we can do to increase this performance?
By default, LOBs are not fetched along with the table data and it takes an extra round-trip to the database to fetch them in getString.
If you are using Oracle's .NET provider, you may set InitialLOBFetchSize in the data reader settings to a value large enough to accommodate your large objects in memory so they could be fetched in all their entirety along with the other data.
Some other options:
Are the LOB columns being stored in-line (in the data row) or out-of-line (in a separate place)? If the LOB columns tend to be small (under 4k in size), you can use the ENABLE STORAGE IN ROW clause to tell Oracle to store the data in-line where possible.
If your LOBs are larger and frequently used, are they being stored in the buffer cache? The default in 10g is that LOBs are NOCACHE, meaning each i/o operation against them involve direct reads to the database, a synchronous disk event, which can be slow. A database trace would reveal significant waits on direct path read / direct path write events.
This chapter of the Oracle Application Developer's Guide - Large Objects would be valuable reading.
We decided to take a different approach which will allow us to ignore clob performance.
Our current code (I didn't write it!) queries a table in the database and retrieves all of the information in the table, including the clobs, even though it wasn't quite necessary to retrieve them all # the time. Instead, we created another field with the first 4k characters in a varchar and query that instead. Then, when we need the full clob, we query it on an individual basis, rather than all clobs for all records.
I was reading on internet these statements about SQL Server data types:
VARBINARY(MAX) - Binary strings
with a variable length can store up
to 2^31-1 bytes.
IMAGE - Binary strings with a
variable length up to 2^31-1
(2,147,483,647) bytes.
Is there a really big technical difference between VARBINARY(MAX) and IMAGE data types?
If there is a difference: do we have to customize how ADO.NET inserts and updates image data field in SQL Server?
They store the same data: this is as far as it goes.
"image" is deprecated and has a limited set of features and operations that work with it. varbinary(max) can be operated on like shorter varbinary (ditto for text and varchar(max)).
Do not use image for any new project: just search here for the issues folk have with image and text datatypes because of the limited functionality.
Examples from SO: One, Two
I think that technically they are similar, but it is important to notice the following from the documentation:
ntext, text, and image data types will be removed in a future version of MicrosoftSQL Server. Avoid using these data types in new development work, and plan to modify applications that currently use them. Use nvarchar(max), varchar(max), and varbinary(max) instead.
Fixed and variable-length data types for storing large non-Unicode and Unicode character and >binary data. Unicode data uses the UNICODE UCS-2 character set.
They store the same data: this is as far as it goes.
"image" is deprecated and has a limited set of features and operations
that work with it. varbinary(max) can be operated on like shorter
varbinary (ditto for text and varchar(max)).
Do not use image for any new project: just search here for the issues
folk have with image and text datatypes because of the limited
functionality.
In fact, VARBINARY can store any data that can be converted into a byte array, such as files, and this is the same process that IMAGE data type uses, so, by this point of view, both data types can store the same data.
But VARBINARY have a size property, while IMAGE accepts any size up to the data type limits, so when using IMAGE data type, you will spend more resources to store the same data.
In a Microsoft® SQL Server®, the IMAGE data type is really deprecated, then you must bet in VARBINARY data type.
But be carefull: The Microsoft® SQL Server® CE® (including the latest 4.0 version) still using IMAGE data type and probably this data type will not "disappears" so soon, because in Compact Edition versions, this data type is better than any other to fast files storage.
I inadvertently found one difference between them. You can insert a string into an image type but not into a varbinary. Maybe that's why MS is deprecating the image type as it really doesn't make sense to set an image with a string.