Implementing database layer in Spring / Jetspeed - spring

I'm using a Spring/Jetspeed/portal architecture and want to add a new portal which uses persistent storage. The Jetspeed database itself is MySql so I'm thinking of adding some extra tables to this DB, I don't think I can re-use the existing Jetspeed tables ? I could use jdbc to then query these tables but I'd like to abstract this layer. What technologies should I use ? I like the idea of NoSql and this might be good project to introduce it in. I will be just adding approx 3 tables.

Related

Best practice for combining database with object store in Spring Boot App

I've an application which stores entites in MariaDB. The entity has some relations to other entities and in one of the child entities is a binary attribute. See the following:
Parent
\ 1-n ChildA
- attrA1
- ...
\ 1-n ChildB
- attrB1
- binaryAttr
Storing the binary files in the DB has an impact on the DB size of course and thereby an impact on our backup concept. Besides this binary data doesn't necessarily have to be in the DB as well.
I'm thinking about combining MariaDB with an S3 compatible object store, so that the structured data is persisted in DB and the binary files in the object store.
We're are using Spring Boot (Data-JPA) with Apache Camel.
The naïve approach would be to store the binary file with a uuid-key in the object store and afterwards persisting the rest of the entity with the reference (uuid) to the DB.
But there's no easy transaction handling for this, nor a transparent handling of the entity (I've to handle persisting entity and binary data seperately).
Is there a java / spring based framework to overcome the drawbacks (transaction handling, transparent handling) of two different "datasources"?
What is the best practice to handle such a scenario or is it totally unrecommended?
Is there an extension mechanism for Spring / Hibernate to handle object store persistency within the hibernate orm mapping process?
Is it possible to implement one repository for persisting / loading entity and binary data at once?
Can anyone give me some direction?

Embeded H2 Database for dynamic files

In our application, we need to load large CSV files and fetch some data out of it. For example, getting the distinct values from the CSV file. For this, we decided to go with in-memory DB's like H2, as there is no need to store the data in persistent storage.
However, the file is so dynamic that the columns may not be the same. I need to load the file to the H2 database to a table that is temporary for that session.
Tech Stack is Spring boot and H2.
The examples I see on forums is using a standard entity that knows what fields the table has. However my case the table columns will be dynamic
I tried the below in spring boot
public interface ImportCSVRepository extends JpaRepository<Object, String>
with
#Query(value = "CREATE TABLE TEST AS SELECT * FROM CSVREAD('test.csv');", nativeQuery = true)
But this gives unmanaged entity error. I understand why the error is thrown. However I am not sure how to achieve this. Also please clarify if I should use Spring-batch ?
You can use JdbcTemplate to manually create tables and query/update the data in them.
An example of how to create a table with JdbcTemplate
Dynamically creating tables and defining new entities (or modifying existing ones) is hardly possible with spring-data repositories and #Entity-ies. You probably should also check some NoSQL dbs like MongoDb - it's easier to define documents (or key-value objects - Redis) with dynamic structures in them.

Get Schema Size using JPA Repository

We are using multi-tenant application. Each tenant will have its own schema. Now we need to determine the schema size(Memory Size). Is there a way provided by Spring JPA for this..? How to use Custom JPA Repository for getting the schema size(Memory Size) instead of writing our own implementation?
Spring Data JPA can't help you with that.
And as far as I know, neither can any JPA implementation.
The reason is that all of these tools lack necessary information:
How do you store information in the database. Databases offer lots of storage option, like leaving empty space in blocks to allow for updates that grow rows or compressing data in order to trade storage space for the cost of (de)compressing and many more.
How much data you put in your tables and how it looks like. For example, you probably have many attributes of type String in your model. But how long are they on average? 2 characters? Or 2000?
So for this kind of information you should look more into database tools that might offer this kind of functionality.

Which is better ORM (Apache Cayenne) , JDBC or SpringJDBC?

I am Working on multiple database like MSSQL server and PostgreSQL with heavy transactions and complex queries. I have searched that simple jdbc is more faster then ORM. I was thinking of using ORM because I do not want to write different query for different database for same work, and also for standardized my dao layer. I am mapping my database tables without using foreign keys and for ORM like apache cayenne I have to map tables with foreign key constraint, so I can use my Joins or any other multiple table operations. Is it good to use and ORM or simple jdbc is fine.
From your problem dscription, you already have an understanding of the tradeoffs involved. So this is really a decision that you need to make for yourself based on those tradeoffs.
My only advice here will be to take a second look at performance requirements. While ORM does introduce an overhead of creating, storing and managing objects, in all but a few cases, you can safely ignore this overhead for the sake of a better abstraction. Also when working with JDBC very often you end up writing your own code to convert ResultSet to objects, which will encounter its own overhead. So you may not end up with faster code, while forfeiting all the benefits of a clean object model and a framework that manages it.
So my own preference is to go with a better abstraction (ORM in this case), and then use the framework tools for optimizing the performance. E.g. to speed up processing of large ResultSets Cayenne provides a few techniques: result iterators, DataRow queries, paginated queries, etc.
On the other hand I would use JDBC or something like MyBatis when it is not possible to cleanly model your data as entities. E.g. when there are no natural relationships, all access happens via stored procedures, etc. Doesn't seem like your case though.

How to retrieve the data from database without using apache jackrabbit datastore?

I have integrated the jack rabbit with Oracle database and I am storing the
Data using Jackrabbit, if I don't want to retrieve the data using the
Jackrabbit, in what way I can get the data. In database data is storing in
blob type.
The way Jackrabbit stores the data in the DB is an implementation detail, and it does not magically map this into a "nice" DB schema if that's what you mean. (The hierarchical nature and all the JCR features make this impossible). It's a bit like having a Unix file system and then asking how can I read the low level inodes etc. from the file system implementation - you really should not.
Last but not least note that while it is running nothing else (except for a Jackrabbit cluster setup) must write to the DB (the tables used by Jackrabbit) as this will easily lead to data corruption.
As #TedTrippin already mentioned above, an ORM framework would make things much easier. But if you really want to do it manually in Oracle, the approach would be:
Study the code of the OCM http://jackrabbit.apache.org/jcr/object-content-mapping.html, then get the content according to the logic of associations and relations from Oracle, probably not in one but multiple queries per document; eventually with user-defined functions, which are supported in Oracle and might make things easier.
Would be interesting to know the background of your questions. You tagged it with "Spring" and "CMS". I don't see any reason why you would want to access the data directly from Oracle, it's tedious. In case you want to provide an API for the content to an external system, or in case you have lost a CMS that was once in front of and just using the Jackrabbit repo as a content store, you could still use such ORM / OCM framework standalone to make it easier to access the data.

Resources