Data migration among multiple databases with spring-boot - spring-boot

I am trying to make an application where data will be migrated from one database to another database (Multiple dbs will be used). User can select the table at runtime & push it to target db. I am using spring-boot, spring data JPA & trying with Flyway.
My issue is how to read the complete schema from source db as user can select the source db at runtime?
Sumit

You can obtain a MetaData object from a JDBC connection and use it to obtain all kinds of information about the database, e.g. the list of tables.
See the following example which I took from a tutorial.
databaseMetaData = connection.getMetaData();
ResultSet resultSet = databaseMetaData.getTables(null, null, null, new String[]{"TABLE"});
System.out.println("Printing TABLE_TYPE \"TABLE\" ");
System.out.println("----------------------------------");
while(resultSet.next())
{
System.out.println(resultSet.getString("TABLE_NAME"));
}
Note: JPA is most likely not the right tool for the job. Consider using Springs JdbcTemplate instead.

Related

Java based ETL Application

I want to build a spring framework based ETL application. I should be able to create an exact copy of any table in a database. Hence, the structure of the table is not known to me beforehand. So, creation of entities is not possible within the application.
The idea is to provide some external configuration to the application for each table. The application should then be able to create an exact copy of the table.
I cannot use Spring JPA as it requires creation of entities. Thus, planning to use Spring JDBCTemplate. Will Spring JDBCTemplate be the right framework for my application?
I am not ready to use Pentaho,rather I want to build something like it with Java.
You can use Spark.
Here is an example of how you can do it
public class DemoApp {
SparkSession spark = SparkSession.builder()
.master("local[1]")
.appName(DemoApp.class.getName())
.getOrCreate();
Dataset<Row> table1 = spark.read().jdbc("jdbc:postgresql://127.0.0.1:5432/postgres", "demo.table", getConnectionProperties(dbProperties));
private Properties getConnectionProperties(Properties dbProperties) {
Properties connectionProperties = new Properties();
connectionProperties.put("user", "postgres");
connectionProperties.put("password", "password");
connectionProperties.put("driver", "org.postgresql.Driver");
connectionProperties.put("stringtype", "unspecified");
return connectionProperties;
}
}
You can read several tables and after that join them or do other things you like.

Embeded H2 Database for dynamic files

In our application, we need to load large CSV files and fetch some data out of it. For example, getting the distinct values from the CSV file. For this, we decided to go with in-memory DB's like H2, as there is no need to store the data in persistent storage.
However, the file is so dynamic that the columns may not be the same. I need to load the file to the H2 database to a table that is temporary for that session.
Tech Stack is Spring boot and H2.
The examples I see on forums is using a standard entity that knows what fields the table has. However my case the table columns will be dynamic
I tried the below in spring boot
public interface ImportCSVRepository extends JpaRepository<Object, String>
with
#Query(value = "CREATE TABLE TEST AS SELECT * FROM CSVREAD('test.csv');", nativeQuery = true)
But this gives unmanaged entity error. I understand why the error is thrown. However I am not sure how to achieve this. Also please clarify if I should use Spring-batch ?
You can use JdbcTemplate to manually create tables and query/update the data in them.
An example of how to create a table with JdbcTemplate
Dynamically creating tables and defining new entities (or modifying existing ones) is hardly possible with spring-data repositories and #Entity-ies. You probably should also check some NoSQL dbs like MongoDb - it's easier to define documents (or key-value objects - Redis) with dynamic structures in them.

Role of H2 database in Apache Ignite

I have an Apache Spark Job and one of its components fires queries at Apache Ignite Data Grid using Ignite SQL and the query is a SQLFieldsQuery. I was going through the thread dump and in one of the Executor logs I saw the following :
org.h2.mvstore.db.TransactionStore.begin(TransactionStore.java:229)
org.h2.engine.Session.getTransaction(Session.java:1580)
org.h2.engine.Session.getStatementSavepoint(Session.java:1588)
org.h2.engine.Session.setSavepoint(Session.java:793)
org.h2.command.Command.executeUpdate(Command.java:252)
org.h2.jdbc.JdbcStatement.executeUpdateInternal(JdbcStatement.java:130)
org.h2.jdbc.JdbcStatement.executeUpdate(JdbcStatement.java:115)
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.connectionForThread(IgniteH2Indexing.java:428)
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.connectionForSpace(IgniteH2Indexing.java:360)
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.queryLocalSqlFields(IgniteH2Indexing.java:770)
org.apache.ignite.internal.processors.query.GridQueryProcessor$5.applyx(GridQueryProcessor.java:892)
org.apache.ignite.internal.processors.query.GridQueryProcessor$5.applyx(GridQueryProcessor.java:886)
org.apache.ignite.internal.util.lang.IgniteOutClosureX.apply(IgniteOutClosureX.java:36)
org.apache.ignite.internal.processors.query.GridQueryProcessor.executeQuery(GridQueryProcessor.java:1666)
org.apache.ignite.internal.processors.query.GridQueryProcessor.queryLocalFields(GridQueryProcessor.java:886)
org.apache.ignite.internal.processors.cache.IgniteCacheProxy.query(IgniteCacheProxy.java:698)
com.test.ignite.cache.CacheWrapper.queryFields(CacheWrapper.java:1019)
The last line in my code executes a sql fields query as follows :
SqlFieldsQuery sql = new SqlFieldsQuery(queryString).setArgs(args);
cache.query(sql);
According to my understanding, Ignite has its own data grid which it uses to store the cache data and indices. It only makes use of H2 database to parse the SQL query and get a query execution plan.
But, the Thread dump shows that updates are being executed and transactions are involved. I don't understand the need for transactions or updates in a SQL Select Query.
I want to know the following about the role of H2 database in Ignite :
I went into the open source code of Apache Ignite(version 1.7.0) and saw that it was trying to open a connection to a specific schema in H2 database by executing the query SET SCHEMA schema_name ( connectionForThread() method of IgniteH2Indexing class ). Is one schema or one table created for every cache ? If yes, what information does it contain since all the data is stored in ignite's data grid.
I also came across another interesting thing in the open source code which is that Ignite tries to derive the schema name in H2 from space name ( reference can be found in queryLocalSqlFields() method of IgniteH2Indexing class ). I want to know what does this space name indicate and is it something internal to Ignite or configurable ?
Would the setting of schema and connection to H2 db happen for each of my SQL query, if yes then is there any way to avoid this ?
Yes, we call executeUpdate to set schema. In Ignite 2.x we will be able to switch to Connection.setSchema for that. Right now we create SQL schema for each cache and you can create multiple tables in it, but this is going to be changed in the future. It does not actually contain anything, we just utilize some H2 APIs.
Space name is basically the same thing as a cache name. You can configure SQL schema name for a cache using CacheConfiguration.setSqlSchema.
If you run queries using the same cache instance, schema will not change.

Spring Data : relationships between 2 different data sources

In a Spring Boot Application project, I have 2 data sources:
a MySQL database (aka "db1")
a MongoDB database (aka "db2")
I'm using Spring Data JPA and Spring Data MongoDB, and it's working great... one at a time.
Saying db1 handles "Players", and db2 handles "Teams" (with a list of players' ID). Is it possible to make the relationship between those 2 heterogeneous entities working? (i.e. #ManyToOne, #Transactional, Lazy/Eager, etc.)
For example, I want to be able to write:
List<Player> fooPlayers = teamDao.findOneById(foo).getPlayers();
EDIT: If possible, I'd like to find a solution working with any spring data project
Unfortunately your conundrum has no solution in spring data.
what may be a possibility is that you create an interface (DAO) class of your own. That DAO class would have implementations to to query both of your DBs. A very crude and short example would be
your DAO
{
yourFind (id)
{
this would find in db2 and return a relevant list of objects
findOneByID(id)
get the player from the above retrieved list and query db1
getPlayer(player)
}
}
i hope this points you in the right direction

Oracle XML DB and the Java Persistence API

I've stumbled upon Oracles XML DB functionality, but so far, from my reading I only see examples with JDBC implementations.
This is one of the examples:
import oracle.xdb.XMLType;
...
PreparedStatement stmt = conn.prepareStatement(
"SELECT e.poDoc FROM po_xml_tab e" );
ResultSet rset = stmt.executeQuery();
while( rset.next() ) {
// get the XMLType
XMLType poxml = ( XMLType )rset.getObject( 1 );
// get the XML as a string...
String poString = poxml.getStringVal();
}
According to the official xml db developers guide, there is an option to store data in a object-relational (structured) format. This makes me think that there should be an almost seamless link between XML DB and JPA. Maybe I'm missing something, or maybe it just doesn't exist?
Can they work together? Are there other options than JDBC? Or can I just do JPA for the queries, and the JDBC for XML?
Edit: Is Oracle XML DB even worth using it? Since it doesn't look like anyone uses it (according to the views and responses so far).
You can use Oracle XDB features from JPA. If you have a column of an XMLType you can map it into a JPA Entity using a #Basic mapping to a String.
In EclipseLink you could use a Converter to map it to another data-type, or use the DirectToXMLTypeMapping to map the DOM.
If you want to map the XML to objects, you could use a Converter that uses JAXB.

Resources