Spring batch creates meta data tables in database automatically. as of now I forced spring batch to create metadata tables in memory as per one of the stack-overflow answer. As per the documentation I learnt that metadata tables are used for recovery and restart. My batch job do not require restart because as per my functionality I can simply process the file in next batch. But unable to understand what kind of recovery spring batch provide by using metadata tables. I am simply asking this question because I would like to know is it advisory to turn off creation of meta data tables? Thank you.
Related
I am debugging a spring batch application that uses Oracle. I am seeing lot of tables getting modified by the spring batch Job. And also , it is very difficult to go through the code base to understand what are all the tables got modified.
Is there an easy way to know what are all the tables & records got modified by the spring batch job?
Is there any tool I can use to:
monitor a folder, when a xml file is added, it checks its name validate it(if exists in a list of names), validates it against a schema xsd, and extract data contained in xml and load it to an oracle database, if any error occurs in that process it rejects the file (write its name in a file of rejected files). I don't expect it to be able to fullfill all these features but at least help me with the monitoring and automating the process.
thanks in advance,
Yes, any ETL or data integration tool should be able to do that. I’ve implemented a project that had most of those features in the past using Pentaho Data Integration.
We want to take tables from Oracle to Cassandra every day. Because tables is updated in Oracle everyday. So when i searched this , i find these options:
Extract oracle tables as a file , then write Cassandra
Using sqoop to get tables from oracle, write Map Reduce job and insert into Cassandra ?
I am not sure which way is the appropriate ? Also is there another options ?
Thank you.
Option 1
Extracting oracle tables as a file and then writing to Cassandra manually everyday can be tiresome process unless if you are scheduling a cron job. I have tried this before, but if the process fails then logging it might be an issue. If you are using this process and exporting to CSV and trying to write to cassandra then I would suggest using cassandra bulk loader (https://github.com/brianmhess/cassandra-loader)
Option 2
I haven't worked with this, so can't speak about this.
Option 3 (I use this)
I use an open source tool, Pentaho Data Integration (Spoon) (https://community.hitachivantara.com/docs/DOC-1009855-data-integration-kettle) to solve this problem. It's fairly a simple process
spoon. You can automate this process by using a carte server (spoon server) which has logging capabilities as well as automatic restarting if the process failed in between.
Let me know if you found any other solution that worked for you.
I using JPA and MySQL in Spring boot. Howto make a initial content data of database?
Example, need create the basic sections, default admin user, etc.
Thanks.
I would recommend that you take a look at Flyweight, It is nicely integrated into SpringBoot.
We use it to create the initial database, and for adding new tables or modifying the database when deploying new version of our application.
I would recommend that you create a script /resources/db/migration/V1__Initial.sql Which just have the table layout and then a V2__data.sql with the initial data.
A script can only be run once, and you can't modify it after it has been run, this information is stored in a table named schema_version, which you will probably have to delete, or manipulate during development. Here is a link
to how it works - These days I would never do a real world project without using it.
Do we have any utility to sync data between Oracle & Neo4J database. I want to use Neo4j in readonly mode & all writes will happen to oracle DB.
I think this depends on how often you want to have the data synced. Are you looking for a periodic sync/ETL process (say hourly or daily), or are looking for live updates into Neo4j?
I'm not aware of tools designed for this, but it's not terribly difficult to script yourself.
A periodic sync is obviously easiest. You can do that directly using the Java API and connecting via JDBC to Oracle. You could also just dump the data from Oracle as a CSV and import into Neo4j. This would be done similiarly to how data is imported from PostreSQL in this article: http://neo4j.com/developer/guide-importing-data-and-etl/
There is a SO response for exporting data from Oracle using sqlplus/spool:
How do I spool to a CSV formatted file using SQLPLUS?
If you're looking for live syncing, you'd probably do this either through monitoring the transaction log or by adding triggers onto your tables, depending on the complexity of your data.