integration of multiple databases via talend open studio for big data - business-intelligence

I have 11 MySQL Databases, each one contains 81 tables, each table have the same schema in the 11 databases, my goal is to integrate those databases into a MongoDB database
I'm working with Talend Open Studio for Big Data
Can you please suggest a way that can help me?

Related

I'm currently trying to migrate a large table from cassandra to oraclesql and can't find many solutions

I've been researching and looking for ideas but the only thing close to a solution i've found has been where someone used pyspark to convert an oracle table into hdfs and then from hdfs into cassandra but I was hoping there was another/a clear solution to this data migration.
Title suggests that it is Cassandra > Oracle. Message text says Oracle > HDFS > Cassandra (i.e. the opposite direction). What exactly are you trying to do?
Suppose it is the title that is correct. If there's no tool which would do the migration for you, from my - developer's - point of view, creating a database link in my Oracle schema which points to Cassandra might be a good option. Then I'd just write some SQL code to migrate data I need. Here's how: Access Cassandra Data as a Remote Oracle Database.
Shortly:
connect to Cassandra as an ODBC data source
set connection properties for compatibility with Oracle
configure the ODBC gateway, Oracle Net and Oracle database
write queries

Talend open studio embeded DB (H2 or Derby)

I need to use an embedded DB (H2, Apache Derby) with talend open studio. I saw that it's possible with talend MDM, but couldn't find any tutorial on how to embed this in talend open studio.
I have a big amount of data, from different tables that are processed the first time, stored locally before a second step of transformations. But can't use cache memory or files (csv) as middle storage.
Any ideas ? help please
Talend MDM uses an internal Database for the purpose of the product (data repository). Talend Open Studio (DI) does not use it (there are no internal informations needed with Talend Open Studio for Data Integration).
If you want to connect and extract data from your DB2 database, you can use DB2 components inside your talend jobs.

Migrate bulk MS access MDB files data to oracle 11G

I have 500+ MS ACCESS MDB files, with huge data (400+ GBs) in total. I have a requirement to migrate it to oracle. All MDBs have same table structures.
Any suggestions for best approach will be helpful
I don't know if it is best, but - Oracle SQL Developer (a free GUI tool) offers Migration Workbench which lets you migrate MS Access into Oracle. Here are two pages you'd want to visit:
Migrating a Microsoft Access Database to Oracle Database 11g: https://www.oracle.com/webfolder/technetwork/tutorials/obe/db/hol08/sqldev_migration/msaccess/migrate_microsoft_access_otn.htm
Migrating from Microsoft Access to Oracle: https://www.oracle.com/technetwork/database/migration/access-084991.html

Which open source dashboard/BI tools can work with monetDB?

I'm trying to create a rich online dashboard to analyze web traffic with monetDB.
Does anybody know how to integrate with an open source solution ?
I would recommend to use:
DWH: MonetDB
ETL: PDI (Pentaho Data Integration)
OLAP: Mondrian OLAP (OLAP schema workbench tool)
Dashboards: Pentaho BI Server CE (CDF: charts portfolio CCC + maps, etc.)
For a quick start:
Fill you DWH (MonetDB) with sample data (One fact table, few dimensions)
Create OLAP schema in on the top of the DWH tables using "OLAP schema workbench" tool
On Pentaho BI Server CE:
Add data sources - DWH connection, OLAP schema
Create dashboards (according to samples on BI Server)

Applying SCD on Talend MDM Server

I am using Talend Open Studio for MDM and I have a requirement to do version control on customer records.
When using an Oracle database, I can use tOracleSCD to capture the changes. Likewise, for MySQL, I can use tMysqlSCD.
But in Talend Open Studio for MDM, the only supported database is H2 and so I am storing all master records in a H2 database.
In this case, how can I achieve version control as there is no component available in Talend
for H2 database?
The SCD components just set up triggers on the watched tables and provide an easy interface into reading the trigger output tables.
You could set the triggers up manually on the H2 database by recreating the database in MySQL and then using the MySQL SCD components to work out what it's doing and work out how to read the data back in and then recreate those steps with H2 components as part of a data integration task.
That said, Talend MDM has the concept of a journal which stores all of the changes made to a data record. The Talend Open Studio for MDM documentation has some more detailed information about how to view the journal. All changes made through the MDM interface should make an entry in the journal automatically.

Resources