Creating txt file using Pentaho - etl

I'm currently trying to create txt files from all tables in the dbo schema
I have like 200s-300s tables there, so it would takes up too much times to create it manually..
I was thinking for creating a loop.
so as example (using AdventureWorks2019) :
select t.name as table_name
from sys.tables t
where schema_name(t.schema_id) = 'Person'
order by table_name;
This would get all the table name within the Person schema.
So I would loop :
Table input : select * from ${table_name}
But then i realized that for txt files, i need to declare all the field and their data types in pentaho, so it would become a problems.
Any ideas how to do this "backup" txt files?

Using Metadata Injection and more queries to the schema catalog tables in SQL Server. You not only need to retrieve the table name, you would need to afterwards retrieve the columns in that table and the data types, and inject that information (metadata) to the text output step.
You have in the samples directory of your spoon installation an example on how to use Metadata Injection, use it, along with the documentation, to build a simple example (the check to generate a transformation with the metadata you have injected is of great use to debug)
I have something similar to copy data from one database to another, both in Oracle, but with SQL Server you have similar catalog tables as in Oracle to retrieve the information you need. I created a simple, almost empty transformation to read one table and write to another. This transformation has almost no information, only the database origin in the Table Input step and the target database in the Table Output step:
And then I have a second transformation where I fill up all the information (metadata) to inject: The query to perform in the Table Input step, and all the data I need in the Table Output: Target table, if I need to truncate before inserting, the columns from (stream field) and to (Table field):

Related

how to create table definition from csv file and also copy data at the same time

I want to load data from a csv file into Vertica. I don't want to create table and the copy data in two separate steps. Instead, I want to create the table, specify the csv file and then let vertica figure out column definitions (names, data type) itself and then load the data.
Something like create table titanic_train () as COPY FROM '/data/train.csv' PARSER fcsvparser() rejected data as table titanic_train_rejected abort on error no commit;
Is it possible?
I guess that if a table has 100s of columns then automating the create table, column definition and data copy would be much easier/faster than doing these steps separately
It's always several steps, no matter what.
Use the built-in bits of Vertica:
CREATE FLEX TABLE foo();
COPY foo FROM '/data/mycsvs/foo.csv' PARSER fCsvParser();
SELECT COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW('foo');
-- THEN, either:
SELECT * FROM foo_view;
-- OR: create a ROS Table:
CREATE TABLE foo_ros AS SELECT * FROM foo_view;
Get a CSV-to-DDL parser from the net, like https://github.com/marco-the-sane/d2l, and install it then:
$ d2l -coldelcomma -chardelquote -drp -copy /data/mycsvs/foo.csv | vsql
So , in the second instance, it's one step, but it calls both d2l and vsql.

Dataset to sqlite data transfer

While development I was in need to create sqlite data base using dataset.
First create empty database file in sqlite. Then using dataset create table and dump data.
Is it possible using some dataset method?
I have done one thing
I have used one query
select * from sqlite_master where type='table'
That will give me schema of sqllite file from where I want to copy data.
On other end I will execute query return by above query.
That will help to create table dynamically.
On by one I used dataadapter to fill data and assigned dataset datatable data and used
Dataadapter. Update method.

updating data in external table

Lets assume the following scenario :
I have several users that will prepare .csv files (not being aware of each other so concurrency is possible).
The .csv file will always be in same format.
The data in the .csv file will contain a list of ids together with some other columns like update_date.
Based on that data i will create a procedure that will update data in real DB table.
The idea is to use external tables, to maximally simplify it for the .csv creators, so they will put files in a folder and stuff will be done for them, rest is my job.
The questions are :
Can i have several files as the source for 1 external table or i need 1 ext table for each file (and what i mean here is whenever there is new func call to load data from csv, it should be added to existing external table...so not all files are being loaded at once)
Can i update records/fields in external table.
External table basically allowes to query the data stored in the external file(s). So from this point you can't issue an UPDATE on it.
You can
1) add new files in the directory and ALTER the table
ALTER TABLE my_ex LOCATION ('file1.csv','file2.csv');
2) you can of course modify the existing files as well. There is no database state of the external table, each SELECT loads the data in the database, so you will always see the "updated" status.
** UPDATE **
An attempt to modify (e.g. UPDATE) leads to ORA-30657 operation not supported on external organized table.
To be able to maintain status in the database the data must be first copied in a regular table (CTAS - create table as select from the external table).

How to create a table identical to other table in structure and constraints in Oracle?

I want to create a table (lets say table_copy) which has same columns as other table (lets call it table_original) in Oracle database, so the query will be like this :
create table table_copy as (select * from table_original where 1=0);
This will create a table, but the constraints of table_original are not copied to table_copy, so what should be done in this case?
Only NOT NULL constraints are copied using Create Table As Syntax (CTAS). Others should be created manually.
You might however query data dictionary view to see the definitions of constraints and implement them on your new table using PL/SQL.
The other tool that might be helpful is Oracle Data Pump. You could import the table using REMAP_TABLE option specifying the name for the new table.
Use a database tool to extract the DDL needed for the constraints (SQL Developer does the job). Edit the resulting script to match the name of the new class.
Execute the script.
If you need to do this programmatically you can use a statement like this:
DBMS_METADATA.GET_DDL('TABLE','PERSON') from DUAL;

oracle query talend etl

I am a beginner to use talend etl I would make a request to select a database oracle then make a change _ S and insert it into a mysql table I'm stuck. I have not found how to make select queries with talend not know how I started that can help me please
To make a select on Oracle schema you need a tOracleInput component instance. You need to specify a Oracle connection (either a builtin connection or a repository-registered connection) and the output schema (the columns you need).
Then, click on "Guess query" to build the select SQL code accordingly to your desired output schema. You can then modify the automatically generated SQL code to fit your needs (ie. add WHERE or SORT clauses). Don't forget that output schema and selected columns must be the same (ie if you add a SQL-generated column inside the select clause you must add in your output schema, too). Any valid SELECT syntax can be used here (including subselects, cursors, window over partition and even more weird oracle stuff).
After the input instance, add a tMap where you make all your data manipulations. Finally, close with a tMySqlOutput, specifing connection and output table details. The flow will trivially looks like:
tOracleInput ----> tMap ----> tMySqlOutput
Create the connection Mysql and Oracle in Metadata Objects (Db Connections).
Create the following objects (tMySqlConnection - TOracleConnection)
Configure the objects with the parameters connections (Property Type: Repository).
Extract data: You can select the table with the object tOracleInput.
Edit schema of the table in the submenu Component
Create object TMAP, (transform the data)
Create object tMysqlOutPut and configure the schema and columns to insert.
Create object tMysqlCommit and close the connection.
Listo! ¡¡Run to job!! :)

Resources