Retrieve from dynamically named tables - etl

I'm using Talend Open Studio for Data Integration.
I have tables that are generated every day and table names are suffixed by date like so
dailystats20220127
dailystats20220126
dailystats20220125
dailystats20220124
I have two-part question.
I want to look at the table which has yesterday's date in it so sysdate - 1 and I want to fetch data from yesterday
select 'dailystats' ||to_char(sysdate - 1,'YYYYMMDD') TableName
from dual;
How do I retrieve schema for a dynamic name?
How do I pull data from that table.
I've worked with static table names and its a straightforward process.

If the schema is always the same, you just define it once in your input component.
In your input component set the sql as :
"select [fields] from dailystats"+ TalendDate.formatDate("yyyyMMdd", TalendDate.addDate(TalendDate.getCurrentDate(), -1, "dd"))

Related

Creating txt file using Pentaho

I'm currently trying to create txt files from all tables in the dbo schema
I have like 200s-300s tables there, so it would takes up too much times to create it manually..
I was thinking for creating a loop.
so as example (using AdventureWorks2019) :
select t.name as table_name
from sys.tables t
where schema_name(t.schema_id) = 'Person'
order by table_name;
This would get all the table name within the Person schema.
So I would loop :
Table input : select * from ${table_name}
But then i realized that for txt files, i need to declare all the field and their data types in pentaho, so it would become a problems.
Any ideas how to do this "backup" txt files?
Using Metadata Injection and more queries to the schema catalog tables in SQL Server. You not only need to retrieve the table name, you would need to afterwards retrieve the columns in that table and the data types, and inject that information (metadata) to the text output step.
You have in the samples directory of your spoon installation an example on how to use Metadata Injection, use it, along with the documentation, to build a simple example (the check to generate a transformation with the metadata you have injected is of great use to debug)
I have something similar to copy data from one database to another, both in Oracle, but with SQL Server you have similar catalog tables as in Oracle to retrieve the information you need. I created a simple, almost empty transformation to read one table and write to another. This transformation has almost no information, only the database origin in the Table Input step and the target database in the Table Output step:
And then I have a second transformation where I fill up all the information (metadata) to inject: The query to perform in the Table Input step, and all the data I need in the Table Output: Target table, if I need to truncate before inserting, the columns from (stream field) and to (Table field):

Does oracle store transaction date of data?

Im currently using oracle 11g. I had extracted data from the schema once at a specific date to do some cleansing process. Suppose that now i would want to extract again but only with new/updated data from the last date i extracted, is there anyway i could get it? unfortunately these data does not have any column that store last edited date.
i was wondering if Oracle would automatically store that type of info that i could check? perhaps any transaction log?
Thanks,
A Physal
One way would be to enable flashback and then you can do:
SELECT * FROM table1
MINUS
SELECT * FROM table1 AS OF TIMESTAMP TIMESTAMP '2018-01-01 00:00:00.000';
To get all the rows changed since 2018-01-01.

Is it possible to use SYSDATE as a column alias?

If I want to run a report daily and store the report's date as one of the column headers. Is this possible?
Example output (Counting the activities of employees for that day):
SELECT EMPLOYEE_NAME AS EMPLOYEE, COUNT(ACTIVITY) AS "Activity_On_SYSDATE" FROM EMPLOYEE_ACCESS GROUP BY EMPLOYEE_NAME;
Employee Activity_On_17042016
Jane 5
Martha 8
Sam 11
You are looking to do a reporting job with a data storing tool. The database (and SQL) is for storing and retrieving data, not for creating reports. There are special tools for creating reports.
In database design, it is very unhealthy to encode actual data in table or column name. Neither a table name nor a column name should have, as part of the name (and of the way they are used), an employee id, a date, or any other bit of actual data. Actual data should only be in fields, which in turn are in columns in different tables.
From what you describe, your base table should have columns for employee, activity and date. Then on any given day, if you want the count for the "current" day, you can query with
select employee, count(activity) ct
from table_name
where activity_date = SYSDATE
group by employee
If you want, you can also include the "activity_date" column in the output, that will show for which date the report was run.
Note that I assumed the column name for "date" is "activity_date." And in the output I used "ct" for a column alias, not "count." DATE and COUNT are reserved words, like SYSDATE, and you should NOT use them as table or column name. You could use them as aliases, as long as you don't need to refer to these aliases anywhere else in SQL, but it is still a very bad idea. Imagine you ever need to refer to a column (by name or by alias) and the name or alias is SYSDATE. What would a where clause like this mean?
where sysdate = sysdate
Do you see the problem?
Also, I can't tell from your question - were you thinking of storing these reports back in the database? To what end? It is better to store just one query and run it whenever needed (and make the "activity_date" for which you want the counts) an input parameter, so you can run the query for any date, at any time in the future. There is no need to store the actual daily reports in the database, as long as the base table is properly maintained.
Good luck!

SQL Server - Date of newly inserted column in a table

As part of my project, new columns are introduced in many tables. I wanted to find out the date in which this columns are introduced. Is there a way I can query the date of insertion of all the columns in a specific table in Oracle SQL Developer 3.0.04.
You can try to use the last_ddl_time object from the dba_objects table.

Import most recent data from CSV to SQL Server with SSIS

Here's the deal; the issue isn't with getting the CSV into SQL Server, it's getting it to work how I want it... which I guess is always the issue :)
I have a CSV file with columns like: DATE, TIME, BARCODE, etc... I use a derived column transformation to concatenate the DATE and TIME into a DATETIME for my import into SQL Server, and I import all data into the database. The issue is that we only get a new .CSV file every 12 hours, and for example sake we will say the .CSV is updated four times in a minute.
With the logic that we will run the job every 15 minutes, we will get a ton of overlapping data. I imagine I will use a variable, say LastCollectedTime which can be pulled from my SQL database using the MAX(READTIME). My problem comes in that I only want to collect rows with a readtime more recent than that variable.
Destination table structure:
ID, ReadTime, SubID, ...datacolumns..., LastModifiedTime where LastModifiedTime has a default value of GETDATE() on the last insert.
Any ideas? Remember, our readtime is a Derived Column, not sure if it matters or not.
Here is one approach that you can make use of:
Let's assume that your destination table in SQL Server is named BarcodeData.
Create a staging table (say BarcodeStaging) in your database that has the same column structure as your destination table BarcodeData into which CSV data is imported into.
In the SSIS package, add an Execute SQL Task before the Data Flow Task to truncate the staging table BarcodeStaging.
Import the CSV data into the staging table BarcodeStaging and not into the actual destination table.
Use the MERGE statement (I assume that you are using SQL Server 2008 or higher version), to compare the staging table BarCodeStaging and the actual destination table BarcodeData using the DateTime column as the join key. If there are unmatched rows, then copy the rows from the staging table and insert them into the destination table.
Technet link to MERGE statement: http://technet.microsoft.com/en-us/library/bb510625.aspx
Hope that helps.

Resources