I have dozens of legacy stored procedures that create temporary tables for collecting results(performance improvement).
I've created a read replica of my Aurora PostgreSQL and tried to execute this procedure, but it failed, as it doesn't allow creating even temporary tables in a read-only transaction.
Are any ways how to solve this issue with minimal effort?
Just converting temp tables to CTE is not effective as they won't give me the performance that temp tables give.
Related
I have a scenario where there are 3 files of 5 million lines each with 3 weeks of data is bulk mapped to a staging table. They run parallelly. If the data transfer for a file fails with a concurrency error, what will be the best way to load the data of the 3 files effectively into the stage table.
(I was asked this in an interview.)
In ODI 12c, Parallel Target Table Load can be achieved by selecting the Use Unique Temporary Object Names checkbox in the Physical tab of a mapping. The work tables for each concurrent session will have a unique name.
I discuss it in more details in the Parallel Target Table Load section of this blog on ODI 12c new features
I'm working on redesigning DWH solution previously based on Teradata with lots of BTEQ scripts performing transformations on mirror tables loaded from source DBs. New solutions will be based on Snowflake and as a transformation tool set of SQL (Snowflake) scripts is being prepared.
Is that a right approach to use in ETL scripts DDL statements which create e.g. temporary table which than et the end of script is dropped?
In my opinion such table should be created before running this script instead of creating it in the script on fly. One argument opts that DDL statements on Snowflake commits transaction and that's why I want to avoid DDL statements in transformation scripts. Please help me finding pros and cons of using DDL statements in ETL process and back me up that I'm right or convince that I'm wrong.
If you are wanting transactions to cover all your SELECT/INSERT/MERGE steps of your transformation step of your ELT, you need to not create/drop any tables, as those will commit your open transactions.
We get around this by have pre-existing worker tables per task/deployment that are create/truncated prior to the transaction section of our ELT process. And our tooling does not allow a task to run at the same time.
Thus we load into a landing table, we transform into temporary tables, then we multi-table merge into the final tables. With only the last steps needing to be in transactions.
I came accross creating the temporary table in oracle. But could not understand the best use of this.
Can someone help me to understand what is the features and benefits of using a temporary table in Oracle (create temporary table temp_table) over an ordinary table (create table temp_table)
)
From the concepts guide:
A temporary table definition persists in the same way as a permanent
table definition, but the data exists only for the duration of a
transaction or session. Temporary tables are useful in applications
where a result set must be held temporarily, perhaps because the
result is constructed by running multiple operations.
And:
Data in a temporary table is private to the session, which means that
each session can only see and modify its own data.
So one aspect is that the data is private to your session. Which is also true of uncommitted data in a permanent table, but with a temporary table the data can persist and yet stay private across a commit (based on the on commit clause on creation).
Another aspect is that they use temporary segments, which means you generate much less redo and undo overhead using a temporary table than you would if you put the same data temporarily into a permanent table, optionally updated it, and then removed it when you'd finished with it. You also avoid contention and locking issues if more than one session needs its own version of the temporary data.
Given below are some points why and when we should temporary table :-
1)Temporary tables are created for storing the data in a tabular form, for easy retrieval of it when needed, with in that particular session.
2)It also add a security purpose of keeping the data available only for that particular session.
3) When a code goes long and a lot of cursors are opened it better to put the data in a temporary table so that it can be easily fetched at the time needed.
Given that
there is a single tablespace pointing to one file, and there are many schemas pointing to that tablespace and there are many simultaneous jobs doing heavy DDL operations (dropping database, dropping indexes, creating a large database from scratch with data import) using these schemas,
is it possible the tablespace somehow locked so that there would be timeouts for some jobs using these schemas?
I need to perform a query 2.5 million times. This query generates some rows which I need to AVG(column) and then use this AVG to filter the table from all values below average. I then need to INSERT these filtered results into a table.
The only way to do such a thing with reasonable efficiency, seems to be by creating a TEMPORARY TABLE for each query-postmaster python-thread. I am just hoping these TEMPORARY TABLEs will not be persisted to hard drive (at all) and will remain in memory (RAM), unless they are out of working memory, of course.
I would like to know if a TEMPORARY TABLE will incur disk writes (which would interfere with the INSERTS, i.e. slow to whole process down)
Please note that, in Postgres, the default behaviour for temporary tables is that they are not automatically dropped, and data is persisted on commit. See ON COMMIT.
Temporary table are, however, dropped at the end of a database session:
Temporary tables are automatically dropped at the end of a session, or
optionally at the end of the current transaction.
There are multiple considerations you have to take into account:
If you do want to explicitly DROP a temporary table at the end of a transaction, create it with the CREATE TEMPORARY TABLE ... ON COMMIT DROP syntax.
In the presence of connection pooling, a database session may span multiple client sessions; to avoid clashes in CREATE, you should drop your temporary tables -- either prior to returning a connection to the pool (e.g. by doing everything inside a transaction and using the ON COMMIT DROP creation syntax), or on an as-needed basis (by preceding any CREATE TEMPORARY TABLE statement with a corresponding DROP TABLE IF EXISTS, which has the advantage of also working outside transactions e.g. if the connection is used in auto-commit mode.)
While the temporary table is in use, how much of it will fit in memory before overflowing on to disk? See the temp_buffers option in postgresql.conf
Anything else I should worry about when working often with temp tables? A vacuum is recommended after you have DROPped temporary tables, to clean up any dead tuples from the catalog. Postgres will automatically vacuum every 3 minutes or so for you when using the default settings (auto_vacuum).
Also, unrelated to your question (but possibly related to your project): keep in mind that, if you have to run queries against a temp table after you have populated it, then it is a good idea to create appropriate indices and issue an ANALYZE on the temp table in question after you're done inserting into it. By default, the cost based optimizer will assume that a newly created the temp table has ~1000 rows and this may result in poor performance should the temp table actually contain millions of rows.
Temporary tables provide only one guarantee - they are dropped at the end of the session. For a small table you'll probably have most of your data in the backing store. For a large table I guarantee that data will be flushed to disk periodically as the database engine needs more working space for other requests.
EDIT:
If you're absolutely in need of RAM-only temporary tables you can create a table space for your database on a RAM disk (/dev/shm works). This reduces the amount of disk IO, but beware that it is currently not possible to do this without a physical disk write; the DB engine will flush the table list to stable storage when you create the temporary table.