Does the SCN itself encode a timestamp or is it a lookup from some table.
From an AskTom post he explains that the timestamp to +/-3seconds is stored in raw field in smon_scn_time. IS that where the function is going to get the value?
If so, when is that table purged if ever? If so, what triggers that purge?
If it is, does that make it impossible to translate old SCN's to Timestamps?
If it's impossible, then it eliminates any uses of that field that are long term things (read: auditing).
If I put that function in a query, would joining to that table be faster?
If so, anyone know how to covert that Raw column?
The SCN does not encode a time value. I believe it is an autoincrementing number.
I would guess that SMON is inserting a row into SMON_SCN_TIME (or whatever table underlies it) every time it increments the SCN, including the current timestamp.
I queried for the minimum recorded timestamp in several databases and they all go back about 5 days and have a little under 1500 rows in the table. So it is less than the instance lifetime.
I imagine the lower bound on how long the data is kept might be determined by the DB_FLASHBACK_RETENTION_TARGET parameter, which defaults to 1 day.
I would recommend using the function, they've probably provided it so they can change the internals at will.
No idea what the RAW column TIM_SCN_MAP contains, but the TIME_DP and SCN column would appear to give you the mapping.
Related
I've got a 3GB SQLite database file with a single table with 40 million rows and 14 fields (mostly integers and very short strings and one longer string), no indexes or keys or other constraints -- so really nothing fancy. I want to check if there are entries where a specific integer field has a specific value. So of course I'm using
SELECT EXISTS(SELECT 1 FROM FooTable WHERE barField=?)
I haven't got much experience with SQLite and databases in general and on my first test query, I was shocked that this simple query took about 30 seconds. Subsequent tests showed that it is much faster if a matching row occurs at the beginning, which of course makes sense.
Now I'm thinking of doing an initial SELECT DISTINCT barField FROM FooTable at application startup, and caching the results in software. But I'm sure there must be a cleaner SQLite way to do this, I mean, that should be part of what a DBMS's job right?
But so far, I've only created primary keys for speeding up queries, which doesn't work here because the field values are non-unique. So how can I speed up this query so that it works at constant time? (It doesn't have to be lightning fast, I'd be completely fine if it was under one second.)
Thanks in advance for answering!
P.S. Oh, and there will be about 500K new rows every month for an indefinite period of time, and it would be great if that doesn't significantly increase query time.
Adding an index on barField should speed up the subquery inside the EXISTS clause:
CREATE INDEX barIdx ON FooTable (barField);
To satisfy the query, SQLite would only have to seek the index once and detect that there is at least one matching value.
After googling for a while , I am posting this question here since I was not able to find such a problem posted anywhere.
Our application has a table with 274 columns(No LOB or Long Raw columns) and over a period of 8 years the table started to have chained rows so any full table scan is impacting the performance.
When we dig deeper we found out that approximately 50 columns are not used anywhere in application and so could be dropped right away. But the challenge here is the application has to undergo many code changes to achieve this and we have exposed the underlying data as a service that is being consumed by other applications as well. So we cannot choose the code change as an option for now.
Another option we thought was, whether I can make these 50 columns as Virtual column set to NULL always, then we only we need to make changes to table loading procs and rest all will be as is. But I need experts' advice whether adding virtual columns to the table will not construct chained rows again. Will this solution work for the given problem statement?
Thanks
Rammy
Oracle only allows 255 columns per block. For tables with more than 255 columns it splits rows into multiple blocks. Find out more.
You table has 274 columns so you have chained rows because of the inherent table structure rather than the amount of space the data takes up. Making fifty columns all null won't change that.
So, if you want to eliminate the chained rows you really need to drop the rows. Of course you don't want to change all that application code. So what you can try is:
rename the table
drop the columns you don't want any more
create a view using the original table name and include NULL columns in the view's projection to match the original table structure.
I have a table in Hbase named 'xyz' . When I do an update operation on this table , it updates a table even though it is same record .
How can I control second record to not be added.
Eg:
create 'ns:xyz',{NAME=>'cf1',VERSIONS => 5}
put 'ns:xyz','1','cf1:name','NewYork'
put 'ns:xyz','1','cf1:name','NewYork'
Above put statements are giving 2 records with different timestamp if I check all versions. I am expecting that it should not add 2nd record because it have same value
HBase isn't going to look through the entire row and work out if it's the same as the data you're adding. That would be an expensive operation, and HBase prides itself on its fast insert speeds.
If you're really eager to do this (and I'd ask if you really want to do this), you should perform a GET first to see if the data is already present in the table.
You could also write a Coprocessor to do this every time you PUT data, but again the performance would be undesirable.
As mentioned by #Ben Watson, HBase is best known for it's performance in write since it doesn't need to check for the existence of a value as multiple versions will be maintained by default.
One hack what you can do is, you can use custom versioning. As show in the below screenshot, you have two versions already for a row key. Now if you are going to insert the same record with the same timestamp. HBase would be overwriting the same record with just the value.
NOTE: It is left to your application to get the same timestamp for a particular value.
I have configured free text search on a table in my postgres database. Pretty simple stuff, with firstname, lastname and email. This works well and is fast.
I do however sometimes experience looong delays when inserting a new entry into the table, where the insert keeps running for minutes and also generates huge WAL files. (We use the WAL files for replication).
Is there anything I need to be aware of with my free text index? Like Postgres maybe randomly restructuring it for performance reasons? My index is currently around 400 MB big.
Thanks in advance!
Christian
Given the size of the WAL files, I suspect you are right that it is an index update/rebalancing that is causing the issue. However I have to wonder what else is going on.
I would recommend against storing tsvectors in separate columns. A better way is to run an index on to_tsvector()'s output. You can have multiple indexes for multiple languages if you need. So instead of a trigger that takes, say, a field called description and stores the tsvector in desc_tsvector, I would recommend just doing:
CREATE INDEX mytable_description_tsvector_idx ON mytable(to_tsvector(description));
Now, if you need a consistent search interface across a whole table, there are more elegant ways of doing this using "table methods."
In general the functional index approach has fewer issues associated with it than anything else.
Now a second thing you should be aware of are partial indexes. If you need to, you can index only records of interest. For example, if most of my queries only check the last year, I can:
CREATE INDEX mytable_description_tsvector_idx ON mytable(to_tsvector(description))
WHERE created_at > now() - '1 year'::interval;
I have a table that is auto-updating from time to time (say daily). All updated fields are of type TEXT, and might have lots of data. What I definitely know is that the data will not change a lot. Usually up to 30 characters added or deleted.
So what would be more efficient? To merge somehow the changes or delete the old data and retrieve the new one?
And, if the merge way is the way to do it, how should I do that? Is there any keyword or something in order to make this easier and more efficient?
P.S I am completely new to databases in general, it's the very first time I ever create and use a database, so sorry if it is a silly question
Due to the MVCC model, PostgreSQL always writes a new row for any set of changes applied in a single UPDATE. Doesn't matter, how much you change. There is no "merge way".
It's similar to (but not the same as) deleting the row and inserting a new one.
Since your columns are obviously big, they are going to be TOASTed, meaning they are compressed and stored out-of-line in a separate table. In an UPDATE, these columns can be preserved as-is if they remain unchanged, so it's considerably cheaper to UPDATE than to DELETE and INSERT. Quoting the manual here
During an UPDATE operation, values of unchanged fields are normally
preserved as-is; so an UPDATE of a row with out-of-line values incurs
no TOAST costs if none of the out-of-line values change.
If your rows have lots of columns and only some get updated a lot, it might pay to have two separate tables with a 1:1 relationship. But that's an extreme case.