I am trying to figure out how to delete all the databases in the CockroachDB using commands, without needing to delete them one by one. If that is not an option, can you point out the directory where all the information of the databases are stored by cockroachDB so that I can manually delete them and be done with it?
You can't drop all databases at once, they have to be dropped one by one. See the drop database statement.
If you are within a program, you can first fetch the list of databases (SHOW DATABASES, or SELECT datname FROM pg_database), just be sure not to try to drop crdb_internal, information_schema, pg_catalog, or system as they cannot be dropped.
If you want to wipe the cockroach cluster itself, you can kill all nodes and rm -rf <data directory> on each node.
The data directory is whatever path you specified in the --store flag (${PWD}/cockroach-data by default).
Related
We have a materialized view (MV) on Oracle 19c. Its data is updated/refreshed by scheduler job every day.
Because of some maintenance work that may last for more than 6 months, I would like to update some data inside it manually. However, manual update is forbidden on MV.
It may sound stupid but I am planning to:
Disable to the scheduler job
Turn the MV into table (DROP MATERIALIZED VIEW .. PRESERVE TABLE; )
Then we can update the table data manually for our maintenance work.
After the maintenance work, I would:
Turn the table back to MV
Re-enable the scheduler job to refresh the data
So the question is... how do I turn the table back to MV SAFELY in my case? It is easy to turn MV into table but I have never heard anyone doing it the other way round.
By safely, I mean that the reverted MV is back to before without lost of behaviours/properties.
If I turn the MV to table and then back to MV, would the index still work for both the table and the MV without affected?
Similarly, if we already have a synonym for the MV, would this synonym still work after converting to table and back to MV again?
Do I need re-grant any user privileges to the table and later for the MV again?
Note: I am aware that after turning the table back to MV, the data get refreshed and our manual data would be lost. That is acceptable for us because we just want the manual data to stay during the maintenance period.
If there are other suggestions/alternatives, I am happy to hear.
Combining synonym, patch table and views might be a good solution for temporary situation as suggested.
And then..
You can safely recreate materialized view with
CREATE MATERIALIZED VIEW testmv ON PREBUILT TABLE ...
Indexes can be used till refresh. You don't need to re-grant user privileges
Synonyms that crated previously for MV still works after creating MV from prebuilt table.
What is the reason to update some data inside the mview? Maybe you need a table, not a mview. Just refresh the materialized view to update the data, or change the related SQL to reflect the dataset you want. Other way it would be as inconsistent state.
Turn the table back to MV
If you understand that after that step, the MV is refreshed with the life data and your changes to the underlaying table are lost, it's OK.
Another way to do the job is to access the MV through a VIEW, and changes the definition of the VIEW for the maintenance work (conceptually "select from MV where not exists (select FROM PATCH) UNION ALL select from PATCH"), and putting it back to "select FROM MV" when done).
(And note that truncate PATCH will have the same effect, and you don't have to change the VIEW...)
I have got a connection to an Oracle Database set up in Power BI.
This is working fine, apart from the fact it brings back 9500+ tables that start with "BIN".
Is there some SQL code I can put in the SQL statement section when connecting to the Oracle Database that limits the tables that it returns to ignore any table that begins with 'BIN'?
Tables starting with BIN$ are tables that have been dropped but not purged and are in Oracle's "recycle bin".
The simplest method of not showing them is, if they are no longer required, to PURGE (delete) the tables from the recycle bin and then you will not see them as they will not exist.
You can use (documentation link):
PURGE TABLE BIN$0+xyzabcdefghi123 to get rid of an individual tables with that name (or you can use the original name of the table).
PURGE TABLESPACE tablespace_name USER username; to get rid of all recycled tables in a table space belonging to a single user.
PURGE TABLESPACE tablespace_name; to get rid of all recycled tables in a table space.
PURGE RECYCLEBIN; to get rid of all of the current user's recycled tables.
PURGE DBA_RECYCLEBIN; to get rid of everything in the recycle bin (assuming you have SYSDBA privileges).
Before purging tables you should make sure that they are really not required as you would need to restore from backups to bring them back.
Unfortunately, I truncated a table in hive and trash got cleaned up. Is there any way to get the data back.thanks
There is a way to recover deleted file(s) but it's not recommended and if not done properly can affect a cluster as well. Use these procedure with caution on production system
Here's step by step description to recover accidentally deleted files.
https://community.hortonworks.com/articles/26181/how-to-recover-accidentally-deleted-file-in-hdfs.html
I have a directory in HDFS, everyday one processed file is placed in that directory with DateTimeStamp in file name, if I create external table on top of that Directory location, does external table refreshes itself when every day file comes and resides in that directory ??
If you add files into table directory or partition directory, does not matter, external or managed table in Hive, the data will be accessible for queries, you do not need to do any additional steps to make data available, no refresh is necessary.
Hive table/partition is a metadata (DDL, location, statistics, access permissions, etc) plus data files in the location. So, data is stored in the table/partition location in HDFS.
Only if you create new directory for new partition which is not created yet, then you will need to execute ALTER TABLE ADD PARTITION LOCATION=<new location> or MSCK REPAIR TABLE command. The equivalent command on Amazon Elastic MapReduce (EMR)'s version of Hive is: ALTER TABLE table_name RECOVER PARTITIONS.
If you add files into already created table/partition locations, no refresh is necessary.
CBO can use statistics for query calculation without reading data files, for example count(*). It works for simple queries only, like count(*), max().
If you are using CBO with statistics for query calculation, you may need to refresh it using ANALYZE TABLE hive_table PARTITION(partitioned_col) COMPUTE STATISTICS. See this answer for more details: https://stackoverflow.com/a/39914232/2700344
If you do not need statistics and want your table location to be scanned every time you query it, switch it off: set hive.compute.query.using.stats=false;
Adding nodes (leaves or aggregators) to a memSQL cluster is straightforward: I edited memsql_cluster.json and reran memsql-cluster setup. The problem is adding partitions to an existing table. The point here is to scale up: need to add more rows, but have exhausted the available memory in the original cluster.
I tried, for example:
mysql> create partition DMP:32 on 'ec2-X-Y-Z.compute-1.amazonaws.com':3306;
ERROR 1773 (HY000): Partition ordinal 32 is out of bounds. It must be in [0, 32).
mysql>
Reading the memsql docs, I could not find any ddl option to change the number of partitions. I would prefer not to drop and recreate these tables. Any ideas on how to do it?
Thanks!
You cannot add more rows to an in-memory database when memory is already exhausted. That said, you can scale out (i.e. add more leaf nodes).
You can add more leaf nodes to your MemSQL cluster, then run rebalance_partitions to distribute your existing partitions evenly across the larger cluster. This will allow each of your partitions to consume more space in your cluster, allowing you to scale out.
If you just wanted to add more partitions, you can use mysqldump to export your MemSQL schema and data, recreate the database with more partitions, then load the schema and data back into your database that now has more partitions.
Learn more about rebalance_partitions here:
http://docs.memsql.com/docs/latest/ref/REBALANCE_PARTITIONS.html
In order to re-partition the data currently you have to (1)export the schema and (2)data from your database (partitions are set on the database level), (3)recreate the database, and (4)reload everything back in.
You can dump database schema and tables to local files using mysqldump. It's best to run mysqldump twice, once to dump the schema and once to dump the data.
This command will create a file in the local directory containing the schemas for all tables in the DB:
mysqldump -h 127.0.0.1 -u root -B <db_name> --no-data > schema.sql
If you have a large database and not enough local disk to dump all data at once you can specify some table(s) per command. For each one use a different filename (eg data1.sql, data2.sql, etc) and put one or more table names in as arguments. I would dump any smaller tables with one statement together and then dump any giant tables separately. Use this command:
mysqldump -h 127.0.0.1 -u root -B <db_name> --no-create-info --tables <table1> <table2> <table3> > data1.sql
Once all tables and the schema have been dumped you can recreate the DB with more partitions. We generally recommend #partitions = #leaves * #cores/leaf. Use this command:
CREATE DATABASE <db_name> PARTITIONS = <#_total_partitions>;
Confirm # of partitions with "SHOW PARTITIONS ON <db_name>;"
Insert the schema and data to the new database with these commands from the shell prompt:
mysql -u root -h 127.0.0.1 < schema.sql
mysql -u root -h 127.0.0.1 < data1.sql
The schema will take a few minutes to load. Time for data to load will depend on the size of the dataset. After each table completes its rows will be committed and queries can run against it.