MySQL database backup: performance issues - performance

Folks,
I'm trying to set up a regular backup of a rather large production database (half a gig) that has both InnoDB and MyISAM tables. I've been using mysqldump so far, but I find that it's taking increasingly longer periods of time, and the server is completely unresponsive while mysqldump is running.
I wanted to ask for your advice: how do I either
Make mysqldump backup non-blocking - assign low priority to the process or something like that, OR
Find another backup mechanism that will be better/faster/non-blocking.
I know of the existence of MySQL Enterprise Backup product (http://www.mysql.com/products/enterprise/backup.html) - it's expensive and this is not an option for this project.
I've read about setting up a second server as a "replication slave", but that's not an option for me either (this requires hardware, which costs $$).
Thank you!
UPDATE: more info on my environment: Ubuntu, latest LAMPP, Amazon EC2.

If replication to a slave isn't an option, you could leverage the filesystem, depending on the OS you're using,
Consistent backup with Linux Logical Volume Manager (LVM) snapshots.
MySQL backups using ZFS snapshots.
The joys of backing up MySQL with ZFS...
I've used ZFS snapshots on a quite large MySQL database (30GB+) as a backup method and it completes very quickly (never more than a few minutes) and doesn't block. You can then mount the snapshot somewhere else and back it up to tape, etc.

Edit: (previous answer was suggestion a slave db to back up from, then I noticed Alex ruled that out in his question.)
There's no reason your replication slave can't run on the same hardware, assuming the hardware can keep up. Grab a source tarball, ./configure --prefix=/dbslave; make; make install; and you'll have a second mysql server living completely under /dbslave.
EDIT2: Replication has a bunch of other benefits, as well. For instance, with replication running, you'll may be able to recover the binlog and replay it on top your last backup to recover the extra data after certain kinds of catastrophes.
EDIT3: You mention you're running on EC2. Another, somewhat contrived idea to keep costs down is to try setting up another instance with an EBS volume. Then use the AWS api to spin this instance up long enough for it to catch up with writes from the binary log, dump/compress/send the snapshot, and then spin it down. Not free, and labor-intensive to set up, but considerably cheaper than running the instance 24x7.

Try mk-parallel-dump utility from maatkit (http://www.maatkit.org/)
regards,

Something you might consider is using binary logs here though a method called 'log shipping'. Just before every backup, issue out a command to flush the binary logs and then you can copy all except the current binary log out via your regular file system operations.
The advantage with this method is your not locking up the database at all, since when it opens up the next binary log in sequence, it releases all the file locks on the prior logs so processing shouldn't be affected then. Tar'em, zip'em in place, do as you please, then copy it out as one file to your backup system.
An another advantage with using binary logs is you can restore up to X point in time if the logs are available. I.e. You have last year's full backup, and every log from then to now. But you want to see what the database was on Jan 1st, 2011. You can issue a restore 'until 2011-01-01' and when it stops, your at Jan 1st, 2011 as far as the database is concerned.
I've had to use this once to reverse the damage a hacker caused.
It is definately worth checking out.
Please note... binary logs are USUALLY used for replication. Nothing says you HAVE to.

Adding to what Rich Adams and timdev have already suggested, write a cron job which gets triggered on low usage period to perform the slaving task as suggested to avoid high CPU utilization.
Check mysql-parallel-dump also.

Related

How to use a recent Oracle backup file (from yesterday) and only online redo logs to recover the database in another location (disaster recovery)?

I would like to plan and test my database recovery in another site (another instance on another server in disaster recovery site).
I take a monthly RMAN level 0 image copy every month and daily incremental level 1 backups.
The database is running in noarchivelog mode. The online redo logs are multiplexed to a disk in the disaster recovery site. Also we have a recovery catalog on another server.
I want to test restoring the recent (yesterday) backup to database in disaster recovery site and then recover to just apply the online redo log files, how to achieve that?
side question: Is it sufficient to recover if we only have a yesterday backup and the online redo logs containing all transactions of today and none of them was overwritten? Since the database is in noarchivelog mode.
What is the use of archivelog mode if we have a daily backup and the redo logs are not overwritten during the day until the backup is taken?
what is the use of backing up archive logs?
You are working with a dangerous setup since you seem to be betting on redo log files that are never filled up between your backups. When your data has no value, go ahead, otherwise switch to archive log mode.
Archives are created when a redo log group fills up. So, in your case you need to copy the online redo log files manually to the remote site for recovery.
How sure are you about the redo log files not being overwritten?
Be sensible and if this is production switch to archive log mode. Otherwise, promise not te make promises about being able to make point in time recoveries.
An other point: if your online redo log files are damaged, your database has a big problem and in your case you might loose a day worth of work. Is that OK? If not, reduce the size of the redo log files to a limit where it does make a switch every now and then. I am sure your company has an idea about how much time they can accept loosing transactions from. Many companies allow less than one hour transaction loss.

Mark standalone redis as read-only

I want to mark a standalone Redis server (not a Redis-Cluster, not a Redis-Sentinel) as read-only. I have been googling for this for quite sometime but I don't seem to find a definite answer (Almost all answers point to Clustering or Sentinel). I was looking out for some config modification (CONFIG SET something).
NOTE: config set replica-read-only yes does not make the current redis-server read-only, but only its replicas.
My use-case basically is I am doing a migration wherein at some point I want to make the redis-server read-only. My application code can handle failures whenever a write call happens so that's not an issue.
Also, if this is not directly possible from redis server, is there something that I can do in the client code that'll have the same effect (I am using redis-py as the client library)? (Although this is less than ideal)
Things that I've tried
Played around with config set replica-read-only yes and other configs. They don't seem to be applying the current redis-server.
Tried marking a redis-server as a replica of itself (This was illogical, but just wanted to see if this worked), but turns out it deleted all keys in my local redis, so not something I can do.
Once the writes are done and you want to switch the node to read-only, couple of ways to do that:
Modify the redis.conf to have "min-replicas-to-write 3". Since you don't have 3 replicas your node will stop accepting writes but will continue to serve reads, as shown below:
However, please note that after modifying redis.conf, you will have to restart your redis node for the changes to take effect.
Another way is when you want to switch to readonly mode, at that time you create a replica and let it sync with the master and then kill the master node. Then replica will exist as read only.
There're several solution you can try:
You can use the rename-command config to disable write commands. If you only want to disable small number of commands, that's a good solution. However, since there're too many write commands, you might need to have too many configuration, and easy to miss some of them.
If you're using Redis 6.0, you can use Redis ACL to disable write commands for specific users.
You can setup a read-only Redis replica for your master, and ask clients to read from the replica.

PostgreSQL statistics issue - could not rename temporary statistics file

I am running PotgreSQL 9.4 on Windows, and constantly get the error,
2015-06-15 09:35:36 EDT LOG could not rename temporary statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat": Permission denied
I also see constant 200-800k writes to global.stat and global.tmp. I have seen other users with the same issue, but no solution.
It is a big database server, with 300g of data, and 6,000 databases.
I tried setting,
track_activities=off
In the config file, but it did not seem to have any affect.
Any help for the error, or reducing the write?
After my initial answer, I decided to research the operation of the stats collector and in particular what it is doing with the files in pg_stat_tmp. I've substantially re-written the answer as a result.
What are the global.stat / global.tmp files used for?
Postgresql contains functionality to collect statistics and status information about its operation. The function is described in Section 27.2 of the manual.
This information is collated by the stats collector process. It is made available to the other postgresql processes via the global.stat file. The first time you run a query that accesses this data within a transaction, the backend which you are connected to will read the global.stat file and cache the result, using it until the end of the transaction.
To keep this file up to date, the stats collector process periodically re-writes it with updated information. It typically does this several times a second. The process is as follows:
Create a new file global.tmp
Write data to this file
Rename global.tmp as global.stat, overwriting the previous global.stat
The global.tmp and global.stats files are written into the directory configured by the stats_temp_directory configuration parameter. Normally this is set to $PGDATA/pg_stat_tmp.
On shutdown, the stats file is written into the file $PGDATA/global/pgstat.stat, and the files in the tmp dir above are removed. This file is then read and removed when the database is started up again.
Why is the stats collector processor creating so much I/O load?
Normally, the amount of data written to the global.stats is relatively modest and writing it does not generate that much I/O traffic. However under some circumstances it does seem to get very bloated. When this happens the amount of load generated can start to get excessive as the entire file is rewritten more than once a second.
I have had one experience where it grew by a factor or 10 or more, compared to other similar servers. This machine did have an unusually large number of databases (for our application at least - 30-40 databases - but nothing like the 6000 you say you have). It is possible that having a large number of databases exacerbates this.
Some of the references below talk about a pattern of creating / dropping lots of tables causing bloat in these files, and that perhaps autovacuum is not running aggressively enough to remove the associated bloat. You may wish to consider your autovac settings.
Why do I get 'Permission Denied' errors on Windows?
After examining the postgresql source code I think there may be a race condition in accessing the global.stats file which could happen at any time, but is exacerbated by the size of the file.
The default mode of operation in Windows is that it is not possible to rename or remove a file while another process has it open. This is different to Linux (or Unix) where a file can be renamed or removed while other processes are accessing it.
In the sequence above you can see that if one of the backend processes is reading the file at the same time as the stats collector is rewriting it, then the backend process may still have the file open at the time the rename is attempted. That leads to the 'Permission Denied' error you are seeing.
Naturally when the file becomes very large, then the amount of time taken to read it becomes more significant, therefore the probability of the stats collector process attempting a rename while a backend still has it open increases.
However, since the file is frequently being rewritten, the impact of these errors is relatively mild. It just means that this particular update fails, leading the the backends getting slightly out of date statistics. The next update will probably succeed.
Note that Windows does offer a file opening mode which does allow files to be deleted or renamed while they are opened by another process, however as far as I could tell, this mode is not used by Postgresql. I could not find any bug report on this - seems like it should be reported.
In summary, these errors are a side effect of the main problem, which is the excessive size of the global.stat file.
I've turned track_activities off but the file is still being written - Why?
From what I can see, track_activites affects only one of the sets of information that the stats collector is collecting.
In addition, it looks as though the stats collector process is started regardless of these settings, and will continue to re-write the file. The settings appear to control only the collection of fresh data.
My conclusion is that once the file has become bloated, it will remain so and continue to be re-written, even once all of the stats collection options are turned off.
What can I do to avoid this problem?
Once the file has become bloated, it seems that the easiest way to get the database back into a good working state is to remove the file, using the following steps:
Stop the database
When the DB is stopped, the pg_stat_tmp directory is empty and a file $PGDATA/global/pgstat.stat is written. We renamed this file to pgstat.stat.old.
Start the database. It creates a fresh set of pgstat files. After confirming the server was operating correctly you can remove the old file you have renamed.
This is the process we used when one of our servers suffered from this problem.
Needless to say be very careful when manually manipulating any files under the Postgresql Data directory.
After this you may want to monitor the server to see if it the file becomes bloated again. If it does then here are some additional ideas to consider:
As mentioned above I have seen some references to this file becoming bloated if autovacuum is not running aggressively enough. You may wish to tune the autovacuum settings
Disabling any of the track_xxx options described in the Section 18.9.1 of the manual which are not required may help
It is possible to place the pg_stats_tmp directory in a tmpfs filesystem (or whatever equivalent RAM based filesystem is available in windows). Doing so should eliminate I/O as a concern for these files.
References:
Postgres stats collector showing high disk I/O
Too much I/O generated by postgres stats collector process
stats collector suddenly causing lots of IO
Here might be a solution for your problem. https://wiki.postgresql.org/wiki/May_2015_Fsync_Permissions_Bug
Another possibility could be antivirus settings. Try to turn it off temporarily.
It happened to me few days ago. I rebooted the machine, but the error did not disappeared.
Don't know why, but performing a vacuum analyze verbose did the trick, and the error has stoped to show up.

Postgres: After importing production database (with replication) to my local machine, I notice network packets being sent and received from macbook

I've been a MySQL guy, and now I'm working with Postgres so I am learning. Wondering if someone can tell me why my postgres process on my macbook is sending and receiving data over my network. I am just noticing this is happening for the first time - so maybe it's been going on before this and I just never noticed postgres does this.
What has me a bit nervous, is that I pulled down a production datadump from our server which is set up with replication and I imported it to my local postgres db. The settings in my postgresql.conf don't indicate replication is turned on. So it shouldn't be streaming out to anything, right?
If someone has some insight into what may be happening, or why postgres is sending/receiving packets, I'd love to hear the easy answer (and the complex one if there's more to what's happening).
This is a postgres install via Homebrew on MacOSX.
Thanks in advance!
Some final thoughts: It's entirely possible, I guess, that Mac's activity monitor also shows local 'network' traffic stats. Maybe this isn't going out to the internets.....
In short, I would not expect replication to be enabled for a DB that was dumped from a server that had it if the server to which it was restored had no replication configured at all.
More detail:
Normally, to get a local copy of a database in Postgres, one would do a pg_dump of the remote database (this could be done from your laptop, pointing at your server), followed by a createdb on your laptop to create the database stub and then a pg_restore pointed at the dump to populate its contents. [Edit: Re-reading your post, it seems like you may perhaps have done this, but meant that the dump you used had replication enabled.)]
That would be entirely local (assuming no connections into the DB from off-box), so long as you didn't explicitly setup any replication or anything else that would go off-box. Can you elaborate on what exactly you mean by importing with replication?
Also, if you're concerned about remote traffic coming from Postgres, try running this command a few times over the period of a minute or two (when you are seeing the traffic):
netstat | grep postgres
In general, replication in Postgres in configured at a server level, and has to do with things such as the master server shipping WAL files to the standby server (for streaming replication). You would have almost certainly have had to setup entries in postgresql.conf and pg_hba.conf to ensure that the standby server had access (such as a replication entry in the latter conf file). Assuming you didn't do steps such as this, I think it can pretty safely be concluded that there's no replication going on (especially in conjunction with double-checking via netstat).
You might also double-check the Postgres log to see if it's doing anything replication related. In a default install, that'd probably be in /var/log/postgresql (although I'm not 100% sure if Homebrew installs put it somewhere else).
If it's UDP traffic, to and from a high port, it's likely to be PostgreSQL's internal statistics collector.
These are pre-bound to prevent interference and should not be accessible outside of PostgreSQL.

OBIEE:how to reload rpd file quickly?

I'm new to use oracle BIEE.My development enviromnent now is installed,and the project is a little big.Multi user development is using for developing now.The problems happens when one developer publish the rpd to network and want to test the data,the server reloading the rpd file takes too much time and I can hardly wait!When multi users want to test rpd file,e,can't stand it... is there any other way to solve the problem?or how to make the biee sever reload the rpd file quickly?
It's hard to say specifically without knowing a bit more about your setup, but here are a few general advice pointers:
When stopping the service OBI will wait for any running queries to complete before stopping the service, so making sure there's nothing running before you try to do this.
Make sure you're only restarting the BI Server component, you don't need to wait for the other services to restart if you're just changing the RPD (if you're on 11g then deploying through EM should mean this happens anyway so you don't need to worry).
If you're using 11g, you could try incremental updates by creating patches.
Check whether the hardware you're running on is adequate, most importantly that you've enough RAM so it's not having to page out to disk when it loads the RPD.
Remove anything unused from the RPD to make it smaller.

Resources