This question is going to be a bit specific because I have tried A LOT of things out there and none of it has worked for me. I'm hoping someone out there might have another idea.
I am working with PostgreSQL on a Mac (OS High Sierra) and I am trying to improve the performance for generating a materialized view, but can't compare my changes anymore because it seems PostgreSQL has cached the materialized view. It used to take ~12 minutes to generate the materialized view, and now it's taking less than 10 seconds (same code, I reverted the changes).
I used EXPLAIN (ANALYZE, BUFFERS) to confirm that almost all of the data getting fetched by the query to generate the materialized view is a hit (cached), and there were almost no disk reads.
I do not know if the information is cached in PostgreSQL's shared buffers or in the OS cache because at this point I've done things that I thought would have cleared both.
Here is what I have tried for emptying the PostgreSQL cache:
Restarted PostgreSQL server using brew services stop postgres, and then brew services start postgres (also tried calling sync && sudo purge in between). I confirmed with top as well as grep that postgres was no longer running.
Used DISCARD ALL, as well as DISCARD with its other options.
Set the shared_buffers setting in postgresql.conf to the minimum (128k).
Installed, compiled, and used pg_dropcache.
I looked at pg_ctl for a bit but I'll admit I couldn't figure out how to use it. I got the error no database directory specified and environment variable PGDATA unset, and I am not sure what to set the -D/pgdata option to for my case.
VACUUM. I know this shouldn't have had an effect, but I tried it anyway.
Here is what I have tried for emptying the operating system's cache:
Restarted computer.
Emptied ~/Library/Caches and /Library/Caches.
sync && sudo purge as well as sync && purge.
Booted up in Safe Mode.
I have also tried a few other things that I thought would force PostgreSQL to generate the materialized view from scratch (these would have been fine since I only need to test performance in dev for now):
Cloned the main table used in the materialized view, and generated the materialized view from the clone. It still generated within 10 seconds.
Scrambled some column values (first_name, last_name, mem_id (not the primary key)). It still generated within 10 seconds (and the materialized view was generated correctly with the newly scrambled values).
I am stuck and do not know what to try anymore. Any ideas/help would be appreciated!
Rebooting your computer clears both of the caches (unless you use something like autoprewarm from pg_prewarm, but that code has not be released yet). If the reboot doesn't cause the problem to reappear, then you have either fixed the problem permanently or didn't correctly understand it in the first place.
One possibility is that an ANALYZE (either manual, or auto) fixed some outdated statistics which was causing a poor plan to be used by the materialized view refresh. Another possibility is that a VACUUM means that now index-only scans no longer have to access the table pages, because they are marked as all-visible. If either of these is the case, and if you wanted to recreate the problem for some reason, you would have to restore the database to the state before VACUUM or ANALYZE was run.
EXPLAIN (ANALYZE, BUFFERS) only knows about shared_buffers. If something is a hit in the OS cache only, it will still be reported as a miss by EXPLAIN (ANALYZE, BUFFERS). If you freshly restarted PostgreSQL and the very first query run shows mostly buffer hits and only a few misses, that indicates your query is hitting the same buffers over and over again. This is common in index-only scans, for example, because for every row it consults one of just a handful of visibility map pages.
Related
Ever since I got a new ARM-based M1 MacBook Pro, I've been experiencing severe and consistent PostgreSQL issues (psql 13.1). Whether I use a Rails server or Foreman, I receive errors in both my browser and terminal like PG::InternalError: ERROR: could not read block 15 in file "base/147456/148555": Bad address or PG::Error (invalid encoding name: unicode) or Error during failsafe response: PG::UnableToSend: no connection to the server. The strange thing is that I can often refresh the browser repeatedly in order to get things to work (until they inevitably don't again).
I'm aware of all the configuration challenges related to ARM-based M1 Macs, which is why I've uninstalled and reinstalled everything from Homebrew to Postgres multiple times in numerous ways (with Rosetta, without Rosetta, using arch -x86_64 brew commands, using the Postgres app instead of the Homebrew install). I've encountered a couple other people on random message boards who are experiencing the same issue (also on new Macs) and not having any luck, which is why I'm reluctant to believe that it's a drive corruption issue. (I've also run the Disk Utility FirstAid check multiple times; it says everything's healthy, but I have no idea how reliable that is.)
I'm using thoughtbot parity to sync up my dev environment database with what's currently in production. When I run development restore production, I get hundreds of lines in my terminal that look like the output below (this is immediately after the download completes but before it goes on to create defaults, process data, sequence sets, etc.). I believe it's at the root of the issue, but I'm not sure what the solution would be:
pg_restore: dropping TABLE [table name1]
pg_restore: from TOC entry 442; 1259 15829269 TABLE [table name1] u1oi0d2o8cha8f
pg_restore: error: could not execute query: ERROR: table "[table name1]" does not exist
Command was: DROP TABLE "public"."[table name1]";
pg_restore: dropping TABLE [table name2]
pg_restore: from TOC entry 277; 1259 16955 TABLE [table name2] u1oi0d2o8cha8f
pg_restore: error: could not execute query: ERROR: table "[table name2]" does not exist
Command was: DROP TABLE "public"."[table name2]";
pg_restore: dropping TABLE [table name3]
pg_restore: from TOC entry 463; 1259 15830702 TABLE [table name3] u1oi0d2o8cha8f
pg_restore: error: could not execute query: ERROR: table "[table name3]" does not exist
Command was: DROP TABLE "public"."[table name3]";
pg_restore: dropping TABLE [table name4]
pg_restore: from TOC entry 445; 1259 15830421 TABLE [table name4] u1oi0d2o8cha8f
pg_restore: error: could not execute query: ERROR: table "[table name4]" does not exist
Command was: DROP TABLE "public"."[table name4]";
Has anyone else experienced this? Any solution ideas would be much appreciated. Thanks!
EDIT: I was able to reproduce the same issue on an older MacBook Pro (also running Big Sur), so it seems unrelated to M1 but potentially related to Big Sur.
Definitive workaround for this:
After trying all the workarounds in the other answer, I was STILL getting this error occasionally. Even after dumping and restoring the database, switching to M1-native postgres, running all manner of maintenance script, etc.
After much tinkering with postgresql.conf, the only thing that has reliably worked around this issue indefinitely (have not since received the error):
In postgresql.conf, change:
max_worker_processes = 8
to
max_worker_processes = 1
After making this change, I have thrown every test at my previously error-ridden database and it hasn't displayed the same error once. Previously an extraction routine I run on a database of about 20M records would give the bad address error after processing 1-2 million records. Now it completes the whole process.
Obviously there is a performance penalty to reducing the number of parallel workers, but this is the only way I've found to reliably and permanently resolve this issue.
UPDATE #2:
WAL Buffer etc. adjustments extended the time between errors, but didn't eliminate it completely. Ended up reinstalling a fresh Apple Silicon version of Postgres using Homebrew then doing a pg_dump of my existing database (experiencing the errors) and restoring it to the new installation/cluster.
Here's the interesting bit: pg_restore failed to restore one of the indexes in the database, and noted it during the restore process (which otherwise completed). My hunch is that corruption or another issue with this index was causing the Bad Address errors. As such, my final suggestion on this issue is to perform pg_dump, then use pg_restore, not pg_dump to restore the database. pg_restore appears to have flagged this issue where pg_dump didn't, writing a clean DB sans the faulty index.
UPDATE:
Continued to experience this issue after attempting several workarounds, including a full pg_dump and restore of the affected database. And while some of the fixes seem to extend the time between occurrences (particularly increasing shared buffer memory), none have proven a permanent fix.
That said, some more digging on postgres mailing lists revealed that this "Bad Address" error can occur in conjunction with WAL (write-ahead-log) issues. As such, I've now set the following in my postgresql.conf file, significantly increasing the WAL buffer size:
wal_buffers = 4MB
and have not experienced the issue since (knock on wood, again).
It makes sense that this would have some effect, as the wal_buffer size increases by default in proportion to the shared buffer size (as aforementioned, increasing shared buffer size provided temporary relief). Anyway, something else to try until we get definitive word on what's causing this bug.
Was having this exact issue sporadically on an M1 MacBook Air: ERROR: could not read block and Bad Address in various permutations.
I read in postgres forum that this issue can occur in virtual machine setups. As such, I assume this is somehow caused by Rosetta. Even if you're using the Universal version of postgres, you're likely still using an x86 binary for some adjunct process (e.g. Python in my case).
Regardless, here's what has solved the issue (so far): reindexing the database
Note: you need to reindex from the command line, not using SQL commands. When I attempted to reindex using SQL, I encountered the same Bad Address error over and over, and the reindexing never completed.
When I reindexed using the command line, the process finished, and the Bad Address error has not recurred (knock on wood).
For me, it was just:
reindexdb name_of_database
Took 20-30 minutes for a 12GB DB. Not only am I not getting these errors anymore, but the database seems snappier to boot. Only hope the issue doesn't return with repeated reads/writes/index creation in Rosetta. I'm not sure why this works... maybe indices created on M1 Macs are prone to corruption? Maybe the indices become corrupt due to write or access because of the Rosetta interaction?
Is it possible that something in the Big Sur Beta 11.3 fixed this issue?
I've been having the same issues as OP since installing PostgreSQL 13 using MacPorts on my Mac mini M1 (now on PostgreSQL 13.2).
I would see could not read block errors:
Occasionally when running ad hoc queries
Always when compiling a book in R Markdown that makes several queries
Always when running VACUUM FULL on my main database (there's about 620 GB in the instance on this machine and the error would be thrown very quickly relative to how long a VACUUM FULL would take).
(My "fix" so far has been to point my Mac to the Ubuntu server I have running in the corner of my office, so no real problem for me.)
But I've managed to do 2 and 3 without the error since upgrading to Big Sur Beta 11.3 today (both failed immediately prior to upgrading). Is it possible that something in the OS fixed this issue?
I restored postgresql.conf from postgresql.conf.sample (and restarted db server) and it works fine since then.
TBC, I was trying both wal_buffers & max_worker_processes here and it didn't help. I discovered it accidentally because I tried so many things I just needed to go back. I did not reinitiazed whole database or anything like that, just the config file.
We are upgrading our Progress application on 9.1D to 11.3. Is there any sample document which we should look for our migration.
Currently we have built a new server where we are installing OpenEdge Enterprise RDBMS 11.3.
Can we backup the current database and dump it to new version.
Any suggestions/documents ?
Generally Progress is very "kind" when upgrading but you have to take in mind that moving from 9.1d to 11.3 (11.4 is soon out by the way) is moving from 2002 to 2013. A lot has changed since then.
If you have program logic that relies on disc layout, os utilities (for example using UNIX, DOS or OS-COMMAND) they might be changed as well. So an upgrade might break even if the files compile without errors. You need to test everything!
You cannot directly backup and restore from 9.1D to 11.3, you need to dump & load.
What you need to do:
Back everything up! Don't miss this and make sure you save a copy of the backup. Back up database, scripts, program files (.p, .i, .r, .cls, etc). Everything! This is vital! Make sure you always have an untouched version left of the backup so you can restart if things go bad. Progress has built in utilities for backing up the database. OS utilities can also be used. Be aware that OS utilities can not be used to create online backups. The backed up database will most likely be corrupt. Shut down the database before backing up when using OS utilities.
Dump you current database. Data as well as schema. Don't forget to check for sequences etc.
Rebuild a new database on your new server with schema from old db.
If possible - move to Type 2 Storage Areas when doing an upgrade like this. It will increase perfomance. Check documentation and knowledgebase around required settings for this.
Load dumped data
Copy program files from old server to new
Recompile
Create startup scripts etc for starting databases as well as clients. Old parameters might not fit your new server, you most likely have more memory, faster CPU, larger discs etc.
All steps have several substeps. I suggest you dive into the documentation found at community.progress.com. You can also search the KnowledgeBase (knowledgebase.progress.com)
Also if you run into problems you can ask more specific questions here (but tag accordingly for example with openedge).
11.3 Documentation
9.1D Documentation
KnowledgeBase
I upgraded my postgres database from 9.0 to 9.3 and imported the data from one to the other (about 2gigs).
Since the import the new database has been going crazy taking 30% cpu and 4 gig ram, tried rebooting, and leaving it for a day, but still going crazy any ideas?
** update **
Some more info.
Nothing is running on the database, no queries, etc.
It seems to be related to the vacuum process. If I disable the vacuum, then it is fine.
Perhaps it takes vacuum a while to clean up after a big import (2 gig .sql file, ~500 databases), but it is still going crazy after 2 days, which seems like some kind of bug?
Ok, I ran,
vacuumdb --all --full -w
After making a pgpass.conf file in, C:\Users\postgres\AppData\Roaming\postgresql. This took a while, but its seems somewhat better now.
Postgres is still trashing the disk, but is down to 10% cpu instead of 30%, and only 30meg ram instead of 4gig.
Think that I might just disable the vacuum and just run it manually once a month...
I am working with a SuSE machine (cat /etc/issue: SUSE Linux Enterprise Server 11 SP1 (i586)) running Postgresql 8.1.3 and the Slony-I replication system (slon version 1.1.5). We have a working replication setup going between two databases on this server, which is generating log shipping files to be sent to the remote machines we are tasked to maintain. As of this morning, we ran into a problem with this.
For a while now, we've had strange memory problems on this machine - the oom-killer seems to be striking even when there is plenty of free memory left. That has set the stage for our current issue to occur - we ran a massive update on our system last night, while replication was turned off. Now, as things currently stand, we cannot replicate the changes out - slony is attempting to compile all the changes into a single massive log file, and after about half an hour or so of running, it trips over the oom-killer issue, which appears to restart the replication package. Since it is constantly trying to rebuild that same package, it never gets anywhere.
My first question is this: Is there a way to cap the size of Slony log shipping files, so that it writes out no more than 'X' bytes (or K, or Meg, etc.) and after going over that size, closes the current log shipping file and starts a new one? We've been able to hit about four megs in size before the oom-killer hits with fair regularity, so if I could cap it there, I could at least start generating the smaller files and hopefully eventually get through this.
My second question, I guess, is this: Does anyone have a better solution for this issue than the one I'm asking about? It's quite possible I'm getting tunnel vision looking at the problem, and all I really need is -a- solution, not necessarily -my- solution.
My SQL Compact database is very simple, with just three tables and a single index on one of the tables (the table with 200k rows; the other two have less than a hundred each).
The first time the .sdf file is used by my Compact Framework application on the target Windows Mobile device, the system hangs for well over a minute while "something" is done to the database: when deployed, the DB is 17 megabytes, and after this first usage, it balloons to 24 megs.
All subsequent usage is pretty fast, so I'm assuming there's some sort of initialization / index building going on during this first usage. I'd rather not subject the user to this delay, so I'm wondering what this initialization process is and whether it can be performed before deployment.
For now, I've copied the "initialized" database back to my desktop for use in the setup project, but I'd really like to have a better answer / solution. I've tried "full compact / repair" in the VS Database Properties dialog, but this made no difference. Any ideas?
For the record, I should add that the database is only read from by the device application -- no modifications are made by that code.
Yes, it recreates your indexes because the database was created or opened on a desktop computer. Copy your indexed database from the device and into your setup.
more info here:
http://blogs.msdn.com/sqlservercompact/archive/2009/04/01/after-moving-the-database-from-one-platform-to-other-the-first-sqlceconnection-open-takes-more-time.aspx
Since the db is read only, and if the "initialized" db no longer inflates, I would go with simply putting it into the setup. Just confirming that your approach makes sense.