Failed system call was shmget(key=5432001, size=16498688, 03600).
HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMMAX parameter. You can either reduce the request size or reconfigure the kernel with larger SHMMAX. To reduce the request size (currently 16498688 bytes), reduce PostgreSQL's shared_buffers parameter (currently 1536) and/or its max_connections parameter (currently 104).
If the request size is already small, it's possible that it is less than your kernel's SHMMIN parameter, in which case raising the request size or reconfiguring SHMMIN is called for.
The PostgreSQL documentation contains more information about shared memory configuration.
You can set the memory for the current session using the commands:
sudo sysctl -w kern.sysv.shmmax=16777216
sudo sysctl -w kern.sysv.shmall=4096
Which will allow Postgres to start.
To get that to stick across restarts you need to create or edit the file /etc/sysctl.conf to include:
kern.sysv.shmmax=16777216
kern.sysv.shmall=4096
Editing /etc/sysctl.conf and restarting did the trick for me:
kern.sysv.shmmax=1610612736
kern.sysv.shmmin=1
kern.sysv.shmmni=256
kern.sysv.shmseg=64
kern.sysv.shmall=393216
Oddly enough, the PostgreSQL installer already complained about wrong shared memory settings and proposed to change sysctl.conf. But apparently, the values for shmmax and shmall were still too small.
As Ortwin mentions, you need to edit the /etc/sysctl.conf file. This file doesn't actually exist on a clean Mac OS/X Lion install, you'll need to create it. The parameters listed above are reasonable for a large machine - this allocates up to 1.5gb for shared memory. If you've only got 2gb, you might want to use less than that.
On my mac, I allocate 256MB to shared memory, with the following line:
kern.sysv.shmmax=268435456
Two links that I found helpful when researching this are the following:
http://www.spy-hill.net/help/apple/SharedMemory.html - discusses shared memory on Darwin
and
http://archives.postgresql.org/pgsql-patches/2006-02/msg00176.php
gives some history of this problem with regard to postgresql.
This has to do with the Shared Memory parameter (shared_buffers) in the PostgreSQL configuration file. Linux also has settings in place to limit the amount of memory an application can request. These settings are stored in three files :-
/proc/sys/kernel/shmall
/proc/sys/kernel/shmmax
/proc/sys/kernel/shmmni
One or more of these files need to be amended or by using the "sysctl" executable. Ask your system administrator to do this. The error message tells you what the values should be. PostgreSQL should then be able to startup properly.
If you are unable to change the values, reduce the shared_buffers parameter to a point where it is below the threshold
Related
We had an issue where we were advised by oracle our stacksize was too small - Doing ulimit -Ss as the oracle user this showed 10240k (a previously recommended setting) - However when looking at an oracle process - pmon for example and then doing cd /proc/;cat limits - we would see the Max stack size of 2mb.... so it seems the 10mb setting was not having an effect
Oracle recommended adding the line "oracle soft stack 16384" to /etc/security/limits.conf but this line seems to have no effect on my servers (also of course re-booted after adding the line).
I'd be grateful if someone could shed some light on where it is actually being set
Do you use systemd? If you start database via script executed from SystemD then security/limits.config intentionally ignored. And you have to set limits one again in systemd unit file.
I've tried several things to get to the root of this, but I'm clueless.
Here's the Go program. It's just one file and has a /api/sign endpoint that accepts POST requests. These POST requests have three fields in the body, and they are logged in a sqlite3 database. Pretty basic stuff.
I wrote a simple Dockerfile to containerize it. Uses golang:1.7.4 to build the binary and copies it over to alpine:3.6 for the final image. Once again, nothing fancy.
I use wrk to benchmark performance. With 8 threads and 1k connections for 50 seconds (wrk -t8 -c1000 -d50s -s post.lua http://server.com/api/sign) and a lua script to create the post requests, I measured the number of requests per second between different situations. In all situations, I run wrk from my laptop and the server is in DigitalOcean VPS (2 vCPUs, 2 GB RAM, SSD, Debian 9.4) that's very close to me.
Directly running the binary produced 2979 requests/sec.
Docker (docker run -it -v $(pwd):/data -p 8080:8080 image) produced 179 requests/sec.
As you can see, the Docker version is over 16x slower than running the binary directly. Everything else is the same during both experiments.
I've tried the following things and there is practically no improvement in performance in the Docker version:
Tried using host networking instead of bridge. There was a slight increase to around 190 requests/sec, but it's still miserable.
Tried increasing the limit on the number of file descriptors in the container version with --ulimit nofile=262144:262144. No improvement.
Tried different go versions, nothing.
Tried debian:9.4 for the final image instead of alpine:3.7 in the hope that it's musl that's performing terribly. Nothing here either.
(Edit) Tried running the container without a mounted volume and there's still no performance improvement.
I'm out of ideas at this point. Any help would be much appreciated!
Using an in-memory sqlite3 database completely solved all performance issues!
db, err = sql.Open("sqlite3", "file=dco.sqlite3?mode=memory")
I knew there was a disk I/O penalty hit associated with Docker's abstractions (even on Linux; I've heard it's worse on macOS), but I didn't know it would be ~16x.
Edit: Using an in-memory database isn't really an option most of the time. So I found another sqlite-specific solution. Before all database operations, do this to switch sqlite to WAL mode instead of the default rollback journal:
PRAGMA journal_mode=WAL;
PRAGMA synchronous=NORMAL;
This dramatically improved the Docker version's performance to over 2.7k requests/sec!
I am running PotgreSQL 9.4 on Windows, and constantly get the error,
2015-06-15 09:35:36 EDT LOG could not rename temporary statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat": Permission denied
I also see constant 200-800k writes to global.stat and global.tmp. I have seen other users with the same issue, but no solution.
It is a big database server, with 300g of data, and 6,000 databases.
I tried setting,
track_activities=off
In the config file, but it did not seem to have any affect.
Any help for the error, or reducing the write?
After my initial answer, I decided to research the operation of the stats collector and in particular what it is doing with the files in pg_stat_tmp. I've substantially re-written the answer as a result.
What are the global.stat / global.tmp files used for?
Postgresql contains functionality to collect statistics and status information about its operation. The function is described in Section 27.2 of the manual.
This information is collated by the stats collector process. It is made available to the other postgresql processes via the global.stat file. The first time you run a query that accesses this data within a transaction, the backend which you are connected to will read the global.stat file and cache the result, using it until the end of the transaction.
To keep this file up to date, the stats collector process periodically re-writes it with updated information. It typically does this several times a second. The process is as follows:
Create a new file global.tmp
Write data to this file
Rename global.tmp as global.stat, overwriting the previous global.stat
The global.tmp and global.stats files are written into the directory configured by the stats_temp_directory configuration parameter. Normally this is set to $PGDATA/pg_stat_tmp.
On shutdown, the stats file is written into the file $PGDATA/global/pgstat.stat, and the files in the tmp dir above are removed. This file is then read and removed when the database is started up again.
Why is the stats collector processor creating so much I/O load?
Normally, the amount of data written to the global.stats is relatively modest and writing it does not generate that much I/O traffic. However under some circumstances it does seem to get very bloated. When this happens the amount of load generated can start to get excessive as the entire file is rewritten more than once a second.
I have had one experience where it grew by a factor or 10 or more, compared to other similar servers. This machine did have an unusually large number of databases (for our application at least - 30-40 databases - but nothing like the 6000 you say you have). It is possible that having a large number of databases exacerbates this.
Some of the references below talk about a pattern of creating / dropping lots of tables causing bloat in these files, and that perhaps autovacuum is not running aggressively enough to remove the associated bloat. You may wish to consider your autovac settings.
Why do I get 'Permission Denied' errors on Windows?
After examining the postgresql source code I think there may be a race condition in accessing the global.stats file which could happen at any time, but is exacerbated by the size of the file.
The default mode of operation in Windows is that it is not possible to rename or remove a file while another process has it open. This is different to Linux (or Unix) where a file can be renamed or removed while other processes are accessing it.
In the sequence above you can see that if one of the backend processes is reading the file at the same time as the stats collector is rewriting it, then the backend process may still have the file open at the time the rename is attempted. That leads to the 'Permission Denied' error you are seeing.
Naturally when the file becomes very large, then the amount of time taken to read it becomes more significant, therefore the probability of the stats collector process attempting a rename while a backend still has it open increases.
However, since the file is frequently being rewritten, the impact of these errors is relatively mild. It just means that this particular update fails, leading the the backends getting slightly out of date statistics. The next update will probably succeed.
Note that Windows does offer a file opening mode which does allow files to be deleted or renamed while they are opened by another process, however as far as I could tell, this mode is not used by Postgresql. I could not find any bug report on this - seems like it should be reported.
In summary, these errors are a side effect of the main problem, which is the excessive size of the global.stat file.
I've turned track_activities off but the file is still being written - Why?
From what I can see, track_activites affects only one of the sets of information that the stats collector is collecting.
In addition, it looks as though the stats collector process is started regardless of these settings, and will continue to re-write the file. The settings appear to control only the collection of fresh data.
My conclusion is that once the file has become bloated, it will remain so and continue to be re-written, even once all of the stats collection options are turned off.
What can I do to avoid this problem?
Once the file has become bloated, it seems that the easiest way to get the database back into a good working state is to remove the file, using the following steps:
Stop the database
When the DB is stopped, the pg_stat_tmp directory is empty and a file $PGDATA/global/pgstat.stat is written. We renamed this file to pgstat.stat.old.
Start the database. It creates a fresh set of pgstat files. After confirming the server was operating correctly you can remove the old file you have renamed.
This is the process we used when one of our servers suffered from this problem.
Needless to say be very careful when manually manipulating any files under the Postgresql Data directory.
After this you may want to monitor the server to see if it the file becomes bloated again. If it does then here are some additional ideas to consider:
As mentioned above I have seen some references to this file becoming bloated if autovacuum is not running aggressively enough. You may wish to tune the autovacuum settings
Disabling any of the track_xxx options described in the Section 18.9.1 of the manual which are not required may help
It is possible to place the pg_stats_tmp directory in a tmpfs filesystem (or whatever equivalent RAM based filesystem is available in windows). Doing so should eliminate I/O as a concern for these files.
References:
Postgres stats collector showing high disk I/O
Too much I/O generated by postgres stats collector process
stats collector suddenly causing lots of IO
Here might be a solution for your problem. https://wiki.postgresql.org/wiki/May_2015_Fsync_Permissions_Bug
Another possibility could be antivirus settings. Try to turn it off temporarily.
It happened to me few days ago. I rebooted the machine, but the error did not disappeared.
Don't know why, but performing a vacuum analyze verbose did the trick, and the error has stoped to show up.
I'm trying to run Magento community edition 1.7.0.2
using NGINX, PHP FPM
on 512Mb RAM VPS, Ubuntu 12.04.3 32Bit.
Whenever I try to change the default template by changing all the settings under
System->Configuration->Design->Themes by setting all of options, i.e.
Templates, Skin (Images / CSS), Layout, Default
to provided modern template (as well as other template) I get over the PHP memory limit.
Even if I set the limit to 256Mb.
I find it strange, because I was able to make it on shared hosting with less RAM, but on Apache, I guess.
Each time I attempt this - it fails and it is impossible to get either to admin or front end - getting white screen. I solve it by restoring machine from the snapshot.
Can anyone help me debug this?
Update:
Actually, I'm not even able to refresh configuration cache. One of the php-fpm processes increases memory use until it reaches max ram...
2014/01/06 16:58:09 [error] 892#0:
*27 FastCGI sent in stderr: "PHP message: PHP Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 32 bytes)
in
/usr/share/nginx/www/spaparts/app/code/core/Mage/Core/Model/Config.php
on line 622"
while reading response header from upstream, client: 66.249.66.xxx,
server: domain.com,
request: "GET /index.php/apparel/shoes.html?cat=16 HTTP/1.1",
upstream: "fastcgi://unix:/tmp/php5-fpm.sock:", host: "domain.com"
I though it would be nice to write the details, in case anyone else is having similar trouble.
So, the over the limit memory PHP errors were caused by the:
{{unsecure_base_url}} being set to "{{unsecure_base_url}}"
{{secure_base_url}} being set to "{{secure_base_url}}"
This was suggested somewhere as the way to allow the change the domain of the magento install, and it allows to run it as usual, but seems to cause some loops and over the limit RAM consumption.
After changing the settings in System->Configuration->Web everything is back to normal, I was able to clear cache, change theme etc.
Thanks everyone for all your suggestions!
We are using utl_file in Oracle 10g to copy a blob from a table row to a file on the file system and when we call utl_file.fclose() it takes a long time. It's a 10mb file, not very big, and it takes just over a minute to complete. Anyone know why this would be so slow?
Thanks
EDIT
Looks like this is related to our file system. When we write to a local drive it works fine.
We have determined that it is our network file system mount causing the issue. When we remove that from the problem and store the file to a local drive it works fine. We were able to test this on another environment with the same configuration and it's fast and works as expected.
Now we need to get our network guys involved and see why transfering data on the NFS is so slow in this environment.
EDIT
It was the network speed between the oracle server and the UNIX server. It was set to half duplex 10Mb per second. So we bumped it up to full duplex 100Mb and it works great now!
Are you doing a fflush prior to that? If not, then the fclose is performing the fflush for you and that may be where the time is. Check it by issuing a fflush prior to close.