Postgres: After importing production database (with replication) to my local machine, I notice network packets being sent and received from macbook - macos

I've been a MySQL guy, and now I'm working with Postgres so I am learning. Wondering if someone can tell me why my postgres process on my macbook is sending and receiving data over my network. I am just noticing this is happening for the first time - so maybe it's been going on before this and I just never noticed postgres does this.
What has me a bit nervous, is that I pulled down a production datadump from our server which is set up with replication and I imported it to my local postgres db. The settings in my postgresql.conf don't indicate replication is turned on. So it shouldn't be streaming out to anything, right?
If someone has some insight into what may be happening, or why postgres is sending/receiving packets, I'd love to hear the easy answer (and the complex one if there's more to what's happening).
This is a postgres install via Homebrew on MacOSX.
Thanks in advance!
Some final thoughts: It's entirely possible, I guess, that Mac's activity monitor also shows local 'network' traffic stats. Maybe this isn't going out to the internets.....

In short, I would not expect replication to be enabled for a DB that was dumped from a server that had it if the server to which it was restored had no replication configured at all.
More detail:
Normally, to get a local copy of a database in Postgres, one would do a pg_dump of the remote database (this could be done from your laptop, pointing at your server), followed by a createdb on your laptop to create the database stub and then a pg_restore pointed at the dump to populate its contents. [Edit: Re-reading your post, it seems like you may perhaps have done this, but meant that the dump you used had replication enabled.)]
That would be entirely local (assuming no connections into the DB from off-box), so long as you didn't explicitly setup any replication or anything else that would go off-box. Can you elaborate on what exactly you mean by importing with replication?
Also, if you're concerned about remote traffic coming from Postgres, try running this command a few times over the period of a minute or two (when you are seeing the traffic):
netstat | grep postgres
In general, replication in Postgres in configured at a server level, and has to do with things such as the master server shipping WAL files to the standby server (for streaming replication). You would have almost certainly have had to setup entries in postgresql.conf and pg_hba.conf to ensure that the standby server had access (such as a replication entry in the latter conf file). Assuming you didn't do steps such as this, I think it can pretty safely be concluded that there's no replication going on (especially in conjunction with double-checking via netstat).
You might also double-check the Postgres log to see if it's doing anything replication related. In a default install, that'd probably be in /var/log/postgresql (although I'm not 100% sure if Homebrew installs put it somewhere else).

If it's UDP traffic, to and from a high port, it's likely to be PostgreSQL's internal statistics collector.
These are pre-bound to prevent interference and should not be accessible outside of PostgreSQL.

Related

Start mySQL Server out of a Ruby program

I have a Linux server where I start a few Ruby programs during the day. The server is directly connected to the internet (no firewall) at a hoster and I wonder, if there is a way to start and close the mySQL server just before I update the db and close it afterwards. The target is, to have the mySQL server only open when it is needed. So I thought it might be a way to activate the port or the service directly out of Ruby.
thank you for answering,
Werner
You'd probably have to change the permissions to the database through ruby, then, doing whatever you want to do, and change the permissions back.
You could do that usig the mysql gem, connecting to the database and running the commands.
Then restart the process, and do the same thing but backwards
Honestly, I don't know why you would do that, and I wouldn't recommend someone to do that. But that would be my approach

Where does Redis store the data

I am using redis for pub/sub as well as for server side cache. I mean my app server has redis server running as one process (functioning as a cache as well) . I have several thin clients (running redis client) connected to this app server in pub/sub mode. I would like to know where redis stores the cache data ? in server alone or there will be a copy in the clients as well. Also is it a good idea to use Redis in this fashion if there are close to 100 redis clients connected to server through pub/sub channel.
Thanks
Redis is a (sort of) in-memory noSQL database; but I found that my copy (running on linux) dumps to /var/lib/redis/dump.rdb
Redis can manage really big numbers of connections, by default its in-memory store (thanks to storing stuff in RAM it can be so fast).
But in the same time it can be configured as a persistent store, so dumping cached data (every x time or every x updated keys) to disk.
So it can be configured depending on your needs, have a look here.
All the cache data will be stored in the memory of the server provided to the config of running redis server.
The clients do not hold any data, they only access the data stored by the redis server.
I just installed redis on mac via homebrew. Without any configuration, I
found the dump.rdb is in my working directory (where I launched redis-server).
You can figure that out with the config command.
redis-cli config get dir
However as far as I know pub/sub data is volatile and not stored nor cached in redis at all. If you need that, you should look for a dedicated message broker like for example RabbitMQ.
On my Ubuntu, it was at /var/lib/redis/dump.rdb. On my macOS (installed via brew), it was at /usr/local/var/db/redis/dump.rdb.
Default location
/var/lib/redis/
Redis save all data in memory of server and rarely save date to disk.
For server<>client flow - all data transport with server.
Redis can processing number of clients ... default limit - 10.000
If you need less .. you must reconfigure OS, Server Settings etc. - http://redis.io/topics/clients
As I understood about the question your concern is about the Radis server memory and the client (application) memory.
I would like to know where redis stores the cache data ? in server alone or there will be a copy in the clients as well.
The Redis 6's client-side caching is what you actually looking for. There server and application both stores copies an keep in sync through a protocol communication. Eventhough they have implemented few ways to accomplish it following example (picked from the docs) mechanism will help you to understand it well.
Client 1 -> Server: CLIENT TRACKING ON
Client 1 -> Server: GET foo
(The server remembers that Client 1 may have the key "foo" cached)
(Client 1 may remember the value of "foo" inside its local memory)
Client 2 -> Server: SET foo SomeOtherValue
Server -> Client 1: INVALIDATE "foo"
Hope this helps. See that nice docs for more elaboration.

Connecting to Heroku Postgres Database takes a long time

connecting to a Heroku Postgres database using Dbeaver client to after enabling ssl and using org.postgresql.ssl.NonValidatingFactory takes about 4-5 minutes to connect to the database.
Is this behavior normal?
Just uncheck "Show non-default databases".
It crashes / freezes because it tries to load a huge amount of databases in that host
No, this is not normal at all. Connecting should be quite instantaneous. A few seconds is already way too long, and minutes is completely off, so something else is going on.
If you try connecting with some other tool (like psql directly), do you have the same problem? I'd check there to make sure your code or some dependency is not doing something odd.

Local mongo server with mongolab mirror & fallback

How to set up a local mongodb with mirror on mongolab (propagate all writes from local to mongolab, so they are always synchronized - I don't care about atomic, just that it syncs in a reasonable time frame)
How to use mongolab as a fallback if local server stops working (Ruby/Rails, mongo driver and mongoid).
Background: I used to have a local mongo server but it kept crashing occasionally and all my apps stopped working + I had to "repair" the DB to restart it. Then I switched to mongolab which I am very satisfied with, but it's generating a lot of traffic which I'd like to avoid by having a local "cache", but without having to worry about my local cache crashing causing all my apps to stop working. The DBs are relatively small so size is not an issue. I'm not trying to eliminate the traffic overhead of communicating to mongolab, just lower it a bit.
I'm assuming you don't want to have the mongolab instance just be part of a replica set (or perhaps that is not offered). The easiest way would be to add the remote mongod instance as a hidden member (priority 0) and just have it replicate data from your local instance.
An alternative immediate solution you could use is mongooplog which can be used to poll the oplog on one server and then apply it to another. Essentially replication on demand (you would need to seed one instance appropriately etc. and would need to manage any failures). More information here:
http://docs.mongodb.org/manual/reference/mongooplog/
The last option would be to write something yourself using a tailable cursor in your language of choice to feed the oplog data into the remote instance.

MySQL database backup: performance issues

Folks,
I'm trying to set up a regular backup of a rather large production database (half a gig) that has both InnoDB and MyISAM tables. I've been using mysqldump so far, but I find that it's taking increasingly longer periods of time, and the server is completely unresponsive while mysqldump is running.
I wanted to ask for your advice: how do I either
Make mysqldump backup non-blocking - assign low priority to the process or something like that, OR
Find another backup mechanism that will be better/faster/non-blocking.
I know of the existence of MySQL Enterprise Backup product (http://www.mysql.com/products/enterprise/backup.html) - it's expensive and this is not an option for this project.
I've read about setting up a second server as a "replication slave", but that's not an option for me either (this requires hardware, which costs $$).
Thank you!
UPDATE: more info on my environment: Ubuntu, latest LAMPP, Amazon EC2.
If replication to a slave isn't an option, you could leverage the filesystem, depending on the OS you're using,
Consistent backup with Linux Logical Volume Manager (LVM) snapshots.
MySQL backups using ZFS snapshots.
The joys of backing up MySQL with ZFS...
I've used ZFS snapshots on a quite large MySQL database (30GB+) as a backup method and it completes very quickly (never more than a few minutes) and doesn't block. You can then mount the snapshot somewhere else and back it up to tape, etc.
Edit: (previous answer was suggestion a slave db to back up from, then I noticed Alex ruled that out in his question.)
There's no reason your replication slave can't run on the same hardware, assuming the hardware can keep up. Grab a source tarball, ./configure --prefix=/dbslave; make; make install; and you'll have a second mysql server living completely under /dbslave.
EDIT2: Replication has a bunch of other benefits, as well. For instance, with replication running, you'll may be able to recover the binlog and replay it on top your last backup to recover the extra data after certain kinds of catastrophes.
EDIT3: You mention you're running on EC2. Another, somewhat contrived idea to keep costs down is to try setting up another instance with an EBS volume. Then use the AWS api to spin this instance up long enough for it to catch up with writes from the binary log, dump/compress/send the snapshot, and then spin it down. Not free, and labor-intensive to set up, but considerably cheaper than running the instance 24x7.
Try mk-parallel-dump utility from maatkit (http://www.maatkit.org/)
regards,
Something you might consider is using binary logs here though a method called 'log shipping'. Just before every backup, issue out a command to flush the binary logs and then you can copy all except the current binary log out via your regular file system operations.
The advantage with this method is your not locking up the database at all, since when it opens up the next binary log in sequence, it releases all the file locks on the prior logs so processing shouldn't be affected then. Tar'em, zip'em in place, do as you please, then copy it out as one file to your backup system.
An another advantage with using binary logs is you can restore up to X point in time if the logs are available. I.e. You have last year's full backup, and every log from then to now. But you want to see what the database was on Jan 1st, 2011. You can issue a restore 'until 2011-01-01' and when it stops, your at Jan 1st, 2011 as far as the database is concerned.
I've had to use this once to reverse the damage a hacker caused.
It is definately worth checking out.
Please note... binary logs are USUALLY used for replication. Nothing says you HAVE to.
Adding to what Rich Adams and timdev have already suggested, write a cron job which gets triggered on low usage period to perform the slaving task as suggested to avoid high CPU utilization.
Check mysql-parallel-dump also.

Resources