Golang file and folder replication / mirroring across multiple servers

Golang file and folder replication / mirroring across multiple servers - go

Consider this scenario. In a load-balanced environment, I have 3 separate instances of a CMS running on 3 different physical servers. These 3 separate running instances of the application is sharing the same database.
On each server, the CMS has a /media folder where all media subfolders and files reside. My question is how I'd implement/code a file replication service/functionality in Golang, so when a subfolder or file is added/changed/deleted on one of the servers, it'll get copied/replicated/deleted on all other servers?
What packages would I need to look in to, or perhaps you have a small code snippet to help me get started? That would be awesome.
Edit:
This question has been marked as "duplicate", but it is not. It is however an alternative to setting up a shared network file system. I'm thinking that keeping a copy of the same file on all servers, synchronizing and keeping them updated might be better than sharing them.

You probably shouldn't do this. Use a distributed file system, object storage (ala S3 or GCS) or a syncing program like btsync or syncthing.
If you still want to do this yourself, it will be challenging. You are basically building a distributed database and they are difficult to get right.
At first blush you could checkout something like etcd or raft, but unfortunately etcd doesn't work well with large files.
You could, on upload, also copy the file to every other server using ssh. But then what happens when a server goes down? Or what happens when two people update the same file at the same time?
Maybe you could design it such that every file gets a unique id (perhaps based on the hash of its contents so you can safely dedupe) and those files can never be updated or deleted, only added. That would solve the simultaneous update problem, but you'd still have the downtime problem.
One approach would be for each server to maintain an append-only version log when a file is added:
VERSION | FILE HASH
1 | abcd123
2 | efgh456
3 | ijkl789
With that you can pull every file from a server and a single number would be sufficient to know when a file is added. (For example if you think Server A is on version 5, and you get informed it is now on version 7, you know you need to sync 2 files)
You could do this with a database table:
ID | LOCAL_SERVER_ID | REMOTE_SERVER_ID | VERSION | FILE HASH
Which you could periodically poll and do your syncing via ssh or http between machines. If a server was down you could just retry until it works.
Or if you didn't want to have a centralized database for this you could use a library like memberlist. The local meta data for each node could be its version.
Either way there will be some amount of delay between a file was uploaded to a single server, and when it's available on all of them. Handling that well is hard, which is why you probably shouldn't do this.

Related

existdb: identify database server

We have a number of (developer) existDb database servers, and some staging/production servers.
Each have their own configuration, that are slightly different.
We need to select which configuration to load and use in queries.
The configuration is to be stored in an XML file within the repository.
However, when syncing the content of the servers, a single burnt-in XML file is not sufficient, since it is overwritten during copying from the other server.
For this, we need the physical name of the actual database server.
The only function found, request:get-server-name that is not quite stable since a single eXist server can be accessed through a number of various (localhost, intranet or external) URLs. However, that leads to unnecessary duplication of the configuration, one for each external URL...
(Accessing some local files in the file system is not secure and not fast.)
How to get the physical name of the existDb server from XQuery?

I m sorry but I don't fully understand your question, are you talking about exist's default conf.xml or your own configuration file that you need to store in a VCS repo? Should the xquery be executed on one instance and trigger an event in all others, or just some, or...? Without some code it is difficult to see why and when something gets overwritten.
you could try console:jmx-token which does not vary depending on URL (at least it shouldn't)
Also you might find it much easier to use a docker based approach. Either with multiple instances coordinated via docker-compose or to keep the individual configs from not interfering with each other when moving from dev to staging to production https://github.com/duncdrum/exist-docker

If I understand correctly, you basically want to be able to get the hostname or the IP address of a server from XQuery. If the functions in the XQuery Request module are not doing as you wish, then another option would be to set a Java System Property when starting eXist-db. This system property could be the internal DNS name or IP of your server, for example: -Dour-server-name=server1.mydomain.com
From XQuery you could then read that Java System property using util:system-property("our-server-name").

Shell Script for file monitoring

I have 2 AWS EC2 LAMP servers and i want to replicate the data on one of the folders to others. I know I can try with EFS, but for some reason it is not a viable option at this moment. So, here is what I want to request for help:
Our Server A and Server B has same file structure but the files inside are mismatch. So, I want a script in Server A to look in, example, /var/www/html/../file/ folder and compare with /var/www/html/../file/ in Server B, and dump all new files from Server A to B.
Any help on how to write it?

Well, I used S3FS which is lot easier than breaking head over the script. It readily copies the files from one server to another.

Laravel running on a remote host

I am looking at learning Laravel, it looks great but my one concern is how to get it running on a remote host where I have limited (non root) access.
Is it just a case of uploading the files via ftp or are there any other tricky config things that need done.

Probably your best bet is simply copying all app files, but be aware it may take quite long (many files) if your only access is FTP, with risk of incomplete transfer. May be better (but not necessary) to transfer a single compressed archive file and extract it via PHP zip extension or exec() and tar command if available (you can find many tutorials on the web).
Last but not least, you could try to run composer via PHP script - take a look here for example - but that could be much harder than expected (it didn't work for me some time ago because the hosting service had proc_open disabled).
Also, in your case you most likely have permission to access only your own web root directory and you can't change the document root configuration, therefore probably you won't be able to place "non-public" elements outside the document root as recommended, so at least remember to set file permissions properly.
Most important, remember to check the requirements first (note that starting from version 4.2 Laravel will require PHP 5.4).

What's the best way to (programatically) determine a file's network origin?

For an application I'm writing, i want to programatically find out what computer on the network a file came from. How can I best accomplish this?
Do I need to monitor network transactions or is this data stored somewhere in Windows?

When a file is copied to the local system Windows does not keep any record of where it was copied. So unless the application that created it saved such information in the file then it will be lost.
With file auditing file and directory operations can be tracked, but I don't think that will include the source path with file copies (just who created it and when).

Yes, it seems like you would either need to detect the file transfer based on interception of network traffic, or if you have the ability to alter the file in some way, use public key cryptography to sign files using a machine-specific key before they are transferred.

Create a service on either the destination computer, or on the file hosting computers which will add records to an Alternate Data Stream attached to each file, much the way that Windows handles ZoneInfo for files downloaded from the internet.
You can have a background process on machine A which "tags" each file as having been tagged by machine A on such-and-such a date and time. Then when machine B downloads the file, assuming we are using NTFS filesystems, it can see the tag from A. Or, if you can't have a process at the server, you can use NTFS streams on the "client" side via packet sniffing methods as others have described. The bonus here is that future file-copies will retain the data as long as it is between NTFS systems.
Alternative: create a requirement that all file transfers must be done through a Web portal (as opposed to network drag-and-drop). Built in logging. Or other type of file retrieval proxy. Do you have control over procedures such as this?

Encrypted FTP Storage

I guess this is kind of a programming question, because I'm going to write a program if this doesn't exist.
So I found a very cheap web-host (I don't really care about the actual web hosting). They will give me a domain name and ftp server with a ton of storage space. Anyway, I want to backup a few hundred gigs of data (mostly family photos and scans of important documents). I also want to backup any future family photos / documents. I don't care if everything on my local NAS dies in a fire, I just want to have the photos and important documents backed up off-site.
So I want some program that lets me select folders locally and schedules them to be backed up to the ftp server. I'm a bit of a security nut, so i'd like the files to be encrypted locally before being transferred up onto the server.
I know I can do this with truecrypt volumes, but I don't want to transfer an entire encrypted volume blob up to the server ever time I change a file in it. So I could do multiple true crypt volumes but that will be a pain to manage
Also this must be mac/linux compatible although I'll primarily be on linux.
I basically need rsync + truecrypt + cron + sftp all rolled into a cryptographically secure program.
I've been searching for days with no luck. Any ideas?

mozyBackup does this - it doesn't use FTP, it has a custom uploader.
ps. Remember a typical home ADSL connection only does about 1Gb/day upstream
Linux option.
Out of the box option probably duplicity ( for example see http://www.howtoforge.com/creating-encrypted-ftp-backups-with-duplicity-and-ftplicity-on-debian-lenny )
Otherwise if these are basically rarely changed archive copies of files - I would roll my own gnupg (or dpad) individual file encryption, a file changed script, and ftp or rsync.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio