I have a worker process in my app running a script every hour. This script writes data to the file system which the web app uses to update its contents. I've noticed that although the worker runs the process successfully, the data is not being updated. Is this at all related to the fact that heroku's file system is read only? If so, how can I write this file without having to get into databases?
Assuming your web and worker processes are instantiated in separate rows in your procfile, they are running on separate Heroku dynos, each with its own file system.
So, your web process does not have access to your worker process's file system, or vice versa.
Furthermore, even if you ran both processes on the same dyno (which is possible, but NOT recommended), you still could not use the local dyno file system to reliably transfer information from one process to another, since Heroku dynos can and will recycle without prior warning at least once every 24 hours or so, and when they do so any files you wrote to the local file system before the recycle disappear.
Bottom line: You should not and cannot use the local Heroku file system for what you are trying to do. Instead, you need to use some kind of stateful backing service (such as e.g Heroku Redis).
Related
I have two apps running on the same server.
Now it seems like when adding withoutOverlapping() to the scheduler job and managing the base cronjob via cron itself, these 2 apps are blocking each other in execution.
Could that be?
Yes, withoutOverlapping only works per application.
Laravel creates a file in the storage folder with a hash of the job. This way, if the file exists, Laravel knows the job is still running. The one application cannot possibly know if the other one is currently running a job because it does not have access to the storage folder of the other application.
If your code looks like the following
$schedule->command('process:queue 0')->everyMinute()->withoutOverlapping();
$schedule->command('process:queue 1')->everyMinute()->withoutOverlapping();
It is because same commands with different parameters might bc considered overlapping.
I.e. the hash of the job will consider only the command signature.
Consider this scenario. In a load-balanced environment, I have 3 separate instances of a CMS running on 3 different physical servers. These 3 separate running instances of the application is sharing the same database.
On each server, the CMS has a /media folder where all media subfolders and files reside. My question is how I'd implement/code a file replication service/functionality in Golang, so when a subfolder or file is added/changed/deleted on one of the servers, it'll get copied/replicated/deleted on all other servers?
What packages would I need to look in to, or perhaps you have a small code snippet to help me get started? That would be awesome.
Edit:
This question has been marked as "duplicate", but it is not. It is however an alternative to setting up a shared network file system. I'm thinking that keeping a copy of the same file on all servers, synchronizing and keeping them updated might be better than sharing them.
You probably shouldn't do this. Use a distributed file system, object storage (ala S3 or GCS) or a syncing program like btsync or syncthing.
If you still want to do this yourself, it will be challenging. You are basically building a distributed database and they are difficult to get right.
At first blush you could checkout something like etcd or raft, but unfortunately etcd doesn't work well with large files.
You could, on upload, also copy the file to every other server using ssh. But then what happens when a server goes down? Or what happens when two people update the same file at the same time?
Maybe you could design it such that every file gets a unique id (perhaps based on the hash of its contents so you can safely dedupe) and those files can never be updated or deleted, only added. That would solve the simultaneous update problem, but you'd still have the downtime problem.
One approach would be for each server to maintain an append-only version log when a file is added:
VERSION | FILE HASH
1 | abcd123
2 | efgh456
3 | ijkl789
With that you can pull every file from a server and a single number would be sufficient to know when a file is added. (For example if you think Server A is on version 5, and you get informed it is now on version 7, you know you need to sync 2 files)
You could do this with a database table:
ID | LOCAL_SERVER_ID | REMOTE_SERVER_ID | VERSION | FILE HASH
Which you could periodically poll and do your syncing via ssh or http between machines. If a server was down you could just retry until it works.
Or if you didn't want to have a centralized database for this you could use a library like memberlist. The local meta data for each node could be its version.
Either way there will be some amount of delay between a file was uploaded to a single server, and when it's available on all of them. Handling that well is hard, which is why you probably shouldn't do this.
Our application builds an Access database (.mdb) and then starts a different application with the Shell command which needs Read/Write Access to this very database. The problem is that on some systems our application seems erratically to retain an exclusive lock on the database, preventing the other application from accessing it. Only after closing down the first application can the other application proceed.
The specific Error that is raised is Error 3028, which seems to be specific for DAO 3.51 (Access '97) which we indeed employ. I cannot understand why some systems are affected (and then not consistently) and others never. I thought that it might be a timing issue and built in a Sleep period between building the database and launching the other application, but that does not help.
What is going on?
EDIT:
I now created a workaround by creating the database in a separate file and then copying it. Now the second program should always be able to access it and any remaining lock problems will surface in the first program, which I maintain. I will follow up later when our users have been able to test this.
Are you closing the connection to the DB before passing control to another EXE?
I had a similar issue previously which wasn't quite the same but from what you have described this is the approach I would try:
Before lauching the secondary application with the shell command.
Alongside the sleep period you have already employed you will also need to close the original program which generated the .mdb file.
I achieved this by shelling a windows batch file, and then immediately exiting the original program.
Batch file makeup as follows:
ping -n 5 localhost >NUL
start MSAccess.exe "C:\DB.mdb"
exit
This allows 5 seconds for the mdb file to be freed-up before launching, you could replace my Ms Access call with your secondary program.
I have a daemon that forks the process.
This daemon access a database using mysql connector library.
When I do not fork, I am able to open and read a database fine, however, when I fork, I get
MySQL server has gone away
errors consistently on the first query...
Anyone know what could be causing this?
Edit Oh, my apologies for misinterpreting
Still the problems with differences between daemonized/non-daemonized are roughly with the following class of options:
environment variables
LIBPATH
PATH
HOME, UID, EUID (HOME surprisingly enough gets (ab)used way too often)
mysql specific variables
permissions
what user is the daemon running as? elevated or privilege separation?
current working directory (traditionally / for daemons, where / might be a chroot jail instead of 'real' /)
Starting with kernel 2.4.19, Linux provides per-process mount namespaces. A
mount namespace is the set of file system mounts that are visible to a
process. Mount-point namespaces can be (and usually are) shared between
multiple processes, and changes to the namespace (i.e., mounts and unmounts)
by one process are visible to all other processes sharing the same namespace.
(The pre-2.4.19 Linux situation can be considered as one in which a single
namespace was shared by every process on the system.)
detached stdin/stdout causing trouble (IMO that would mean badly designed library, but who am I)
watch it that specific resources (file locks, socket connections, threads (!)) are NOT inherited across fork/execve. I recommend reading the linked on daemonization (below), especially for example the section on 'Mutual Exclusion and Running a Single Copy [open,lockf,getpid]'
I'm sure I'm forgetting stuff
Ermm... what are you starting a mysql server process for? Mysql has plenty of sound init scripts that do work.
On the subject of proper daemonization: http://www.enderunix.org/docs/eng/daemon.php
Pay attention to the effects of sharing resources with fork children (e.g. file descriptors).
Besides that, you could just be missing basic environment settings. Peruse the official init scripts for mysql to find out which you need.
I have a decently large DB that I'm trying to pull down locally from heroku via db:pull.
I never can stick around my machine long enough to keep it from going to sleep, effectively killing the connection and terminating the process. GOTO 1.
I know I could change my system settings to stop my computer from sleeping, which would keep the connection alive, but is there a way to continue a previous pull?
Or maybe the solution is just not to use db:pull for a large db.
heroku db:pull supports resuming. When you start a pull it will create a .dat file in your project (and get rid of it when it's completed). You can do:
heroku db:pull --resume FILE # resume transfer described by a .dat file
to start the pull from the previous location.
Heroku pgbackups maybe a better option to grab the large Db file - http://devcenter.heroku.com/articles/pgbackups.
Although I'd be more inclined to prevent your computer from sleeping - just disable the sleep functionality during the downloading from settings/control panel depending on OS.