CakePHP: Run shell job from controller - shell

Is it possible to use dispatchShell from a Controller?
My mission is to start a shell job when the user has signed up.
I'm using CakePHP 2.0

If you can't mitigate the need to do this as dogmatic suggests then, read on.
So you have a (potentially) long-running job you want to perform and you don't want the user to wait.
As the PHP code your user is executing happens during a request that has been started by Apache, any code that is executed will stall that request until it completion (unless you hit Apache's request timeout).
If the above isn't acceptable for your application then you will need to trigger PHP outwith the Apache request (ie. from the command line).
Usability-wise, at this point it would make sense to notify your user that you are processing data in the background. Anything from a message telling them they can check back later to a spinning progress bar that polls your application over ajax to detect job completion.
The simplest approach is to have a cronjob that executes a PHP script (ie. CakePHP shell) on some interval (at minimum, this is once per minute). Here you can perform such tasks in the background.
Some issues arise with background jobs however. How do you know when they failed? How do you know when you need to retry? What if it doesn't complete within the cron interval.. will a race-condition occur?
The proper, but more complicated setup, would be to use a work/message queue system. They allow you to handle the above issues more gracefully, but generally require you to run a background daemon on a server to catch and handle any incoming jobs.
The way this works is, in your code (when a user registers) you insert a job into the queue. The queue daemon picks up the job instantly (it doesn't run on an interval so it's always waiting) and hands it to a worker process (a CakePHP shell for example). It's instant and - if you tell it - it knows if it worked, it knows if it failed, it can retry if you want and it doesn't accidentally handle the same job twice.
There are a number of these available, such as Beanstalkd, dropr, Gearman, RabbitMQ, etc. There are also a number of CakePHP plugins (of varying age) that can help:
cakephp-queue (MySQL)
CakePHP-Queue-Plugin (MySQL)
CakeResque (Redis)
cakephp-gearman (Gearman)
and others.
I have had experience using CakePHP with both Beanstalkd (+ the PHP Pheanstalk library) and the CakePHP Queue plugin (first one above). I have to credit Beanstalkd (written in C) for being very lightweight, simple and fast. However, with regards to CakePHP development, I found the plugin faster to get up and running because:
The plugin comes with all the PHP code you need to get started. With Beanstalkd, you need to write more code (such as a PHP daemon that polls the queue looking for jobs)
The Beanstalkd server infrastructure becomes more complex. I had to install multiple instances of beanstalkd for dev/test/prod, and install supervisord to look after the processes).
Developing/testing is a bit easier since it's a self-contained CakePHP + MySQL solution. You simply need to type cake queue add user signup and cake queue runworker.

I was able to run consolle from controller/action, see the example below.
App::uses('ShellDispatcher', 'Console');
...
public function aco_sync() {
$command = '-app '.APP.' AclExtras.AclExtras aco_sync -r adminControllers -p UserAdmin';
$args = explode(' ', $command);
$dispatcher = new ShellDispatcher($args, false);
if($dispatcher->dispatch()) {
$this->Session->flash('OK');
} else {
$this->Session->flash('Error');
}
return $this->redirect(array('action' => 'index'));
}

In CakePHP-3 you can dispatch shells from the controller & do it almost the same as in CakePHP-2. The documentation does not mention this.
// in your controller:
$shell = new \Cake\Console\Shell;
$shell->dispatchShell('shell_class param1 param2');
// or how the docs suggest
$shell->dispatchShell('shell_class', 'param1', 'param2');
Beware of stdout & stderr in unit tests.
Dispatching a shell turns on stdout and stderr logging with ConsoleLogger, and will give you all the logging in your console if you have something like the code snippet above in code that you are testing from phpunit.

function getEbayOrder(){
$this->autoRender = false;
App::import('Console/Command', 'AppShell');
App::import('Console/Command', 'EbayShell');
$job = new EbayShell();
$job->dispatchMethod('get_orders');
echo "REPONSE";
}

anything is possible, but why would you want to. If you find you need to do something in a shell and the actual application look at using libs.
you stick the code in the lib and then call the lib from both your app and the shell.

If this is to intialize AclExtras the best way is:
App::import('Console/Command', 'AppShell');
App::import('Plugin/AclExtras/Console/Command', 'AclExtrasShell');
$job = new AclExtrasShell();
$job->startup();
$job->dispatchMethod('aco_sync');
But avoid this unless you have no possibilities to run the console script.

Related

Laravel Scheduling in clustered environment

I am working with scheduling in Laravel 5.3. Previously, I was using one server to host the laravel application. Now that I am using two servers to run the Laravel App, how do I ensure that both servers are not running the same jobs at the same time?
Recently, I saw an Event method called "withoutOverlapping()". See https://laravel.com/docs/5.3/scheduling#preventing-task-overlaps
In my case, withoutOverlapping() cannot help me as I am working in a clustered environment.
Are there any workarounds or suggestions regarding this?
First of all, define if it is critical or not to avoid running task multiple times.
For example, if your app is using a task to do some sort of cleanup, there is almost no drawback to run it on every server (who care if you try to delete messages with +10 min twice?)
If it is absolutely critical to run every task only one time, you'll need to define a "main server" that will execute tasks, and a slave server that will just answer to requests but not perform any task. This is quite trivial as you just have to give every env a different name in your .env, and test against that when you define the scheduler tasks.
This is the easiest way, seriously don't bother making a database locking mecanism or whatever so you can synchronise tasks accross servers. Even OS's struggle to manage properly synchronisation against threads on the same machine, why do you want to implement the same accross different machines?
Here's what I've done when I ran into the same problems with load balancing:
class MutexCommand extends Command {
private $hash = null;
public function cleanup() {
if (is_string($this->hash)) {
Redis::del($this->hash);
$this->hash = null;
}
}
protected abstract function generateHash();
protected abstract function handleInternal();
public final function handle() {
register_shutdown_function([$this,"cleanup"]);
try {
$this->hash = $this->generateHash();
//Set a value if it does not exist atomically. Will fail if it does exist.
//Essentially setnx is the mechanism to acquire the lock
if (!Redis::setnx($this->hash,true)) {
$this->hash = null; //Prevent it from being cleaned up
throw new Exception("Already running");
}
$this->handleInternal();
} finally {
$this->cleanup();
}
}
}
Then you can write your commands:
class ThisShouldNotOverlap extends MutexCommand {
public function generateHash() {
return "Unique key for mutex, you can just use the class name if you want by doing return static::class";
}
public function handleInternal() { /* do stuff */ }
}
Then whenever you try to run the same command on multiple instances one would successfully acquire the "lock" and the others should fail.
Of course this assumes that you are using a non-clustered redis cache.
If you are not using redis then there's probably similar locking mechanisms you can implement in other caches, if you are using a clustered redis then you may need to use the RedLock locking mechanism
Essentially no, there's no a natural way using Laravel to know if another Laravel app have the same job on the job dispatcher.
We have some options there to find a solution:
Create a intermediate app that manages the jobs from the other apps.
Allow only one app to dispatch jobs.
Use worker queues, you have some packages for this, I would recommend to use Laravel 5 with WebSockets and Queue Asynchronously.
First of all Laravel scheduler isn't designed to work in a clustered environment. It was never intended to be that way.
I would suggest you should have a dedicated cron instance which manages your Laravel scheduler jobs.

How to run Laravel Queue without using artisan or shell?

I tried php artisan queue:work, its runs great, the website no freezing, but how to run programatically via Controller? i run Artisan:call('queue:work') but its freezing (waiting the queue to finish) and end up gateway timeout, but the queue run successfully though.
Any suggestion?
Queues allow you to defer the processing of a time-consuming task, such as sending an email, until a later time.
So executing the queue worker from controller actually negate the purpose of the queues. Explain your exact use case in question to provide more details.
Try this in your controller function
use Symphony\Component\Process\Process;
use Symfony\Component\Process\Exception\ProcessTimedOutException;
try {
$process = new Process(your artisan command,
null,
your environment,
[],
timeout(ex: 60000),[]);
$process->run();
} catch (ProcessTimedOutException $e) {
// you can show some flash message
// OR
return Response::json(['message' => 'some message'], 'desire response code');
}
In your case, you need to do some font-end hack, run this controller function as ajax and make some loading. And stop the loading after your desire time is pass.
This is not a best approach, you should not run queue process from controller, you should let your server to do this process by using supervisor, etc…
https://laravel.com/docs/5.1/queues#supervisor-configuration

Prefered way to fork / start subprocesses in Cucumber

Let's say I have this scenario:
Scenario: Test LDAP access
Given that the LDAP dummy server is started
And the LDAP query is executed
...
I wish to start a LDAP server in that step. In my case, I use ruby-ldapserver, so I could, in theory, do this in my step:
args = { ... }
#ldap_pid = fork do
redirect_stdout_stderr_to_logfile()
wait_for_ldap_requests(args)
exit # avoid messing with Cucumber/web driver cleanup
end
...
After do
if #ldap_pid
Process.kill("HUP", #ldap_pid)
Process.wait #ldap_pid
end
end
A totally different approach:
system("some_script_that_starts_ldap_dummy < #{input} >#{tmpfile} 2>&1 &")
This certainly works but is rather unelegant (starting a ruby program from inside ruby - unnecessary process creation, and I need to set up the input parameters for that subprogram as well).
All that said, I'm not too altogether about either approach (the "warm fuzzy feeling" is not there).
What is your standard approach to these things? Is there one to speak of? Does Cucumber bring something to the table that could support me here? Should I run something to tell Cucumber that it has forked and should handle itself like a child process?
Edit: actually, when playing around with the fork approach, I did not notice any problems with the DB at all. I did notice that if I kill the child with SIGINT, it will break the web driver (Poltergeist / PhantomJS) in my case. A functioning workaround for this is to send a SIGHUP, handle it in the child by shutting down gracefully (if needed) but not callingexit; and then, after a few seconds a SIGKILL (which denies the child any chance to close down any protocols and just rips it away). Not nice... and not free of race conditions, say if the CI server should be under load.

Run when you can

In my sinatra web application, I have a route:
get "/" do
temp = MyClass.new("hello",1)
redirect "/home"
end
Where MyClass is:
class MyClass
#instancesArray = []
def initialize(string,id)
#string = string
#id = id
#instancesArray[id] = this
end
def run(id)
puts #instancesArray[id].string
end
end
At some point I would want to run MyClass.run(1), but I wouldn't want it to execute immediately because that would slow down the servers response to some clients. I would want the server to wait to run MyClass.run(temp) until there was some time with a lighter load. How could I tell it to wait until there is an empty/light load, then run MyClass.run(temp)? Can I do that?
Addendum
Here is some sample code for what I would want to do:
$var = 0
get "/" do
$var = $var+1 # each time a request is recieved, it incriments
end
After that I would have a loop that would count requests/minute (so after a minute it would reset $var to 0, and if $var was less than some number, then it would run tasks util the load increased.
As Andrew mentioned (correctly—not sure why he was voted down), Sinatra stops processing a route when it sees a redirect, so any subsequent statements will never execute. As you stated, you don't want to put those statements before the redirect because that will block the request until they complete. You could potentially send the redirect status and header to the client without using the redirect method and then call MyClass#run. This will have the desired effect (from the client's perspective), but the server process (or thread) will block until it completes. This is undesirable because that process (or thread) will not be able to serve any new requests until it unblocks.
You could fork a new process (or spawn a new thread) to handle this background task asynchronously from the main process associated with the request. Unfortunately, this approach has the potential to get messy. You would have to code around different situations like the background task failing, or the fork/spawn failing, or the main request process not ending if it owns a running thread or other process. (Disclaimer: I don't really know enough about IPC in Ruby and Rack under different application servers to understand all of the different scenarios, but I'm confident that here there be dragons.)
The most common solution pattern for this type of problem is to push the task into some kind of work queue to be serviced later by another process. Pushing a task onto the queue is ideally a very quick operation, and won't block the main process for more than a few milliseconds. This introduces a few new challenges (where is the queue? how is the task described so that it can be facilitated at a later time without any context? how do we maintain the worker processes?) but fortunately a lot of the leg work has already been done by other people. :-)
There is the delayed_job gem, which seems to provide a nice all-in-one solution. Unfortunately, it's mostly geared towards Rails and ActiveRecord, and the efforts people have made in the past to make it work with Sinatra look to be unmaintained. The contemporary, framework-agnostic solutions are Resque and Sidekiq. It might take some effort to get up and running with either option, but it would be well worth it if you have several "run when you can" type functions in your application.
MyClass.run(temp) is never actually executing. In your current request to / path you instantiate a new instance of MyClass then it will immediately do a get request to /home. I'm not entirely sure what the question is though. If you want something to execute after the redirect, that functionality needs to exist within the /home route.
get '/home' do
# some code like MyClass.run(some_arg)
end

Basic Sidekiq Questions about Idempotency and functions

I'm using Sidekiq to perform some heavy processing in the background. I looked online but couldn't find the answers to the following questions. I am using:
Class.delay.use_method(listing_id)
And then, inside the class, I have a
self.use_method(listing_id)
listing = Listing.find_by_id listing_id
UserMailer.send_mail(listing)
Class.call_example_function()
Two questions:
How do I make this function idempotent for the UserMailer sendmail? In other words, in case the delayed method runs twice, how do I make sure that it only sends the mail once? Would wrapping it in something like this work?
mail_sent = false
if !mail_sent
UserMailer.send_mail(listing)
mail_sent = true
end
I'm guessing not since the function is tried again and then mail_sent is set to false for the second run through. So how do I make it so that UserMailer is only run once.
Are functions called within the delayed async method also asynchronous? In other words, is Class.call_example_function() executed asynchronously (not part of the response / request cycle?) If not, should I use Class.delay.call_example_function()
Overall, just getting familiar with Sidekiq so any thoughts would be appreciated.
Thanks
I'm coming into this late, but having been around the loop and had this StackOverflow entry appearing prominently via Google, it needs clarification.
The issue of idempotency and the issue of unique jobs are not the same thing. The 'unique' gems look at the parameters of job at the point it is about to be processed. If they find that there was another job with the same parameters which had been submitted within some expiry time window then the job is not actually processed.
The gems are literally what they say they are; they consider whether an enqueued job is unique or not within a certain time window. They do not interfere with the retry mechanism. In the case of the O.P.'s question, the e-mail would still get sent twice if Class.call_example_function() threw an error thus causing a job retry, but the previous line of code had successfully sent the e-mail.
Aside: The sidekiq-unique-jobs gem mentioned in another answer has not been updated for Sidekiq 3 at the time of writing. An alternative is sidekiq-middleware which does much the same thing, but has been updated.
https://github.com/krasnoukhov/sidekiq-middleware
https://github.com/mhenrixon/sidekiq-unique-jobs (as previously mentioned)
There are numerous possible solutions to the O.P.'s email problem and the correct one is something that only the O.P. can assess in the context of their application and execution environment. One would be: If the e-mail is only going to be sent once ("Congratulations, you've signed up!") then a simple flag on the User model wrapped in a transaction should do the trick. Assuming a class User accessible as an association through the Listing via listing.user, and adding in a boolean flag mail_sent to the User model (with migration), then:
listing = Listing.find_by_id(listing_id)
unless listing.user.mail_sent?
User.transaction do
listing.user.mail_sent = true
listing.user.save!
UserMailer.send_mail(listing)
end
end
Class.call_example_function()
...so that if the user mailer throws an exception, the transaction is rolled back and the change to the user's flag setting is undone. If the "call_example_function" code throws an exception, then the job fails and will be retried later, but the user's "e-mail sent" flag was successfully saved on the first try so the e-mail won't be resent.
Regarding idempotency, you can use https://github.com/mhenrixon/sidekiq-unique-jobs gem:
All that is required is that you specifically set the sidekiq option
for unique to true like below:
sidekiq_options unique: true
For jobs scheduled in the future it is possible to set for how long
the job should be unique. The job will be unique for the number of
seconds configured or until the job has been completed.
*If you want the unique job to stick around even after it has been successfully processed then just set the unique_unlock_order to
anything except :before_yield or :after_yield (unique_unlock_order =
:never)
I'm not sure I understand the second part of the question - when you delay a method call, the whole method call is deferred to the sidekiq process. If by 'response / request cycle' you mean that you are running a web server, and you call delay from there, so all the calls within the use_method are called from the sidekiq process, and hence outside of that cycle. They are called synchronously relative to each other though...

Resources