How to run cleanup code at the end of Presto/Trino custom UDF? - user-defined-functions

I have created a custom Scalar PrestoSQL/Trino UDF. I wanted to know if it is possible to detect and run a method at the end of UDF execution (end of execution of a task or after the last row/data of the split)?
I want to run a cleanup/IPC resource deallocation after the UDF finishes execution on a split (kind of like a close() method)

Related

Ensure to run code after rails `delayed job` fails or successess

Is there any way we can ensure certain code to run event after the delayed job is failed or succeeds just like we can write ensure block in exception handling?
What's wrong with the following approach?
def delayed_job_method
do_the_job
ensure
something
end

Possible to run ActiveRecord "each" block inside EventMachine periodic timer without blocking?

I'd like to perform a batch database query for a large data set using ActiveRecord inside EventMachine. I'd like each call of the block passed to find_each to be called within an EventMachine periodic timer.
With the following, the find_each simply runs, and the add_periodic_timer block only runs once until the find_each is completely finished (i.e. the periodic timer block doesn't run every 0.001 seconds):
EventMachine.add_periodic_timer(0.001) do
TradingCore::Quote.by_date(#date).by_symbols(#symbols).order(:created_at).find_each do |quote|
...
sleep(delay)
end
end
Is there any way to make the find_each block execute for each record without blocking the event loop?
Think I solved this using a fiber. So, I did this
quotes_fiber = Fiber.new {
TradingCore::Quote.by_date(#date).by_symbols(#symbols).order(:created_at).find_each do |quote|
...
sleep(delay)
Fiber.yield
end
}
EventMachine.add_periodic_timer(0.001) do
quotes_fiber.resume
end
and things seem to work fine :)

Running Plone subscriber events asynchronously

In using Plone 4, I have successfully created a subscriber event to do extra processing when a custom content type is saved. This I accomplished by using the Products.Archetypes.interfaces.IObjectInitializedEvent interface.
configure.zcml
<subscriber
for="mycustom.product.interfaces.IRepositoryItem
Products.Archetypes.interfaces.IObjectInitializedEvent"
handler=".subscribers.notifyCreatedRepositoryItem"
/>
subscribers.py
def notifyCreatedRepositoryItem(repositoryitem, event):
"""
This gets called on IObjectInitializedEvent - which occurs when a new object is created.
"""
my custom processing goes here. Should be asynchronous
However, the extra processing can sometimes take too long, and I was wondering if there is a way to run it in the background i.e. asynchronously.
Is it possible to run subscriber events asynchronously for example when one is saving an object?
Not out of the box. You'd need to add asynch support to your environment.
Take a look at plone.app.async; you'll need a ZEO environment and at least one extra instance. The latter will run async jobs you push into the queue from your site.
You can then define methods to be executed asynchronously and push tasks into the queue to execute such a method asynchronously.
Example code, push a task into the queue:
from plone.app.async.interfaces import IAsyncService
async = getUtility(IAsyncService)
async.queueJob(an_async_task, someobject, arg1_value, arg2_value)
and the task itself:
def an_async_task(someobject, arg1, arg2):
# do something with someobject
where someobject is a persistent object in your ZODB. The IAsyncService.queueJob takes at least a function and a context object, but you can add as many further arguments as you need to execute your task. The arguments must be pickleable.
The task will then be executed by an async worker instance when it can, outside of the context of the current request.
Just to give more options, you could try collective.taskqueue for that, really simple and really powerful (and avoid some of the drawbacks of plone.app.async).
The description on PyPI already has enough to get you up to speed in no time, and you can use redis for the queue management which is a big plus.

Ruby - Error Handling - Good Practices

This is more of an opinion oriented question. When handling exceptions in nested codes such as:
Assuming you have a class that initialize another class to run a job. The job returns a value, which is then processed by the class which initially called it.
Where would you put the exception and error logging? Would you define it on the initialization of the job class in the calling class, which will handle then exception in the job execution or on both levels ?
if the job handles exceptions then you don't need to wrap the call to the job in a try catch.
but the class that initializes and runs the job could throw exceptions, so you should handle exceptions at that level as well.
here is an example:
def some_job
begin
# a bunch of logic
rescue
# handle exception
# log it
end
end
it wouldn't make sense then to do this:
def some_manager
begin
some_job
rescue
# log
end
end
but something like this makes more sense:
def some_manager
begin
# a bunch of logic
some_job
# some more logic
rescue
# handle exception
# log
end
end
and of course you would want to catch specific exceptions.
Probably the best answer, in general, for handling Exceptions in Ruby is reading Exceptional Ruby. It may change your perspective on error handling.
Having said that, your specific case. When I hear "job" in hear "background process", so I'll base my answer on that.
Your job will want to report status while it's doing it's thing. This could be states like "in queue", "running", "finished", but it also could be more informative (user facing) information: "processing first 100 out of 1000 records".
So, if an error happens in your background process, my suggestion is two-fold:
Make sure you catch exceptions before you exit the job. Your background job processor might not like a random exception coming from your code. I, personally, like the idea of catching the exception and saving it to the database, for easy retrieval later. Then again, depending on your background job processor, maybe it handles error reporting for you. (I think reque does, for example).
On the front end, use AJAX (or something) to occasionally check in to how the job is doing. Say every 10 seconds or something. In additional to getting the status of the job, also make sure you return this additional information to the user (if appropriate).

Program structure regarding NSTask

I want to run an unknown amount (unknown at compile time) of NSTasks and I want to run an unknown amount (again, at compile time, max. 8) of them simultaneously. So basically I loop through a list of files, generate an NSTask, run it until the maximum of simultaneous tasks are ran and whenever one finishes another NSTask starts until all of them are done.
My approach would be creating a class that generates an NSTask and subclass it to change parameters here and there when there's a different input (changes that are made from the interface). Then the superclass will run the NSTask and will have an #synthesize method returning its progress. Those objects will be generated in the above repeat loop and the progress will be displayed.
Is this a good way to go? If so, can someone give me a quick example of how the repeat loop would look like? I don't know how I would reference to all objects once they're run.
for (; !done ;) {
if (maxValue ≥ currentValue) {
//Run Object with next file.
//Set currentValue.
}
//display progress and set done to YES if needed and set currentValue to it -1 if needed
}
Thanks in advance.
There's no loop exactly.
Create an array for tasks not yet started, another with tasks that are running, and another with tasks that have finished. Have a method that pulls one task from the pending-tasks array, starts (launches) it, and adds it to the running-tasks array. After creating the arrays and filling out the pending-tasks array, call that method eight times.
When a task finishes, remove the task from the running-tasks array and add it to the finished-tasks array, then check whether there are any tasks yet to run. If there's at least one, call the run-another-one method again. Otherwise, check whether there are any still running: If not, all tasks have finished, and you can assemble the results now (if you haven't been displaying them live).

Resources