How to run `loop_in_executor` in different threads for asyncio? - python-asyncio

So, let us say, we have a sync method as shown below:
def sync_method(param1, param2):
# Complex method logic
return "Completed"
I want to run the above method in a different async method under run_in_executor in the current eventloop. An example is as following:
async def run_sync_in_executor(param1, param2, pool=None):
loop = asyncio.get_event_loop()
value = loop.run_in_executor(pool, sync_method, param1, param2)
# Some further changes to the variable `value`
return value
Now, I want to run the above method while looping through a list of params, and eventually modify the final output.
One method, which I thought would work but doesn't is using asyncio.gather:
def main():
params_list = [[1, 2], [2, 3], [3, 4], [4, 5], [5, 6], [6, 7], [7, 8], [8, 9], [9, 10]]
output = await asyncio.gather(*[run_sync_in_executor(v[0], v[1]) for v in params_list])
As I read the docs, and understood, the reason this didn't work is that the method run_sync_in_executor is trying to access the current event loop, which is being shared by all the different executions of gather. Since, there can be only a single thread per event loop, and even before, the first loop has ended, due to the nature of gather the following method is trying to access the event loop, which causes an error.
As a solution, I thought of using ThreadPoolExecutor, which probably creates the number of threads as per the num_workers clause where the pool can be used by each method when executed. I was expecting something of this sort:
with ThreadPoolExecutor(num_workers=8) as executor:
for param in params_list:
future = executor.submit(run_sync_in_executor, param[0], param[1], executor)
print(future.result())
But the above method doesn't work.
It would be great if someone could suggest me as to what is the best method of achieving the desired goal?

You have several mistakes in your code: you did not awaited run_in_executor, main should be async function. Working solution:
import asyncio
import time
def sync_method(param1, param2):
"""Some sync function"""
time.sleep(5)
return param1 + param2 + 10000
async def ticker():
"""Just to show that sync method does not block async loop"""
while True:
await asyncio.sleep(1)
print("Working...")
async def run_sync_in_executor(param1, param2, pool=None):
"""Wrapper around run in executor"""
loop = asyncio.get_event_loop()
# run_in_executor should be awaited, otherwise run_in_executor
# just returns coroutine (not its result!)
value = await loop.run_in_executor(pool, sync_method, param1, param2)
return value
async def amain():
"""Main should be async function !"""
params_list = [[1, 2], [2, 3], [3, 4], [4, 5], [5, 6], [6, 7], [7, 8], [8, 9], [9, 10]]
asyncio.create_task(ticker()) # runs in parallel, never awaited!
output = await asyncio.gather(*[run_sync_in_executor(v[0], v[1]) for v in params_list])
print(output)
if __name__ == '__main__':
asyncio.run(amain())

Related

Reassign entire array to the same reference

I've searched extensively but sadly couldn't find a solution to this surely often-asked question.
In Perl I can reassign an entire array within a function and have my changes reflected outside the function:
#!/usr/bin/perl -w
use v5.20;
use Data::Dumper;
sub foo {
my ($ref) = #_;
#$ref = (3, 4, 5);
}
my $ref = [1, 2];
foo($ref);
say Dumper $ref; # prints [3, 4, 5]
Now I'm trying to learn Ruby and have written a function where I'd like to change an array items in-place by filtering out elements matching a condition and returning the removed items:
def filterItems(items)
removed, items = items.partition { ... }
After running the function, items returns to its state before calling the function. How should I approach this please?
I'd like to change an array items in-place by filtering out elements matching a condition and returning the removed items [...] How should I approach this please?
You could replace the array content within your method:
def filter_items(items)
removed, kept = items.partition { |i| i.odd? }
items.replace(kept)
removed
end
ary = [1, 2, 3, 4, 5]
filter_items(ary)
#=> [1, 3, 5]
ary
#=> [2, 4]
I would search for pass by value/reference in ruby. Here is one I found first https://mixandgo.com/learn/is-ruby-pass-by-reference-or-pass-by-value.
You pass reference value of items to the function, not the reference to items. Variable items is defined out of method scope and always refers to same value, unless you reassign it in the variable scope.
Also filterItems is not ruby style, see https://rubystyle.guide/
TL;DR
To access or modify an outer variable within a block, declare the variable outside the block. To access a variable outside of a method, store it in an instance or class variable. There's a lot more to it than that, but this covers the use case in your original post.
Explanation and Examples
In Ruby, you have scope gates and closures. In particular, methods and blocks represent scope gates, but there are certainly ways (both routine and meta) for accessing variables outside of your local scope.
In a class, this is usually handled by instance variables. So, as a simple example of String#parition (because it's easier to explain than Enumerable#partition on an Array):
def filter items, separator
head, sep, tail = items.partition separator
#items = tail
end
filter "foobarbaz", "bar"
#=> "baz"
#items
#=> "baz"
Inside a class or within irb, this will modify whatever's passed and then assign it to the instance variable outside the method.
Partitioning Arrays Instead of Strings
If you really don't want to pass things as arguments, or if #items should be an Array, then you can certainly do that too. However, Arrays behave differently, so I'm not sure what you really expect Array#partition (which is inherited from Enumerable) to yield. This works, using Enumerable#slice_after:
class Filter
def initialize
#items = []
end
def filter_array items, separator
#items = [3,4,5].slice_after { |i| i == separator }.to_a.pop
end
end
f = Filter.new
f.filter_array [3, 4, 5], 4
#=> [5]
Look into the Array class for any method which mutates the object, for example all the method with a bang or methods that insert elements.
Here is an Array#push:
ary = [1,2,3,4,5]
def foo(ary)
ary.push *[6, 7]
end
foo(ary)
ary
#=> [1, 2, 3, 4, 5, 6, 7]
Here is an Array#insert:
ary = [1,2,3,4,5]
def baz(ary)
ary.insert(2, 10, 20)
end
baz(ary)
ary
#=> [1, 2, 10, 20, 3, 4, 5]
Here is an example with a bang Array#reject!:
ary = [1,2,3,4,5]
def zoo(ary)
ary.reject!(&:even?)
end
zoo(ary)
ary
#=> [1, 3, 5]
Another with a bang Array#map!:
ary = [1,2,3,4,5]
def bar(ary)
ary.map! { |e| e**2 }
end
bar(ary)
ary
#=> [1, 4, 9, 16, 25]

Use ruby to compress a stream of time series data

I have a stream that I would like to read from a sensor. The stream never ends. Most of the time the values repeat over time. So I would like to identify runs of values and just keep the first and last of each run, and keep their timestamps too.
Here is an example of 10 minutes of data:
[['8:00', 4],['8:01', 4],['8:02', 4],['8:03', 7],['8:04', 7],['8:05', 8],['8:06', 9],['8:07', 13],['8:08', 13],['8:09', 13]].lazy
I want to compress this data to this:
[['8:00', 4],['8:02', 4],['8:03', 7],['8:04', 7],['8:05', 8],['8:06', 9],['8:07', 13],['8:09', 13]]
I've been trying to accomplish this through enumerable functions such as chunk, each_cons, each_with_object. This problem, though, seems inherently functional. Can I accomplish this using lazy enumerator in ruby?
data.reduce([data.first]) do |result, item|
result.last.last == item.last ? result : result + [item]
end
This doesn't produce exactly your desired output - it skips the last item of the run. But the good news is you don't need the last item, because you know its value is the same as your first item, and you know its timestamp is one less than the next item. (If your timestamps aren't consecutive, then this is no good). If the last entry also isn't at Time.now, the simplest thing to do is just manually tack it on at the end.
What it does:
Initializes the result with the first value. This is simply to avoid a nil case at the beginning.
For each item in data
If the value in item.last is the same as the last entry currently in result, do nothing
If the value in item.last is different, append it to result
I've written it so that each iteration produces a new result array with result + [item], which is the functional style and preferred way to use reduce, but that produces a lot of unnecessary intermediate arrays. You can create just one new array by actually appending (<<) instead.
I am posting a solution to my own question. I started with Kristján's solution which used reduce. Note that my solution fails to produce the final sample time, but I am choosing to accept this behavior because my example was just meant to be a simulated stream. So that 8:09 sample is not meant to be the final value. The next incoming sample will determine whether that 8:09 value gets stored. So that detail of my original post could have been better explained.
samples = [['8:00', 4],['8:01', 4],['8:02', 4],['8:03', 7],['8:04', 7],['8:05', 8],['8:06', 9],['8:07', 13],['8:08', 13],['8:09', 13]].lazy
prev = []
compressed = samples.reduce([samples.first]) do |keepers, sample|
keepers << prev << sample if keepers.last.last != sample.last
prev = sample
keepers
end
puts compressed.inspect
# => [["8:00", 4], ["8:02", 4], ["8:03", 7], ["8:04", 7], ["8:05", 8], ["8:05", 8], ["8:06", 9], ["8:06", 9], ["8:07", 13]]
That's not an elegant solution, but it works.
data = ['8:00', 4],['8:01', 4],['8:02', 4],['8:03', 7],['8:04', 7],['8:05', 8],['8:06', 9],['8:07', 13],['8:08', 13],['8:09', 13]
def clean_array(data)
item_to_delete = []
(0..(data.count-3)).each do |i|
if data[i][1].eql?(data[i+2][1])
item_to_delete << data[i+1]
end
end
data - item_to_delete
end
new_data = clean_array(data)
The output, as expected is
=> [["8:00", 4], ["8:02", 4], ["8:03", 7], ["8:04", 7], ["8:05", 8], ["8:06", 9], ["8:07", 13], ["8:09", 13]]
Edit
Another approach
data = ['8:00', 4],['8:01', 4],['8:02', 4],['8:03', 7],['8:04', 7],['8:05', 8],['8:06', 9],['8:07', 13],['8:08', 13],['8:09', 13]
new_data = []
data.each { |item| (new_data[-2] and item[1].eql?(new_data[-2][1])) ? new_data[-1] = item : new_data << item }
new_data
# => => [["8:00", 4], ["8:02", 4], ["8:03", 7], ["8:04", 7], ["8:05", 8], ["8:06", 9], ["8:07", 13], ["8:09", 13]]

Ruby print function [duplicate]

This question already has answers here:
Why am I getting objects printed twice?
(4 answers)
Closed 6 years ago.
# Call the each method of each collection in turn.
# This is not a parallel iteration and does not require enumerators.
def sequence(*enumerables, &block)
enumerables.each do |enumerable|
enumerable.each(&block)
end
end
# Examples of how these iterator methods work
a,b,c = [1,2,3],4..6,'a'..'e'
print "#{sequence(a,b,c) {|x| print x}}\n"
why the results is:
123456abcde[[1, 2, 3], 4..6, "a".."e"]
anyone could tell me why [[1, 2, 3], 4..6, "a".."e"] is getting printed?
or tell me why the return value of the 'sequence' method is [[1, 2, 3], 4..6, "a".."e"]??
Many thanks
sequence(a,b,c) { |x| print x }
prints 123456abcde and
print "#{some_code}\n"
will print the return value of some_code. In your example the each loops returns [[1, 2, 3], 4..6, "a".."e"], because the return value if each is self (see: http://apidock.com/ruby/v1_9_3_392/Enumerator/each)

Ruby Regex: Get Index of Capture

I've seen this question asked and answered for javascript regex, and the answer was long and very ugly. Curious if anyone has a cleaner way to implement in ruby.
Here's what I'm trying to achieve:
Test String: "foo bar baz"
Regex: /.*(foo).*(bar).*/
Expected Return: [[0,2],[4,6]]
So my goal is to be able to run a method, passing in the test string and regex, that will return the indices where each capture group matched. I have included both the starting and ending indices of the capture groups in the expected return. I'll be working on this and adding my own potential solutions here along the way too. And of course, if there's a way other than regex that would be cleaner/easier to achieve this, that's a good answer too.
Something like this should work for a general amount of matches.
def match_indexes(string, regex)
matches = string.match(regex)
(1...matches.length).map do |index|
[matches.begin(index), matches.end(index) - 1]
end
end
string = "foo bar baz"
match_indexes(string, /.*(foo).*/)
match_indexes(string, /.*(foo).*(bar).*/)
match_indexes(string, /.*(foo).*(bar).*(baz).*/)
# => [[0, 2]]
# => [[0, 2], [4, 6]]
# => [[0, 2], [4, 6], [8, 10]]
You can have a look at the (kind of strange) MatchData class for how this works. http://www.ruby-doc.org/core-1.9.3/MatchData.html
m = "foo bar baz".match(/.*(foo).*(bar).*/)
[1, 2].map{|i| [m.begin(i), m.end(i) - 1]}
# => [[0, 2], [4, 6]]

Elegantly implementing 'map (+1) list' in ruby

The short code in title is in Haskell, it does things like
list.map {|x| x + 1}
in ruby.
While I know that manner, but what I want to know is, is there any more elegant manners to implement same thing in ruby like in Haskell.
I really love the to_proc shortcut in ruby, like this form:
[1,2,3,4].map(&:to_s)
[1,2,3,4].inject(&:+)
But this only accept exactly matching argument number between the Proc's and method.
I'm trying to seek a way that allow passing one or more arguments extra into the Proc, and without using an useless temporary block/variable like what the first demonstration does.
I want to do like this:
[1,2,3,4].map(&:+(1))
Does ruby have similar manners to do this?
If you just want to add one then you can use the succ method:
>> [1,2,3,4].map(&:succ)
=> [2, 3, 4, 5]
If you wanted to add two, you could use a lambda:
>> add_2 = ->(i) { i + 2 }
>> [1,2,3,4].map(&add_2)
=> [3, 4, 5, 6]
For arbitrary values, you could use a lambda that builds lambdas:
>> add_n = ->(n) { ->(i) { i + n } }
>> [1,2,3,4].map(&add_n[3])
=> [4, 5, 6, 7]
You could also use a lambda generating method:
>> def add_n(n) ->(i) { i + n } end
>> [1,2,3,4].map(&add_n(3))
=> [4, 5, 6, 7]
Use the ampex gem, which lets you use methods of X to build up any proc one one variable. Here’s an example from its spec:
["a", "b", "c"].map(&X * 2).should == ["aa", "bb", "cc"]
You can't do it directly with the default map. However it's quite easy to implement a version that supports this type of functionality. As an example Ruby Facets includes just such a method:
require 'facets/enumerable'
[1, 2, 3, 4].map_send(:+, 10)
=> [11, 12, 13, 14]
The implementation looks like this:
def map_send(meth, *args, &block)
map { |e| e.send(meth, *args, &block) }
end
In this particular case, you can use the following:
[1, 2, 3, 4].map(&1.method(:+))
However, this only works because + is not associative. It wouldn't work for -, for example.
Ruby hasn't built-in support for this feature, but you can create your own extension or use small gem 'ampex'. It defines global variable X with extended 'to_proc' functionality.
It gives you possibility to do that:
[1,2,3].map(&X.+(1))
Or even that:
"alpha\nbeta\ngamma\n".lines.map(&X.strip.upcase)
If you just want to add 1, you can use next or succ:
[1,2,3,4].map(&:next)
[1,2,3,4].map(&:succ)

Resources