How do you view the details of a submitted task with IPython Parallel?

I'm submitting tasks using a Load Balanced View.
I would like to be able to connect from a different client and view the remaining tasks by the function and parameters that were submitted.
def someFunc(parm1, parm2):
return parm1 + parm2
lbv = client.load_balanced_view()
async_results = []
for parm1 in [0,1,2]:
for parm2 in [0,1,2]:
ar = lbv.apply_async(someFunc, parm1, parm2)
From the client I submitted this from I can figure out which result went with which function call based on their order in the async_results array.
What I would like to know is how can I figure out the function and parameters associated with a msg_id if I am retrieving the results from a different client using the queue_status or history commands to get msg_id's and the client.get_result command to retrieve the results.

These things are pickled, and stored in the 'buffers' in the hub's database. If you want to look at them, you have to fetch those buffers from the database, and unpack them.
Assuming you have a list of msg_ids, here is a way that you can reconstruct the f, args, and kwargs for all of those requests:
# msg_ids is a list of msg_id, however you decide to get that
from IPython.zmq.serialize import unpack_apply_message
# load the buffers from the hub's database:
query = rc.db_query({'msg_id' : {'$in' : msg_ids } }, keys=['msg_id', 'buffers'])
# query is now a list of dicts with two keys - msg_id and buffers
# now we can generate a dict by msg_id of the original function, args, and kwargs:
requests = {}
for q in query:
msg_id =
f, args, kwargs = unpack_apply_message(q['buffers'])
requests[q['msg_id']] = (f, args, kwargs)
From this, you should be able to associate tasks based on their function and args.
One Caveat: since f has been through pickling, often the comparison f is original_f will be False, so you have to do looser comparisons, such as f.__module__ + f.__name__ or similar.
For a bit more detail, here is an example that generates some requests,
then reconstructs and associates them based on the function and arguments having some prior knowledge of what the original requests may have looked like.


how to iterate over a list of values returning from ops to jobs in dagster

I am new to the dagster world and working on ops and jobs concepts. \
my requirement is to read a list of data from config_schema and pass it to #op function and return the same list to jobs. \
The code is show as below
def read_tableNames(context):
return lst
def write_db():
when it accepts the list in #op function, it is showing as a frozenlist type but when i tried to return to jobs it conver it into <class 'dagster._core.definitions.composition.InvokedNodeOutputHandle'> data type
My requirement is to fetch the list of data and iterate over the list and perform some operatiosn on individual data of a list using #ops
Please help to understand this
Thanks in advance !!!
When using ops / graphs / jobs in Dagster it's very important to understand that the code defined within a #graph or #job definition is only executed when your code is loaded by Dagster, NOT when the graph is actually executing. The code defined within a #graph or #job definition is essentially a compilation step that only serves to define the dependencies between ops - there shouldn't be any general-purpose python code within those definitions. Whatever operations you want to perform on data flowing through your job should take place within the #op definitions. So if you wanted to print the values of your list that is be input via a config schema, you might do something like
def read_tableNames(context):
here's an example using two ops to do this data flow:
def read_tableNames(context):
return lst
def print_tableNames(context, table_names):'-------------->',type(table_names)
def simple_flow():
Have a look at some of the Dagster tutorials for more examples

How to parse two elements from a list to make a new one

I have this input repeated in 1850 files:
And I wanted to make a list in a way that by looking for the login I can retrieve the ID using a syntax like:
This is my desired output:
{"XXX"=>"66570", "XXX"=>"66570", "XXX"=>"66570", "XXX"=>"66570", ... }
My code is:
i2 = 1
while i2 != users_list_raw.parsed.count
temp_user = users_list_raw.parsed[i2]
temp_user_login = temp_user['login']
temp_user_id = temp_user['id']
user = {
temp_user_login => temp_user_id
users_list << user
i2 += 1
My output is:
[{"XXX":66570},{"XXX":66569},{"XXX":66568},{"XXX":66567},{"XXX":66566}, ... {}]
but this is not what I want.
What's wrong with my code?
hash[key] = value to add an entry in a hash. So I guess in your case users_list[temp_user_login] = temp_user_id
But I'm unsure why you'd want to do that. I think you could look up the id of a user by having the login with a statement like:
login = XXX
user = {|user| user["login"] == login}.first
id = user["id"]
and maybe put that in a function get_id(login) which takes the login as its parameter?
Also, you might want to look into databases if you're going to manipulate large amounts of data like this. ORMs (Object Relational Mappers) are available in Ruby such as Data Mapper and Active Record (which comes bundled with Rails), they allow you to "model" the data and create Ruby objects from data stored in a database, without writing SQL queries manually.
If your goal is to lookup users_list[XXX] then a Hash would work well. We can construct that quite simply:
users_list = users_list_raw.parsed.each.with_object({}) do |user, list|
list[user['login']] = user['id']
Any time you find yourself writing a while loop in Ruby, there might be a more idiomatic solution.
If you want to keep track of a mapping from keys to values, the best data structure is a hash. Be aware that assignment via the array operator will replace existing values in the hash.
login_to_id = {}
Dir.glob("*.txt") { |filename| # Use Dir.glob to find all files that you want to process
data = eval( # Your data seems to be Ruby encoded hash/arrays. Eval is unsafe, I hope you know what you are doing.
data.each { |hash|
login_to_id[hash["login"]] = hash["id"]
puts login_to_id["XXX"] # => 66939

python multiprocessing - process results as they are returned

I am using python multiprocessing library to hash a bunch of files asynchronously (among other things) like so:
mt_pool = multiprocessing.Pool(MP_SIZE)
pool_handler = mt_pool.starmap_async(hash_job, [(root, entry.path) for entry in scantree(root)])
the hash_job function returns a dictionary (file, result, status)
I would like to begin processing the results as they come in. something similer to:
good_results = []
if status == 0:
logging.error("file {0} returned error: {1}").format(job_result['file'],job_result['file'])
I am assuming I would need to place the above code inside a while block of some sort.
can anyone help?

How do I create a compound multi-index in rethinkdb?

I am using Rethinkdb 1.10.1 with the official python driver. I have a table of tagged things which are associated to one user:
"id": "PK",
"user_id": "USER_PK",
"tags": ["list", "of", "strings"],
// Other fields...
I want to query by user_id and tag (say, to find all the things by user "tawmas" with tag "tag"). Starting with Rethinkdb 1.10 I can create a multi-index like this:
r.table('things').index_create('tags', multi=True).run(conn)
My query would then be:
res = (r.table('things')
.get_all('TAG', index='tags')
.filter(r.row['user_id'] == 'USER_PK').run(conn))
However, this query still needs to scan all the documents with the given tag, so I would like to create a compound index based on the user_id and tags fields. Such an index would allow me to query with:
res = r.table('things').get_all(['USER_PK', 'TAG'], index='user_tags').run(conn)
There is nothing in the documentation about compound multi-indexes. However, I
tried to use a custom index function combining the requirements for compound
indexes and multi-indexes by returning a list of ["USER_PK", "tag"] pairs.
My first attempt was in python:
lambda each: [[each['user_id'], tag] for tag in each['tags']],
This makes the python driver choke with a MemoryError trying to parse the index function (I guess list comprehensions aren't really supported by the driver).
So, I turned to my (admittedly, rusty) javascript and came up with this:
"""(function (each) {
var result = [];
var user_id = each["user_id"];
var tags = each["tags"];
for (var i = 0; i < tags.length; i++) {
result.push([user_id, tags[i]]);
return result;
This is rejected by the server with a curious exception: rethinkdb.errors.RqlRuntimeError: Could not prove function deterministic. Index functions must be deterministic.
So, what is the correct way to define a compound multi-index? Or is it something
which is not supported at this time?
Short answer:
List comprehensions don't work in ReQL functions. You need to use map instead like so:
lambda each: each["tags"].map(lambda tag: [each['user_id'], tag]),
Long answer
This is actually a somewhat subtle aspect of how RethinkDB drivers work. So the reason this doesn't work is that your python code doesn't actually see real copies of the each document. So in the expression:
lambda each: [[each['user_id'], tag] for tag in each['tags']]
each isn't ever bound to an actual document from your database, it's bound to a special python variable which represents the document. I'd actually try running the following just to demonstrate it:
q = r.table('things').index_create(
lambda each: print(each)) #only works in python 3
And it will print out something like:
<RqlQuery instance: var_1 >
the driver only knows that this is a variable from the function, in particular it has no idea if each["tags"] is an array or what (it's actually just another very similar abstract object). So python doesn't know how to iterate over that field. Basically exactly the same problem exists in javascript.

Django models are not ajax serializable

I have a simple view that I'm using to experiment with AJAX.
def get_shifts_for_day(request,year,month,day):
data= dict()
data['d'] =year
data['e'] = month
data['x'] = User.objects.all()[2]
return HttpResponse(simplejson.dumps(data), mimetype='application/javascript')
This returns the following:
TypeError at /sched/shifts/2009/11/9/
<User: someguy> is not JSON serializable
If I take out the data['x'] line so that I'm not referencing any models it works and returns this:
{"e": "11", "d": "2009"}
Why can't simplejson parse my one of the default django models? I get the same behavior with any model I use.
You just need to add, in your .dumps call, a default=encode_myway argument to let simplejson know what to do when you pass it data whose types it does not know -- the answer to your "why" question is of course that you haven't told poor simplejson what to DO with one of your models' instances.
And of course you need to write encode_myway to provide JSON-encodable data, e.g.:
def encode_myway(obj):
if isinstance(obj, User):
return [obj.username,
# and/or whatever else
elif isinstance(obj, OtherModel):
return [] # whatever
elif ...
raise TypeError(repr(obj) + " is not JSON serializable")
Basically, JSON knows about VERY elementary data types (strings, ints and floats, grouped into dicts and lists) -- it's YOUR responsibility as an application programmer to match everything else into/from such elementary data types, and in simplejson that's typically done through a function passed to default= at dump or dumps time.
Alternatively, you can use the json serializer that's part of Django, see the docs.
