Subclassing pandas.Index - subclass

I would like to write a subclass of pandas.core.index.Index. I'm following the guide to subclassing ndarrays which can be found in the numpy documentation. Here's my code:
import numpy as np
import pandas as pd
class InfoIndex(pd.core.index.Index):
def __new__(subtype, data, info=None):
# Create the ndarray instance of our type, given the usual
# ndarray input arguments. This will call the standard
# ndarray constructor, but return an object of our type.
# It also triggers a call to InfoArray.__array_finalize__
obj = pd.core.index.Index.__new__(subtype, data)
# set the new 'info' attribute to the value passed
obj.info = info
# Finally, we must return the newly created object:
return obj
However, it doesn't work; I only get a Index object:
In [2]: I = InfoIndex((3,))
In [3]: I
Out[3]: Int64Index([3])
What am I doing wrong?

Index constructor tries to be clever when the inputs are special (all ints or datetimes for example) and skips to calls to view at the end. So you need to put that in explicitly:
In [150]: class InfoIndex(pd.Index):
.....: def __new__(cls, data, info=None):
.....: obj = pd.Index.__new__(cls, data)
.....: obj.info = info
.....: obj = obj.view(cls)
.....: return obj
.....:
In [151]: I = InfoIndex((3,))
In [152]: I
Out[152]: InfoIndex([3])
Caveat emptor: be careful subclassing pandas objects as many methods will explicitly return Index as opposed to the subclass. And there are also features in sub-classes of Index that you'll lose if you're not careful.

If you implement the __array_finalize__ method you can ensure that metadata is preserved in many operations. For some index methods you'll need to provide implementations in your subclass. See http://docs.scipy.org/doc/numpy/user/basics.subclassing.html for a bit more help

To expand on the previous answers. You can also preserve most index methods, if you use the _constructor property and set _infer_as_myclass = True.

Related

Why isn't Python NewType compatible with isinstance and type?

This doesn't seem to work:
from typing import NewType
MyStr = NewType("MyStr", str)
x = MyStr("Hello World")
isinstance(x, MyStr)
I don't even get False, but TypeError: isinstance() arg 2 must be a type or tuple of types because MyStr is a function and isinstance wants one or more type.
Even assert type(x) == MyStr or is MyStr fails.
What am I doing wrong?
Cross-reference: inheritance from str or int
Even more detailed in the same question: https://stackoverflow.com/a/2673802/1091677
If you would like to subclass Python's str, you would need to do the following way:
class MyStr(str):
# Class instances construction in Python follows this two-step call:
# First __new__, which allocates the immutable structure,
# Then __init__, to set up the mutable part.
# Since str in python is immutable, it has no __init__ method.
# All data for str must be set at __new__ execution, so instead
# of overriding __init__, we override __new__:
def __new__(cls, *args, **kwargs):
return str.__new__(cls, *args, **kwargs)
Then:
x = MyStr("Hello World")
isinstance(x, MyStr)
returns True as expected
As at python 3.10...
I'd speculate that the answer to
"Why isn't Python NewType compatible with isinstance and type?"
... is "It is a limitation of NewType".
I'd speculate that the answer to
"What am I doing wrong?"
... is "nothing". You are assuming NewType makes a new runtime type, it appears that it doesn't.
And for what it's worth, I wish it did work.
maybe you want a type that is methods like str does but is not a str?
A simple way to get that effect is just:
class MyStr:
value:str
def __init__(self,value:str):
self.value=value
pass
... but that means using all the string methods is "manual", e.g.
x=MyStr('fred')
x.value.startswith('fr')
... you could use #dataclass to add compare etc.
This is not a one-size-fits-all answer, but it might suit your application.
then make that simple
class MyStr:
value:str
def __init__(self,value:str):
self.value=value
...generic like Str in (incomplete) https://github.com/urnest/urnest/blob/master/xju/newtype.py
... and you can write:
class MyStrType:pass; class MyOtherStrType:pass
class MyStr(Str[MyStrType]):
pass
class MyOtherStr(Str[MyOtherStrType]):
pass
x=MyStr('fred')
y=MyOtherStr('fred')
x < y # ! mypy error distinct types MyStr and MyOtherStr
That's what I was after, which might be what you were after? I have to provide Int,Str separately but in my experience distinct int, str, float, bool, bytes types give a lot of readability and error-rejection leverage. I will add Float, Bool, Bytes to xju.newtype soon and give them all the full set of methods.
looks might have been "fixed" in python 3.10:
https://docs.python.org/3/library/typing.html?highlight=typing#newtype
says:
Changed in version 3.10: NewType is now a class rather than a function. There is some additional runtime cost when calling NewType over a regular function. However, this cost will be reduced in 3.11.0.
I don't have 3.10 handy as I write this to try your example.

Sympy: Custom subclass of Matrix not working

In sympy, I have written the following code to create a custom subclass for the class Matrix:
from sympy import Matrix, symbols
class alsoMatrix(Matrix):
def __init__(self,name):
v0,v1,v2 = symbols(f'{name}[0:3]')
super().__init__([v0,v1,v2])
my_matrix = alsoMatrix('v')
But all I get is this error:
Data type not understood; expecting list of lists or lists of values.
And yet, I did put a list of values. In fact, even if I get rid of the 'symbols' and enter [0,0,0] into the super initializing function instead, I get the exact same error. What seems to be the problem here?
I think I figured out the problem. Sympy doesn't seem to use the __init__ method to create new instances, but rather the __new__ method, the syntax for which is slightly different. So I rewrote my code as the following:
from sympy import Matrix, symbols
class alsoMatrix(Matrix):
def __new__(cls,name):
v0,v1,v2 = symbols(f'{name}[0:3]')
return super(alsoMatrix,cls).__new__(cls,[v0,v1,v2])
my_matrix = alsoMatrix('v')
This seems to have done the trick.

It is possible to customize the response to nested object attributes?

I'm trying to figure out if there's a way to return one attribute of a nested object when the attribute is addressed using the 'dot' notation, but to return different attributes of that object when subsequent attributes are requested.
ex)
class MyAttributeClass:
def __init__(self, value):
self.value = value
from datetime.datetime import now
self.timestamp = now()
class MyOuterClass:
def __init__(self, value):
self._value = MyAttributeClass(value)
test = MyOuterClass(5)
test.value (should return test._value.value)
test.value.timestamp (should return test._value.timestamp)
Is there any way to accomplish this? I imagine, if there is one, it involves defining the __getattr__ method of MyOuterClass, but I've been searching around and I haven't found any way to do this. Is it just impossible? It's not a big deal if it can't be done, but I've wanted to do this many times and I'd just like to know if there's a way.
It seems obvious now, but inheritance was the answer. Defining attributes as instances of a custom class which inherits from the datatype I wanted for ordinary attribute access (i.e. object.attr) and giving this subclass the desired attributes for subsequent attribute requests (i.e. object.attr.subattr).

ipython parallel push custom object

I am unable to send object to direct view workers.
Here is what I want to do:
class Test:
def __init__(self):
self.id = 'ddfdf'
from IPython.parallel import Client
rc = Client()
dv = rc[:]
t = Test()
dv['t'] = t
print dv['t']
NameError: name 't' is not defined
This would work if I try to push pandas object or any of the build in objects.
What is the way to do it with custom object?
I tried:
dv['Test'] = Test
dv['t'] = t
print dv['t']
UnpicklingError: NEWOBJ class argument isn't a type object
For interactively defined classes (in __main__), you do need to push the class definition, or use dill. But even this doesn't appear to work for old-style classes, which is a bug in IPython's handling of old-style classes[1]. This code works fine if you use a new-style class:
class Test(object):
...
instead of on old-style class. Note that old-style classes are not available in Python 3.
It's generally a good idea to always use new-style classes anyway.

Is it ok for a Django mixin to inherit another mixin?

I'm pretty sure the answer to this question is obviously "NO", since Django mixins are supposed to
inherit "object"s, but I can't find an alternative solution to my problem :(
To make the question as simple as possible,,,
views.py
class JSONResponseMixin(object):
def render_to_response(self, context):
"Returns a JSON response containing 'context' as payload"
return self.get_json_response(self.convert_context_to_json(context))
def get_json_response(self, content, **httpresponse_kwargs):
"Construct an `HttpResponse` object."
return http.HttpResponse(content,
content_type='application/json',
**httpresponse_kwargs)
def convert_context_to_json(self, context):
"Convert the context dictionary into a JSON object"
# Note: This is *EXTREMELY* naive; in reality, you'll need
# to do much more complex handling to ensure that arbitrary
# objects -- such as Django model instances or querysets
# -- can be serialized as JSON.
return json.dumps(context)
class HandlingAJAXPostMixin(JSONResponseMixin):
def post(self, request, *args, **kwargs):
.....
data = {'somedata': somedata}
return JSONResponseMixin.render_json_response(data)
class UserDetailView(HandlingAJAXPostMixin, DetailView):
model = MyUser
.....
So the problem I have is that, for multiple Views, I want to respond to their "post" request with the same
JSON Response. That is why I defined the HandlingAJAXPostMixin so that I could reuse it for
other Views. Since the HandlingAJAXPostMixin returns a JSON response,
it requires a render_json_response method, which is defined in the JSONResponseMixin.
This is the reason why I am making my HandlingAJAXPostMixin inherit the JSONResponseMixin, but this obviously seems wrong :(..
Any suggestions..?
Thanks!!!
It's perfectly valid for a mixin to inherit from another mixin - in fact, this is how most of Django's more advanced mixins are made.
However, the idea of mixins is that they are reusable parts that, together with other classes, build a complete, usable class. Right now, your JSONResponseMixin might as well be a separate class that you don't inherit from, or the methods might just be module-wide methods. It definitely works, there's nothing wrong with it, but that's not the idea of a mixin.
If you look at Django's BaseDetailView, you see the following get() method:
def get(self, request, *args, **kwargs):
self.object = self.get_object()
context = self.get_context_data(object=self.object)
return self.render_to_response(context)
get_object() and get_context_data() are defined in the subclasses of BaseDetailView, but render_to_response() isn't. It's okay for mixins to rely on methods that it's superclasses don't define, this allows different classes that inherit from BaseDetailView to supply their own implementation of render_to_response(). Right now, in Django, there's only one subclass, though.
However, logic is delegated as much as possible to those small, reusable methods that the mixins supply. That's what you want to aim for. If/else logic is avoided as much as possible - the most advanced logic in Django's default views is:
if form.is_valid():
return self.form_valid(form)
else:
return self.form_invalid(form)
That's why very similar views, like CreateView and UpdateView are in fact two separate views, while they could easily be a single view with some additional if/else logic. The only difference is that CreateView does self.object = None, while UpdateView does self.object = self.get_object().
Right now you are using a DetailView that defines a get() method that returns the result of self.render_to_response(). However, you override render_to_response() to return a JSON response instead of a template-based HTML response. You're using a mixin that you don't what to use (SingleObjectTemplateResponseMixin) and then override it's behavior to do something that you don't want to do either, just to get the view doing what you want it to do. A better idea would be to write an alternative for DetailView who's only job is to supply a JSON response based on a single object. To do this, I would create a SingleObjectJSONResponseMixin, similar to the SingleObjectTemplateResponseMixin, and create a class JSONDetailView that combines all needed mixins into a single object:
class SingleObjectJSONResponseMixin(object):
def to_json(context):
return json.dumps(context)
def render_to_response(context, **httpresponse_kwargs):
return HttpResponse(self.to_json(context),
context_type='application/json',
**httpresponse_kwargs)
class BaseJSONDetailView(SingleObjectMixin, View):
# if you want to do the same for get, inherit just from BaseDetailView
def post(self, request, *args, **kwargs):
self.object = self.get_object()
context = self.get_context_data(object=self.object)
return render_to_response(context)
class JSONDetailView(SingleObjectJSONResponseMixin, BaseJSONDetailView):
"""
Return JSON detail data of a single object.
"""
Notice that this is almost exactly the same as the BaseDetailView and the SingleObjectTemplateResponseMixin provided by Django. The difference is that you define a post() method and that the rendering is much more simple with just a conversion to JSON of the context data, not a complete template rendering. However, logic is deliberately kept simple as much as possible, and methods that don't depend on each other are separated as much as possible. This way, SingleObjectJSONResponseMixin can e.g. be mixed with BaseUpdateView to easily create an AJAX/JSON-based UpdateView. Subclasses can easily override the different parts of the mixins, like overriding to_json() to supply a certain data structure. Rendering logic is where it belongs (in render_to_response()).
Now all you need to do to create a specific JSONDetailView is to subclass and define which model to use:
class UserJSONDetailView(JSONDetailView):
model = MyUser

Resources