Django REST Framework (DRF): ModelSerializer does not validate models on serialization - django-rest-framework

I want to ask how to use Django REST Framework (DRF) ModelSerializers correctly for serializing from model.
I have Django model with two required fields:
class Book(models.Model):
title = models.CharField()
desc = models.CharField()
I have DRF ModelSerializer:
class BookSerializer(serializers.ModelSerializer):
class Meta:
model = Book
fields = ['title', 'desc']
I can deseralize and validate incoming request using:
serializer = BookSerializer(data=request.data)
serializer.is_valid(raise_exception=True)
But how to serialize and send response? DRF allows me to break contact built using ModelSerializer. If I forgot to set one of mandatory Book fields, it will still still pass through BookSerializer!
invalid_book = Book(title="Foo") # but forgotten to set "desc"
serializer = BookSerializer(instance=invalid_book)
serializer.data # it contains book without required "desc"
Serialized created using instance parameter throws error if I try is_validate().
Why ModelSerializer can validate incoming data, but cannot outgoing?

Validation is only performed when deserializing. As per the documentation:
Serializers also provide deserialization, allowing parsed data to be converted back into complex types, after first validating the incoming data.
That makes sense (edit: in the way Django Rest Framework seems to be construed). Because it isn't the 'role' of the Serializer to make sure that your complex data such as querysets and model instances (eg. your Book instance) that you are going to serialize is construed 'legitimately', thus they also don't validate while serializing.
So if you would save the instance like invalid_book.save(), Django would throw an error because of the missing field.
Edit
After a comment about being 'a point of view' and thus being opiniated I want to stress and make clear that this seems to be the way that Django Rest Framework (DRF) is construed. After digging deeper on SO I link this answer in support.
Also if you read the documentation of DRF, it is somewhat implied that serialization and validation are two separate concepts.
Furthermore, analyzing serializers.py makes clear that validation is only run when calling is_valid() and the validation is only run on the provided data flag. In fact, it can't even be run when only an instance is provided:
def __init__(self, instance=None, data=empty, **kwargs):
self.instance = instance
if data is not empty:
self.initial_data = data
self.partial = kwargs.pop('partial', False)
self._context = kwargs.pop('context', {})
kwargs.pop('many', None)
super().__init__(**kwargs)
...
def is_valid(self, raise_exception=False):
assert hasattr(self, 'initial_data'), (
'Cannot call `.is_valid()` as no `data=` keyword argument was '
'passed when instantiating the serializer instance.'
)
if not hasattr(self, '_validated_data'):
try:
self._validated_data = self.run_validation(self.initial_data)
except ValidationError as exc:
self._validated_data = {}
self._errors = exc.detail
else:
self._errors = {}
if self._errors and raise_exception:
raise ValidationError(self.errors)
return not bool(self._errors)

You are under a very wrong assumption. A serializer (not de-serializer) does one thing. Convert an Object to JSON. Here, you are creating an object Book(name='sad book'). This is just a regular Python Object. Django Serializers will attempt to serialize any Object that is passed to it.
What you might be wondering is the field is required in Model but why doesn't the serializer validate? Because of the way DRF handles serialization. I will show some excerpts from DRF Source code.
This is how the data property is calulated.
class BaseSerializer():
...
...
#property
def data(self):
if hasattr(self, 'initial_data') and not hasattr(self, '_validated_data'):
msg = (
'When a serializer is passed a `data` keyword argument you '
'must call `.is_valid()` before attempting to access the '
'serialized `.data` representation.\n'
'You should either call `.is_valid()` first, '
'or access `.initial_data` instead.'
)
raise AssertionError(msg)
if not hasattr(self, '_data'):
if self.instance is not None and not getattr(self, '_errors', None):
# THIS IS WHERE WE GO. THE to_representation() CAN BE FOUND IN THE IMPLEMENTATION
# OF ModelSerializer() which inherits from this class BaseSerializer
self._data = self.to_representation(self.instance)
elif hasattr(self, '_validated_data') and not getattr(self, '_errors', None):
self._data = self.to_representation(self.validated_data)
else:
self._data = self.get_initial()
return self._data
What happens in ModelSerializer.to_representation() ?
class ModelSerializer(BaseSerializer):
...
...
def to_representation(self, instance):
"""
Object instance -> Dict of primitive datatypes.
"""
ret = OrderedDict()
fields = self._readable_fields
for field in fields:
try:
attribute = field.get_attribute(instance)
except SkipField:
continue
# We skip `to_representation` for `None` values so that fields do
# not have to explicitly deal with that case.
#
# For related fields with `use_pk_only_optimization` we need to
# resolve the pk value.
check_for_none = attribute.pk if isinstance(attribute, PKOnlyObject) else attribute
if check_for_none is None:
ret[field.field_name] = None
else:
ret[field.field_name] = field.to_representation(attribute)
return ret
As you can see, in this case the serializer only maps the fields from the Object being passed. So, there is no validation during serialization. For more info, check the source code of DRF. It's pretty easy if you use Pycharm Pro.

Related

How to pass custom error message in django_filters fields?

I am using django_filter package for custom filtering in my django rest framework API,
Below is the given code
import django_filters
from src.core.models.rough_management import DocumentDetails
class DocumentDetailsFilter(django_filters.FilterSet):
my_date = django_filters.DateFromToRangeFilter()
class Meta:
model = DocumentDetails
fields = ['my_date']
Here i am getting "Enter a valid date" as an exception message when i enter invalid date range so my question is how to pass custom exception message to "No records found"?
Django filters use native django form fields to validate data.
DateFromToRangeFilter use django.forms.fields.DateField.
(Django DateField code)
You need to implement your own custom filter with little changes.
Not that from my perspective it's bad practice because that code interferes DateField error which should return default error as it is "Enter a valid date" and it's good information from api if you enter invalid date value like text "abcd".
Anyway for now i don't see better place to implement that and that is simpler than interference in FilterSet class.
You can also look on original code here to see how it's designed.
class CustomDateField(django.forms.fields.DateField):
default_error_messages = {
"invalid": "No records found",
}
class CustomDateRangeField(RangeField):
widget = DateRangeWidget
def __init__(self, *args, **kwargs):
fields = (CustomDateField(), CustomDateField()) # Important change here only
super().__init__(fields, *args, **kwargs)
def compress(self, data_list):
# Compress method stay the same as it is in original code
if data_list:
start_date, stop_date = data_list
if start_date:
start_date = handle_timezone(
datetime.combine(start_date, time.min), False
)
if stop_date:
stop_date = handle_timezone(
datetime.combine(stop_date, time.max), False
)
return slice(start_date, stop_date)
return None
class CustomDateFromToRangeFilter(RangeFilter):
field_class = CustomDateRangeField

Django Rest Framework ModelSerializer get field data depending on query param

I have a ModelSerializer with several field, many of them are StringRelatedField so the default representation of this fields is given by the __str__ method on the model, in my case this are mostly name field in each model. In other cases I need to retrieve the id instead, so how can I do this, for example depending of a query param.
Solving this way for now, please share if someone has a better solution:
class ExampleSerializer(serializers.ModelSerializer):
...
def __init__(self, *args, **kwargs):
super(ExampleSerializer, self).__init__(*args, **kwargs)
v = self.context['request'].query_params.get('v', None) # using v query param for a "variant" handling
if v == '1': # one of my variants
self.fields['examplefield'] = serializers.StringRelatedField(many=False, allow_null=True)
Note that there is no else or another if statement because I only needed two variants, one of them for retrieving the name field, the other for the id. By default DRF will use PrimaryRelatedField (which is the id) for ModelSerializer related fields see doc

Django REST - separating valid data from non-valid and serializing the former with many=True

I am using Django REST framework to come up with a REST api for my app.
In one of my views, I am trying to use many=True when initializing my serializer object in order to bulk_insert multiple rows at once. The problem is that if one of the records in the dataset is invalid, serializer's is_valid() method return False, thus rejecting the entire dataset. Whereas the desired behavior would be inserting valid records and ignoring invalid ones.
I have succeed in achieving this using the following code, but I have a terrible feeling that this is junk code and the REST framework has a native way to do this.
My code below (that I consider junk code :)):
serializers.py
class MySerializer(serializers.ModelSerializer):
class Meta:
model = CalendarEventAttendee
fields = '__all__'
view.py
def my_view(request):
validated_data = []
# Separate valid data from invalid
for record in request.data:
if MySerializer(data = record).is_valid():
validated_data.append(record)
# bulk_insert valid data
serializer = MySerializer(data=validated_data, many=True)
if serializer.is_valid():
serializer.save()
Can anyone suggest a better approach ?

Django Rest Framework "This field is required" only when POSTing JSON, not when POSTing form content

I'm getting a strange result whereby POSTing JSON to a DRF endpoint returns:
{"photos":["This field is required."],"tags":["This field is required."]}'
Whereas when POSTing form data DRF doesn't mind that the fields are empty.
My model is:
class Story(CommonInfo):
user = models.ForeignKey(User)
text = models.TextField(max_length=5000,blank=True)
feature = models.ForeignKey("Feature", blank=True, null=True)
tags = models.ManyToManyField("Tag")
My serializer is:
class StorySerializer(serializers.HyperlinkedModelSerializer):
user = serializers.CharField(read_only=True)
def get_fields(self, *args, **kwargs):
user = self.context['request'].user
fields = super(StorySerializer, self).get_fields(*args, **kwargs)
fields['feature'].queryset = fields['feature'].queryset.filter(user=user)
fields['photos'].child_relation.queryset = fields['photos'].child_relation.queryset.filter(user=user)
return fields
class Meta:
model = Story
fields = ('url', 'user', 'text', 'photos', 'feature', 'tags')
And my api.py is:
class StoryViewSet(viewsets.ModelViewSet):
serializer_class = StorySerializer
def get_queryset(self):
return self.request.user.story_set.all()
def perform_create(self, serializer):
serializer.save(user=self.request.user)
The results:
# JSON request doesn't work
IN: requests.post("http://localhost:8001/api/stories/",
auth=("user", "password",),
data=json.dumps({'text': 'NEW ONE!'}),
headers={'Content-type': 'application/json'}
).content
OUT: '{"photos":["This field is required."],"tags":["This field is required."]}'
# Form data request does work
IN: requests.post("http://localhost:8001/api/stories/",
auth=("user", "password",),
data={'text': 'NEW ONE!'},
).content
OUT: '{"url":"http://localhost:8001/api/stories/277/","user":"user","text":"NEW ONE!","photos":[],"feature":null,"tags":[]}'
The issue here isn't obvious at first, but it has to do with a shortcoming in form-data and how partial data is handled.
Form data has two special cases that Django REST framework has to handle
There is no concept of "null" or "empty" data for some inputs, including checkboxes and other inputs that allow for multiple selections.
There is no input type that supports multiple values for a single field, checkboxes being the one exception.
Both of these combine together to make it difficult to handle accepting form data within Django REST framework, so it has to handle a few things differently from most parsers.
If a field is not passed in, it is assumed to be None or the default value for the field. This is because inputs with no values are not passed along in the form data, so their key is missing.
If a single value is passed in for a multiple-value field, it will be treated like the one selected value. This is because there is no difference between a single checkbox selected out of many and a single checkbox at all in form data. Both of them are passed in as a single key.
But the same doesn't apply to JSON. Because you are not passing an empty list in for the photos and tags keys, DRF does not know what to give it for a default value and does not pass it along to the serializer. Because of this, the serializer sees that there is nothing passed in and triggers the validation error because the required field was not provided.
So the solution is to always provide all keys when using JSON (not including PATCH requests, which can be partial), even if they contain no data.

Exposing "virtual" field in a tastypie view?

I want to create a view using tastypie to expose certain objects of the same type, but with the following two three twists:
I need to get the objects using three separate queries;
I need to add a field which doesn't exist in the underlying model, and the value of that field depends on which of the queries it came from; and
The data will be per-user (so I need to hook in to one of the methods that gets a request).
I'm not clear on how to hook into the tastypie lifecycle to accomplish this. The recommended way for adding a "virtual" field is in the dehydrate method, which only knows about the bundle it's operating on.
Even worse, there's no official way to join querysets.
My problem would go away if I could get tastypie to accept something other than a queryset. In that case I could pass it a list of subclasses of my object, with the additional field added.
I'm open to any other sensible solution.
Edit: Added twist 3 - per-user data.
In the last version you should override the dehydrate method, e.g.
def dehydrate(self, bundle):
bundle.data['full_name'] = bundle.obj.get_full_name()
return bundle
Stumbled over similar problem here. In my case, items in the list could be "checked" by user.
When an item is retrieved by AJAX, its checked status is returned with the resource as a normal field.
When an item is saved to the server, "checked" field from the resource is stored in user's session.
First I thought hydrate() and dehydrate() methods to be the best match for this job, but turned out there are problems with accessing request object in these. So I went with alter_data_to_serialize() and obj_update(). I think there's no need to override obj_create(), since item can't be checked when it's first created, I think.
Here is the code, but note that it hasn't been properly tested yet.
class ItemResource(ModelResource):
def get_object_checked_status(self, obj, request):
if hasattr(request, 'session'):
session = request.session
session_data = session.get(get_item_session_key(obj), dict())
return session_data.get('checked', False)
return False
def save_object_checked_status(self, obj, data, request):
if hasattr(request, 'session'):
session_key = get_item_session_key(obj)
session_data = request.session.get(session_key, dict())
session_data['checked'] = data.pop('checked', False)
request.session[session_key] = session_data
# Overridden methods
def alter_detail_data_to_serialize(self, request, bundle):
# object > resource
bundle.data['checked'] = \
self.get_object_checked_status(bundle.obj, request)
return bundle
def alter_list_data_to_serialize(self, request, to_be_serialized):
# objects > resource
for bundle in to_be_serialized['objects']:
bundle.data['checked'] = \
self.get_object_checked_status(bundle.obj, request)
return to_be_serialized
def obj_update(self, bundle, request=None, **kwargs):
# resource > object
save_object_checked_status(bundle.obj, bundle.data, request)
return super(ItemResource, self)\
.obj_update(bundle, request, **kwargs)
def get_item_session_key(obj): return 'item-%s' % obj.id
OK, so this is my solution. Code is below.
Points to note:
The work is basically all done in obj_get_list. That's where I run my queries, having access to the request.
I can return a list from obj_get_list.
I would probably have to override all of the other obj_* methods corresponding to the other operations (like obj_get, obj_create, etc) if I wanted them to be available.
Because I don't have a queryset in Meta, I need to provide an object_class to tell tastypie's introspection what fields to offer.
To expose my "virtual" attribute (which I create in obj_get_list), I need to add a field declaration for it.
I've commented out the filters and authorisation limits because I don't need them right now. I'd need to implement them myself if I needed them.
Code:
from tastypie.resources import ModelResource
from tastypie import fields
from models import *
import logging
logger = logging.getLogger(__name__)
class CompanyResource(ModelResource):
role = fields.CharField(attribute='role')
class Meta:
allowed_methods = ['get']
resource_name = 'companies'
object_class = CompanyUK
# should probably have some sort of authentication here quite soon
#filters does nothing. If it matters, hook them up
def obj_get_list(self, request=None, **kwargs):
# filters = {}
# if hasattr(request, 'GET'):
# # Grab a mutable copy.
# filters = request.GET.copy()
# # Update with the provided kwargs.
# filters.update(kwargs)
# applicable_filters = self.build_filters(filters=filters)
try:
#base_object_list = self.get_object_list(request).filter(**applicable_filters)
def add_role(role):
def add_role_company(link):
company = link.company
company.role = role
return company
return add_role_company
director_of = map(add_role('director'), DirectorsIndividual.objects.filter(individual__user=request.user))
member_of = map(add_role('member'), MembersIndividual.objects.filter(individual__user=request.user))
manager_of = map(add_role('manager'), CompanyManager.objects.filter(user=request.user))
base_object_list = director_of + member_of + manager_of
return base_object_list #self.apply_authorization_limits(request, base_object_list)
except ValueError, e:
raise BadRequest("Invalid resource lookup data provided (mismatched type).")
You can do something like this (not tested):
def alter_list_data_to_serialize(self, request, data):
for index, row in enumerate(data['objects']):
foo = Foo.objects.filter(baz=row.data['foo']).values()
bar = Bar.objects.all().values()
data['objects'][index].data['virtual_field'] = bar
return data

Resources