Django Rest Framework "This field is required" only when POSTing JSON, not when POSTing form content - django-rest-framework

I'm getting a strange result whereby POSTing JSON to a DRF endpoint returns:
{"photos":["This field is required."],"tags":["This field is required."]}'
Whereas when POSTing form data DRF doesn't mind that the fields are empty.
My model is:
class Story(CommonInfo):
user = models.ForeignKey(User)
text = models.TextField(max_length=5000,blank=True)
feature = models.ForeignKey("Feature", blank=True, null=True)
tags = models.ManyToManyField("Tag")
My serializer is:
class StorySerializer(serializers.HyperlinkedModelSerializer):
user = serializers.CharField(read_only=True)
def get_fields(self, *args, **kwargs):
user = self.context['request'].user
fields = super(StorySerializer, self).get_fields(*args, **kwargs)
fields['feature'].queryset = fields['feature'].queryset.filter(user=user)
fields['photos'].child_relation.queryset = fields['photos'].child_relation.queryset.filter(user=user)
return fields
class Meta:
model = Story
fields = ('url', 'user', 'text', 'photos', 'feature', 'tags')
And my api.py is:
class StoryViewSet(viewsets.ModelViewSet):
serializer_class = StorySerializer
def get_queryset(self):
return self.request.user.story_set.all()
def perform_create(self, serializer):
serializer.save(user=self.request.user)
The results:
# JSON request doesn't work
IN: requests.post("http://localhost:8001/api/stories/",
auth=("user", "password",),
data=json.dumps({'text': 'NEW ONE!'}),
headers={'Content-type': 'application/json'}
).content
OUT: '{"photos":["This field is required."],"tags":["This field is required."]}'
# Form data request does work
IN: requests.post("http://localhost:8001/api/stories/",
auth=("user", "password",),
data={'text': 'NEW ONE!'},
).content
OUT: '{"url":"http://localhost:8001/api/stories/277/","user":"user","text":"NEW ONE!","photos":[],"feature":null,"tags":[]}'

The issue here isn't obvious at first, but it has to do with a shortcoming in form-data and how partial data is handled.
Form data has two special cases that Django REST framework has to handle
There is no concept of "null" or "empty" data for some inputs, including checkboxes and other inputs that allow for multiple selections.
There is no input type that supports multiple values for a single field, checkboxes being the one exception.
Both of these combine together to make it difficult to handle accepting form data within Django REST framework, so it has to handle a few things differently from most parsers.
If a field is not passed in, it is assumed to be None or the default value for the field. This is because inputs with no values are not passed along in the form data, so their key is missing.
If a single value is passed in for a multiple-value field, it will be treated like the one selected value. This is because there is no difference between a single checkbox selected out of many and a single checkbox at all in form data. Both of them are passed in as a single key.
But the same doesn't apply to JSON. Because you are not passing an empty list in for the photos and tags keys, DRF does not know what to give it for a default value and does not pass it along to the serializer. Because of this, the serializer sees that there is nothing passed in and triggers the validation error because the required field was not provided.
So the solution is to always provide all keys when using JSON (not including PATCH requests, which can be partial), even if they contain no data.

Related

Django Rest Framework ModelSerializer get field data depending on query param

I have a ModelSerializer with several field, many of them are StringRelatedField so the default representation of this fields is given by the __str__ method on the model, in my case this are mostly name field in each model. In other cases I need to retrieve the id instead, so how can I do this, for example depending of a query param.
Solving this way for now, please share if someone has a better solution:
class ExampleSerializer(serializers.ModelSerializer):
...
def __init__(self, *args, **kwargs):
super(ExampleSerializer, self).__init__(*args, **kwargs)
v = self.context['request'].query_params.get('v', None) # using v query param for a "variant" handling
if v == '1': # one of my variants
self.fields['examplefield'] = serializers.StringRelatedField(many=False, allow_null=True)
Note that there is no else or another if statement because I only needed two variants, one of them for retrieving the name field, the other for the id. By default DRF will use PrimaryRelatedField (which is the id) for ModelSerializer related fields see doc

Django REST Framework (DRF): ModelSerializer does not validate models on serialization

I want to ask how to use Django REST Framework (DRF) ModelSerializers correctly for serializing from model.
I have Django model with two required fields:
class Book(models.Model):
title = models.CharField()
desc = models.CharField()
I have DRF ModelSerializer:
class BookSerializer(serializers.ModelSerializer):
class Meta:
model = Book
fields = ['title', 'desc']
I can deseralize and validate incoming request using:
serializer = BookSerializer(data=request.data)
serializer.is_valid(raise_exception=True)
But how to serialize and send response? DRF allows me to break contact built using ModelSerializer. If I forgot to set one of mandatory Book fields, it will still still pass through BookSerializer!
invalid_book = Book(title="Foo") # but forgotten to set "desc"
serializer = BookSerializer(instance=invalid_book)
serializer.data # it contains book without required "desc"
Serialized created using instance parameter throws error if I try is_validate().
Why ModelSerializer can validate incoming data, but cannot outgoing?
Validation is only performed when deserializing. As per the documentation:
Serializers also provide deserialization, allowing parsed data to be converted back into complex types, after first validating the incoming data.
That makes sense (edit: in the way Django Rest Framework seems to be construed). Because it isn't the 'role' of the Serializer to make sure that your complex data such as querysets and model instances (eg. your Book instance) that you are going to serialize is construed 'legitimately', thus they also don't validate while serializing.
So if you would save the instance like invalid_book.save(), Django would throw an error because of the missing field.
Edit
After a comment about being 'a point of view' and thus being opiniated I want to stress and make clear that this seems to be the way that Django Rest Framework (DRF) is construed. After digging deeper on SO I link this answer in support.
Also if you read the documentation of DRF, it is somewhat implied that serialization and validation are two separate concepts.
Furthermore, analyzing serializers.py makes clear that validation is only run when calling is_valid() and the validation is only run on the provided data flag. In fact, it can't even be run when only an instance is provided:
def __init__(self, instance=None, data=empty, **kwargs):
self.instance = instance
if data is not empty:
self.initial_data = data
self.partial = kwargs.pop('partial', False)
self._context = kwargs.pop('context', {})
kwargs.pop('many', None)
super().__init__(**kwargs)
...
def is_valid(self, raise_exception=False):
assert hasattr(self, 'initial_data'), (
'Cannot call `.is_valid()` as no `data=` keyword argument was '
'passed when instantiating the serializer instance.'
)
if not hasattr(self, '_validated_data'):
try:
self._validated_data = self.run_validation(self.initial_data)
except ValidationError as exc:
self._validated_data = {}
self._errors = exc.detail
else:
self._errors = {}
if self._errors and raise_exception:
raise ValidationError(self.errors)
return not bool(self._errors)
You are under a very wrong assumption. A serializer (not de-serializer) does one thing. Convert an Object to JSON. Here, you are creating an object Book(name='sad book'). This is just a regular Python Object. Django Serializers will attempt to serialize any Object that is passed to it.
What you might be wondering is the field is required in Model but why doesn't the serializer validate? Because of the way DRF handles serialization. I will show some excerpts from DRF Source code.
This is how the data property is calulated.
class BaseSerializer():
...
...
#property
def data(self):
if hasattr(self, 'initial_data') and not hasattr(self, '_validated_data'):
msg = (
'When a serializer is passed a `data` keyword argument you '
'must call `.is_valid()` before attempting to access the '
'serialized `.data` representation.\n'
'You should either call `.is_valid()` first, '
'or access `.initial_data` instead.'
)
raise AssertionError(msg)
if not hasattr(self, '_data'):
if self.instance is not None and not getattr(self, '_errors', None):
# THIS IS WHERE WE GO. THE to_representation() CAN BE FOUND IN THE IMPLEMENTATION
# OF ModelSerializer() which inherits from this class BaseSerializer
self._data = self.to_representation(self.instance)
elif hasattr(self, '_validated_data') and not getattr(self, '_errors', None):
self._data = self.to_representation(self.validated_data)
else:
self._data = self.get_initial()
return self._data
What happens in ModelSerializer.to_representation() ?
class ModelSerializer(BaseSerializer):
...
...
def to_representation(self, instance):
"""
Object instance -> Dict of primitive datatypes.
"""
ret = OrderedDict()
fields = self._readable_fields
for field in fields:
try:
attribute = field.get_attribute(instance)
except SkipField:
continue
# We skip `to_representation` for `None` values so that fields do
# not have to explicitly deal with that case.
#
# For related fields with `use_pk_only_optimization` we need to
# resolve the pk value.
check_for_none = attribute.pk if isinstance(attribute, PKOnlyObject) else attribute
if check_for_none is None:
ret[field.field_name] = None
else:
ret[field.field_name] = field.to_representation(attribute)
return ret
As you can see, in this case the serializer only maps the fields from the Object being passed. So, there is no validation during serialization. For more info, check the source code of DRF. It's pretty easy if you use Pycharm Pro.

Django REST Framework - access verbose_name of fields in ModelSerializer

Say I have the following Model:
class Book(Model):
title = CharField(verbose_name="Book title")
and a ModelSerializer:
class BookSerializer(ModelSerializer):
class Meta:
model = Book
fields = "__all__"
I would like to have a function get_verbose_names which returns verbose names of the fields in the model. This is what I have so far:
def get_verbose_names(serializer):
return [field.label for field in serializer.get_fields().values()]
It seems to work fine but problems occur when I use this for the builtin User model. The only fields which work are ID, E-mail, Active, Superuser status and Staff status. The special thing about those fields is that their verbose name differs from their name. Django REST Framework is probably hiding a super-smart logic which checks this and refuses to set the field label to its verbose name in such cases.
Do Django REST Framework's fields have the verbose names hidden somewhere, or they don't copy them from the original Django model fields at all and I am screwed? Or will the trick be to override this logic? I tried and could not find it.
Django REST Framework really has the mentioned "super-smart logic". It is the function needs_label in utils.field_mapping:
def needs_label(model_field, field_name):
"""
Returns `True` if the label based on the model's verbose name
is not equal to the default label it would have based on it's field name.
"""
default_label = field_name.replace('_', ' ').capitalize()
return capfirst(model_field.verbose_name) != default_label
Probably the easiest way to bypass this annoying feature is to do this:
def get_verbose_names(serializer):
return [field.label or name.replace("_", " ").capitalize()
for name, field in serializer.get_fields().items()]
Explained in words, check the field label and if none was auto-generated for it, use the needs_label logic to determine it.

Django REST - separating valid data from non-valid and serializing the former with many=True

I am using Django REST framework to come up with a REST api for my app.
In one of my views, I am trying to use many=True when initializing my serializer object in order to bulk_insert multiple rows at once. The problem is that if one of the records in the dataset is invalid, serializer's is_valid() method return False, thus rejecting the entire dataset. Whereas the desired behavior would be inserting valid records and ignoring invalid ones.
I have succeed in achieving this using the following code, but I have a terrible feeling that this is junk code and the REST framework has a native way to do this.
My code below (that I consider junk code :)):
serializers.py
class MySerializer(serializers.ModelSerializer):
class Meta:
model = CalendarEventAttendee
fields = '__all__'
view.py
def my_view(request):
validated_data = []
# Separate valid data from invalid
for record in request.data:
if MySerializer(data = record).is_valid():
validated_data.append(record)
# bulk_insert valid data
serializer = MySerializer(data=validated_data, many=True)
if serializer.is_valid():
serializer.save()
Can anyone suggest a better approach ?

django-REST: Nested relationships vs PrimaryKeyRelatedField

Is it better to use nested relationships or PrimaryKeyRelated field if you have lots of data?
I have a model with deep relationships.
For simplicity I did not add the colums.
Model:
Usecase:
User creates 1 Workoutplan with 2 Workouts and 3 WorkoutExercises.
User creates 6 Sets for each WorkoutExercise/Exercise.
User starts workout > new FinishedWorkout is created
User does first exercise and enters the used weights > new FinishedWorkoutExercise with FinishedSet is created
Question:
I want to track the progression for each workoutplan > workout > exercise.
So with time the user may have finished dozens of workouts therefore hundreds if sets are already in the database.
If I now use nested Relationships I may load a lot of data I don't need.
But if I use PrimaryKeyRelatedFields I have to load all the data I need separately which means more effort in my frontend.
Which method is preferred in such a situation?
Edit:
If I use PrimaryKeyRelatedFields how do I distinguish if e.g. Workouts in Workoutplan is an array with primary keys or an array with the loaded objects?
If you use PrimaryKeyRelatedField, you'll have a big overload to request the the necessary data in frontend
In your case, I would create specific serializers with the fields you want (using Meta.fields attribute). So, you won't load unecessary data and the frontend won't need to request more data from backend.
I can write a sample code, if you need more details.
I'll get to the question regarding serializers in a second, but first of all and for clarification. What is the purpose of having duplicate models as Workout/Finished Workout, Set/Finished Set,...?
Why not...
class Workout(models.Model):
#...stuff...
finished = models.DateTimeField(null=True, blank=True)
#...more stuff...
Then you can just set a finished date on a workout when it's done.
Now, regarding the question. I would suggest you think about user interactions. What parts of the front-end are you trying to populate? How is the data related and how would the user access it?
You should think about what parameters you're querying DRF with. You can send a date and expect workouts finished on a specific day:
// This example is done in Angular, but you get the point...
var date= {
'day':'24',
'month':'10',
'year':'2015'
};
API.finishedWorkout.query(date).$promise
.then(function(workouts){
//...workouts is an array of workout objects...
});
Viewset...
class FinishedWorkoutViewset(viewsets.GenericAPIView,mixins.ListModelMixin):
serializer_class = FinishedWorkOutSerializer
queryset = Workout.objects.all()
def list(self, request):
user = self.request.user
day = self.data['day'];
month = self.data['month'];
year = self.data['year'];
queryset = self.filter_queryset(self.get_queryset().filter(finished__date=datetime.date(year,month,day)).filter(user=user))
page = self.paginate_queryset(queryset)
serializer = self.get_serializer(queryset, many=True)
return response.Response(serializer.data)
And then your FinishedWorkoutSerializer can just have whatever fields you want for that specific type of query.
This leaves you with a bunch of very specific URLs, which isn't all that great, but you can use specific serializers for those interactions and you're also open to dynamically changing the filter, depending on what paramaters are in self.data.
There is also a chance that you may want to filter differently depending what method is being called, say you want to list only active exercises, but if a user queries a specific exercise, you want him to have access to it (note that the Exercise object should have a models.BooleanField attribute called "active").
class ExerciseViewset(viewsets.GenericViewSet, mixins.RetrieveModelMixin, mixins.ListModelMixin):
serializer_class = ExerciseSerializer
queryset = Exercise.objects.all()
def list(self, request):
queryset = self.filter_queryset(self.get_queryset().filter(active=True))
page = self.paginate_queryset(queryset)
serializer = self.get_serializer(queryset, many=True)
return response.Response(serializer.data)
Now you have different objects show up on the same URL, depending on the action. It's a bit closer to what you need, but you're still using the same serializer, so if you need a huge nested object on retrieve(), you're also gonna get a bunch of them when you list().
In order to keep lists short and details nested, you need to use different serializers.
Let's say you want to only send exercises' pk and name attributes when they are listed, but whenever an exercise is queried, you wan't to send along all related "Set" objects ordered inside an array of "WorkoutSets"...
# Taken from an SO answer on an old question...
class MultiSerializerViewSet(viewsets.GenericViewSet):
serializers = {
'default': None,
}
def get_serializer_class(self):
return self.serializers.get(self.action, self.serializers['default'])
class ExerciseViewset(MultiSerializerViewSet, mixins.RetrieveModelMixin, mixins.ListModelMixin):
queryset = Exercise.objects.all()
serializers = {
'default': SimpleExerciseSerializer,
'retrieve': DetailedExerciseSerializer
}
Then your serializers.py could look a bit like...
#------------------Exercise
#--------------------------Simple List
class SimpleExerciseSerializer(serializers.ModelSerializer):
class Meta:
model Exercise
fields = ('pk','name')
#--------------------------Detailed Retrieve
class ExerciseWorkoutExerciseSetSerializer(serializers.ModelSerializer):
class Meta:
model Set
fields = ('pk','name','description')
class ExerciseWorkoutExerciseSerializer(serializers.ModelSerializer):
set_set = ExerciseWorkoutExerciseSetSerializer(many=True)
class Meta:
model WorkoutExercise
fields = ('pk','set_set')
class DetailedExerciseSerializer(serializers.ModelSerializer):
workoutExercise_set = exerciseWorkoutExerciseSerializer(many=True)
class Meta:
model Exercise
fields = ('pk','name','workoutExercise_set')
I'm just throwing around use cases and attributes that probably make no sense in your model, but I hope this is helpfull.
P.S.; Check out how Java I got in the end there :p "ExcerciseServiceExcersiceBeanWorkoutFactoryFactoryFactory"

Resources