YARN Rest API: XML as default - hadoop

Is there a way to change the default response format for GET requests like /ws/v1/cluster/info to be XML?
I know that I can specify the Accept: application/xmlheader with my request. However I want to change the default value so that I can omit the header.

From my reading of the Yarn source code, the distinction between JSON and XML is completely delegated to the underlying JAX-RS infrastructure, with annotations like
#Produces({ MediaType.APPLICATION_JSON, MediaType.APPLICATION_XML })
all over the code. This mechanism (called "Static Content Negotiation") specifies that the first in the list is the default, which is consistent with behavior. One could use the javax.ws.rs.core.Variant class (and a technique called "Runtime Content Negotiation") to override this, but I can't find any use of it in the codebase.
If you're willing to make a small modification to the source and rebuild it, you could simply find all of these #Produces declarations and swap the order. If you decide to do this, you'll want to be mindful of the apparent bug described here. If it turns out to still be relevant (and it was recent), you may find you have to tackle all the complexity of runtime content negotiation anyway.
It should be simple enough to try it out, but if you don't have any other reason to build from sources it's probably overkill.

Related

Why Elasticsearch uses PUT instead of POST for creating index?

As far as I know, POST is usually used for changing the state of the server, and PUT usually for updating the information. If I am creating a new index, should it not be POST instead of PUT? PUT does make sense when creating a document as it changes the state of data.
Your statement
As far as I know, POST is usually used for changing the state of the server, and PUT usually for updating the information.
does conform to the conventional HTTP vs CRUD semantics:
HTTP method
CRUD equivalent
Description
POST
Create
Let the target resource process the representation enclosed in the request.
PUT
Update
Set the target resource’s state to the state defined by the representation enclosed in the request.
However, the PUT spec also stipulates that:
The PUT method requests that the state of the target resource be
created or replaced with the state defined by the representation
enclosed in the request message payload
As such, PUT can (and is) used in Elasticsearch to both create an index AND update its
[settings and mappings].
Also, keep in mind that it's rarely just a matter of strict adherence to the semantics. One of the creators of ES put it this way:
It's all about REST semantics.
And our understanding of the semantics at the time when we made the APIs. And backwards compatibility constraints. And whatever "feels" natural to the person who implemented the API.
Where it makes a lot of sense Elasticsearch maps the HTTP verbs to
useful things. But when it doesn't make a ton of sense we just go with
whatever verb feels good rather than trying to be super strict about
REST. Also, we don't do linked data, instead relying on you to build
links from context. I'm told that is particularly non-REST. But it is
what we do.

What is the point of having PATCH, POST, PUT types when we use repository save methods for all?

As a newcomer to spring I would like to know the actual difference between:-
#PostMapping
#PutMapping
#PatchMapping
My understanding is PUT is for update but then we have to get the element by its id and then save() it. Similarly the save() method is again used by Post which automatically replaces by its identifier(PRIMARY). In my application I am able to use three of these methods interchangeably.
What is the point of having PATCH, POST, PUT types when we use repository save methods for all?
HTTP method tokens are used to define request semantics in such a way that general purpose components (browsers, reverse proxies, etc) can exploit the information to do intelligent things.
The easiest of these is that PUT has idempotent semantics; if an http response is lost, a general purpose component knows that it may autonomously retry sending the request. This in turn gives you a bit of extra reliability over an unreliable network, "for free".
The fact that your origin server uses the same persistence mechanism for each is an implementation detail, something deliberately hidden behind the "uniform interface".
The difference between PATCH and POST is subtle; PATCH gives you an unambiguous way to designate that the enclosed entity is a patch document, and offers a mechanism for discovering which patch document formats are understood by the origin server, neither of which you get from POST alone.
What's less clear, at least to me, is whether PATCH semantics allow an intermediate component to do something intelligent with a request - in other words, do the additional constraints (relative to POST) allow intermediaries to do anything interesting?
As best I can tell, the semantics of a PATCH request are more specific, but not actionably more specific -- certainly not as obviously as we have in the case of safe or idempotent request semantics.
POST is for creating a brand new object.
PUT will replace all of an objects properties in one go.
Leaving a property empty will empty the value in the datastore.
PATCH does a partial update of an object.
You can send it just the properties which should be updated.
A PATCH request with all object properties included will have the same effect as a POST request. But they are not the same.
The HTTP method is a convention not specific to Spring but is a main pillar of the REST API specification.
They make sure the intent of a request is clear and both the provider and consumer are in agreement of the end result.
Kind of like the pedals or gear shift in our cars. It's a lot easier when they all work the same.
Switching them up could lead to a lot of accidents.
For us as developers, it means we can expect most REST APIs to behave in a similar way, assuming an API is implemented according to or reasonably close to the specification.
POST/PUT/PATCH may look alike but there are subtle differences.
As you mention the PUT and PATCH methods require some kind of ID of the object to be updated.
In an example of a combined POST/PUT/PATCH endpoint, sending a request with an object, omitting some of its properties. How does the API react?
Update only the received properties.
Update the entire object, emptying the omitted properties.
Attempt to create a new object.
How is the consumer of the endpoint to know which of the three actions the server took?
This is where the HTTP method and specification/convention help determine the appropriate course of action.
Spring may facilitate the save method which can handle both creation, updates and partial updates. But this is not necessarily the case for other frameworks in Java or other languages.
Also, your application may be simple enough to handle POST/PUT/PATCH in the same controller method right now.
But over time as your application grows more complex, the separation of concerns makes your code a lot cleaner, more readable and maintainable.

Ignore an Empty Request Body?

After writing tests for my REST API (built using Spring Boot), I realised that, even when the request body is not used (see below), calling the endpoint with a request body succeeds–effectively, Spring ignores the body.
This is not a huge issue, but I was wondering what philosophy I should approach this with:
Should I fail when passed an unexpected body (when do not expect any)?
If so, is this configurable in Spring so that there is strict checking of body/parameters?
Finally, beyond personal preference and/or experience, is there a good way of deciding this, or should I simply use 'if it isn't broken, don't fix it' as my mantra?
#PatchMapping(value = "/products/{pid}/sell")
public TxDTO sell(#NotBlank #PathVariable("pid") String pid,
#NotNull #RequestParam Float price)
I think you shouldn't think too much into this. Technically, ignoring the unexpected body doesn't violate any software development principles. Even though this might be something that makes you feel uncomfortable personally in the context of your project, you might want to consider other scenarios where there's a filter or servlet sitting in front of your #RestController doing some additional stuff you're not aware of.
The point is this is not a feature you should turn off globally nor it is worth spending time implementing custom code to turn it off locally for a single endpoint :).

Constants class or propeties file for declaring url mappings?

Is there any strong reasons to choose one over the other when declaring the mappings for url resources?
#RequestMapping(Mappings.USER)
vs
#RequestMapping("${mappings.user}")
I understand that property files can be modified after deployment, and that might be a reason to keep it in properties if you want it to be changed easily, right? But also I think changing them easily could be undesirable. So for those with experience, which do you prefer, and why? I think a constants file might be easier to refactor, like if I wanted to change the name of a resource I would only have to refactor inside the constants class vs if I refactored properties I would have to refactor in the properties file and everywhere that uses the mapping (Im using eclipse and as far as I know it doesnt have property name refactoring like that). Or maybe a third option of neither and declaring them all as literals inside the controllers?
It all depends on your use case. If you need the change URIs without recompilation, property files is the way to go. Otherwise, constants provide type safety and ease of unit testing that SPEL doesn't. If you're not gonna change or reuse them (for example, same URI for GET and POST is very common), I don't see any need for constants at all.

Store result into cache in Play 2.2

In Play framework 2.2 is very simple to create an result of the current request. We type just:
Ok(views.html.default.render())
And then to make it work is enough to wrap it by Action, so the final code looks like:
def index = Action {
Ok(views.html.default.render())
}
That is fine. But now, I want to store the response in cache to make it more scalable. I use EHCache. The issue is, that when I store it into cache, it throws
NotSerializableException: play.api.mvc.ActionBuilder$$anon$1
I tried to cache at least the result it self, but it throws
ERROR net.sf.ehcache.store.disk.DiskStorageFactory Disk Write of result-key failed:
java.io.NotSerializableException: play.api.libs.iteratee.Enumerator$$anon$18
I know, that the values are stored in the cache, but only in a memory, which might be very insufficient considering really high load and many distinct responses.
Question:
So my question is whether there is any way how to fully cache Play actions/results, including proper serialization?
Edit:
How I try to use the cache: I do not use Cached {} because it doesn't behave exactly how I need, so I try to designed it in my own way. So just for the testing purposes I use it more verbosely by now:
Cache.set("myaction", Action {
Ok(views.html.default.render())
})
or
Cache.set("myresponse", Ok(views.html.default.render()))
But both of these produces exceptions mentioned above.
About the cache: The Play cache API is not sufficient to me, so I extended it by another couple methods together with new plugin implementation. At first I tried to just copy default plugin and implement those extensions but there were some issues, so I fixed them is recommended here. It is the plugin fix. Since then it seems that it actually uses the EHCache (guessing from those exceptions).
It seems to me that you are not trying to store the results in the cache but the action, which then contains a closure that cannot be serialized, I guess this is not what you want to do anyways, I guess this is because you are using EHCache directly?
If you use the Play cache API it should help you do the right thing. You can find the docs for it here: http://www.playframework.com/documentation/2.2.x/ScalaCache
The response may still not be serializable though, if you really want a cache that serializes to disk you should be able to cache the HTML generated by the template as it is basically a string, and then re-use that but create a new response for every request.
(My gut feeling is that you would probably get better performance from rendering the template every time than the cache reading it from disk unless you have some really crazy complex templates)
Not sure it is suitable for 2.2, however according to this issue I reported
if you're calling set method directly from an implementation of CacheApi and the implementation expects a serializable object, use this wrapper which is also used by the #cached helper.

Resources