Can Sling mappings be restricted to requests with host header - url-rewriting

I would like to selectively apply Sling mappings defined in sling:Mapping nodes under /etc/map.publish and can't get the behaviour I would like.
Essentially, I would like the mapping rule to trigger only when the host header matches the request.
I am currently using sling:Mapping nodes under /etc/map.publish to map resource paths to short URLs in the response.
So under /etc/map.publish/http/myapp I would have the following node:
<jcr:root ...>
jcr:primaryType="sling:Mapping"
sling:internalRedirect="/content/company/app/en"
sling:match="app.company.com
</jcr:root>
What I would like is that when a user requests:
http://app.company.com/content/company/app/en/page.html
The urls in the response (when mapped) will return in the form:
http://app.company.com/page.html
The reason for this difference in inbound and outbound urls is because I have Apache rewriting URLs for different device types.
However, when a request with a different host header arrives, such as:
http://localhost:4502/content/company/app/en/page.html
I do not want the URLs to be mapped according to that rule. Right now, it is being mapped to
http://app.company.com/page.html
It seems as though the mapping is strictly resolves the resource using considering the host/port. Then when mapping urls during output a "best match" is found and used. I would like the map() to behave like the resolve() if possible.

There are two mechanisms based on /etc/map:
URL resolver using resolver.resolve() responsible for transforming URLs like http://app.company.com/page.html into content path, eg. /content/company/app/en/page.html
Link rewriter using resolver.map() method which transforms the content and shortens all links from /content/company/app/en/page.html form in <a>, <img>, etc. to full URL. It will work only if you don't have any regular expressions in apropriate sling:match property.
You can use domain name to map/resolve content and eg. create multidomain environment, so http://app.company.com/page.html will hit one resource and http://app.company2.com/page.html will hit another.
However, you can't disable or enable link rewriter depending on the current request host. Eg. if configure mappings as above, the /content/company/app/en/page.html content path will always be shortened to http://app.company.com/page.html, no matter what host header you have in your request.

If you want to make sure your inbound request is resolved, just add a second mapping to it.
Your mapping would look like this:
<jcr:root ...>
jcr:primaryType="sling:Mapping"
sling:internalRedirect="[/content/company/app/en,/content,/]"
sling:match="app.company.com
</jcr:root>
Outbound mappings, s.a. resolver.map(), will use the first applying rule.

Related

QueryString Structure of a Conditial Retrieve in OneM2M?

This is an example resource tree.
I need to retrieve latest 48 hours' data of cnt-2 and cnt-0 all together. What kind of query string should I put to the request ?
/in-cse
/in-cse/ae-123
/in-cse/cnt-2
/in-cse/cin-21
/in-cse/cin-22
/in-cse/cin-23
/in-cse/ae-124
/in-cse/cnt-0
/in-cse/cin-01
/in-cse/cin-02
/in-cse/cin-03
/in-cse/cnt-1
/in-cse/cin-11
/in-cse/cin-22
/in-cse/cin-33
Where should I put the ids of cnt-0 and cnt-2 in the querystring ?
/onem2m/api/v1/~/in-cse?fu=2&crb=20190808T000000&cra=20190806T000000&ty=4
Also should I use only querystring to make discovery or is it valid to make a POST request ?
With the example request in your question you will also get all the matching <contentInstance> resources of cnt-1, because you do the discovery on the level of the IN-CSE. Unfortunately, you cannot have multiple targets in a single request, but I see at least two solutions that could work for your use case:
You can add labels two <contentInstance> resources and add label to your search.
/onem2m/api/v1/~/in-cse?fu=2&crb=20190808T000000&cra=20190806T000000&label=myLabel&ty=4
You can add a <group> that contains the <container> resources that are important to your use case (ie. cnt-0 and cnt-2) and make the <group>'s fanoutPoint the target of your discovery request. The CSE is then responsible to redirect the discovery to each member of the <group>.
/onem2m/api/v1/~/in-cse/aGroup/fopt?fu=2&crb=20190808T000000&cra=20190806T000000&ty=4
In my opinion the second method is the more "elegant" one because it makes the (application) relationship of the two <container> resources clearer , but the first one might also be feasible if your <contentInstance> resources are tagged using labels anyway.
Regarding the POST request: For the HTTP binding query parameters are only allowed for filtering and discovery. Please have a look at TS-0009, section 6.2.2.2 Query component.
Btw, there are currently ongoing discussions in oneM2M to describe the differences between retrieval and discovery a bit better.

Ajax results filtering and URL parameters

I am building a results filtering page using AJAX requests. I would like to reflect the filters in the URL. For example: for price_from I want to add ?price_from=VAL to the URL.
I have a backend that is capable of rendering the page with URL parameters.
After some googling I would a Backbone.router solution which has a hash fallback for the IE that does not support HTML5 history API.
I have a problem with setting a good philosophy of routes. I have a set of filtering parameters (price_from, price_to, color, ...) and I would like to attach each parameter to one route.
Is that possible to chain the routes to match for example: ?price_from=0&price_to=1&color=red? (the item order can change)
It means: call all the routes at the same time and keep the ie backwards compatibility?
Your best bet would be to have a query portion of the URL rather than using GET parameters to denote the search criteria. For example:
Push state: /search/query/price_from=0&price_to=1&color=red
Hash based: #search/query/price_from=0&price_to=1&color=red
Your backend would of course need to change a bit to be able to parse the new URL structure.

RESTful api for dynamic showform on top of spring mvc

I want to build a typical mvc app for CRUD of simple items, the api s should be RESTful. The catch here is, that i have a large pallete of items that needs to be initialized. On the server side those items are defined as java beans and the corresponding create form for the item is dynamically created from the field information(data type, validation constraints etc) harvested from the bean.
I am new to REST and just read up about how the urls should be nouns defining the resource and action specified by HTTP verb. In that perspective how to model something like
.../client/showForm?type=xyz from non RESTful way to RESTful one ?? My intention here is to tell the server to dynamically construct and send back a CREATE form for client of type xyz. The obvious problem with url i mentioned above is that it specifies action in the url which, from what i have read, makes it non RESTful.
When I think of REST, I think of resources. I think of data. In other words, I don't think of REST as being something that I would typically use to retrieve a form, which is a user interface component.
REST is an architectural style that is used to identify a resource on a server using a uniform resource identifier, or URI. Additionally, actions performed on those resources identified by the URI are determined based on the specific HTTP Method used in the request: GET, POST, PUT, DELETE, etc.
Thus, let's say you have a Client object. That client object might have the following properties:
Name
Location
AccountNumber
If I wanted to retrieve the data for a single client, I might use the following URI:
GET /client/xyz/ # xyx is the accountnumber used to identify the client.
I would use a GET method, since REST describes GET as being the method to use when retrieving data from the server.
The data could theoretically be returned in HTML, since REST is not a standard but more like a series of flexible guidelines; however, to really decouple my data from my user interface, I would choose to use something platform independent like JSON or XML to represent the data.
Next, when adding a client to the collection on the server, I would use the /client/ URI pattern, but I would use the HTTP Method POST, which is used when adding a resource to a collection on the server.
# Pass the data as JSON to the server and tell the server to add the client to the
# collection
POST /client/ {"accountnumber":"abc" , "Name" : "Jones" , "Location" : "Florida"}
If I were to modify an existing record on the server or replace it, I would most likely use the HTTP Method PUT, since REST guidelines say that PUT should be used if repeating the same operation repeatedly would not change the state of the server.
# Replace the client abc with a new resource
PUT /client/abc/ {"accountnumber":"abc" , "Name" : "Bob Jones" , "Location" : "Florida"}
The general idea behind REST is that it is used to identify a resource and then take action on that resource based on what HTTP Method is used.
If you insist on coupling your data with your view, one way accomplish this and retrieve the actual form, with the client data, could be to represent the form as a resource itself:
GET /client/abc/htmlform/
This URL would of course return your client data for client abc, but in an HTML form that would be rendered by the browser.
While my style of coding utilizes data transports such as JSON or XML to abstract and separate my data from my view, you could very well transport that data as HTML. However, the advantage of using JSON or XML is that your RESTful API becomes platform independent. If you ever expand your API to where other developers wish to consume it, they can do so, regardless of what specific platform or programming language they are using. In other words, the API could be used my PHP, Java, C#, Python, Ruby, or Perl developers.
In other words, any language or platform that can make HTTP requests and can send GET, POST, PUT, DELETE requests can be used to extend or build upon your API. This is the true advantage of REST.
For more information on setting up your controllers to use REST with Spring MVC, see this question. Additionally, check out the Spring MVC Documentation for more information.
Also, if you haven't checked out the Wikipedia article on REST, I strongly encourage you to do so. Finally, another good, classic read on REST is How I Explained REST To My Wife. Enjoy.

Detecting URL rewrites (SEO urls)

How could a client detect if a server is using Search Engine Optimizing techniques such as using mod_rewrite to implement "seo friendly urls."
For example:
Normal url:
http://somedomain.com/index.php?type=pic&id=1
SEO friendly URL:
http://somedomain.com/pic/1
Since mod_rewrite runs server side, there is no way a client can detect it for sure.
The only thing you can do client side is to look for some clues:
Is the HTML generated dynamic and that changes between calls? Then /pic/1 would need to be handled by some script and is most likely not the real URL.
Like said before: are there <link rel="canonical"> tags? Then the website likes to tell the search engine, which URL of multiple with the same content it should use from.
Modify parts of the URL and see, if you get an 404. In /pic/1 I would modify "1".
If there is no mod_rewrite it will return 404. If it is, the error is handled by the server side scripting language and can return a 404, but in most cases would return a 200 page printing an error.
You can use a <link rel="canonical" href="..." /> tag.
The SEO aspect is usually on words in the URL, so you can probably ignore any parts that are numeric. Usually SEO is applied over a group of like content, such that is has a common base URL, for example:
Base www.domain.ext/article, with fully URL examples being:
www.domain.ext/article/2011/06/15/man-bites-dog
www.domain.ext/article/2010/12/01/beauty-not-just-skin-deep
Such that the SEO aspect of the URL is the suffix. Algorithm to apply is typify each "folder" after the common base assigning it a "datatype" - numeric, text, alphanumeric and then score as follows:
HTTP Response Code is 200: should be obvious, but you can get a 404 www.domain.ext/errors/file-not-found that would pass the other checks listed.
Non Numeric, with Separators, Spell Checked: separators are usually dashes, underscores or spaces. Take each word and perform a spell check. If the words are valid - including proper names.
Spell Checked URL Text on Page if the text passes a spell check, analyze the page content to see if it appears there.
Spell Checked URL Text on Page Inside a Tag: if prior is true, mark again if text in its entirety is inside an HTML tag.
Tag is Important: if prior is true and tag is <title> or <h#> tag.
Usually with this approach you'll have a max of 5 points, unless multiple folders in the URL meet the criteria, with higher values being better. Now you can probably improve this by using a Bayesian probability approach that uses the above to featurize (i.e. detects the occurrence of some phenomenon) URLs, plus come up with some other clever featurizations. But, then you've got to train the algorithm, which may not be worth it.
Now based on your example, you also want to capture situations where the URL has been designed such that a crawler will index because query parameters are now part of the URL instead. In that case you can still typify suffixes' folders to arrive at patterns of data types - in your example's case that a common prefix is always trailed by an integer - and score those URLs as being SEO friendly as well.
I presume you would be using of the curl variants.
You could try sending the same request but with different "user agent" values.
i.e. send the request one using user agent "Mozzilla/5.0" and a second time using User Agent "Googlebot" if the server is doing something special for web crawlers then there should be a different response
With the frameworks today and url routing they provide I don't even need to use mod_rewrite to create friendly urls such http://somedomain.com/pic/1 so I doubt you can detect anything. I would create such urls for all visitors, crawlers or not. Maybe you can spoof some bot headers to pretend you're a known crawler and see if there's any change. Dunno how legal that is tbh.
For the dynamic url's pattern, its better to use <link rel="canonical" href="..." /> tag for other duplicate

how does URL rewrite work in plain english

I have read a lot about URL rewriting but I still don't get it.
I understand that a URL like
http://www.example.com/Blog/Posts.php?Year=2006&Month=12&Day=19
can be replaced with a friendlier one like
http://www.example.com/Blog/2006/12/19/
and the server code can remain unchanged because there is some filter which transforms the new URL and sends it to the old, but does it replace the URLs in the HTML of the response too?
If the server code remains unchanged then it is possible that in my returned HTML code I have links like:
http://www.example.com/Blog/Posts.php?Year=2006&Month=12&Day=20
http://www.example.com/Blog/Posts.php?Year=2006&Month=12&Day=21
http://www.example.com/Blog/Posts.php?Year=2006&Month=12&Day=22
This defeats the purpose of having the nice URLs if in my page I still have the old ones.
Does URL rewriting (with a filter or something) also replace this content in the HTML?
Put another way... do the rewrite rules apply for the incoming request as well as the HTML content of the response?
Thank you!
The URL rewriter simply takes the incoming URL and if it matches a certain pattern it converts it to a URL that the server understands (assuming your rewrite rules are correct).
It does mean that a specific resource can be accessed multiple ways, but this does not "defeat the point", as the point is to have nice looking URLs, which you still do.
They do not rewrite the outgoing content, only the incoming URL.

Resources