At the moment I have the following understanding (which, I assume, is incomplete and probably even wrong).
A web server receives request from a client. The requests are coming to a particular "path" ("address", "URL") and have a particular type (GET, POST and probably something else?). The GET and POST requests can also come with variables and their values (which can be though as a "dictionary" or "associate array"). The parameters of GET requests are set in the address line (for example: http://example.com?x=1&y=2) while parameters of POST requests are set by the client (user) via web forms (in other words, a user fills in a form and press "Submit" button).
In addition to that we have what is called SESSION (also known as COOKIES). This works the following way. When a web-server gets a request (of GET or POST type) it (web server) checks the values of the sent parameters and based on that it generates and sends back to the client HTML code that is displayed in a browser (and is seen by the user). In addition to that the web servers sends some parameters (which again can be imagined as "dictionary" or "associative arrays"). These parameters are saved by the browser somewhere on the client side and when a client sends a new request, he/she also sends back the session parameters received earlier from the web server. In fact server says: you get this from me, memorize it and next time when you speak to me, give it back (so, I can recognize you).
So, what I do not know is if client can see what exactly is in the session (what parameters are there and what values they have) and if client is able to modify the values of these parameters (or add or remove parameters). But what user can do, he/she might decide not to accept any cookies (or session).
There is also something called "local storage" (it is available in HTML5). It works as follows. Like SESSION it is some information sent by web-server to the client and is also memorized (saved) by the client (if client wants to). In contrast to the session it is not sent back b the client to the server. Instead, JavaScripts running on the client side (and send by web-servers as part of the HTML code) can access information from the local storage.
What I am still missing, is how AJAX is working. It is like by clicking something in the browser users sends (via Browser) a request to the web-server and waits for a response. Then the browser receives some response and use it to modify (but not to replace) the page observed by the user. What I am missing is how the browser knows how to use the response from the web-server. Is it written in the HTML code (something like: if this is clicked, send this request to the web server, and use its answer (provided content) to modify this part of the page).
I am going to answer to your questions on AJAX and LocalStorage, also on a very high level, since your definition strike me as such on a high level.
AJAX stands for Asynchronous JavaScript and XML.
Your browser uses an object called XMLHTTPRequest in order to establish an HTTP request with a remote resource.
The client, being a client, is oblivious of what the remote server entails on. All it has to do is provide the request with a URL, a method and optionally the request's payload. The payload is most commonly a parameter or a group of parameters that are received by the remote server.
The request object has several methods and properties, and it also has its ways of handling the response.
What I am missing is how the browser knows how to use the response
from the web-server.
You simply tell it what to do with the reponse. As mentioned above, the request object can also be told what to do with a response. It will listen to a response, and when such arrives, you tell the client what to do with it.
Is it (the response) written in the HTML code?
No. The response is written in whatever the server served it. Most commonly, it's Unicode. A common way to serve a response is a JSON (JavaScript Object Notation) object.
Whatever happens afterwards is a pure matter of implementation.
LocalStorage
There is also something called "local storage" (it is available in
HTML5). It works as follows. Like SESSION it is some information sent
by web-server to the client and is also memorized (saved) by the
client (if client wants to)
Not entirely accurate. Local Storage is indeed a new feature, introduced with HTML5. It is a new way of storing data in the client, and is unique to an origin. By origin, we refer to a unique protocol and a domain.
The life time of a Local Storage object on a client (again, per unique origin), is entirely up to the user. That said, of course a client application can manipulate the data and decide what's inside a local storage object. You are right about the fact that it is stored and can be used in the client through JavaScript.
Example: some web tracking tools want to have some sort of a back up plan, in case the server that collects user data is unreachable for some reason. The web tracker, sometimes introduced as a JavaScript plugin, can write any event to the local storage first, and release it only when the remote server confirmed that it received the event successfully, even if the user closed the browser.
First of all, this is just a simple explanation to clarify your mind. To explain this stuff in more detail we would need to write a book. This been said, I'll go step by step.
Request
A request is a client asking for / sending data to a server.
This request has the following parts:
An URL (Protocol + hostname/IP + path)
A Method (GET, POST, PUT, DELETE, PATCH, and so on)
Some optional parameters (the way they are sent depends on the method)
Some headers (metadata sent to the server)
Some optional cookies
An optional SESSION ID
Some explanations about this:
Cookies can be set by the client or by the server, but they are always stored by the client's browser. Therefore, the browser can decide whether to accept them or not, or to delete or modify them
Session is stored in the server. The server sends the client a session ID to be able to recongnize him in any future request.
Session and cookies are two different things. One is server side, and the other is client side.
AJAX
I'll ommit the meaning of the acronym as you can easily google it.
The great thing about AJAX is the very first A, that stands for asynchronous, what means that the JS engine (in this case built in the browser) won't block until the response gets back.
To understand how AJAX works, you have to know that it's very much alike a common request, but with the difference that it can be triggered without reloading the web page.
The content of the response it whatever you want it to be. From some HTML code, to a JSON string. Even some plain text.
The way the response is treated depends on the implementation and programming. As an example, you could simply alert() the result of an AJAX call, or you and append it to a DOM element.
Local Storage
This doesn't have much to do with anything.
Local storage is just some disk space offered by the browser so you can save data in the browser that persists even if the page or the browser is closed.
An example
Chrome offers a javascript API to manage local storage. It's client side, and you can programmatically access to this storage and make CRUD opperations. It's just like a non-sql non-relational DB in the browser.
I wil summarize your main questions along with a brief answer right below them:
Q1:
Can the client see what exactly is in the session?
A: No. The client only knows the "SessionID", which is meta-data (all other session-data is stored on server only, and client can't see or alter it). The SessionID is used by the server only to identify the client and to map the application process to it's previous state.
HTTP is a stateless protocol, and this classic technique enables it as stateful.
There are very rare cases when the complete session data is stored on client-side (but in such cases, the server should also encrypt the session data so that the client can't see/alter it).
On the other hand, there are web clients that don't have the capability to store cookies at all, or they have features that prevent storing cookie data (e.g. the ability of the user to reject cookies from domains). In such cases, the workaround is to inject the SessionID into URL parameters, by using HTTP redirects.
Q2:
What's the difference between HTML5 LocalStorage and Session?
LocalStorage can be viewed as the client's own 'session' data, or better said a local data store where the client can save/persist data. Only the client (mainly from javascript) can access and alter the data. Think of it as javascript-controlled persistent storage (with the advantage over cookies that you can control what data, it's structure and the format you want to store it). It's also more advantageous than storing data to cookies - which have their own limitations such as data size and structure.
Q3:
How AJAX works?
In very simple words, AJAX means loading on-demand data on top of an already loaded (HTML) page. A typical http request would load the whole data of a page, while an ajax request would load and update just a portion of the (already-loaded) page.
This being said, an AJAX request is very similar to a standard HTTP Request.
Ajax requests are controlled by the javascript code and it can enrich the interaction with the page. You can request specific segments of data and update sections of the page.
Now, if we remember the old days when any interaction with a website (eg. signing in, navigating to other pages etc.) required a complete page reload? Back then, a lot of unnecessary traffic occurred just to perform any simple action. This in turn impacted site responsiveness, user experience, network traffic etc.
This happened due to browsers incapability (at that time) to [a.] perform a parallel HTTP request to the server and [b]render a partial HTML view.
Modern browsers come with these two features that enables AJAX technology - that is, invoking asynchronous(parallel) HTTP Requests (Ajax HTTP Requests) and they also provide on-the-fly DOM alteration mechanism via javascript (real-time HTML Document Object Model manipulation).
Please let me know if you need more info on these topics, or if there's anything else I can help with.
For a more profound understanding, I also recommend this nice web history article as it explains how everything started from when HTML was created and what was it's purpose (to define [at the time] rich documents), and then how HTTP was initially created and what problem it solved (at the time - to "transfer" static HTML). That explains why it is a stateless protocol.
Later on, as HTML and the WEB evolved, other needs emerged (such as the need to interact with the end-user) - and then the Cookie mechanism enhanced the protocol to enable stateful client-server communication by using session cookies. Then Ajax followed. Nowadays, the cookies come with their own limitations too and we have LocalStorage. Did I also mention WebSockets?
1. Establishing a Connection
The most common way web servers and clients communicate is through a connection which follows Transmission Control Protocol, or TCP. Basically, when using TCP, a connection is established between client and server machines through a series of back-and-forth checks. Once the connection is established and open, data can be sent between client and server. This connection can also be termed a Session.
There is also UDP, or User Datagram Protocol which has a slightly different way of communicating and comes with its own set of pros and cons. I believe recently some browsers may have begun to use a combination of the two in order to get the best results.
There is a ton more to be said here, but unless you are going to be writing a browser (or become a hacker) this should not concern you too much beyond the basics.
2. Sending Packets
Once the client-server connection is established, packets of data can be sent between the two. TCP packets contain various bits of information to assist in communication between the two ports. For web programmers, the most important part of the packet will be the section which includes the HTTP request.
HTTP, Hypertext Transfer Protocol is another protocol which describes what the makeup/format of these client-server communications should be.
A most basic example of the relevant portion of a packet sent from a client to a server is as follows:
GET /index.html HTTP/1.1
Host: www.example.com
The first line here is called the Response line. GET describes the method to be used, (others include POST, HEAD, PUT, DELETE, etc.) /index.html describes the resource requested. Finally, HTTP/1.11 describes the protocol being used.
The second line is in this case the only header field in the request, and in this case it is the HOST field which is sort of an alias for the IP address of the server, given by the DNS.
[Since you mentioned it, the difference between a GET request and a POST request is simply that in a GET request the parameters (ex: form data) is included as part of the Response Line, whereas in a POST request the parameters will be included as part of the Message Body (see below).]
3. Receiving Packets
Depending on the request sent to the server, the server will scratch its head, think about what you asked it, and respond accordingly (aka whatever you program it to do).
Here is an example of a response packet send from the server:
HTTP/1.1 200 OK
Content-Type: text/html; charset=UTF-8
...
<html>
<head>
<title>A response from a server</title>
</head>
<body>
<h1>Hello World!</h1>
</body>
</html>
The first line here is the Status Line which includes a numerical code along with a brief text description. 200 OK obviously means success. Most people are probably also familiar with 404 Not Found, for example.
The second line is the first of the Response Header Fields. Other fields often added include date, Content-Length, and other useful metadata.
Below the headers and the necessary empty line is finally the (optional) Message Body. Of course this is usually the most exciting part of the response, as it will contain things like HTML for our browsers to display for us, JSON data, or pretty much anything you can code in a return statement.
4. AJAX, Asynchronous JavaScript and XML
Based off all of that, AJAX is fairly simple to understand. In fact, the packets sent and received can look identical to non-ajax requests.
The only difference is how and when the browser decides to send a request packet. Normally, upon page refresh a browser will send a request to the server. However, when issuing an AJAX request, the programmer simply tells the browser to please send a packet to the server NOW as opposed to on page refresh.
However, given the nature of AJAX requests, usually the Message Body won't contain an entire HTML document, but will request smaller, more specific bits of data, such as a query from a database.
Then, your JavaScript which calls the Ajax can also act based off the response. Any JavaScript method is available as making an Ajax call is just another JavaScript function. Thus, you can do things like innerHTML to add/replace content on your page with some HTML returned by the Ajax call. Alternatively though, you could also do something like make an Ajax call which simply should return True or False, and then you could call some JavaScript function with an if else statement. As you can hopefully see, Ajax has nothing to do with HTML per say, it is just a JavaScript function which makes a request from the server and returns the response, whatever it may be.
5. Cookies
HTTP Protocol is an example of a Stateless Protocol. Basically, this means that each pair of Request and Response (like we described) is treated independently of other requests and responses. Thus, the server does not have to keep track of all the thousands of users who are currently demanding attention. Instead, it can just respond to each request individually.
However, sometimes we wish the server would remember us. How annoying would it be if every time I waned to check my Gmail I had to log in all over again because the server forgot about me?
To solve this problem a server can send Cookies to be stored on the client's machine. The server can send a response which tells the client to store a cookie and what exactly it should contain. The client's browser is in charge of storing these cookies on the client's system, thus the location of these cookies will vary depending on your browser and OS. It is important to realize though that these are just small files stored on the client machine which are in fact readable and writable by anyone who knows how to locate and understand them. As you can imagine, this poses a few different potential security threats. One solution is to encrypt the data stored inside these cookies so that a malicious user won't be able to take advantage of the information you made available. (Since your browser is setting these cookies, there is usually a setting within your browser which you can modify to either accept, reject, or perhaps set a new location for cookies.
This way, when the client makes a request from the server, it can include the Cookie within one of the Request Header Fields which will tell the server, "Hey I am an authenticated user, my name is Bob, and I was just in the middle of writing an extremely captivating blog post before my laptop died," or, "I have 3 designer suits picked out in my shopping cart but I am still planning on searching your site tomorrow for a 4th," for example.
6. Local Storage
HTML5 introduced Local Storage as a more secure alternative to Cookies. Unlike cookies, with local storage data is not actually sent to the server. Instead, the browser itself keeps track of State.
This alternative also allows much larger amounts of data to be stored, as there is no requirement for it to be passed across the internet between client and server.
7. Keep Researching
That should cover the basics and give a pretty clear picture as to what is going on between clients and servers. There is more to be said on each of these points, and you can find plenty of information with a simple Google search.
I'm debugging some tricky Ajax code without a server side, in fact I have no domain etc to even put any server-side code on.
I would like to find some very minimal Ajax JSON(P) testing or debugging webservice or web API that just sends something back. Something like a ping or noop or ack.
I would prefer something small, fast, and reliable, preferably provided by major Internet companies such as Google, Microsoft, or Yahoo.
Ideally it would support these features, though none are totally essential:
Support JSONP.
Parametrically return success or various kinds of failure.
Parametrically delay a specified time before responding.
Parametrically support or reject CORS requests.
Parametric control over HTTP headers.
Parametrically describe the object returned.
In fact the most basic form of such a service from a major provider would also be useful for determining that Internet access is available, at least to a degree beyond navigator.onLine.
jsFiddle actually supports some of the features I'm looking for with their “Echo Javascript file and XHR requests”.
To improve user experience “echo” features has been created. This allows to test XHR requests, add javascript files, create workers - all from one fiddle, so it is more transparent for the user reading the code. XHR requests are split to HTML, JSON, JSONP and XML. Gist and github responses are similar to the echo feature and go nicely in pair with storing fiddles in gist and github.
They're not really intended for use outside jsFiddle though. Some limitations:
HTTP POST must be used for everything except JSONP and JavaScript.
CORS is not supported.
Of the desired features I listed, it supports at least:
specifying a delay
describing an object to return
I'm not sure I understand what types of vulnerabilities this causes.
When I need to access data from an API I have to use ajax to request a PHP file on my own server, and that PHP file accesses the API. What makes this more secure than simply allowing me to hit the API directly with ajax?
For that matter, it looks like using JSONP http://en.wikipedia.org/wiki/JSONP you can do everything that cross-domain ajax would let you do.
Could someone enlighten me?
I think you're misunderstanding the problem that the same-origin policy is trying to solve.
Imagine that I'm logged into Gmail, and that Gmail has a JSON resource, http://mail.google.com/information-about-current-user.js, with information about the logged-in user. This resource is presumably intended to be used by the Gmail user interface, but, if not for the same-origin policy, any site that I visited, and that suspected that I might be a Gmail user, could run an AJAX request to get that resource as me, and retrieve information about me, without Gmail being able to do very much about it.
So the same-origin policy is not to protect your PHP page from the third-party site; and it's not to protect someone visiting your PHP page from the third-party site; rather, it's to protect someone visiting your PHP page, and any third-party sites to which they have special access, from your PHP page. (The "special access" can be because of cookies, or HTTP AUTH, or an IP address whitelist, or simply being on the right network — perhaps someone works at the NSA and is visiting your site, that doesn't mean you should be able to trigger a data-dump from an NSA internal page.)
JSONP circumvents this in a safe way, by introducing a different limitation: it only works if the resource is JSONP. So if Gmail wants a given JSON resource to be usable by third parties, it can support JSONP for that resource, but if it only wants that resource to be usable by its own user interface, it can support only plain JSON.
Many web services are not built to resist XSRF, so if a web-site can programmatically load user data via a request that carries cross-domain cookies just by virtue of the user having visited the site, anyone with the ability to run javascript can steal user data.
CORS is a planned secure alternative to XHR that solves the problem by not carrying credentials by default. The CORS spec explains the problem:
User agents commonly apply same-origin restrictions to network requests. These restrictions prevent a client-side Web application running from one origin from obtaining data retrieved from another origin, and also limit unsafe HTTP requests that can be automatically launched toward destinations that differ from the running application's origin.
In user agents that follow this pattern, network requests typically use ambient authentication and session management information, including HTTP authentication and cookie information.
EDIT:
The problem with just making XHR work cross-domain is that many web services expose ambient authority. Normally that authority is only available to code from the same origin.
This means that a user that trusts a web-site is trusting all the code from that website with their private data. The user trusts the server they send the data to, and any code loaded by pages served by that server. When the people behind a website and the libraries it loads are trustworthy, the user's trust is well-placed.
If XHR worked cross-origin, and carried cookies, that ambient authority would be available to code to anyone that can serve code to the user. The trust decisions that the user previously made may no longer be well-placed.
CORS doesn't inherit these problems because existing services don't expose ambient authority to CORS.
The pattern of JS->Server(PHP)->API makes it possible and not only best, but essential practice to sanity-check what you get while it passes through the server. In addition to that, things like poisened local resolvers (aka DNS Worms) etc. are much less likely on a server, than on some random client.
As for JSONP: This is not a walking stick, but a crutch. IMHO it could be seen as an exploit against a misfeature of the HTML/JS combo, that can't be removed without breaking existing code. Others might think different of this.
While JSONP allows you to unreflectedly execute code from somwhere in the bad wide world, nobody forces you to do so. Sane implementations of JSONP allways use some sort of hashing etc to verify, that the provider of that code is trustwirthy. Again others might think different.
With cross site scripting you would then have a web page that would be able to pull data from anywhere and then be able to run in the same context as your other data on the page and in theory have access to the cookie and other security information that you would not want access to be given too. Cross site scripting would be very insecure in this respect since you would be able to go to any page and if allowed the script on that page could just load data from anywhere and then start executing bad code hence the reason that it is not allowed.
JSONP on the otherhand allows you to get data in JSON format because you provide the necessary callback that the data is passed into hence it gives you the measure of control in that the data will not be executed by the browser unless the callback function does and exec or tries to execute it. The data will be in a JSON format that you can then do whatever you wish with, however it will not be executed hence it is safer and hence the reason it is allowed.
The original XHR was never designed to allow cross-origin requests. The reason was a tangible security vulnerability that is primarily known by CSRF attacks.
In this attack scenario, a third party site can force a victim’s user agent to send forged but valid and legitimate requests to the origin site. From the origin server perspective, such a forged request is not indiscernible from other requests by that user which were initiated by the origin server’s web pages. The reason for that is because it’s actually the user agent that sends these requests and it would also automatically include any credentials such as cookies, HTTP authentication, and even client-side SSL certificates.
Now such requests can be easily forged: Starting with simple GET requests by using <img src="…"> through to POST requests by using forms and submitting them automatically. This works as long as it’s predictable how to forge such valid requests.
But this is not the main reason to forbid cross-origin requests for XHR. Because, as shown above, there are ways to forge requests even without XHR and even without JavaScript. No, the main reason that XHR did not allow cross-origin requests is because it would be the JavaScript in the web page of the third party the response would be sent to. So it would not just be possible to send cross-origin requests but also to receive the response that can contain sensitive information that would then be accessible by the JavaScript.
That’s why the original XHR specification did not allow cross-origin requests. But as technology advances, there were reasonable requests for supporting cross-origin requests. That’s why the original XHR specification was extended to XHR level 2 (XHR and XHR level 2 are now merged) where the main extension is to support cross-origin requests under particular requirements that are specified as CORS. Now the server has the ability to check the origin of a request and is also able to restrict the set of allowed origins as well as the set of allowed HTTP methods and header fields.
Now to JSONP: To get the JSON response of a request in JavaScript and be able to process it, it would either need to be a same-origin request or, in case of a cross-origin request, your server and the user agent would need to support CORS (of which the latter is only supported by modern browsers). But to be able to work with any browser, JSONP was invented that is simply a valid JavaScript function call with the JSON as a parameter that can be loaded as an external JavaScript via <script> that, similar to <img>, is not restricted to same-origin requests. But as well as any other request, a JSONP request is also vulnerable to CSRF.
So to conclude it from the security point of view:
XHR is required to make requests for JSON resources to get their responses in JavaScript
XHR2/CORS is required to make cross-origin requests for JSON resources to get their responses in JavaScript
JSONP is a workaround to circumvent cross-origin requests with XHR
But also:
Forging requests is laughable easy, although forging valid and legitimate requests is harder (but often quite easy as well)
CSRF attacks are a not be underestimated threat, so learn how to protect against CSRF
I've created a RESTful API that supports GET/POST/PUT/DELETE requests. Now I want my API to have a Javascript client library, and I thought to use JSONP to bypass the cross-domain policy. That works, but of course only for GET requests.
So I started thinking how to implement such a thing and at the same time trying to make it painless to use.
I thought to edit my API implementation and check every HTTP request. If it's a JSONP requests (it has a "callback" parameter in the querystring) I force every API method to be executed by a GET request, even if it should be called by other methods like POST or DELETE.
This is not a RESTful approach to the problem, but it works. What do you think?
Maybe another solution could be to dynamically generate an IFrame to send non-GET requests. Any tips?
There's some relevant points on a pretty similar question here...
JSONP Implications with true REST
The cross-domain restrictions are there for a reason ;-)
Jsonp allows you to expose a limited, safe, read-only view of the API to cross domain access - if you subvert that then you're potentially opening up a huge security hole - malicious websites can make destructive calls to your API simply by including an image with an href pointing to the right part of the API
Having your webapp expose certain functionality accessed through iframes, where all the ajax occurs within the context of your webapp's domain is definitely the safer choice. Even then you still need to take CSRF into consideration. (Take a look at Django's latest security announcement on the Django blog for a prime example - as of a release this week all javascript calls to a Django webapp must be CSRF validated by default)
The Iframe hack is not working anymore on recent browsers, do not use it anymore (source : http://jquery-howto.blogspot.de/2013/09/jquery-cross-domain-ajax-request.html)
What are the benefits of a XML HTTP request? A given server could send data (e.g. some JSON serialization) for a normal request (non-XHR) as it would send data for a XHR request. And that data could be processed asynchronously (by a browser for example) as well. So why was the XMLHttpRequest invented?
Some things I can think of:
To use the same URL for HTML and a web service
To let the server know that this must be processed fast.
As far as I recall, one of the first uses of XmlHttpRequest was for OWA, which used WebDAV on the wire. So show me how to do methods other than GET/POST without it.
One important thing about XHR is that it's asynchronous and you can have several concurrently running XHR requests. For example you can have several informers on your web page, all updating independently and concurrently.
XMLHttpRequest (or ActiveXObject in IE) is what allows Javascript to make HTTP requests. It was created to be able to retrieve data in Javascript without having to change the page/refresh the browser.
There are non-javascript ways of retrieving data without refreshing the page, but if you are using Javascript XMLHttpRequest is the way to go. Many libraries have simplified the use of this call by implementing ajax functions in their libraries (jQuery.ajax() for example) which causes most people to not even realize that XMLHttpRequest is the underlying call behind it.
I think the biggest reason it exists is that it predates an Ajax JSON request. It was originally the only way to do AJAX based things. It's still useful when requesting an HTML page and populating an HTML element with the information requested. It's much simpler to use XHR in that instance instead of parsing the JSON and reading out a variable.
I guess the simple answer is that if you're looking for a single piece of data it would be a simpler request to process.