Difference between WebDAV and socket.io (which one is better ?) - websocket

Here is my story :
I have a busy server (too much queries every minute), and I'm trying to upload images on that server, the problem is I don't know what to use to have good performance.
Can someone tell me the differences in performance between :
uploading a file using WebDAV protocol.
uploading a file using WebSocket Protocol  (socket.io).
which method is faster + consumes less resources + better.
Can you tell me for example what method YouTube is using for uploads ?
Thanks all.

Websocket can save resources if the socket is re-used for multiple communications requests. But if you used it to facillitate single uploads there would be no difference between it and a normal HTTP PUT.
Thats because creating a websocket connection goes through the same steps as initiating a normal HTTP connection, but then changes mode at the end so the connection is ready and waiting for transfers. So if a user was to do 10 uploads using a websocket connection there is potentially an advantage because there would only be one handshake, while for normal HTTP PUT there would be 10 handshakes.
But to do websocket uploads you will need to "roll your own" because there is no standard file upload semantics as part of the websocket standard, as there is with HTTP. And that means the potential for bugs and inefficiencies.
So to answer which approach is "better" I think the performance gain from using websockets for file upload would be small, perhaps not measurable, but doing so would introduce the risk of bugs and inefficiencies.

Related

Is this a correct scenario to use WebSocket?

I have a browser plugin which will be installed on 40,000 dekstops.
This plugin will connect to a backend configuration file available via https, e.g. http://somesite/config_file.js.
The plugin is configured to poll this backend resource once/day.
But there is only one backend server. So if 40,000 endpoints start polling together the server might crash.
I could think of randomize the polling frenquency from the desktop plugins. But randomization still does not gurantee that there will not be a overload at the server.
Is using websocket in this scenario solves the scalability issue?
Polling once a day is very little.
I don't see any upside for Websockets unless you switch to Push and have more notifications.
However, staggering the polling does make a lot of sense, since syncing requests for the same time is like writing a DoS attack against your own server.
Staggering doesn't necessarily have to be random and IMHO, it probably shouldn't.
You could start with a fixed time and add a second per client ID, allowing for ~86K connections in 24 hours which should be easy for any server to handle.
As a side note, 40K concurrent connections might not as hard to achieve as you imagine.
EDIT (relating to the comments)
Websockets vs. Server Sent Events:
IMHO, when pushing data (vs. polling), I would prefer Websockets over Server Sent Events (SSE).
Websockets have a few advantages, such as client side communication which allows clients to ping the server and confirm that the connection is still alive.
The Specific Use-Case:
From the description in the question and the comments it seems that you're using browser clients with a custom plugin and that the updates you wish to install daily might require the browser to be active.
This raises different questions that effect the implementation (are the client browsers open all day? do you have any control over the client browsers and their environment? can you guarantee installation while the browser is closed?).
...
IMHO, you might consider having the client plugins test for an update each morning as they load for the first time during that day (first access).
People arrive at work in different times and they open their browsers for the first time at different schedules. So the 40K requests you're expecting will be naturally scattered across that timeline (probably a 20-30 minute timespan).
This approach makes sure that the browsers and computers are actually open (making the update possible) and that the update requests are staggered over a period of time (about 33.3 requests per second, if my assumption is correct).
If you're serving a pre-written static configuration file (perhaps updated by the server daily), avoiding dynamic content and minimizing any database calls, than 33 req/sec should be very easy to manage.

Is there any significant performance benefits with HTTP2 multiplexing as compared with HTTP2 Server Push?

HTTP2 multiplexing uses the same TCP connection thereby removing Connection time to the same host.
But with HTTP2 Server Push is there any significant performance benefits except for the roundtrip time that HTTP2 multiplexing will take while requesting every resource.
I gave a presentation about this, that you can find here.
In particular, the demo (starting at 36:37) shows the benefits that you can have with multiplexing alone, and then by adding HTTP/2 Push.
Spoiler: the combination of HTTP/2 multiplexing and Push yields astonishing better results with respect to HTTP/1.1.
Then again, every case is different, so you have to actually measure your case.
But the potential of HTTP/2 to yield better performance than HTTP/1.1 is really large, and many (most?) cases will benefit from this.
I'm not sure what exactly you're asking here, or if it's a good fit for StackOverflow but will attempt to answer none-the-less. If this is not the answer you are looking for then please rephrase the question so we can understand what exactly it is you are looking for.
You are right in that HTTP/2 uses multiplexing, which does negate the need for multiple connections (and the time and resources needed to set them up and manage them). However it's much more than that as it's not limited (browsers will typically limit connections to 4-6 per host) and also allows for "similar" connections (same IP and same certificate but different hostname) to share connections as well. Basically it solves the queuing of resources that the request/response method of HTTP/1 means and reduces need of limited multiple connections that HTTP/1 requires as a workaround. Which also reduces need for other workarounds like sharding, sprite files, concatenation... etc.
And yes HTTP/2 server push saves on one round trip. So when you request a webpage it sends both the HTML and the CSS needed to draw the page as the server knows you will need the CSS as it's pointless just sending you the HTML, waiting for your web browser to get it, parse it, see it needs CSS and request the CSS file and wait for it to download.
I'm not sure if you're implying that a round trip time is so low, that there is little gains in HTTP/2 server push because there is now no delay in requesting a file due to HTTP/2 multiplexing? If so that is not the case - there are significant gains to be made in pushing resources, particularly blocking resources like CSS which the browser will wait for before drawing a single thing on screen. While multiplexing reduces the delay in sending a request, it does not reduce the latency on the request travelling to the server, now on the server responding to that and sending it back. While these sound small they are noticeable and make a website feel slow.
So yes, at present, the primary gain for HTTP/2 Server Push is in reducing that round trip time (basically to zero for key resources).
However we are at the infancy of this and there are potential other uses for performance or other reasons. For example you could use this as a way of prioritising content so an important image could be pushed early when, without this, a browser would likely request CSS and Javascript first and leave images until later. Server Push could also negate the need for inline CSS (which bloats pages with copies of style sheets and may require Javascript to then load the proper CSS file) - another HTTP/1.1 workaround for performance. I think it will be very interesting to watch what happens with HTTP/2 Server Push over the coming years.
Saying that, there still some significant challenges with HTTP/2 server push. Most importantly how do you prevent wasting bandwidth by pushing resources that the browser already has cached? It's likely a digest HTTP header will be added for this but still under discussion. Which leads on how to implement HTTP/2 Server Push in the best method - for web browsers, web servers and web developers? The HTTP/2 spec is a bit vague on how this should be implemented, which leaves it up to different web servers in particular providing different methods to signal to the server to push a resource.
As I say, I think this one of the parts of HTTP/2 that could lead to some very interesting applications. We live in interesting times...

Should I be using AJAX or WebSockets.

Oh the joyous question of HTTP vs WebSockets is at it again, however even after quit a bit of reading on the hundreds of versus blog posts, SO questions, etc, etc.. I'm still at a complete loss as to what I should be working towards for our application. In this post I will be supplying information on application functionality, and the types of requests/responses used in our application currently.
Currently our application is a sloppy piece of work, thrown together using AngularJS and AJAX requests to a Apache server running PHP, namely XAMPP. With the launch of our application I've noticed that we're having problems with response times when the server is under any kind of load. This probably has something to do with the sloppy architecture of our server, the hardware, and the fact that our MySQL database isn't exactly optimized.
However, with such a loyal fanbase and investors seeing potential in our application and giving us a chance to roll out a 2.0 I've been studying hard into how to turn this application into a powerhouse of low latency scalability. Honestly the best option would be hire someone with experience, but unfortunately I'm a hobbyist, and a one-man-army without much experience.
After some extensive research, I've decided on writing the backend using NodeJS this time. However I'm having a hard time deciding on HTTP or Websockets. Here's the types of transactions that are done between the Server/Client.
Client sends a request to the server in JSON format. The request has a few different things.
A request id (For processing logic based on the request)
The data associated with the request ID.
The server receives the request, polls the database (if necessary) and then responds to the client in JSON format. Sometimes the server is serving files to the client. Namely images in Base64 format.
Currently the application (When being used) sends a request to the server every time an interface is changed, which on average for our application is once every few seconds. Every action on our interfaces sends another request to the server. The application also sends requests to check for notifications/messages every 8 seconds, (or two seconds depending on if they're on the messaging interface).
Currently here are the benefits I see of a stated connection over a stateless connection with our application.
If the connection is stated, I can eliminate the requests for notifications and messages, as the server can just tell the client whenever one comes available. This can eliminate x(n)/4 requests per second to the server alone.
Handling something like a disconnection from the server is as simple as attempting to reconnect, opposed to handling timeouts/errors per request, this would only be handled on the socket.
Additional security can be obtained by removing security keys for database interaction, this should prevent the possibility of Hijacking(?) of a session_key and using it to manipulate or access another users data. The session_key is only needed due to there being no state in the AJAX setup.
However, I'm someone who started learning programming through TCP game server emulation. So I understand some benefits of a STATED connection, while I don't understand the benefits of a STATELESS connection very much at all. I know they both have their benefits and quirks, but I'm curious what would be the best approach for us.
We're mainly looking for Scalability, as we had a local application launch and managed to bottleneck at nearly 10,000 users in under 48 hours. Luckily I announced this as a BETA and the users are cutting me a lot of slack after learning that I did it all on my own as a learning project. I've disabled registrations while looking into improving the application's front and backend.
IMPORTANT:
If using WebSockets, would we be able to asynchronously download pictures from the server like we can with AJAX? For example, I can make 5 requests to the server using AJAX for 5 different images, and they will all start downloading immediately, using a stated connection would I have to wait for each photo to be streamed before moving to the next request? Would this only bottle-neck a single user, or every user that is waiting on a request to be completed?
It all boils down on how your application works and how it needs to scale. I would use bare WebSockets rather than any wrapper, since it is an already easy to use API and your hands won't be tied when you need to scale out.
Here some links that will give you insight, although not concrete answers to your questions because as I said, it depends on your expectations.
Hard downsides of long polling?
WebSocket/REST: Client connections?
Websockets, and identifying unique peers[PHP]
How HTML5 Web Sockets Interact With Proxy Servers
If your question is Should I use HTTP over Websockets ?, the response is: You should not.
Even if it is faster because you don't lose time opening the connection, you lose also all the HTTP specification like verbs (GET, POST, PATCH, PUT, ...), path, body, and also response, status code. This seams simple but you'll have to re-implement all or part of these protocol things.
So you should use Ajax, as long as it is one ponctual request.
When you need to make an ajax request every 2 seconds, you need in fact that the server sends you data, not YOU request server to check Api change (if changed). So this is a sign that you should implement a websocket server.

Is using SSL session cache a bad idea for a websocket server?

When a client reconnects to an SSL server, SSL session caches remove the need to recompute the same cryptographic agreement previously used by that client (while also reducing the communication round trips needed from 2 to 1).
However, the websocket protocol states that clients should never disconnect from a websocket server without a good reason (e.g. because an error occurred or the user closed the browser/application tab/window) (?); so when websockets are established on top of an SSL layer, the server can simply assume any websocket connection is alive unless notified otherwise, during which time the underlying SSL session of any connection can also safely be assumed to remain valid?
Furthermore, a websocket server needs to be able to handle many concurrent long lived connections, and because SSL session caches need to be stored for every single connection (?), implementing these caches would in this case probably be detrimental to performance because of the large memory overhead, right?
Sorry; this might be more than one simple question, but I wanted to verify if my understanding of these issues is adequate.
Well,
I think it might depend on the software architecture. Let's say you have a site with 100 pages and the user often navigates between them. Many of them have a websocket for some special purpose, but it is only kept alive during the display of that page. Then caching would make sense as you're closing/opening WebSockets often.
On the other hand, you might have a site whit only one page and where you open a websocket. Content is managed by WebSocket and Ajax requests, but the WebSocket is kept alive during the whole session. In this case, caching SSL for the WebSocket doesn't make much sense.
So, at the end, I would say it depends on the implementation. If you already have a site, you should analyze how it behaves and tune your cache needs. On the other hand, if you're starting to design a new site, knowing pros and cons of different scenarios might help you to build a better and more efficient design.
Regards

Handle Unstable Internet Connection in Server-Client App

what technology can i use to manage unstable internet connection in a Server-Client App. i know mainly PHP (+Zend Framework), learning C# & ASP.NET MVC. i heard WCF/MSMQ is something that can help... but how ... is there something PHP (which i am more familiar) can do? but it is also good to know a .NET alternative if its better
the background:
client***s*** will connect to server db to do CRUDs. but if the internet connection fails this will not be possible. so how do i fix this?
the solution used now was have localhost db's. at the end of the day, all clients will upload to server and morning download "consolidated" db from server. this is not foolproof as upload/download may still fail. and considering large amts of data transfered, it actually increases the chances.
UPDATE: is there a PHP/Zend Framework/MySQL replacement for MSMQ/WCF?
WCF can help, because it supports various technologies for reliable message transfer.
One thing that might help you is to have the clients make their data changes locally, then upload those changes to a reliable message queue. You would not upload all changes in a single transaction. You might upload 10 at a time, possibly one at a time. As the uploaded messages are processed on the server, the server would write the transaction results to another queue, unique to each client. After the upload (or maybe at the same time), the client would check that queue to see what the result of each upload was. If the result was success, then the client can remove their local database. If the result was a failure, then the client should try uploading it again.
Of course, you should always be careful that your attempts at error recovery don't make things worse. Too much retry traffic on a bad link may very well cause more traffic, which may itself need recovery, etc.
And, of course, the ultimate solution is to move towards links that are more reliable. Not necessarily faster, but just more reliable.

Resources