How most popular media rich websites implement their media library? - performance

I'm wondering how those popular media rich websites implement their media library. Do they store all the media files in the database? What kind of database do they use? Do they employ other mechanism to boost the performance?
Thanks for any response.

"Popular Media Sites" is pretty broad, but typically high volume rich media sites use Content Delivery Networks, such as Akamai, etc or cloud based storage like AmazonS3

You are asking a very difficult question to answer.
I recommend that as an introductory read you check out Youtube Architecture on High Scalability. Youtube is a very good real-life example of how a media-centric website works.
Surprising as it may be, serving the actual media files is not the bottleneck. The harder part is getting all the media meta-data synched, generating thumbnails, etc. Media files can always be hosted from a cluster, or from a CDN in case of an extremely popular video.
Read the link for more in-depth info.

Speaking as a dev on a popular media website, we offload the serving of video to 3rd parties (YouTube and BrightCove). Depending on the situation, we then host this video within a custom player to layer in ads and other features. Pumping down video streams is best handled by someone else who has invested a lot of energy and money into their architecture.
As with everything, you need to decide if you needs are specific enough to warrant developing in-house vs effort of integrating with other tools.

Related

Video (mp4) loading optimization for small web project

I'm currently developing my portfolio website using Nuxt3 in the frontend and Netlify for hosting. The site contains a fair amount of videos and although most mp4 files are not excessively large in size (1.2 - 1.4mb), requesting them directly from my server has taken a strain on the loading times of my site.
Aside from lazy-loading and compressing, what further steps could I take to optimize the loading speed of my videos? I am aware of CDNs such as Amazon Cloudfront and Cloudinary, but uncertain as to which would be most suitable for a small portfolio project.
Since this is quite a general question, any pointers to other techniques and best practices are much appreciated. Thank you for the help!
Like images, video can have a billion things you can optimize and fine tune.
If it's a small portfolio project, just use Cloudinary. It will be super simple, highly optimized for you, will probably fall under a free tier and won't need reading a 400 pages book on how to work with various codes, containers, buffering etc etc...

Is browser caching of images good enough to invalidate the need for server side storing?

I had an architecture question, and I had to rewrite the question title multiple times, since SO asked me to. So please feel free to correct it, if you feel so. I am not an expert in cache related things so I would very much appreciate some insights about my architecture related question.
So the situation is like this. We have a web based design app (frontend Javascript, backend PHP) which presents lots of clipart images to our customers who use that in creating online art work. Earlier, our app was loaded into an AWS machine and we used to have the clipart images also stored locally in the same server in order to not have any network transfer required to load the clipart and thus make the design app load time faster. The customer created designs were also saved into a backend MySQL server connected directly to the web based design app (in JSON and relational model).
A while before a new team joined to make a mobile version of this app, and they insisted that the cliparts should be loaded from a "central location" both for our web app, and for the mobile app they are creating. They also said that the design should also be stored into a "central database", accessible by the web and mobile apps (and there were some major re-architecting of the JSON structure as well)
So finally, the architecture changed such that, the cliparts now reside in a centralized location (S3 Server). And there is an "Asset Delivery and Storage (ADS) System" to which our design app makes requests for clipart images and gets served. (Please note that the cliparts repository is very large and only a subset of clipart images are served based on various parameters - such as the style of the design, account type of the customer etc). So this task is now done by the ADS system (written in python).
And since our web design app no longer has any local storage of cliparts nor logic of cliparts filtering (which got delegated to ADS, so no more server side PHP), it has also become a purely web based (front end Javasdcript) app without any server requirements and subsequently got moved to S3.
Now the real matter is that, our web app seems much more slower when initial loading, than when we had our on stash of cliparts stored in the server. I read that if an app requests for images, those images are cached in the browser and if the customer, for eg, loads the same order before that cache has expired then there is no repeat request that needs to be sent to the server (in this case ADS).
If that is true, is there any case I can really make to state that moving the clipart images from the design app server to the ADS system and having to send a request and load them every time a design is loaded has contributed in part to the recent slowness of the design app?
Also most times I hear the answer that "mobile app also does the same and is faster".I am not a mobile developer. Could there be some mobile cache tricks that help the mobile app to be much more "cache-efficient" than the purely web based design app, such that even though the architecture is same for both (sending request ADS for cliparts), the mobile app does it in a better and more efficient manner?
End note: I realise I am not asking a specific programming question. But from some of the notes I have read here, SO is a community for programmers, and I do not know of any other community that so well answers programming related questions. The architecture question I have is a genuine programming related question I face at work and sadly I am not skilled enough to understand if all the recent architectural changes there has any drawbacks that is causing our web app performance to degrade noticeably.
Thanks for reading, and I would really appreciate any pointers or even links to reading for better understanding this.
In chrome, open up the developer tools, and click on the network tab. 90% of the time you can identify the slow resource from there.

What kind of messaging architectures are used in huge, scalable sites today?

Sites like Twitter and Facebook scale to hundreds of thousands of users. Most of their architectural overviews are available online as talks and slideshows. However, my question is more oriented towards any messaging middleware/layer that these sites use. I understand that it would be different for different sites - but are there any common characteristics when using messaging technologies (e.g. JMS) on highly scaled sites? More specifically, are there use cases that cannot be handled by traditional messaging solutions?
Twitter switched to Scala for their middleware and got huge performance boost. High-scalability is the best authority on the topic of scaling web applications.

Launching an online app: do I use CDN, Amazon Services or a dedicated server?

My web application requires as little lag as possible. I have tried hosting it on a dedicated server, but users on the other side of world have complained about latency issues.
So I am considering using CDN or Amazon services.... would either help resolve this?
The application uses a lot of AJAX, so latency can be an issue.
Amazon's Cloudfront, part of the Amazon Web Services (AWS) that you can purchase, is a CDN (Content Delivery Network) -- so asking whether to use Amazon or "a CDN" strikes me as a weird question, akin to asking whether you should drink Coke or "a soda" (given that Coke is "a soda"). Rather you should ask "should I use Amazon or another CDN?" just like you'd ask "should I drink Coke or another soda?".
Your decision among CDNs must be based on many parameters - cost, reliability, convenience, speed, and so forth. Unfortunately I have no first-hand experience of CloudFront; however, on paper, it seems particularly simple to use (especially if you're already using other AWS components, since getting data e.g. from S3 to CloudFront is fast and cheap indeed;-), and reasonably priced (based on usage). But I have no experience about its uptime record or delivery speed.
A content delivery network is a great idea to speed up delivery of static content (images, javascript, etc). You could even use this in combination with a dedicated server if you want.
You may also consider using a tool such as YSlow to analyze what may be causing your latency issues.
A CDN will only improve the performance of your static content -- if your Ajax code requires active content, then it won't help for that.
Amazon AWS might help, but it depends on the details of your application. Amazon isn't particularly well-known for delivering a low-latency solution.
Most apps that require low latency end up addressing the issue from many directions. A combination of a CDN and dedicated servers is certainly one approach. One key there is choosing the right data center for your servers (a low-latency hub).
In case it might help, I wrote a book about this subject: Ultra-Fast ASP.NET, which includes a discussion of client-side issues, hardware infrastructure, CDNs, caching, and many other issues that can impact latency.

Does streaming media in SharePoint pose a performance risk?

Can anyone comment on the performance implications of storing streaming media in a SharePoint 2007 document library? I’ve heard this can be detrimental to the performance of the farm due to the media being streamed from storage in a SQL DB.
Has anyone had any firsthand experience with this and if so, what alternatives have you used to provide users with the ability to publish and mange their own video content? Assume a secure internal environment so external services like YouTube are not viable in this scenario.
I have tried this on a test deploy and it had very poor performance. Not only did the SharePoint server struggle, but the video the client was trying to stream was very laggy. Granted, we did not have a state of the art server set up, but I was the only one accessing the server and it couldn't even handle that. Given my experience, I would advise against it.
I can stream FLV's for flash movies from a document library with reasonable performance. I still opted for deploying them to a separate non-sharepoint website because the video's where fairly static and do take up a lot of SQL space.
You might consider activating blobcaching to get around the streaming from the database, see: http://msdn.microsoft.com/en-us/library/aa604896.aspx

Resources