In my company, we have a report generation team which maintains a local web application which is horribly slow. These reports get generated weekly. The data for these reports reside inside a database which gets queried through this report portal. I cannot suggest them to change the application in anyway (like memcache etc.) the only option I have is to somehow save these pages locally and relay.
As these are not static pages(they use database to fetch the data), I want to know is there anyway I can store these pages locally by running a cronjob and then have the super fast access for me and my team.
PS:This application doesn't have any authentication these are plain diffs of two files stored in the database.
There are lot of options, but the following one may be easy
Generate HTML page regularly and update the cache (cache entire generated html page with key obtained from the dynamic content uniqueness), with some kind of cronjob as you have mentioned. This job populates all the modified dynamic content # regular intervals.
Have a wrapper for every dynamic page content to lookup cache. If hit then simply return the already generated HTML page. Else, go through regular flow.
You can also choose to cache this newly generated page also.
Hope it helps!
Related
Please, I created this report and published it on the web
But it is interesting that the filters that I enabled in three columns do not appear - they only work on my computer, in the desktop version. Does Publishing to the Power BI Web not allow you to include filters?
Filters should work the same with Publish to Web.
One thing you need to keep in mind is that there can be a significant lag between publishing the file and when the public link is updated.
From Microsoft documentation:
How it works (technical details)
When you create an embed code using Publish to web, the report is made visible to Internet users. It's publicly available, so you can expect viewers to easily share the report through social media in the future. Users view the report either by opening the direct public URL or viewing it embedded in a web page or blog. As they do, Power BI caches the report definition and the results of the queries required to view the report. This caching ensures that thousands of concurrent users can view the report without impacting performance.
The data is cached for one hour from the time it is retrieved. If you update the report definition (for example, if you change its View mode) or refresh the report data, it can take some time before changes are reflected in the version of the report that your users view. When a data refresh occurs for an import data model, the service clears the cached data and retrieves new data. In most cases, the data is updated nearly simultaneous with the import of the data. However, for reports with many distinct queries, it may take some time to update. Since each element and data value is cached independently, when a data update occurs, a user may see a mix of current and previous values. Therefore, we recommend staging your work ahead of time, and creating the Publish to web embed code only when you're satisfied with the settings. If your data will refresh, minimize the number of refreshes and perform the refreshes at off hours. We don’t recommend using Publish to web for data that needs to refresh frequently.
Make sure you have the Filtering enabled in the 'Options and settings' section of the Desktop PBI before publishing.
Screenshot Attached
I have a website with millions of pages. The content on the page stored in database but the data is not changed very frequently. so for the sake of improving the performance of the wesite and reducing the costs of deployment of web applications, I want to generate the static pages for the dynamic content and refresh the pages if the contents are changed. But I am very concerned about how to manage these large amount of pages. how should I store these pages? Is it possible that it will cause IO problems when the web server handle many requests? Is there any better solutions for this issue? Should I use varnish to handle this issue?
Varnish looks like a very good use case for that. Basically, you wouldn't be generating the full site statically, but incrementally, every time there's new requested content varnish hasn't cached yet.
EDIT to cover the comments:
if all the Varnish nodes are down, you can't get your content, same as if the database is down, or if your load-balancers are down. Just have two Varnish load-balanced for high availability with keepalived for example.
if varnish is restarted, the cache will get cleared, unless you are using Varnish Plus/Enterprise with MSE. It may not be an issue if you don't restart often (configuration changes don't need restarts), since the database still has the data to repopulate the cache.
Varnish has a ton of options to invalidate content: purges for just one object, revalidation, bans to target entire sub-domains or sub-trees, xkeys for tag-based invalidation.
Based on the description, your architecture looks like Webpages --> Services -->Database. The pages are generated dynamically based on the data in the database.
For Example, when you search for employed details, the services hits the database and get the details of employee and is rendered on the UI.
Now,if you create a content and store as webpage for every employee, this solution will not scale. Also, if employee information is changed in the database, you will have a stale data if you are not recreating the page.
My recommendation is the architecture should have a cache server and the new architecture should be like Webpages --> Services --> Cache server> Database. Services should query the database, create a page and store in the Cache. Key for cache should be page URL and value should be the page content. Now, when the URL hits the services, services will get the page from the cache rather than going to database. If the key is not available in the cache, services will query the database and fill the cache with key and value.
"Key is Url of the page. Value is the content of the page which has hidden updated date."
You can have the back-end job or a separate service to refresh the cache when data is updated in the database. Job can compare the updated date in the database vs the date in the cache value and flush the cache if the date is not matching. Job running in the back-end to refresh the cache will run behind the scene and will not impact user or UI performance.
In our web app, we have a web page that includes many components, each rendered with data from DB, server side cache is used to store the generated components for future requests. And we also maintained a global 'last-modified' timestamp for the whole page, which is the last time any data (in database) on this web is changed, and we return 304 HTTP response if the browser cache has a fresh version.
In a word, we use both server side cache and client side cache to improve performance.
This is all good until we consider deploying new code. When new code (say html) is deployed, not only client-side cache is invalid, server side cache has to be purged, too. We have to set the last-modified time to our code deployment time, and purge everything in server side cache.
This is not quite desirable, if we deploy code regularly. Because the data in database for the page is not changed regularly, we expect the caches to work over long period of time. But deploying new code defeats our purpose.
What should we do in this case? Is there any 'industry best practice' here?
For my projects, when I change a file such as a css file, I will add a parameter to where the file is included. For example,
<link href='default.css?1' type='text/css' rel='stylesheet'>
And change the number each time you want the file to be reloaded rather than extracted from the cache.
What would be a good approach in general to cache a web page where most of the content living in a database almost never changes (e.g. description) but a little content changes high-frequently (e.g. stock items).
I want to keep the web page cached as long as possible. Would it be an option to get the dynamic content via AJAX request? Do better approaches exist?
You could request the stock data from a separate URL and use JavaScript to insert it into the document. That way, the HTML/CSS/JS remains the same and can be cached. The stock information is loaded using JavaScript and it's not inserted into the HTML by the server.
You could create a URL that returns JSON for this purpose (and similarly for other information that you wish to include using JavaScript).
I need a way for cache images and html files in PhoneGap from my site. I'm planning that users will see site without internet connection like it will be with it. But I see information only about sql data storing, but how can I store images (and use later).
To cache images check out this library -of which I'm the creator-:
imgcache.js
. It's designed for the very purpose of caching images using the local filesystem. If you check out the examples you will see that it can also detect when an image fails to be loaded (because you're offline or you have a very bad connection) and then replaces it automatically with the cached image. The user of the webapp doesn't even notice it's offline.
As for html pages.. if they're html static files, they could be stored locally in the web app (file:// in phonegap).
If they're dynamically generated pages, check the localStorage API if you have a small amount of data, otherwise the filesystem API.
For my web app I retrieve only json data from my server (and process/render it using Backbone+Underscore). The json payload is stored into the localStorage. If the application gets offline, it will fetch json data from the localStorage instead of the server (home-baked fork of Backbone.dualStorage)
You then get the full offline experience: pages+images.
Caching like you might need for simple offline operation is not exactly that easy.
Your first option is the cache manifest. It has some limitations (like the size of the cache) but might work for you since it was designed to do what you want.
Another options is that you can store content on the disk of the device using the file system APIs. This has some drawbacks like security and the fact that you have to load the file from a path / url that is different than you might normally load it from on the web. Check out the hydra plugin for an example of this.
One final option might be to store stuff in localStorage (which has the benefit of being private on all platforms) and then pull it out of there when needed ... that means base64'ing all your images tho so that is a pretty big departure from just standard caching.
Caching is very much possible on Android OS. but on Apple as stated above there are limitations with the size of the images and cache size etc.
If you are willing to integrate and allow the caching on iOS you can use "cache manifest" to do so. but keep the draw backs and limitations in mind.
Also
if you want to save the file to Documents folder under my App, Apple will reject your App. The reason is the system backup all data under Documents folder to iCould after iOS6, so Apple does not allow big data like images or JSON file which could sync from your server again to keep in this folder.
So there is another work around which is good So one can use LocalFileSystem.TEMPORARY instead. It does not save the data to Library/Cache, but it save data to temp folder of App, which does not been auto backup to iCloud and not auto deleted either.
Regards
Rajeev