Using varnish to cache batched backend operations - caching

I'm using Mapnik to generate map tiles (PNG). I have a url where tiles can be generated on-the-fly individually:
http://tiles.example.com/dynamic/MAPID/ZOOM/X/Y.png
Each map tile is 256x256 pixels.
However, generating tiles individually is expensive. It's much more efficient to generate them batched (i.e. generate one large PNG, and split it into smaller files). I have a URL that can do that too:
http://tiles.example.com/dynamic/MAPID
which batch generates all the tiles for a map and returns "OK" when complete, saves them to disk, from where they are available statically at:
http://tiles.example.com/static/MAPID/ZOOM/X/Y.png
which is NGINX serving raw files.
Is it possible to configure Varnish to trigger a batch generation, wait for it to complete, then cache and serve individual tiles until they expire (in my case, 5 minutes)?

Currently varnish3 doesn't support backend fetching, this feature should be implemented in varnish4, Instead I would suggest to trigger those as cron jobs and varnish would fetch them when the first user hits the image.
I would also recommend that the generation would be done on a separate folder/file location and just move it when they are ready, would spare you the hassle of people hitting your server during the generation.

Related

Images storage performance react native (base64 vs uri path)

I have an app to create reports with some data and images (min 1 img, max 6). This reports keeps saved on my app, until user sent it to API (which can be done at the same day that he registered a report, or a week later).
But my question is: What's the proper way to store this images (I'm using Realm), is it saving the path (uri) or a base64 string? My current version keeps the base64 for this images (500 ~~ 800 kb img size), and then after my users send his reports to API, I deleted this base64 hash.
I was developing a way to save the path to the image, and then I display it. But image-picker uri returned is temporary. So to do this, I need to copy this file to another place, then save the path. But doing it, I got (for kind of 2 or 3 days) 2x images stored on phone (using memory).
So before I develop all this stuff, I was wondering, will it (copy image to another path then save path) be more performant that save base64 hash (to store at phone), or it shouldn't make much difference?
I try to avoid text only answers; including code is best practice but the question about storing images comes up frequently and it's not really covered in the documentation so I thought it should be addressed at a high level.
Generally speaking, Realm is not a solution for storing blob type data - images, pdf's etc. There are a number of technical reasons for that but most importantly, an image can go well beyond the capacity of a Realm field. Additionally it can significantly impact performance (especially in a sync'ing use case)
If this is a local only app, storing the images on disk in the device and keep a reference to where they are (their path) stored in Realm. That will enable the app to be fast and responsive with a minimal footprint.
If this is a sync'd solution where you want to share images across devices or with other users, there are several cloud based solutions to accommodate image storage and then store a URL to the image in Realm.
One option is part of the MongoDB family of products (which also includes MongoDB Realm) called GridFS. Another option is a solid product we've leveraged for years is called Firebase Cloud Storage.
Now that I've made those statements, I'll backtrack just a bit and refer you to this article Realm Data and Partitioning Strategy Behind the WildAid O-FISH Mobile Apps which is a fantastic article about implementing Realm in a real-world use application and in particular how to deal with images.
In that article, note they do store the images in Realm for a short time. However, one thing they left out of that (which was revealed in a forum post) is that the images are compressed to ensure they don't go above the Realm field size limit.
I am not totally on board with general use of that technique but it works for that specific use case.
One more note: the image sizes mentioned in the question are pretty small (500 ~~ 800 kb img size) and that's a tiny amount of data which would really not have an impact, so storing them in realm as a data object would work fine. The caveat to that is future expansion; if you decide to later store larger images, it would require a complete re-write of the code; so why not plan for that up front.

Conditional Incremental builds in Nextjs

Context
I am learning Nextjs which is a framework for developing react applications quickly by providing many functionalities out of the box such as Server Side Rendering, Fast Refresh and many others out of the box without any configuration. It also provides a functionality to optionally generate some web pages statically which are pre rendered at build time instead of rendering on demand. It achieves it by querying the data required for the page at build time. Nextjs also provides an optional argument expressed in seconds after which the data is re queried and the page re rendered. All of it happens on page level rather than rebuilding the entire website.
Problems
We cannot know in advance how frequently data would change, the data may change after 1 second or 10 minutes and it is impossible to know in advance and extremely hard to predict. However, it is most certainly not a constant number of seconds. With this approach, I might show outdated information due to higher time limit or I might end up querying the database unnecessarily even if data hasn't changed.
Suppose I have implemented some sort of pagination and I want to exploit the fact that most users would only visit first few pages before going to a different link. I could statically pre render first 1000 pages, so the most visited pages are served statically without going to the database whereas the rest are server side rendered. Now, if my data might change frequently, I would have to re render the first 1000 pages after regular intervals and each page would issue a separate query against the same database or external API which would cause too many round trips. I am not aware of the details of Nextjs but I suspect this would be true because Nextjs does not assume anything about the function which pulls the data and a generic implementation would necessitate it.
Attempted Solution
Both problems can be solved by client or server side rendering because the data would be fetched on demand but we lose the benefits of static generation specifically serving static assets compared to querying the database. I believe static generation would be useful if mutations to my data happen infrequently most of the time but we still want to show the updated information as fast as we can when it becomes available.
If I forget about Nextjs for a a while, both problems can be solved by spawning a new process which listens for mutations to the relevant data and only rebuilds those static assets which needs to be updated; kind of like React updates components but on server side. However Nextjs offers a lot of functionalities which would be difficult to replicate, so I cannot use this approach.
If I want to use Nextjs, problem (1) seems impossible to solve due to (perceived?) limitation of Nextjs which only offers one way to rebuild static pages, periodically re render them after a predetermined time. However, (2) can be solved by using some sort of in memory cache which pulls all the required data from the data store in one round trip and structures it up for every page. Then every page will pull data from this cache instead of the database. However, it looks like a hack to me.
Questions
Are there other ways to deal with the problem I might have have missed?
Is there a built-in way to deal with problem (1) and (2) in Nextjs?
Is my assessment of attempted solutions and their viability correct?

Use Custom Image Recognition Collection locally

I have created a Custom Image Recognition collection on IBM Cloud and am using it in my Django website to do the processing. However, I noticed that the response time ranges from 6 to 14 seconds.
I want to reduce this turnaround time. I am already zipping the image file that I sent. So when going through the API reference document here on IBM Cloud I noticed that there is a method called "get_model_file" which download the collection file to a local space.
But no documentation on how this can be used. Anyone who has successfully implemented this? Or am i missing something here?
However, I noticed that the response time ranges from 6 to 14 seconds.
I want to reduce this turnaround time. I am already zipping the image file that I sent.
How many images at at time are you sending in the zip file to the /analyze endpoint? If you are just sending one image at a time, you should not bother zipping it. Also, if you can, you should parallelize your code so that you make 1 request per image, rather than sending, say 6 images in a single zip file. This will reduce the latency.
Using the v4 API, by the way, you should resize your images to no more than 300 pixels in either width or height. In fact, you can "squash" the aspect ratio to square and it will not affect the outcome. The service will do this resizing internally anyhow, but if you do it on the client side, you save network transmission and decoding time.
With a single image at a time, if your resolution is under 300x300 pixels, you should have latency under 1.5 seconds on a typical call, including your network transmission time.
As the documentation states
Currently, the model format is specific to Android apps.
So unless you are creating an Android App then this is not going to work for you.
You probably have two areas of latency. First will be from the browser to your Django app. Second will be from your Django app to the Visual Recognition service. I am not sure where you have hosted the Django app, but if you locate it in the same region (data centre would be even better) you might be able to reduce part of the latency.

Fastest? ClientBundle vs plain URL images

Right now a large application I'm working on downloads all small images separately and usually on demand. About 1000 images ranging from 20 bytes to 40kbytes. I'm trying to figure out if there will be any client performance improvements by using a ClientBundle for the smaller most used ones.
I'm putting the 'many connections high latency' issue for the side now and just concentrate on javascript/css/browser performance.
Some of the images are used directly within CSS. Are there any performance improvements by "spriting" them vs using as usual?
Some images are created as new Image(url). Is it better to leave them this way, move them into CSS and apply styles dinamically or load from a ClientBundle?
Some actions have a result a setURL on an image. I've seen that the same code can be done with ClientBundle and it will probably set the dataURI for that image. Will doing improve performance or is it faster this way?
I'm specifically talking about runtime more than startup time, since this is an application which sees long usage times and all images will probably be cached in the first 10 minutes, so round-trip is not an issue (for now).
Short answer is not really (for FF, chrome, safari, opera) BUT sometimes for IE (<9)!!!
Lets look at what client bundle does
Client bundle packages every image into one ...bundle... so that all you need is one http connection to get all of them... and it requires only one freshness lookup the next time you load your application. (rather than n times, n being the number of your tiny images.. really wasteful.)
So its clear that client bundle greatly improves your apps load time.
Runtime Performance
There maybe times when one particular image fails to get downloaded or gets lost over the internet. If you make 1000 connections, the probability of something going wrong increases (however little). FF, Chrome, Safari, Opera simply put the image not found logo and move on with the running. IE <9 however, will keep trying to get those particular images, using up one connection of the two its allowed. That really impacts performance in IE.
Other than that, there will be some performance improvement if you keep loading new widgets asynchronously and they end up downloading images at a later stage.
Jai

Need Recommendation on impression tracking

I'm doing a research for my work which needs to track impression of a little web app sitting in 3rd party (authorized) websites. I need to analyze the impression close to real time.
I know there are at least two ways
1) use image, and parse the server log for reporting.
2) js sends ajax, and save the request in DB. (either mysql or mongo or other noSQL).
so, which way is the faster way and can handle tones of traffic?
I suspect that server log is slower because it has to append to a file. But I'm not sure if it is really slower, or it is not.
So, what is the pros and cons of each approach? Thanks. :)
P.S. I can't use Google Analytics because there is a limit on Data Export..and also other limitations. :-)
Both options are valid, the image and server logs are simple and work as long as the visitor loads images. This is faster in most cases, since there is no extra processing.
If using JavaScript, I would do what the web-analytics companies do and create a image call with JS and at the other end have either a image file with server logs or a script reading data in to a DB and returning a 1x1 pixel transparent GIF.
If all you need is impressions, I would go with the simpler solution, less to go wrong.

Resources