Can't Optimize Huge Sprite Sheet in Unity - image

I was shocked to find that a game I had just created takes up a whopping 330 megabytes. According to the Editor Log, my textures are to blame:
From the list I started at the top with the Chieftain Walk animation spritesheet. The file was huge, so I opened it in Photoshop and decreased the image resolution dramatically.
However, even after saving in Photoshop, the Editor Log claims that the texture takes up the same amount of memory. What am I doing wrong, and also, when does the Editor Log update? Is it upon building the game? Many thanks.

First of all, you don't need to reduce resolution on the actual PNG file. When Unity builds player, it will store the imported uncompressed file in its Data folder near the executable. The size of the texture will be as it is in your importer settings. By default it is 2048x2048 if I remember correctly. If you change importer settings for your texture, the PNG file will remain the same (which is in the editor), but the texture object (which is used in actual standalone) will become much smaller.
Also, is there any particular reason why you didn't make it squared? Like 512x512. Always make it a square and a power of 2. If not, Unity will be unable to make any optimizations for your sprites
EDIT:
This is the texture import settings, set max size to lower and your game will take less memory (both in hard drive and in RAM/GPU when game is running). You can also add compression level, it will take even less memory, but will take longer to load (in game). When loaded will take same amount of RAM/GPU memory as non-compressed. A win on app size, a lose on load performance. (Test it out and choose what is better for you)
Why power of 2 and square, well:
By ensuring the texture dimensions are a power of two, the graphics pipeline can take advantage of optimizations related to efficiencies in working with powers of two. For example, it can be faster to divide and multiply by powers of two. It will also be easier for unity to create mip-maps (they might take more memory if texture is not square). There are many sources on internet about mip-mapping.

Related

How to get good performance on the gfx card with images larger than the max texture size?

At work, I work with very large images.
I currently do my rendering via SDL2.
The max texture size on the graphics card my machine uses is 8192x8192.
Because my data sets are larger than what will fit in a single texture, I split my image into multiple textures after it is loaded, and tile them.
However, I have found that this comes at a very steep cost. Rendering only 4 textures around 5K by 5K (pixels) each completely tanks the framerate!
Conventional wisdom tells me that the fewer texture swaps the better, but with such large images I've found myself between a rock and a hard place.
One thing I've considered is that perhaps if I were to chunck the images up into many small textures, I could take advantage of culling which would hopefully be a net win. But there's a big problem with that approach - I need to be able to zoom out.
Another option would be to down scale the images. This seems promising as the analysis I am doing on the images do not require the high resolution that the images provide.
I know that OpenGL has mipmapping, but I am inexperienced with OpenGL and am weary of diving into it for a work project. I am not aware of a good way to downscale the images within the confines of SDL2, and for reasons specific to the work I am doing, scaling the images down offline (before I load them) is not appealing.
What is the best approach for me to get the highest framerate in this situation?

XNA Texture loading speed (for extra large Texture sizes)

[Skip to the bottom for the question only]
While developing my XNA game I came to another horrible XNA limitation: Texture2D-s (at least on my PC) can't have dimensions higher than 2048*2048. No problem, I quickly wrote my custom texture class, which uses a [System.Drawing.] Bitmap by default and splits the texture into smaller Texture2D-s eventually and displays them as appropriate.
When I made this change I also had to update the method loading the textures. When loading the Texture2D-s in the old version I used Texture2D.FromStream() which worked pretty good but XNA can't even seem to store/load textures higher than the limit so if I tried to load/store a say 4092*2048 png file I ended up having a 2048*2048 Texture2D in my app. Therefore I switched to load the images using [System.Drawing.] Image.FromFile, then cast it to a Bitmap as it doesn't seem to have any limitation. (Later converting this Bitmap to a Texture2D list.)
The problem is that loading the textures this way is noticeably slower because now even those images that are under the 2048*2048 limit will be loaded as a Bitmap then converted to a Texture2D. So I am actually looking for a way to analyze an image file and check its dimensions (width;height) before even loading it into my application. Because if it is under the texture limit I can load it straight into a Texture2D without the need of loading it into a Bitmap then converting it into a single element Texture2D list.
Is there any (clean and possibly very quick) way to get the dimensions of an image file without loading the whole file into the application? And if it is, is it even worth using? As I guess that the slowest instruction is the file opening/seeking here (probably hardware-based, when it comes to hdd-s) and not streaming the contents into the application.
Do you need to support arbitrarily large textures? If not, switching to the HiDef profile will get you support for textures as large as 4096x4096.
If you do need to stick with your current technique, you might want to check out this answer regarding how to read image sizes without loading the entire file.

How to efficiently display a large number of moving points

I have a large array of points, which updates dynamically. For the most part, only certain (relatively small) parts of the array get updated. The goal of my program is to build and display a picture using these points.
If I build a picture directly from the points it would be 8192 x 8192 pixels in size. I believe an optimization would be to reduce the array in size. My application has two screen areas (the one is a magnification/zooming in of the other). Additionally I will need to pan this picture in either of screen areas.
My approach for optimization is as follows.
Take a source array of points and reduce it with scaling factor for the first screen area
Same for the second area, but with larger scaling factor
Render there two arrays in two FBOs
Using FBOs as a textures (to provide ability to pan a picture)
When updating a picture I re-render only changed area.
Suggest ways to speed this up as my current implementation runs extremely slow.
You will hardly be able to optimize this a lot if you don't have the hardware to run it at an adequate rate. Even if you render in different threads to FBOs and then compose the result, your bottleneck is likely to remain. 67 million data points is nothing to sneeze at, even for modern GPUs.
Try not to update unnecessarily, update only what changes, render only what's updated and visible, try to minimize the size of your components, e.g. use a shorter data type if possible.

Windows Phone 7 memory management

I'd like to know if there are any specific strategies for handling memory, especially with respect to image caching on the Windows Phone. I have a very graphics intensive silverlight App which needs to keep it graphics that it retrieves from the internet and it needs to be able to freely roam about - but the memory requirement becomes quite huge after using the app for a couple of minutes.
I have tried setting the image's UriSource to null but I need to maintain the image backgrounds when I come back to the page. I'm at a loss because there isn't much information on the internet. The inbuilt profiling showed me "Texture Memory Dominant" and asked me to Analyze Heap Memory to resolve the issue, but I'm still clueless about these.
Any pointers to move forward?
My answer will be general - similarly to your question. I presume that you know for sure that the problem is in images. (Because a simple ListBox with a few hundred text items can cost you many MB.)
If you search the web you'll find plenty of links such as this one. But a general analysis is easy to do.
Take an image of the WP7 screen size, i.e. 480x800. 32-bit bitmap (I suppose this is what WP7 uses when the image is opened) takes roughly 1.5 MB (a simple multiplication).
The same jpg file can have 10x smaller size (for high quality compression) or even less.
Now what's done behind the scenes when you use the construction
<image source="http://..."/>.
(In the absence of any information from you, this is what I suppose you use.)
WP7 downloads the image and adds it to the cache. The cache apparently traces the use of the Uri pointing to the image.
As next the image gets opened, i.e. converted to a bitmap of native image size. Image gets downsampled in this process if it would exceed max. WP7 texture size.
You can customize the bitmap size as described here. If you care of quality, then you should use downscale factor 2, 4, or 8. In case of jpeg these factors represent by far the fastest option. (Well, I have no idea if you know the image resolution before the image gets loaded into the Image control. It is not too difficult to get this info from a jpg file, but right now I have no idea how it can be easily done on WP7.)
The bitmap gets freed if (my speculation) if the control's source is set to null. The downloaded image is purged from the cache when Uri is set to null. (This is reported on the web plenty times.)
If you take all this info, it should be possible to (kind of) control your use of the image cache. You can roughly estimate the image size and can decide which images remain in the cache. Maybe it will need some tricks such as storing Uri objects in you own structures and releasing them as needed. I am not saying this is easy to do, but it is certainly possible.

graphics: best performance with floating point accumulation images

I need to speed up some particle system eye candy I'm working on. The eye candy involves additive blending, accumulation, and trails and glow on the particles. At the moment I'm rendering by hand into a floating point image buffer, converting to unsigned chars at the last minute then uploading to an OpenGL texture. To simulate glow I'm rendering the same texture multiple times at different resolutions and different offsets. This is proving to be too slow, so I'm looking at changing something. The problem is, my dev hardware is an Intel GMA950, but the target machine has an Nvidia GeForce 8800, so it is difficult to profile OpenGL stuff at this stage.
I did some very unscientific profiling and found that most of the slow down is coming from dealing with the float image: scaling all the pixels by a constant to fade them out, and converting the float image to unsigned chars and uploading to the graphics hardware. So, I'm looking at the following options for optimization:
Replace floats with uint32's in a fixed point 16.16 configuration
Optimize float operations using SSE2 assembly (image buffer is a 1024*768*3 array of floats)
Use OpenGL Accumulation Buffer instead of float array
Use OpenGL floating-point FBO's instead of float array
Use OpenGL pixel/vertex shaders
Have you any experience with any of these possibilities? Any thoughts, advice? Something else I haven't thought of?
The problem is simply the sheer amount of data you have to process.
Your float buffer is 9 megabytes in size, and you touch the data more than once. Most likely your rendering loop looks somewhat like this:
Clear the buffer
Render something on it (uses reads and writes)
Convert to unsigned bytes
Upload to OpenGL
That's a lot of data that you move around, and the cache can't help you much because the image is much larger than your cache. Let's assume you touch every pixel five times. If so you move 45mb of data in and out of the slow main memory. 45mb does not sound like much data, but consider that almost each memory access will be a cache miss. The CPU will spend most of the time waiting for the data to arrive.
If you want to stay on the CPU to do the rendering there's not much you can do. Some ideas:
Using SSE for non temporary loads and stores may help, but they will complicate your task quite a bit (you have to align your reads and writes).
Try break up your rendering into tiles. E.g. do everything on smaller rectangles (256*256 or so). The idea behind this is, that you actually get a benefit from the cache. After you've cleared your rectangle for example the entire bitmap will be in the cache. Rendering and converting to bytes will be a lot faster now because there is no need to get the data from the relative slow main memory anymore.
Last resort: Reduce the resolution of your particle effect. This will give you a good bang for the buck at the cost of visual quality.
The best solution is to move the rendering onto the graphic card. Render to texture functionality is standard these days. It's a bit tricky to get it working with OpenGL because you have to decide which extension to use, but once you have it working the performance is not an issue anymore.
Btw - do you really need floating point render-targets? If you get away with 3 bytes per pixel you will see a nice performance improvement.
It's best to move the rendering calculation for massive particle systems like this over to the GPU, which has hardware optimized to do exactly this job as fast as possible.
Aaron is right: represent each individual particle with a sprite. You can calculate the movement of the sprites in space (eg, accumulate their position per frame) on the CPU using SSE2, but do all the additive blending and accumulation on the GPU via OpenGL. (Drawing sprites additively is easy enough.) You can handle your trails and blur either by doing it in shaders (the "pro" way), rendering to an accumulation buffer and back, or simply generate a bunch of additional sprites on the CPU representing the trail and throw them at the rasterizer.
Try to replace the manual code with sprites: An OpenGL texture with an alpha of, say, 10%. Then draw lots of them on the screen (ten of them in the same place to get the full glow).
If you by "manual" mean that you are using the CPU to poke pixels, I think pretty much anything you can do where you draw textured polygons using OpenGL instead will represent a huge speedup.

Resources