everyone!
I am new to programming, but very interested in Android app development, and have read many books, articles, and tutorials, and have watched many videos on the topic.
I have been trying to create a personal app that displays artworks drawn on the Android canvas, and it is working, but the problem is that it consumes too much memory. I have recently integrated code to take a heap dump at 3 different points in the normal use of the app, and found that 2 copies of the gallery Activity sometimes exist at the same time.
However, the problem seems really complicated...
The app has 3 Activities:
Activity A ~ shows two buttons. Clicking the first button will start Activity B.
Activity B ~ shows a list of 60 items, each of which has a unique data set. Clicking any item on the list will put the item's data set as intent extras, and then start Activity C.
Activity C ~ initially shows the first digital art. Swiping will show the succeeding digital arts.
The entire app is always in portrait mode ~ the Manifest includes
android:screenOrientation="portrait"
Each of the activities uses static inner classes to define threads, and uses recycle() on bitmaps in the onDestroy method. The onDestroy method also nulls out variables.
Drawing on the Android Canvas always happens in a background thread.
None of the activities uses static variables to reference views or other objects.
Activity C does not use any onClickListener; it uses onTouchEvent instead.
The following is the test that I use in generating heap dumps.
Launch the app. Activity A starts.
Click on button B. Activity B starts.
Inside Activity B, I click on the first item. Activity C starts.
Inside Activity C, I swipe to see all ten digital arts. Then, I tap on the Android Back Button. This destroys Activity C, and restarts Activity B.
I repeat Steps 4 and 5, going from the second item on Activity B's list to the ninth item. Each of these items puts unique intent extras before starting Activity C.
For the tenth item, I repeat Step 4, while code inside Activity C tests for a particular intent extra. I swipe to see 6 more digital arts, and when the code tests True, it produces the first heap dump.
I click on the Android Back Button, destroying Activity C, and restarting Activity B.
Inside Activity B, I click on the Android Back Button, destroying Activity B, and restarting Activity A.
Code inside Activity A detects a restart, and produces the second heap dump.
I repeat Steps 2, 3, and 4.
I repeat Step 8.
Code inside Activity A detects a second restart, and produces the third heap dump.
On one test run, MAT reported that while I was inside Activity C, two Activity C objects were present. In the Leak Suspects section, MAT displayed the following:
The thread java.lang.Thread # 0x41e1c6c8 FinalizerWatchdogDaemon keeps local variables with total size 15,429,232 (32.64%) bytes.
The memory is accumulated in one instance of "android.view.View[]" loaded by "".
Of course, this is rare.
On most tests, MAT reports only one Activity C object. HOWEVER, and this is the big problem, I think, when I am inside Activity C, and then press the back button to go back to Activity B, and then press the back button to go back to Activity A, the heap dump taken inside Activity A leads to the following MAT report:
Problem Suspect 1
10 instances of "android.widget.ImageView", loaded by "" occupy 15,367,680 (54.10%) bytes.
Biggest instances:
1. android.widget.ImageView # 0x41eef9c0 - 1,536,768 (5.41%) bytes.
android.widget.ImageView # 0x41ef3d28 - 1,536,768 (5.41%) bytes.
android.widget.ImageView # 0x42118088 - 1,536,768 (5.41%) bytes.
android.widget.ImageView # 0x4213fae0 - 1,536,768 (5.41%) bytes.
android.widget.ImageView # 0x4382e838 - 1,536,768 (5.41%) bytes.
android.widget.ImageView # 0x4382f6f8 - 1,536,768 (5.41%) bytes.
android.widget.ImageView # 0x4382f920 - 1,536,768 (5.41%) bytes.
android.widget.ImageView # 0x43834f70 - 1,536,768 (5.41%) bytes.
android.widget.ImageView # 0x438442f8 - 1,536,768 (5.41%) bytes.
android.widget.ImageView # 0x438852d0 - 1,536,768 (5.41%) bytes.
Problem Suspect 2
69 instances of "android.graphics.Bitmap", loaded by "" occupy 10,786,184 (37.97%) bytes.
Biggest instances:
android.graphics.Bitmap # 0x41ed0b98 - 635,856 (2.24%) bytes.
Problem Suspect 1 are ImageViews from Activity C, but Activity C should already be destroyed. In the histogram section, MAT reports that one Activity C object is still present.
I am thinking that if I were really leaking memory (Activity C), then going A-B-C-B-C-B-C-B-C-...-B-C-B-A and checking out the bitmap galleries should produce many Activity C objects, maybe ten objects.
Problem Suspect 2 is the same issue. Most of the bitmaps in that problem suspect are from Activity B, which should already be destroyed. One Activity B object also appears in the histogram, when, in fact, the heap dump is taken inside Activity A (after pressing the back button inside Activity B).
Please help.
Notes:
I test on a real Android device running Jellybean (Android 4.2.2). The device screen resolution is width of 480 and height of 800. RAM is at 1GB, and I usually see 400MB free.
One issue that I have already resolved is a bitmap consuming 3456000 bytes. It turned out to be a PNG image set as a background for Activity A. However, it is still a puzzle ~ the PNG is 400×800. So, at ARGB_8888, shouldn't this PNG take only 480×800×4 = 1536000? 3456000 is actually 9 × (480×800).
The app works really fast, and does not report any ANR or Out Of Memory error. In addition, it is rare for the app to be janky. (When many other apps are open, and free RAM goes down to around 190MB, that is the situation when the UI seems laggy.)
I would appreciate any idea or insight or opinion. Thanks!
Related
I use the combination of DXGI_SWAP_CHAIN_FLAG_FRAME_LATENCY_WAITABLE_OBJECT, GetFrameLatencyWaitableObject() and SetMaximumFrameLatency(UINT MaxLatency) to control the input lag vs. smoothness of my application as explained at https://learn.microsoft.com/en-us/windows/uwp/gaming/reduce-latency-with-dxgi-1-3-swap-chains. A value of 1 gives the lowest input lag, but sometimes I need a higher value to reduce jitter/stutter/slowdown caused by cpu and gpu cannot really work in parallel when the value is 1.
I want to be able to dynamically change this value based on the required input lag vs smoothness trade-off.
The problem I have noticed is that while it's possible to, between frames, increase this value by calling SetMaximumFrameLatency with a higher value than set before, I see no effect when decreasing this value by calling the function again with a lower value than the maximum value ever set for this swap chain by a previous call to the same function. So if I ever set it to 2, it is not possible to set it to 1 later. Is this a bug or undocumented "feature"? Or did I do something wrong?
The API itself does not return any error or similar; from the API point of view it appears to apply the new lower value correctly.
To test this, I have BufferCount = 16 and then adjust the max latency value from 1 to 16 which makes the current latency obvious to the eye. It's therefore apparent that dxgi does not apply new lower values.
I've tried to call functions in different orders, close the handle for the waitable object and recreate a new one when modifying the latency, but nothing works. The only workaround so far I'm aware of is to fully recreate the swap chain, which is annoying due to the requirement to unbind all context objects etc.
When initializing the game, I create the swap chain and set an initial latency using SetMaximumFrameLatency.
The game loop is then basically this:
Call WaitForSingleObject on the waitable object handle.
Process inputs.
Render and present a frame.
If it's decided that the latency should change at this point, call SetMaximumFrameLatency with the new value.
Other info:
Renderer: Direct3D 11
OS: Windows 11 21H2 version 22000.675
Graphics card: Intel UHD Graphics 620 / Nvidia GeForce MX150 (tried with both cards) with latest drivers, supporting WDDM 3.0
App type: Win32 desktop application
I am learning about Memory Utilization using the MAT in Eclipse. Though I have ran into a strange problem. Leave aside the heavy apps, I began with the most benign The "Hello World" App. This is what I get as Heap Stats on Nexus 5, ART runtime, Lollipop 5.0.1.
ID: 1
Heap Size: 25.429 MB
Allocated: 15.257 MB
Free: 10.172 MB
% Used: 60%
# Objects: 43487
My Heap dump gives me 3 Memory Leak suspects:
Overview
"Can't post the Pie Chart because of low reputation."
Problem Suspect 1
The class "android.content.res.Resources", loaded by "", occupies 10,166,936 (38.00%) bytes. The memory is
accumulated in one instance of "android.util.LongSparseArray[]" loaded
by "".
Keywords android.util.LongSparseArray[] android.content.res.Resources
Problem Suspect 2
209 instances of "android.graphics.NinePatch", loaded by "" occupy 5,679,088 (21.22%) bytes. These instances are
referenced from one instance of "java.lang.Object[]", loaded by
"" Keywords java.lang.Object[]
android.graphics.NinePatch
Problem Suspect 3
8 instances of "java.lang.reflect.ArtMethod[]", loaded by "" occupy 3,630,376 (13.57%) bytes. Biggest instances:
•java.lang.reflect.ArtMethod[62114] # 0x70b19178 - 1,888,776 (7.06%)
bytes. •java.lang.reflect.ArtMethod[21798] # 0x706f5a78 - 782,800
(2.93%) bytes. •java.lang.reflect.ArtMethod[24079] # 0x70a9db88 -
546,976 (2.04%) bytes. Keywords java.lang.reflect.ArtMethod[]
This is all by a simple code of:
import android.app.Activity;
import android.os.Bundle;
public class MainActivity extends Activity {
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
}
}
Questions
Why are the heap numbers so big. ?
Also as a side note the app was consuming 52 MB of RAM in the system.
Where are these 209 instance of NinePatch coming ? I merely created the project by doing a "Create a new Project" in Eclipse ?
The first leak suspect of resources, It comes up all the time in my analysis of apps. Is it really a suspect ?
What is the ArtMethod? Does it have to do something with the ART runtime ?
In Lollipop the default runtime is ART i.e Android Run Time, which replaces the old Dalvik Run Time(DRT) used in older Android versions.
In KitKat, Google released an experimental version of ART to get feedback from the users.
In Dalvik JIT(just in time compilation) is used, which means when you open the application only then the DEX code is converted to object code.
However, in ART the dex code is converted to object code(i.e AOT ahead of time compilation) during installation itself. The size of this object code is bigger compared to the DEX code therefore ART needs more RAM than DRT. The advantage of ART is that ART apps have better response time over DRT apps.
Yesterday i'm faced with this problem too. In your log key word is "NinePatch". In my case the cause was a "fake" shadow - tiny picture with alpha channel which trigger resource leak. It's costs about 60mb leaked memory for me.
I have a Delphi 6 application that sends bitmaps to a DirectShow DLL in real-time, 25 frames a second. The DirectShow DLL is my code too and is also written in Delphi 6 using the DSPACK DirectShow component suite. I have a simple block of code that goes through each pixel in the bitmap modifying the brightness and contrast of the image, if a certain flag is set, otherwise the bitmap is pushed out the DirectShow DLL unmodified (push source video filter). The code used to be in the main application and then I just moved it into the DirectShow DLL. When it was in the main application it ran fine. I could see the changes in the bitmap as expected. However, now that the code resides in the DirectShow DLL it has the following problems:
When the code block below is active the DirectShow DLL is really slow. I have a quad core i5 and it's really slow. I can also see a big spike in the CPU consumption. In contrast, the very same code running in the main application ran fine on an old single core P4. It did hit the CPU noticeably on that old machine but the video was smooth and there were no problems. The images are only 352 x 288 pixels in size.
I don't see the expected changes to the visible bitmap. I can trace the code in the DirectShow DLL and see the numerical values of each pixel properly altered by the code, but the viewable image in the Graph Edit ActiveMovie window looks completely unchanged.
If I deactivate the code, which I can do in real-time, the ActiveMovie window shows video that is as smooth as glass, perfectly rendered with the CPU barely touched. If I reactivate the code the video is now really choppy, probably showing only 1 to 2 frames a second with a long delay before the first frame is shown, and the CPU spikes. Not completely, but a lot more than I would expect.
I tried compiling the DirectShow DLL with everything on including range checking, overflow checking, etc. and there were no warnings or errors during run-time. I then tried compiling for fastest speed and it still had the exact same problems listed above. Something is really wrong and I can't figure out what. Note, I do indeed lock the canvas before modifying the bitmap and unlock it after I'm done. If it weren't for the "everything on" compilation run I noted above I'd say it felt like an FPU Exception was being raised and silently swallowed with every pixel computation, but as I said, no errors or Exceptions are occurring.
UPDATE: I am putting this here so that the solution, which is embedded in one of Roman R's comment, is plainly visible. The problem that I was not setting the PixelFormat property to pf24Bit before accessing the ScanLine property. As Roman suggested, not doing this must make the TBitmap code create a temporary copy of the bitmap. As soon as I added the line of code below the problems went away, both that of changes not being visible and the soft page faults. It's an insidious problem because the only object that is affected is the pointer you use to access the ScanLine property, since (assumption) it contains a pointer to a temporary copy of the bitmap. That's must be why the subsequent TextOut() call still worked since it worked on the original copy of the bitmap.
clip.PixelFormat := pf24bit; // The missing code line that fixes the problem.
Here's the code block I've been referring to:
function IntToByte(i: Integer): Byte;
begin
if i > 255 then
Result := 255
else if i < 0 then
Result := 0
else
Result := i;
end;
// ---------------------------------------------------------------
procedure brightnessTurboBoost(var clip: TBitmap; rangeExpansionPowerOf2: integer; shiftValue: Byte);
var
p0: PByte;
x,y: Integer;
begin
if (rangeExpansionPowerOf2 = 0) and (shiftValue = 0) then
exit; // These parameter settings will not change the pixel values.
for y := 0 to clip.Height-1 do
begin
p0 := clip.scanline[y];
// Can't just do the whole buffer as a big block of bytes since the
// individual scan lines may be padded for CPU alignment.
for x := 0 to (clip.Width - 1) * 3 do
begin
if rangeExpansionPowerOf2 >= 1 then
p0^ := IntToByte((p0^ shl rangeExpansionPowerOf2) + shiftValue)
else
p0^ := IntToByte(p0^ + shiftValue);
Inc(p0);
end;
end;
end;
There are a few things to say about this code snippet.
First of all, you are using Scanline property of TBitmap class. I have not been dealign with Delphi for many years, so I might be wrong about this but I am under impression that Scanline is not actually a thin accessor, is it? It might be internally hiding things which can dramatically affect performance, such as "if he wants to access the bits of the image, then we have to first convert it to DIB before returning pointers". So a thing looking so simple might appear to be a killer.
"if rangeExpansionPowerOf2 >= 1 then" in the inner loop body? You don't really want to compare this all the way. Either make two separate functions or duplicate the whole loop without in two version for zero and non-zero rangeExpansionPowerOf2 and do this if only once.
"for ... to (clip.Width - 1) * 3 do" I am not really sure that Delphi optimizes the upper boundary evaluation to make it only once. You might be doing those multiplication thrice for every pixel, while you could do it only once the whole image.
For top perofrmance IntToByte is definitely implemented in MMX to avoid ifs and process multiple bytes at once.
Still as you say that images are only 352x288, I would suspect that #1 is ruining the performance.
I have written an app and was testing it for memory leaks when I noticed that the "all allocations" category in the leaks simulator keeps increasing its size whenever I open and close a sub-view.
I intially thought it was a memory leak, but it does not show up as a leak in the leaks tab.
Is this normal?
It depends which column of the table you are looking at.
The 'Overall' and 'Overall Bytes' figures will always go up, since they are a running count of allocations made with no account of deallocations.
However, the 'Live Bytes' and '# Living' figures should go up when an object or block of memory has been allocated, but should go down when they are deallocated.
Repeatedly opening and closing a sub-view should (subject to image or data caching) hover around a fixed number of live bytes and living objects / memory blocks.
Instruments sometimes gets a bit confused, however, as you can see from the screenshot. The whole '# Transitory' column is showing '0', which is obviously incorrect. A transitory object is just one that has been allocated and subsequently deallocated, i.e. it's a non-living object.
(# Living + # Transitory == # Overall)
Whenever Instruments gives me that column of zeros, I quit the current run and start a new one.
As for the Leaks Instrument, it will only show those objects or memory blocks that no longer have any pointers pointing to them. If a program continually allocates more and more objects / memory blocks but retains pointers to them, the Leaks Instrument won't show them.
That would make sense would it not? Every time you do something in the app, something is probably allocated such as your different subviews. Therefore total allocations will increase.It's just a record of the total allocations.
I am pretty sure I am suffering from memory leakage, but I havent 100% nailed down how its happening.
The application Iv'e written downloads 2 images from a url and queues each set of images, called a transaction, into a queue to be popped off by the user interface and displayed. The images are pretty big, averaging about 2.5MB. So as a way of speeding up the user interface and making it more responsive, I pre-load each transaction images into wxImage objects and store them.
When the user pops off another transaction, I feed the preloaded image into a window object that then converts the wxImage into a bitmap and DC blits to the window. The window object is then displayed on a panel.
When the transaction is finished by the user, I destroy the window object (presumably the window goes away, as does the bitmap) and the transaction data structure is overwritten with 'None'.
However, depending on how many images ive preloaded, whether the queue size is set large and its done all at once, or whether I let a small queue size sit over time, it eventually crashes. I really cant let this happen .. :)
Anyone see any obvious logical errors in what im doing? Does python garbage collect? I dont have much experience with having to deal with memory issues.
[edit] here is the code ;) This is the code related to the thread that downloads the images - it is instanced in the main thread the runs the GUI - the download thread's main function is the 'fill_queue' function:
def fill_queue(self):
while True:
if (self.len() < self.maxqueuesize):
try:
trx_data = self.download_transaction_data(self.get_url)
for trx in trx_data:
self.download_transaction_images(trx)
if self.valid_images([trx['image_name_1'], trx['image_name_2']]):
trx = self.pre_load_images(trx)
self.append(trx)
except IOError, error:
print "Received IOError while trying to download transactions or images"
print "Error Received: ", error
except Exception, ex:
print "Caught general exception while trying to download transactions or images"
print "Error Received: ", ex
else:
time.sleep(1)
def download_transaction_images(self, data):
""" Method will download all the available images for the provided transaction """
for(a, b) in data.items():
if (b) and (a == "image_name_1" or a == "image_name_2"):
modified_url = self.images_url + self.path_from_filename(b)
download_url = modified_url + b
local_filepath = self.cache_dir + b
urllib.urlretrieve(download_url, local_filepath)
urllib.urlcleanup()
def download_transaction_data(self, trx_location):
""" Method will download transaction data and return a parsed list of hash structures """
page = urllib.urlopen(trx_location)
data = page.readlines()
page.close()
trx_list = []
trx_data = {}
for line in data:
line = line.rstrip('|!\n')
if re.search('id=', line):
fields = re.split('\|', line)
for jnd in fields:
pairs = jnd.split('=')
trx_data[pairs[0]] = pairs[1]
trx_list.append(trx_data)
return trx_list
def pre_load_images(self, trx):
""" Method will create a wxImage and load it into memory to speed the image display """
path1 = self.cache_dir + trx['image_name_1']
path2 = self.cache_dir + trx['image_name_2']
image1 = wx.Image(path1)
image2 = wx.Image(path2)
trx['loaded_image_1'] = image1
trx['loaded_image_2'] = image2
return trx
def valid_images(self, images):
""" Method verifies that the image path is valid and image is readable """
retval = True
for i in images:
if re.search('jpg', i) or re.search('jpeg', i):
imagepath = self.cache_dir + i
if not os.path.exists(imagepath) or not wx.Image.CanRead(imagepath):
retval = False
else:
retval = False
return retval
Also, I'd like to add that sometimes, just before the crash I get peculiar errors in my console, they look like corrupt image errors but the images are not corrupted, the error has happened at all stages on all images.
Application transferred too few
scanlines [2009-09-08 11:12:03] Error:
JPEG: Couldn't load - file is probably
corrupted. [2009-09-08 11:12:11]
Debug: ....\src\msw\dib.cpp(134):
'CreateDIBSection' fail ed with error
0x00000000 (the operation completed
successfully.).
These errors can happen a la carte, or all together. What I think is happening is that at some point the memory becomes corrupted and anything that happens next, if I load a new transaction, or image, or do a cropping operation - it takes a dive.
So unfortunately after trying out the suggestion of moving the pre-loading function call to wxImage into the main gui thread I am still getting the error - again it will occur after too many images have been loaded into memory or if they sit in memory for too long. Then when I attempt to crop an image the i get a memory error - something is corrupting, whether in the former case I am using too much or dont have enough (which makes no sense because I've increased my paging file size to astronomical proportions) or in the latter case where the length of time is causing a leak or corruption
The only way I think I can go at this point is to use a debugger - are there any easy ways to debug a wxPython application? I would like to see the memory usage in particular.
The main reason why I think I need to preload the images is because if I call wxImage on each image ( I show two at a time) each time i load a 'transaction' the interface from one transaction to the next is very slow and clunky - If I load them in memory its very fast - but then I get my memory error.
Two thoughts:
You do not mention if the downloading is running a separate thread (actually now I see that this is running in a separate thread, I should read more closely). I'm pretty sure that wx.Image is not thread-safe, so if you are instantiating wx.Images in a non-GUI thread, that could lead to trouble like this. (This is almost certainly the issue, most wx classes/objects/functions are not thread-safe).
I've been bitten by nasty IncRef/DecRef bugs in wxPython (due to the underlying C++ bindings) before (mostly associated with wx.Grid and associated classes). While I don't know of any with wx.Image, it wouldn't surprise me to find out you may be required to manually manage memory like you have to in wx.Grid sometimes.
Edit
You need to instantiate the wx.Image in the GUI thread, not the downloading thread (which your above code looks like you are currently instantiating in the non-GUI thread). In general this is almost always going to cause lots of problems in any GUI toolkit. You can search the wxPython mailing list for lots of emails where this is the case. Personally I would do this:
Queue for download urls.
Thread to download images.
Have the downloading thread places a disk location (watch out for race conditions!) in a separate queue and post custom wx.Event(threadsafe) (threadsafe with wx.PostEvent function) to the App thread.
Have the GUI thread pop the file locations and instantiate wx.Image ----> wx.Bitmap (maybe with wx.CallAfter to process when App is idle)
Display (Blit) as needed.