LabVIEW - Store very large 3D array (array of images) - memory-management

I am working on a LabVIEW project on which I have to process some video (with for example 5000 images of 640*480 pixels, so lot of data to process). Using a for loop I am processing one image at the time so in this side all is okay. But in the other side, I have to store the results to visualise the results of the wished image after the processing. Until now I always worked with array but here LabVIEW has not enough memory to do the job (which is quite normal).
Is there a best way to change my way to deal with the data, using other solution such as cluster, save the image on the local disk, etc?
For information, the processing is quite long (several minutes for only one image) and I don't have to save the result before the user ask for so I am anticipating the case where all the video is processed without saving the result.
Thank you in advance.

How much RAM do you have? Assuming 4 bytes per pixel, 5000 640 x 480 images would take about 6 GB, so if you have 16 GB RAM or more then you might be able to handle this data in RAM as long as you're using 64-bit LabVIEW and you're careful about how memory is allocated - read through VI Memory Usage from the help, for a start.
Alternatively you can look at storing the data on disk in a format where you can access an arbitrary chunk from the file. I haven't used it much myself but HDF5 seems to be the obvious choice - if you're on Windows you can install the LiveHDF5 library from the VI package manager.

Did you consider to store images as files in the system temporary directory and delete it afterwards? Since the processing takes long time per image, it should be easily possible to have a "image queue" of 5 images always loaded into memory (to avoid aby performance drop due to loading from file right before the processing) and rest would sit on the disk.

Related

What is the most efficient way to transfer pictures over wifi from FlashAir in Java8?

I'm creating a program that reads pictures (JPG max size about 10Mb per file) from FlashAir as soon as they're taken, display them in a bigger screen for review and saved them to a local folder. It is paramount to reduce the time from the moment the picture is taken until it is displayed to the user and to prevent loss of quality (they are macro pictures). Now, the camera works with JPG, so changing that is not an option for the moment. All the pictures must be saved locally in the maximum possible quality.
I was wondering what would be the best way to achieve this. Since the FlashAir card is in the camera and moves around, the bottleneck will probably be in the wireless transfer (max speed is 54 Mb/s).
The picture could be displayed withing the Java app or sent to a different app for editing, but I want to reduce I/O operations (I don't want to have to re-read the picture once is saved locally to actually display it).
What is the best way to achieve this using pure Java 8 classes?
My test implementation uses ImageIO.read() and ImageIO.write() methods. The problems I have with this approach is that it takes a long time for the picture to be displayed (it is actually read from the saved folder) and the image is re-encoded and compressed, loosing quality compared to the original file that is in the SD Card.
I feel it should be a way to transfer the bytes very efficiently over the network first and run two parallel processes to save the untouched bytes to disk and decode and display the image (image then could potentially be edited and saved to disk to a different location).
I don't need a fully working example. My main concern is what Java 8 I/O classes are best suited for this job and to know if my approach is the best one to achieve the results.
Edit
After some research I was thinking of using ReadableByteChannel to storage the picture bytes in a ByteBuffer and then pass copies of it to two jobs that will run in parallel: the one saving the bytes would use a FileChannel and the one displaying the image would use them to create an ImageIcon.
I don't know if there is a better/recommended approach.

Large image sequences in Flash - How to reduce memory usage?

so I'm trying to publish a looping animation in Flash with a large amount of image sequences in it. (around 3000 frames all told). And I'm having problems with the swf using too much memory when it's played.
The swf is about 80mb, but the file uses an excess of 2gb of ram when played. I don't know why that would be. A memory leak?
My understanding is: that Flash will just load all images in a swf into memory unless you dump the memory somehow.Can anyone explain how to do this? Is it possible? I can't seem to find a solution online.
Thanks
SWF file keeps all images in a compressed format. They can be compressed very well but when playing, they must be decompressed so they take much bigger memory space.
How to optimize that depends on you. There is also a possibility that it cannot be optimized - for example, if SWF does automatic decompression of all images on load. You can test it by checking the free memory after swf is loaded but before the animation is accessed. If standard timeline animation takes too much memory, you can try to use your own custom animation. For example, in an ENTER_FRAME loop you can create only one instance of image for the current frame, and all past instances will be removed and garbage collected. Hard to say because a testing is needed.
Thanks for your responses. As I couldn't find an answer to this, I've resorted to using external video. Cheers for your time.

Constant Write Speed to Disk

I'm writing real-time data to an empty spinning disk sequentially. (EDIT: It doesn't have to be sequential, as long as I can read it back as if it was sequential.) The data arrives at a rate of 100 MB/s and the disks have an average write speed of 120 MB/s.
Sometimes (especially as free space starts to decrease) the disk speed goes under 100 MB/s depending on where on the platter the disk is writing, and I have to drop vital data.
Is there any way to write to disk in a pattern (or some other way) to ensure a constant write speed close to the average rate? Regardless of how much data there currently is on the disk.
EDIT:
Some notes on why I think this should be possible.
When usually writing to the disk, it starts in the fast portion of the platter and then writes towards the slower parts. However, if I could write half the data to the fast part and half the data to the slow part (i.e. for 1 second it could write 50MB to the fast part and 50MB to the slow part), they should meet in the middle. I could possibly achieve a constant rate?
As a programmer, I am not sure how I can decide where on the platter the data is written or even if the OS can achieve something similar.
If I had to do this on a regular Windows system, I would use a device with a higher average write speed to give me more headroom. Expecting 100MB/s average write speed over the entire disk that is rated for 120MB/s is going to cause you trouble. Spinning hard disks don't have a constant write speed over the whole disk.
The usual solution to this problem is to buffer in RAM to cover up infrequent slow downs. The more RAM you use as a buffer, the longer the span of slowness you can handle. These are tradeoffs you have to make. If your problem is the known slowdown on the inside sectors of a rotating disk, then your device just isn't fast enough.
Another thing that might help is to access the disk as directly as possible and ensure it isn't being shared by other parts of the system. Use a separate physical device, don't format it with a filesystem, write directly to the partitioned space. Yes, you'll have to deal with some of the issues a filesystem solves for you, but you also skip a bunch of code you can't control. Even then, your app could run into scheduling issues with Windows. Windows is not a RTOS, there are not guarantees as far as timing. Again this would help more with temporary slowdowns from filesystem cleanup, flushing dirty pages, etc. It probably won't help much with the "last 100GB writes at 80MB/s" problem.
If you really are stuck with a disk that goes from 120MB/s -> 80MB/s outside-to-inside (you should test with your own code and not trust the specs from the manufacture so you know what you're dealing with), then you're going to have to play partitioning games like others have suggested. On a mechanical disk, that will introduce some serious head seeking, which may eat up your improvement. To minimize seeks, it would be even more important to ensure it's a dedicated disk the OS isn't using for anything else. Also, use large buffers and write many megabytes at a time before seeking to the end of the disk. Instead of partitioning, you could write directly to the block device and control which blocks you write to. I don't know how to do this in Windows.
To solve this on Linux, I would be tempted to test mdadm's raid0 across two partitions on the same drive and see if that works. If so, then the work is done and you don't have to write and test some complicated write mechanism.
Partition the disk into two equally sized partitions. Write a few seconds worth of data alternating between the partitions. That way you get almost all of the usual sequential speed, nicely averaged. One disk seek every few seconds eats up almost no time. One seek per second reduces the usable time from 1000ms to ~990ms which is a ~1% reduction in throughput. The more RAM you can dedicate to buffering the less you have to seek.
Use more partitions to increase the averaging effect.
I fear this may be more difficult than you realize:
If your average 120 MB/s write speed is the manufacturer's value then it is most likely "optimistic" at best.
Even a benchmarked write speed is usually done on a non-partitioned/formatted drive and will be higher than what you'd typically see in actual use (how much higher is a good question).
A more important value is the drive's minimum write speed. For example, from Tom's Hardware 2013 HDD Benchmarks a drive with a 120 MB/s average has a 76 MB/s minimum.
A drive that is being used by other applications at the same time (e.g., Windows) will have a much lower write speed.
An even more important value is the drives actual measured performance. I would make a simple application similar to your use case that writes data to the drive as fast as possible until it fills the drive. Do this a few (dozen) times to get a more realistic average/minimum/maximum write speed value...it will likely be lower than you'd expect.
As you noted, even if your "real" average write speed is higher than 100 MB/s you run into issues if you run into slow write speeds just before the disk fills up, assuming you don't have somewhere else to write the data to. Using a buffer doesn't help in this case.
I'm not sure if you can actually specify a physical location to write to on the hard drive these days without getting into the drive's firmware. Even if you could this would be my last choice for a solution.
A few specific things I would look at to solve your problem:
Measure the "real" write performance of the drive to see if its fast enough. This gives you an idea of how far behind you actually are.
Put the OS on a separate drive to ensure the data drive is not being used by anything other than your application.
Get faster drives (either HDD or SDD). It is fine to use the manufacturer's write speeds as an initial guide but test them thoroughly as well.
Get more drives and put them into a RAID0 (or similar) configuration for faster write access. You'll again want to actually test this to confirm it works for you.
You could implement the strategy of alternating writes bewteen the inside and the outside by directly controlling the disk write locations. Under Windows you can open a disk like "\.\PhysicalDriveX" and control where it writes. For more info see
http://msdn.microsoft.com/en-us/library/windows/desktop/aa363858(v=vs.85).aspx
First of all, I hope you are using raw disks and not a filesystem. If you're using a filesystem, you must:
Create an empty, non-sparse file that's as large as the filesystem will fit.
Obtain a mapping from the logical file positions to disk blocks.
Reverse this mapping, so that you can map from disk blocks to logical file positions. Of course some blocks are unavailable due to filesystem's own use.
At this point, the disk looks like a raw disk that you access by disk block. It's a valid assumption that this block addressing is mostly monotonous to the physical cylinder number. IOW if you increase the disk block number, the cylinder number will never decrease (or never increase -- depending on the drive's LBA to physical mapping order).
Also, note that a disk's average write speed may be given per cylinder or per unit of storage. How would you know? You need the latter number, and the only sure way to get it is to benchmark it yourself. You need to fill the entire disk with data, by repeatedly writing a zero page to the disk, going block by block, and divide the total amount of data written by the amount it took. You need to be accessing the disk or the file in the direct mode. This should disable the OS buffering for the file data, and not for the filesystem metadata (if not using a raw disk).
At this point, all you need to do is to write data blocks of sensible sizes at the two extremes of the block numbers: you need to fill the disk from both ends inwards. The size of the data blocks depends on the bandwidth wastage you can allow for seeks. You should also assume that the hard drive might seek once in a while to update its housekeeping data. Assuming a worst-case seek taking 15ms, you waste 1.5% of per-second bandwidth for each seek. Assuming you can spare no more than 5% of bandwidth, with 1 seek/s on average for the drive itself, you can seek twice per second. Thus your block size needs to be your_bandwith_per_second/2. This bandwidth is not the disk bandwidth, but the bandwidth of your data source.
Alas, if only things where this easy. It generally turns out that the bandwidth at the middle of the disk is not the average bandwidth. During your benchmark you must also take a note of write speed over smaller sections of the disk, say every 1% of the disk. This way, when writing into each section of the disk, you can figure out how to split the data between the "low" and the "high" section that you're writing to. Suppose that you're starting out at 0% and 99% positions on the disk, and the low position has a bandwidth of mean*1.5, and the high position has a bandwidth of mean*0.8, where mean is your desired mean bandwidth. You'll then need to write 100% * 1.5/(0.8+1.5) of the data into the low position, and the remainder (100% * 0.8/(0.8+1.5)) into the slower high position.
The size of your buffer needs to be larger than just the block size, since you must assume some worst-case latency for the hard drive if it hits bad blocks and needs to relocate data, etc. I'd say a 3 second buffer may be reasonable. Optionally it can grow by itself if latencies you measure while your software runs turn out higher. This buffer must be locked ("pinned") to physical memory so that it's not subject to swapping.
Another possible option is to destroke (or short stroke) a hard drive. If you start with a 4TB or larger drive and destroke it to 2TB, only the outer portions of the platters will be used, resulting in a faster throughput rate. The issue would be getting the software that issues vendor unique commands to a hard drive to destroke it.

Writing multiple files Vs. writing one big file [in a solid state drive]

(I was not able to find a clear answer to my question, maybe I used the wrong search term)
I want to record many images from a camera, with no compression or lossless compression, on a not so powerful device with one single solid drive.
After investigating, I have decided that, if any, the compression will be simply png image by image (this is not part of the discussion).
Given these constraints, I want to be able to record at maximum possible frequency from the camera. The bottleneck is the (only one) hard drive speed. I want to use the RAM for queuing, and the few available cores for compressing the images in parallel, so that there's less data to write.
Once the data is compressed, do I get any gain in writing speed if I stream all the bytes in one single file, or, considering that I am working with a solid drive, can I just write one file (let's say about 1 or 2 MB) per image still working at the maximum disk bandwidth? (or very close to it, like >90%)?
I don't know if it matters, this will be done using C++ and its libraries.
My question is "simply" if by writing my output on a single file instead of in many 2MB files I can expect a significant benefit, when working with a solid state drive.
There's a benefit, not a significant one. A file system driver for a solid state drive already knows how to distribute the data of a file across many non-adjacent clusters so doing it yourself doesn't help. Necessary to fit a large file on a drive that already contains files. By breaking it up, you force extra writes to also add the directory entries for those segments.
The type of a solid state drive matters but this is in general already done by the driver to implement "wear-leveling". In other words, intentionally scatter the data across the drive. This avoids wearing out flash memory cells, they have a limited number of times you can write them before they physically wear out and fail. Traditionally only guaranteed at 10,000 writes, they've gotten better. You'll exercise this of course. Notable as well is that flash drives are fast to read but slow to write, that matters in your case.
There's one notable advantage to breaking up the image data into separate files: it is easier to recover from a drive error. Either from a disastrous failure or the drive just filling up to capacity without you stopping in time. You don't lose the entire shot. But inconvenient to whatever program reads the images off the drive, it has to glue them back together. Which is an important design goal as well, if you make it too impractical with a non-standard uncompressed file format or just too slow to transfer or just too inconvenient in general then it will just not get used very often.

comparing 2 kernel images and flashing the diff FLASH memory

i have existing old version images like kernel image,filesys image,application images in my NAND flash.
i want to port the new modified kernel or application image on to the NAND flash by replacing the older one.
But in the new images 90% is common to the old images.
so i don't want the entire new image to transfer.
inspite i am thinking of some kind of comaprision between the old and new images and want to send only the difference on to flash memory. so that i can avoid transfering a larger data.
can it be possible ? i need some guidence to do this.
It's certainly possible, however with flash you'll have to take into account the difference between erase sector size and write sector size (typically the erase block is multiple write sectors in size).
This would be very difficult, for two reasons.
The Linux kernel is stored compressed, so a small change can cause all the compression output following that point to be different.
If a modification changes the size of some code, everything stored after that will have to shift forward or back.
In theory, you could create your own way of linking and/or compressing the kernel so that code stays in one place and compression happens in a block-aware way, but that would be a lot of work -- probably not worth it just to save a few minutes of erase/write time during kernel upgrades.

Resources