How to fix low FBER in MRI volumes - metrics

I’m using the MRQy tool (https://github.com/ccipd/MRQy). It essentially calculates a bunch of quality metrics across patient volumes.
I have 34 patient volumes out of which 33 of them have FBER = 0 and 1 has FBER > 0. FBER usually indicates Gibbs ringing effects. It’s the Foreground Background Energy Ratio. A higher FBER is better quality.
The only little processing done to the volumes is:
Isotropic resampling
background noise reduction using a foreground mask (outsu threshold and histogram eq).
I’m neither sure what a low FBER indicates, nor how to fix it and I’m sure there aren’t any clear visual distinctions between the volume with different FBER as well.

Related

Smooth Lower Resolution Voxel Noise

After reading the blog post at n0tch.tumblr.com/post/4231184692/terrain-generation-part-1. I was interested in Notch's solution by sampling at lower resolutions. I implemented this solution in my engine, but instantly noticed he didn't go into detail what he interpolated between to smooth out the noise.
From the blog:
Unfortunately, I immediately ran into both performance issues and
playability issues. Performance issues because of the huge amount of
sampling needed to be done, and playability issues because there were
no flat areas or smooth hills. The solution to both problems turned
out to be just sampling at a lower resolution (scaled 8x along the
horizontals, 4x along the vertical) and doing a linear interpolation.
This is the result of the low-res method without smoothing:
low-res voxel
I attempted to smooth out the noise in the chunk noise array and instantly noticed a problem:
attempt at smoothing
The noise also looks less random now.
As you can see, there is an obvious transition between chunks. How exactly do I use interpolation to smooth out the low resolution noise map so that the border between chunks smoothly connect while still appearing random?

Trying to come up with a feature to improve emotion classifier based on facial movement using facial landmarks

I managed to create an emotion recognition system that uses dense optical flow on each entire frame. While the accuracy range is within 80-90% with cross-validation, I am aiming to improve the accuracy of the program.
There are four emotions: Neutral, happy, surprised, and angry. So far my classifier works pretty well, though it tends to over guess 'neutral' when the answer is 'happy' or 'surprised'. This tends to happen when the mouth is only slightly opened, but the subject is visible smiling or have mouth opened in shock while the classifier still thinks the mouth is closed.
Confusion Matrix for Dense Optical Flow:
[[27 22 0 0]
[ 0 57 1 0]
[ 0 12 60 0]
[ 0 9 3 68]]
Accuracy: 80-90% range
There is something I want to try in order to solve this though.
I have the ability to get the position of facial landmarks, though I don't know what I can do to turn this information into an effective additional feature I can use that would increase the accuracy. I was thinking of just simply getting the face landmark coordinates at the end of each video, but I feel like that it would not be the solution to differentiate between a closed mouth and slightly opened one (the difference in coordinate values will be small I think and am guessing that machine learning won't notice the difference).
I considered the possibility of just simply taking a still image of the subject's mouth and just analyzing that, but rejected it as it vulnerable to factors like lighting and people's appearance, and inconsistent matrix sizes. Plus I want my additional feature to take advantage of facial movement tracking.
I was wondering if there is a smart way to implement facial landmark tracking into a feature that would increase the accuracy of my classifier by dealing with my classifier's tendency to over predict the emotion 'neutral'. Any ways I can accomplish that?

C#: Fast and Smart algorithm to Compare two byte array (image)

I'm running a process on a WebCam Image. I'd like to Wake Up that process only if there is major changes.
Something moving in the image
Lights turn on
...
So i'm looking for a fast efficient algorithm in C# to compare 2 byte[] (kinect image) of the same size.
I just need kind of "diff size" with a threashold
I found some motion detection algorithm but it's "too much"
I found some XOR algorithm but it might be too simple ? Would be great If I could ignore small change like sunlight, vibration, etc, ...
Mark all pixels which are different from previous image (based on threshold i.e. if Pixel has been changed only slightly - ignore it as noise) as 'changed'
Filter out noise pixels - i.e. if pixel was marked as changed but all its neighbors are not - consider it as noise and unmark as changed
Calculate how many pixels are changed on the image and compare with Threshold (you need to calibrate it manually)
Make sure you are operating on Greyscale images (not RGB). I.e. convert to YUV image space and do comparison only on Y.
This would be simplest and fastest algorithm - you just need to tune these two thresholds.
A concept: MPEG standards involve motion detections. Maybe you can monitor the MPEG stream's bandwidth. If there's no motion, than the bandwidth is very low (except during key frames (I frames)). If something changes and any move is going on, the bandwidth increases.
So what you can do is grab the JPEGs and feed it into an MPEG encoder codec. Then you can just look at the encoded stream. You can tune the frame-rate and the bandwidth too in a range, plus you decide what is the threshold for the output stream of the codec which means "motion".
Advantage: very generic and there are libraries available, often they offer hardware acceleration (VGAs/GPUs help with JPEG en/decoding and some or more MPEG). It's also pretty standard.
Disadvantage: more computation demanding than a XOR.

OpenGL performance on rendering "virtual gallery" (textures)

I have a considerable (120-240) amount of 640x480 images that will be displayed as textured flat surfaces (4 vertex polygons) in a 3D environment. About 30-50% of them will be visible in a given frame. It is possible for them to crossover. Nothing else will be present in the environment.
The question is - will the modern and/or few-years-old (lets say Radeon 9550) GPU cope with that, and what frame rate can I expect? I aim for 20FPS, but 30-40 would be nice. Would changing the resolution to 320x240 make it more probable to happen?
I do not have any previous experience with performance issues of 3D graphics on modern GPUs, and unfortunately I must make a design choice. I don't want to waste time on doing something that couldn't have worked :-)
Assuming you have RGB textures, that would be 640*480*3*120 Bytes = 105 MB minimum of texture data, which should fit in VRAM of more recent graphics cards without swapping, so this wont be of an issue. However, texture lookups might get a bit problematic but this is hard to judge for me without trying. Given that you only need to process 50% of 105 MB, that is about 50 MB (very rough estimate) while targetting 20 FPS means 20*50MB/sec = about 1GB/sec. This should be possible to throughput even on older hardware.
Reading the specs of an older Radeon 9600 XT, it says peak fill-rate of 2000Mpixels/sec and if i'm not mistake you require far less than 100Mpixels/sec. Peak memory b/w is specified with 9.6GB/s, while you'd need about 1 GB/s (as explained above).
It would argue that this should be possible, if done correctly - esp. current hardware should have not problem at all.
Anyways, you should simply try out: Loading some random 120 textures and displaying them in some 120 quads can be done in very few lines of code with hardly any effort.
First of all, you should realize that the dimensions of textures should normally be powers of two, so if you can change them something like 512x256 (for example) would be a better starting point.
From that, you can create MIPmaps of the original, which are simply versions of the original scaled down by powers of two, so if you started with 512x256, you'd then create versions at 256x128, 128x64, 64x32, 32x16, 16x8, 8x4, 4x2, 2x1 and 1x1. When you've done this, OpenGL can/will select the "right" one for the size it'll show up at in the final display. This generally reduces the work (and improves quality) in scaling the texture to the desired size.
The obvious sticking point with that would be running out of texture memory. If memory serves, in the 9550 timeframe you could probably expect 256 MB of on-board memory, which would be about sufficient, but chances are pretty good that some of the textures would be in system RAM. That overflow would probably be fairly small though, so it probably won't be terribly difficult to maintain the kind of framerate you're hoping for. If you were to add a lot more textures, however, it would eventually become a problem. In that case, reducing the original size by 2 in each dimension (for example) would reduce your memory requirement by a factor of 4, which would make fitting them into memory a lot easier.

How does MPEG4 compression work?

Can anyone explain in a simple clear way how MPEG4 works to compress data. I'm mostly interested in video. I know there are different standards or parts to it. I'm just looking for the predominant overall compression method, if there is one with MPEG4.
MPEG-4 is a huge standard, and employs many techniques to achieve the high compression rates that it is capable of.
In general, video compression is concerned with throwing away as much information as possible whilst having a minimal effect on the viewing experience for an end user. For example, using subsampled YUV instead of RGB cuts the video size in half straight away. This is possible as the human eye is less sensitive to colour than it is to brightness. In YUV, the Y value is brightness, and the U and V values represent colour. Therefore, you can throw away some of the colour information which reduces the file size, without the viewer noticing any difference.
After that, most compression techniques take advantage of 2 redundancies in particular. The first is temporal redundancy and the second is spatial redundancy.
Temporal redundancy notes that successive frames in a video sequence are very similar. Typically a video would be in the order of 20-30 frames per second, and nothing much changes in 1/30 of a second. Take any DVD and pause it, then move it on one frame and note how similar the 2 images are. So, instead of encoding each frame independently, MPEG-4 (and other compression standards) only encode the difference between successive frames (using motion estimation to find the difference between frames)
Spatial redundancy takes advantage of the fact that in general the colour spread across images tends to be quite low frequency. By this I mean that neighbouring pixels tend to have similar colours. For example, in an image of you wearing a red jumper, all of the pixels that represent your jumper would have very similar colour. It is possible to use the DCT to transform the pixel values into the frequency space, where some high frequency information can be thrown away. Then, when the reverse DCT is performed (during decoding), the image is now without the thrown away high-frequency information.
To view the effects of throwing away high frequency information, open MS paint and draw a series of overlapping horizontal and vertical black lines. Save the image as a JPEG (which also uses DCT for compression). Now zoom in on the pattern, notice how the edges of the lines are not as sharp anymore and are kinda blurry. This is because some high frequency information (the transition from black to white) has been thrown away during compression. Read this for an explanation with nice pictures
For further reading, this book is quite good, if a little heavy on the maths.
Like any other popular video codec, MPEG4 uses a variation of discrete cosine transform and a variety of motion-compensation techniques (which you can think of as motion-prediction if that helps) that reduce the amount of data needed for subsequent frames. This page has an overview of what is done by plain MPEG4.
It's not totally dissimilar to the techniques used by JPEG.
MPEG4 uses a variety of techniques to compress video.
If you haven't already looked at wikipedia, this would be a good starting point.
There is also this article from the IEEE which explains these techniques in more detail.
Sharp edges certainly DO contain high frequencies. Reducing or eliminating high frequencies reduces the sharpness of edges. Fine detail including sharp edges is removed with high frequency removal - bility to resolve 2 small objects is removed with high frequencies - then you see just one.

Resources