Structured Light - How to do when the projector's resolution is lower than patterns? - algorithm

I try to build a structured light environment to do 3D scanning.
As far as I know, if I choose to use gray code to reconstruct a 3D model, I have to implement specific patterns that were encode in power 2(2^x, x = 0 ~ 10).
That is said, the patterns must be at least 1024 x 1024 in resolution.
What if my DLP projector only support resolution up to 800 x 480? It projects Moire pattern when the gray code pattern resolution becomes too high(I tried). What should I do?
My friends suggest that I create 1024 x 1024 patterns, and "crop" them into 800 x 480,
but I thought the gray code should follow specific sequence and patterns, my friends suggestion will create several image that is not symmetry.
Does anyone have the same experience like me?
----------2015.8.4 Update Question----------
I was thinking that if my projector can't perfectly projects high resolution patterns, can I just let it projects the patterns with low resolution, for instance, from 2^0 to 2^6?
Or the gray code strictly demands patterns from 2^0 to 2^10? Otherwise gray code is not available?

you can not directly scale down to your resolution
because it would distort the pattern make it useless
instead you can:
crop it to your resolution
but you need to handle that in the scanning part too because you would not have the full pattern available
use nearest usable power of 2 resolution
like 512x256 and create pattern for it. The rest of the space is unused (wasting pixels)
use bullet #2 + scale up to fit your resolution better
so create pattern 512x256 and linearly scale to fit to 800x480 as much as you can so:
800/512 = 1.5625
480/256 = 1.8750
use the smaller scale (512x256 * 1.5625 -> 800x400) so scale the pattern by 1.5625 and use that as a pattern image
this is scaled by nearest neighbor to avoid subpixel grayscale colors which are harder to detect. This will waste less pixels but it can lower the precision of 3D scan !!!
This is how I generate my pattern in C++ and VCL:
// [generate pattern xs*ys power of 2 resolution]
// clear buffer
bmp->Canvas->Brush->Color=clBlack;
bmp->Canvas->FillRect(TRect(0,0,xs,ys));
int x,y,a,da;
for (da=0;1<<da<xs;da++); // number of bits per x resolution
for (a=0,y=0;y<ys;y++,a=(y*da)/ys)
for (x=0;x<xs;x++)
if (int((x>>a)&1)==0) pyx[ys-1-y][x]=0x00FFFFFF;
bmp->SaveToFile("3D_scann_pattern0.bmp");
bmp is VCL bitmap
xs,ys is the resolution of a bitmap
p[ys][xs] is direct 32bit pixel access to bitmap
This is slight differently encoded then your pattern !!!
[Notes]
If you need precision use bullet #2
If you need to cover bigger area use bullet #3
You can also scale in y axis differently then in x axis as this is just 1D encoding

Related

Change training dataset aspect ratio and size

I have a training dataset of 640x512 images that I would like to use with a 320x240 camera.
Is it ok to change the aspect ratio and the size of the training images to that of the camera?
Would it be better to upscale the camera frames?
It is better if you keep the aspect ratio of the images because you will be artificially modifying the composition of the objects in the image. What you can do is downscale the image by a factor of 2, so it's 320 x 256, then crop from the center so you have a 320 x 240 image. You can do this by simply removing the first 8 and last 8 columns of the image to get it to 320 x 240. Removing the first 8 and last 8 columns should be safe because it is very unlikely you will see meaningful information within an 8 pixel band on either side of the image.
If you are using a deep learning framework such as Tensorflow or PyTorch, there are pre-processing methods to automatically allow you to crop from the center as well as downscale the image by a factor of 2 for you. You just need to set up a pre-processing pipeline and have these two things in place. You don't have any code established so I can't help you with implementation details, but hopefully what I've said is enough to get you started.
Finally, do not upsample the images. There will be no benefit because you will be using existing information to interpolate to a larger space which is inaccurate. You can scale down, but never scale up. The only situation where this could be useful is if you use superresolution, but that would be for specific cases and it highly depends on what images you use. In general, I do not recommend upscaling. Take your training set and downscale to the resolution of the camera as the images from the camera would be what is used at inference and at that resolution.

It´s necessary to create different screens sizes and density xmls for an app? Best approach for this

Just a straight forward question. I´m trying to make the best possible choice here and there is too much information for a "semi-beginner" like me.
Well, at this point, I´m trying with screen size values for my layout (activity_main.xml (normal, large, small)) and with different densities (xhdpi, xxhdpi, mhdpi) and, if a can say so myself, it is a mess. Do I have to create every possible option to support all screen sizes and densities? Or am I doing something really wrong here? what is the best approach for this?
My layouts are now activity_main(normal_land_xxhdpi) and I have serious doubts about it.
I´m using last version of android studio of course. My app is a single activity with buttons, textview and others. Does not have any fragments or intents whatsoever, and for that reason I think this has to be an easy task, but not for me.
Hope you guys can help. I don't think i need to put any code here, but if needed, i can add it.
If you want to make a responsive UI for every device you need to learn about some things first:
-Difference between PX, DP:
https://developer.android.com/training/multiscreen/screendensities
Here you can understand that dp is a standard measure which android uses to calculate how much pixels, lets say a line, should have to keep the proportions between different screensizes with different densities.
-Resolution, Density and Ratio:
The resolution is how much pixels a screen has over height and width. This pixels can be smaller or bigger, so for instance, if you have a screen A with 10x10 px whose pixels are two times smaller than other screen B with 10 x 10 pixels too, A is two times smaller than B even when both have 10 x 10 px.
For that reason exists the meaning Density, which is how much pixels your screen has for every inch, so you can measure the quality of a screen where most pixels per inch (ppi) is better.
Ratio tells you how much pixels are for height as well as width, for example the ratio of a screen of 1000 x 2000 px is 1:2, a full hd screen of 1920 x 1080 is 16:9 (16 pixels height for every 9 pixels width). A 1:1 ratio is a square screen.
-Standard device's resolutions
You can find the most common measurements on...
https://material.io/resources/devices/
When making a UI, you use the DP measurements. You will realize that even when resolution between screens are different, the DP is the same cause they have different densities.
Now, the right way is going with constraint layout using dp measures to put your views on screen, with correct constraints the content will adapt to other screen sizes
Anyway, you will need to make additional XML for some cases:
-Different orientation
-Different ratio
-Different DP resolution (not px)
For every activity, you need to provide a portrait and landscape design. If other device has different ratio, maybe you will need to adjust the height or width due to the proportions of the screens aren't the same. Finally, even if the ratio is the same, the DP resolution could be different, maybe you designed an activity for a 640x360dp and other device has 853x480dp, which means you will have more vertical space.
You can read more here:
https://developer.android.com/training/multiscreen/screensizes
And learn how to use constraintLayout correctly:
https://developer.android.com/training/constraint-layout?hl=es-419
Note:
Maybe it seems to be so much work for every activity, but you make the first design and then you just need to copy the design to other xml with some qualifiers and change the dp values to adjust the views as you wants (without making from scratch) which is really faster.

Starting point for image recognition?

I have a set of 274 color images (each one is 200x150 pixels). Each image is visually distinct. I would like to build an app which accepts an up/down-scaled version of one of the base set of images and determines the closest match.
I'm a senior software engineer but am totally new to image recognition. I'd really appreciate any recommendations as to where to start.
If you're comparing extremely similar images, it's in theory sufficient to calculate the Euclidean distance between the 2 images. The images must be the same size to do so, so it is often necessary to rescale an image to do so (generally the larger image is scaled down). Note that aliasing issues can happen here, so pay some attention to your downsampling algorithm. There's also an issue if your images don't have the same aspect ratio.
However, this is almost never done in practice since it's extremely slow. For N images of size WxH and 3 color channels, it requires N x W x H x 3 comparisons, which quickly gets unworkable (consider that many users can have over 1000 images of size >1000x1000).
Generally we attempt to reduce the image to a smaller array that captures the image information much more briefly, called a visual descriptor. For example taking a 1024x1024x3 image and reducing it to a 128 length vector. This needs only be calculated once for the reference images, and then stored in an appropriate data structure. Then we can compare the descriptor for the query image against the descriptor for the reference images.
The cost of calculating the distance for our dataset of N images for a descriptor of length L is then N x L instead of the original N x W x H x 3
So the issue is to find efficient descriptors that are (a) cheap to compute and (b) capture the image accurately. This is still an active area of research, but I can suggest some:
Histograms are probably the simplest way to do this, although they do very poorly with any illumination change and incorporate only color information, no spatial information. Make sure you normalise your histogram before doing any comparison
Perceptual hashing works well with very similar images or slightly cropped images. See here
GIST descriptors are powerful, but more complex, see here

Detecting individual images in an array of images

I'm building a photographic film scanner. The electronic hardware is done now I have to finish the mechanical advance mechanism then I'm almost done.
I'm using a line scan sensor so it's one pixel width by 2000 height. The data stream I will be sending to the PC over USB with a FTDI FIFO bridge will be just 1 byte values of the pixels. The scanner will pull through an entire strip of 36 frames so I will end up scanning the entire strip. For the beginning I'm willing to manually split them up in Photoshop but I would like to implement something in my program to do this for me. I'm using C++ in VS. So, basically I need to find a way for the PC to detect the near black strips in between the images on the film, isolate the images and save them as individual files.
Could someone give me some advice for this?
That sounds pretty simple compared to the things you've already implemented; you could
calculate an average pixel value per row, and call the resulting signal s(n) (n being the row number).
set a threshold for s(n), setting everything below that threshold to 0 and everything above to 1
Assuming you don't know the exact pixel height of the black bars and the negatives, search for periodicities in s(n). What I describe in the following is total overkill, but that's how I roll:
use FFTw to calculate a discrete fourier transform of s(n), call it S(f) (f being the frequency, i.e. 1/period).
find argmax(abs(S(f))); that f represents the distance between two black bars: number of rows / f is the bar distance.
S(f) is complex, and thus has an argument; arctan(imag(S(f_max))/real(S(f_max)))*number of rows will give you the position of the bars.
To calculate the width of the bars, you could do the same with the second highest peak of abs(S(f)), but it'll probably be easier to just count the average length of 0 around the calculated center positions of the black bars.
To get the exact width of the image strip, only take the pixels in which the image border may lie: r_left(x) would be the signal representing the few pixels in which the actual image might border to the filmstrip material, x being the coordinate along that row). Now, use a simplistic high pass filter (e.g. f(x):= r_left(x)-r_left(x-1)) to find the sharpest edge in that region (argmax(abs(f(x)))). Use the average of these edges as the border location.
By the way, if you want to write a source block that takes your scanned image as input and outputs a stream of pixel row vectors, using GNU Radio would offer you a nice method of having a flow graph of connected signal processing blocks that does exactly what you want, without you having to care about getting data from A to B.
I forgot to add: Use the resulting coordinates with something like openCV, or any other library capable of reading images and specifying sub-images by coordinates as well as saving to new images.

Does scaling of barcode image damages it?

I have a barcode image. I have to make it smaller.
Can that damage the barcode?
Proportional scaling
Not proportional scaling (only height changes)
Barcodes are: Type UPC-A / EAN-13 "vertical lines". Sorry not an expert in barcodes, thought the type of barcode would not be important. Scaling is moderate, the image does not lose relevant data.
Regular barcode (=vertical stripes) is recognized by the relative width of the lines. Thus, the horizontal height only matters for robustness against diagonal scanning. If the codes are scanned with a hand scanner, I'd just scale the height (or crop the image). In any case, the different widths of the lines should still be clearly visible. There may be compliance rules suggesting minimum proportions for a given barcode standard.
For regular linear product barcodes, the simple answer is yes, you can scale it (both case are safe).
However, if you scale too far and the bars end up too close together, you will start to get a high level of read errors.
You'll need to test it with an appropriate barcode reader to make sure you haven't scaled too much.
When scaling a barcode, there are several things you must keep in mind.
1) You get the absolute sharpest edges in a barcode if each module (the narrowest bar) is a whole number of pixels wide.
2) If the module width is not a whole number of pixels, produce a barcode where the width of each module is the truncated whole number and use bilinear interpolation to scale up. This will give you at most one pixel of gradient at the edges.
3) Be careful when buying a barcode library, choose one that includes built-in scaling that preserves the barcode, such as this one or this one. Barcodes have special demands that image processing normally does not have, such as pixel-perfection. Using e.g. Gimp might damage the barcode.

Resources