Would it be effective to crop image in yolo v4? - crop

Example Image
For example as Image above, area I need is just like red box, and other section doesn't have any labels for classification/object detection.
What I think is "If I use cropped image to red box will occur better effect" because there's to much useless area without labels in original image. When mosaic augmentation used in yolo v4, It will put images together in one. And, because there's so many area without labels, data after mosaic can be useless than before.
But, This is just my guess, and I need a test to confirm it, but the lack of computing power is limiting the actual test. So the question is, Is it possible to actually improve performance if the original image is cropped in the form of a red box? Is that why I guessed correctly?
Also, my partner said that cropping is not a good choice in Yolo because it can ruin the proportion of the object, but I couldn't understand what the proportion of the object meant in Yolo. I wonder why the proportion of objects in Yolo is not suit with cropping.
Thanks for read, and have a nice day

simply you shouldn't resize the images, however, if the training/testing data set contains considerable difference among width and heights, use the data augmentation methods. Pls, follow the link for more information.
https://github.com/pjreddie/darknet/issues/800

Related

Latent space image interpolation

Can someone tell me how (or the name of it, so that I could look it up) I can implement this interpolation effect? https://www.youtube.com/watch?v=36lE9tV9vm0&t=3010s&frags=pl%2Cwn
I tried to use r = r+dr, g = g+dr and b = b+db for the RGB values in each iteration, but it looks way too simple compared to the effect from the video.
"Can someone tell me how I can implement this interpolation effect?
(or the name of it, so that I could look it up)..."
It's not actually a named interpolation effect. It appears to interpolate but really it's just realtime updated variations of some fictional facial "features" (the hair, eyes, nose, etc are synthesized pixels taking hints from a library/database of possible matching feature types).
For this technique they used Neural Networks to do a process similar to DFT Image Reconstruction. You'll be modifying the image data in Frequency domain (with u,v), not Time domain (using x,y).
You can read about it at this PDF: https://research.nvidia.com/sites/default/files/pubs/2017-10_Progressive-Growing-of/karras2018iclr-paper.pdf
The (Python) source code:
https://github.com/tkarras/progressive_growing_of_gans
For ideas, on Youtube you can look up:
DFT image reconstruction (there's a good example with b/w Nicholas Cage photo reconstructed in stages. Loud music warning).
Image Synthesis with neural networks (one clip had salternative shoe and hand-bag designs (item photos) being "synthesized" by an N.N. after it analyzed features from other existing catalogue photos as "inspiration".
Image Enhancement Super Resolution using neural networks This method is closest to answering your question. One example has very low-res blurry pixelated image in b/w. Cannot tell if boy or girl. During a test, The network synthesizes various higher quality face images that it thinks is the correct match for the testing input.
After understanding what/how they're achieve it, you could think of shortcuts to get similar effect without needing networks eg: only using regular pixel editing functions.
Found it in another video, it is called "latent space interpolation", it has to be applied on the compressed images. If I have image A and the next image is image B, I have first to encode A and B, use the interpolation on the encoded data and finally decode the resulted image.
As of today, I found out that this kind of interpolation effect can be easily implemented for 3d image data. That is if the image data is available in a normalized and at 3d origin centred way, like for example in a unit sphere around the origin and the data of each faceimage is inside that unit sphere. Having the data of two images stored this way the interpolation can be calculated by taking the differences of rays going through the origin center and through each area of the sphere at some desired resolution.

Using a TreeMap with images

For representing most popular artists from EchoNest API, I've been trying to set-up Silverlight Toolkit's TreeMap using images, their TreeItemDefinition.ValueBinding being defined as the area of the image.
While it mostly fills up the space when the image stretch is set to 'Fill' :
When setting image stretch to 'Uniform' a lot of blank spaces remain :
On this post, image carving is suggested : Treemapping with a given aspect ratio
How can I know which images should be carved and at what dimensions they should be carved if possible at all ?
Is this problem solvable without human intervention for a good result ?
I don't think there is a way to know which images should be carved and at what dimensions they should be carved. An ok-ish euristic might be to check if the mean energy of an image is > a certain threshold (this can be refined to check only blocks of every image, and combining the result later: if the image has blocks without details/energy, it can be carved, at least in that section).
What i think would be better is to apply seam carving to the already composed image: that will try to carve out the white outlines (adding "artificial" energy to the patches of images might lead to even better results, preserving more the shapes of each image). This paper might be of use to check out other image resizing methods too.

How to make a charcoal drawing filter

i'm interested in some kind of charcoal-filters like the photoshop Photocopy-Filter or the note-paper.
Have someone a paper or some instructions how this filter works?
In best case i want to create the following:
input:
Output:
greetings
I think it's a process akin to pan-sharpening. I could get a quite similar image in gimp by:
Converting to gray
Duplicating into two layers
Lightly blurring one layer
Edge-detecting in the other layer with a DOG filter with large radius
Compositing the two layers, playing a bit with the transparency.
What this is doing is converting the color picture into a 0-1 bitmap picture.
They typically use a threshold function which returns 1 (white) for some values and 0 (black) for some other.
One simple function would be transform the image from color to gray-scale, and then select a shade of gray above which everything is white, and below it everything is black. The actual threshold you use could be made adaptive depending on the brightness of the picture (you want a certain percentage of pixels to be white).
It can also be adaptive based on the context within the picture (i.e. a dark area may still have some white pixels to show local contrast). The trees behind the house are not all black because the filtering is sensitive to the average darkness of the region.
Also note that the area close to the light gap in the tree has a cluster of dark pixels, because of its relative darkness. The edges of the home, the bench are also highlighted. There is an edge detection element at play.
I do not know exactly what effect you gave an example of but there are a variety that are similar to it. As VSOverFlow pointed out, thresholding an image would result in something very similar to that though I do not think it is what is being used. Open cv has a function for this, its documentation can be found here. You may also want to look into Otsu's method for thresholding.
Again as VSOverFlow pointed out, there is an edge detection element at play as well. You may want to investigate the Sobel and Prewitt filters. Those are 3 simple options that will give you something similar to the image you provided. Perhaps you could threshold the result from the Prewitt filter? I have no knowledge of how Photoshop implements its filters. If none of these options are close enough to what you are looking for I would recommend looking for information on the specific implementations of those filters in photoshop.

Restoring an old manuscript with image processing

Say i have this old manuscript ..What am trying to do is making the manuscript such that all the characters present in it can be perfectly recognized what are the things i should keep in mind ?
While approaching such a problem any methods for the same?
Please help thank you
Some graphics applications have macro recorders (e.g. Paint Shop Pro). They can record a sequence of operations applied to an image and store them as macro script. You can then run the macro in a batch process, in order to process all the images contained in a folder automatically. This might be a better option, than re-inventing the wheel.
I would start by playing around with the different functions manually, in order to see what they do to your image. There are an awful number of things you can try: Sharpening, smoothing and remove noise with a lot of different methods and options. You can work on the contrast in many different ways (stretch, gamma correction, expand, and so on).
In addition, if your image has a yellowish background, then working on the red or green channel alone would probably lead to better results, because then the blue channel has a bad contrast.
Do you mean that you want to make it easier for people to read the characters, or are you trying to improve image quality so that optical character recognition (OCR) software can read them?
I'd recommend that you select a specific goal for readability. For example, you might want readers to be able to read the text 20% faster if the image has been processed. If you're using OCR software to read the text, set a read rate you'd like to achieve. Having a concrete goal makes it easier to keep track of your progress.
The image processing book Digital Image Processing by Gonzalez and Woods (3rd edition) has a nice example showing how to convert an image like this to a black-on-white representation. Once you have black text on a white background, you can perform a few additional image processing steps to "clean up" the image and make it a little more readable.
Sample steps:
Convert the image to black and white (grayscale)
Apply a moving average threshold to the image. If the characters are usually about the same size in an image, then you shouldn't have much trouble selecting values for the two parameters of the moving average threshold algorithm.
Once the image has been converted to just black characters on a white background, try simple operations such as morphological "close" to fill in small gaps.
Present the original image and the cleaned image to adult readers, and time how long it takes for them to read each sample. This will give you some indication of the improvement in image quality.
A technique call Stroke Width Transform has been discussed on SO previously. It can be used to extract character strokes from even very complex backgrounds. The SWT would be harder to implement, but could work for quite a wide variety of images:
Stroke Width Transform (SWT) implementation (Java, C#...)
The texture in the paper could present a problem for many algorithms. However, there are technique for denoising images based on the Fast Fourier Transform (FFT), an algorithm that you can use to find 1D or 2D sinusoidal patterns in an image (e.g. grid patterns). About halfway down the following page you can see examples of FFT-based techniques for removing periodic noise:
http://www.fmwconcepts.com/misc_tests/FFT_tests/index.html
If you find a technique that works for the images you're testing, I'm sure a number of people would be interested to see the unprocessed and processed images.

How can I deblur an image in matlab?

I need to remove the blur this image:
Image source: http://www.flickr.com/photos/63036721#N02/5733034767/
Any Ideas?
Although previous answers are right when they say that you can't recover lost information, you could investigate a little and make a few guesses.
I downloaded your image in what seems to be the original size (75x75) and you can see here a zoomed segment (one little square = one pixel)
It seems a pretty linear grayscale! Let's verify it by plotting the intensities of the central row. In Mathematica:
ListLinePlot[First /# ImageData[i][[38]][[1 ;; 15]]]
So, it is effectively linear, starting at zero and ending at one.
So you may guess it was originally a B&W image, linearly blurred.
The easiest way to deblur that (not always giving good results, but enough in your case) is to binarize the image with a 0.5 threshold. Like this:
And this is a possible way. Just remember we are guessing a lot here!
HTH!
You cannot generally retrieve missing information.
If you know what it is an image of, in this case a Gaussian or Airy profile then it's probably an out of focus image of a point source - you can determine the characteristics of the point.
Another technique is to try and determine the character tics of the blurring - especially if you have many images form the same blurred system. Then iteratively create a possible source image, blur it by that convolution and compare it to the blurred image.
This is the general technique used to make radio astronomy source maps (images) and was used for the flawed Hubble Space Telescope images
When working with images one of the most common things is to use a convolution filter. There is a "sharpen" filter that does what it can to remove blur from an image. An example of a sharpen filter can be found here:
http://www.panoramafactory.com/sharpness/sharpness.html
Some programs like matlab make convolution really easy: conv2(A,B)
And most nice photo editing have the filters under some name or another (sharpen usually).
But keep in mind that filters can only do so much. In theory, the actual information has been lost by the blurring process and it is impossible to perfectly reconstruct the initial image (no matter what TV will lead you to believe).
In this case it seems like you have a very simple image with only black and white. Knowing this about your image you could always use a simple threshold. Set everything above a certain threshold to white, and everything below to black. Once again most photo editing software makes this really easy.
You cannot retrieve missing information, but under certain assumptions you can sharpen.
Try unsharp masking.

Resources