Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have questions about both sift and phash
First of all, I'm using SIFT to identify similar image in real-time service.
Like pictures by phone-camera, small amount of rotation and blurred effect could be.
And I found Phash. So, I test phash on its demo page. But result made me to sigh.
This is result of above test:
In this test, two images are fixed on x-axis. So they don'
t have rotation. But right images' logo were removed and person was moved to left side. In my eye, This is 'Very Similar'. In addition, SIFT catch this completely.
Now, This is question.
pHash is faster than SIFT?
Is pHash's accuracy reliable?
SIFT's output was too big to use in real-time service. So I must use hash to make output smaller size like LSH(Locality-sensitive hashing). Any other way to I try?
Ok, I got it.
pHash can't recognize rotation and critical movement as same thing.
In case of data space, pHash was dramatically good for using. It is very small size: one image to one hash. SIFT, however, need 128 bytes to get feature point. And there are many feature points in one image.
Eventually, SIFT can identify similar image well than pHash. But more and more size was needed.
In speed bench, I can't test yet. But I think, pHash was faster than SIFT because SIFT have to operate for many features on one images.
If you have another answers for above question, tell me please.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I have recently taken up film photography. Part of the workflow is to scan the images using a flatbed scanner. Unfortunately this process is very slow. Using some software (Silverfast) you make a prescan, zoom in make a more detailed pre scan, click ad drag around a rectangle which highlights the frame, do this for 12 frames, then set the software to do the full res scans.
I want to automate this process. Rather than layout where each frame is, I want to scan the whole film strip, and then use ML.Net to find each frame (X,Y coordinates of the top left corner) which I will then pass to ImageMagick to extract the actual image.
I want to use ML.Net because I am a .Net developer and may have the opportunity to use this experience later. So although example using OpenCV would be welcome, ML.Net would be preferable.
I am a bit of noob when it comes to ML stuff. My first thought is to try train a neural net, inputting the scan image and outputting the X and Y values. However that seems naive (as the image is 100s of MB in size). I imagine the there are better tool then just a raw neural net.
My searching on 'ML object recognition' didn't seem to help as the examples I found were about finding the Dog or Person in an image not a 'frame'; which could be a dog or a person.
Even a pointer in the right direction, of the correct name for this problem would be a great help.
So, what are the type of tool/functions I should I be using to try and solve this type of problem using ML.net?
This is not so much a machine learning problem as it is an image processing problem. I would think ML.Net is quite overkill.
What you probably want is an image processing library and utilize some form of edge detection or "region of interest" detection.
For example, look at this question:
Detect display corners with Emgu
Maybe I misunderstand what you want to do and you actually would benefit from machine learning; then you probably should pre process your images with an image processing library before feeding them to your model.
Hope it helps.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
please help me.
I'm looking for a simple algorithm that its input is a single image and that's it. The output will be a depth map of the image with colors of pixels according to if they are near or far from the camera.
I am looking for a simple solution without Machine Learning, 3D model, sterioscopic input or user input help. Only a single image.
thank you
What you are asking is in general an ill posed problem.
However, recent work with deep-networks have shown that a depth map can be predicted from a single image.
Here's one such paper: Depth Map Prediction from a Single Image
using a Multi-Scale Deep Network.
From the abstract:
Predicting depth is an essential component in understanding the 3D
geometry of a scene. While for stereo images local correspondence
suffices for estimation, finding depth relations from a single image
is less straightforward, requiring integration of both global and
local information from various cues. Moreover, the task is inherently
ambiguous, with a large source of uncertainty coming from the overall
scale. In this paper, we present a new method that addresses this task
by employing two deep network stacks: one that makes a coarse global
prediction based on the entire image, and another that refines this
prediction locally. We also apply a scale-invariant error to help
measure depth relations rather than scale. By leveraging the raw
datasets as large sources of training data, our method achieves
state-of-the-art results on both NYU Depth and KITTI, and matches
detailed depth boundaries without the need for superpixelation.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm working on a project which involves detecting the red blood cells in the blood. RBCs in the blood are never perfectly circular (usually almost eliptical) and they often overlap.
I've searched and found a number of algorithms, but most work for circles only. However, in my case it needs to work for blood from patients with sickle cell disease, where the RBCs are elongated or sickle-shaped. For reference here is an example source image.
Can you suggest an algorithm or approach to solve this problem?
Any help would be greatly appreciated.
As mentioned in the comments, this question is really too broad to answer completely. However, I can give you some pointers in how to address this.
For starters, get yourself the MATLAB Image Processing toolbox.
"Identify red blood cells" is a deceptively simple-sounding task. The first step with any project like this is to figure out what exactly you want to achieve, then start breaking it down into steps of how you will achieve that. Finally, there is the experimental-developmental stage where you try and implement your plan (realise what is wrong with it, then try again).
Cell counting normally uses circularity to identify cells, but that's not possible here because you state you want to identify sickle cells. The other main characteristics distinguishing RBCs from other cells is the colour and size. The colour is more absolute, so start with that. Then think about size. This is a good tutorial on the process of identifying cells although it is in Python the principle is the same.
So we have:
Apply a filter to your image, either isolating the red channel (RGB) or something more complex. Make it monochrome (we don't need colour data).
Smooth the image (e.g. gaussian filter) to reduce the noise and artefacts
Find regional maxima which are (hopefully!) in the center of cells
Label the regional maxima (this should give you the number of cells)
Watershed to find the whole cells an measure size
Hopefully that is enough to get you started!
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am working on detection of 2d data matrix but there is a problem in detection because barcode changes its design in each product so how to detect it? can anybody help me ?
The specification of datamatrix is designed to be identified. You need to look at the code the way it is intended to be looked at. Where I'd start is that the code has a quiet zone and an "L" pattern. That is what you are looking for.
How you go about doing this depends a lot on the general parameters of the image.
The first consideration is lighting and contrast. Can you depend on having a fixed midpoint, where everthing lighter is called white and everything darker black? Or will a simple histogram give a usable midpoint? Or do shadows and uneven lighting cause a value to be called black on the sunny side of the image and the same tone white on the shadow side of an image? On a flatbed scanner it is easy to depend on good contrast, but camera phone photos are more problematic.
The next consideration is size and resolution. For a camera phone application, it is expected that in a low resolution image, a high percentage of the image will contain the barcode, while a scanner may have a lot of image and a little amount of barcode data which needs to be searched for.
Finally comes presentation. Will the barcode appear in 360 degrees of rotation? Will it be flat and level or can it be be skewed, curled and angled? Is there any concern about lens distortion?
Once you can answer the considerations, it should point to what you need to do to identify the barcode. Datamatrix has clocking marks which enable distorted codes to be read, but it is a lot more work to define distortion, so if it is not needed, you wouldn't do it.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Are there any algorithms or tools that can increase the resolution of an image - besides just a simple zoom that makes each individual pixel in the image a little larger?
I realize that such an algorithm would have to invent pixels that don't really exist in the original image, but I figured there might be some algorithm that could intelligently figure out what pixels to add to the image to increase its resolution.
Interpolation: Image Scaling
For actual algorithms check out image interpolation.
The simple answer to your question is, "Yes there are algorithms, but none of them are very good." As you mentioned in the question, the limiting factor is the need to invent pixels in order to increase resolution beyond a small amount. (That's why you can't really read a license plate number from the reflection in someone's glasses off of a photo taken from a CCTV security camera, like they do in CSI: Miami.)
If all you want to do is create a larger image (for a wall hanging, or such like) then you can use a plugin for Photoshop that will smooth transitions between pixels using existing information. It can't create new pixels, but it can get rid of that boxy, pixelated look.
Addendum to the previous answers: Please note that the answer to your question depends heavily on what exactly you mean by resolution - of the display device, of the capture device, or of the viewing device (i.e., the human eye.) I assume you're talking about raster images (the problem wouldn't exist for vector images.)
You must accept that a picture taken at a higher resolution will contain more image information (i.e. details) than a picture of the same scene taken at a lower resolution. There is no way to add this information out of thin air. Scaling algorithms synthesize some information based on the assumption of continuity between the discrete raster image elements. That "new" information is not actually new but derived from the pre-existing picture information, hence it cannot be considered to have a 100% probability of matching the original scene. Better algorithms might yield better probabilities, but their results will always have a match probability of less than 1.
Enlarging images is risky. Beyond a certain point, enlarging images is a fool's errand; you can't magically synthesize an infinite number of new pixels out of thin air. And interpolated pixels are never as good as real pixels. That's why it's more than a little artificial to upsize the 512x512 Lena image by 500%. It'd be smarter to find a higher resolution scan or picture of whatever you need* than it would be to upsize it in software.
From Jeff Atwood
One way to increase resolution is to take multiple exposures, upsize them to 4x areal (2x linear both ways) and use stacking software to merge the images. The final image will be better than any of the originals.
You can try vectorizing the image with tools like autotrace or potrace and use it in whatever resolution you like. But it is computationally very costly so you end up with an image with few colors/features and even fewer if you need to work on its whole quickly.
Super-resolution algorithms might help in some cases.
I don’t know all what’s involved (soft/hardware & initial images necessary), but if you’re interested, here’s some links:
http://almalence.com/doc/superresolution-comparison/
(Seems like Almalence’s PhotoAcute fares the best of the ones tested in this article - $30 or $150). They are at: www. photoacute dot com
Markov Random Fields for SR – a free software package (MIT & Microsoft project)
http://people.csail.mit.edu/billf/project%20pages/sresCode/Markov%20Random%20Fields%20for%20Super-Resolution.html
Most decent image editors have smoothing/interpolating filters to do this kind of resizing/resampling, e.g. IrfanView which gives you several options for interpolation filters. See Lanczos resampling. ImageMagick's convert program allows you to do this also, after specifying a filter
If you need to do this algorithmically, check out the Image Scaling link suggested by Draemon. What platform will you be doing these interpolations on? Most graphics libraries will have a variety of approaches implemented, allowing you to balance speed against quality.
If you just need to resize some images, I recommend GIMP. It can resize images in a variety of ways, at least one of which should produce excellent results in any situation.
As others are pointing out, you can't expect a scaling method to invent information that isn't in the original image. So you can't expect it to be like the moments in CSI where they "zoom and enhance" to see the number on a license plate that was hopelessly blurred in the original image.