Is there any "step by step guide" on how to do a face recognition and identification without any libraries?
The goal is to receive 60 images and identify who is it in every image, so first we will have to get the info of every person and then, identify they image by image.
it's for an academic research and we are not supposed to use any kind of "external help" for our algorithmy. Any programming language will do. We just need to keep everything simple. What should i research and use to do this kind of software?
This article provides basic techniques for image recognition. It uses neural nets to recognize handwritten digits.
Related
I can't find any useful documentation on what the Apple Neural Engine does specifically beyond "accelerate machine learning tasks across the Mac for things like video analysis, voice recognition, image processing, and more", which doesn't tell me much. Is it just another TPU? What architecture does it use? Does it have accelerators for any specific operations linear algebra operations? Do they have a CUDA-type language to actually use it?
I am a total beginner with Xcode and Objective C, but I have some experience with OOP in C++. I bought this book. I read about how to make a simple app, and skimmed the rest of the book. What I want to do is make an iPhone app people can use to look up math equations such as the quadratic eqauation, pythagorean identity, etc. I plan to include a lot of stuff, and do a lot of things better than other apps I have seen. However, before I pay Apple $99 to be a full fledged iOS developer, I want to know that it isn't too hard to make the Greek letters and Math notation that we see in math books. So for example, what code is needed to make an iPhone app that display . Of course I want to use features that I understand are included in Xcode for doing this sort of thing, rather than, make a graphic with another program that my app would use when needed. Besides that specific example, where is the Apple documentation for making other math symbols and notation that my iPhone app will display? If this is the wrong place to ask, it would be great if you could tell me of a beter place to post my question.
It's going to require a lot of writing to get good layouts using the system frameworks. All the building blocks are there, but your program would need significant rendering customization to get the layouts you expect. In detail, the characters you need are there, but you will need to write a bunch of supporting code in order to resize, position, and layout these characters correctly.
You may want to look for a suitably licensed library you can use which specializes in this purpose. Perhaps a LaTeX renderer would offer some good leads.
Use core animiation layers to construct the elements of a parsed equation. Use Quartz to draw lines, symbols, for rendering visual elments of the operation with the equation. Also use Core Plot. And then eventually output to Latex once parsed into hierarchical data structure. Also check out Graham Cox's GCMathParser.
Similar question: Drawing formulas with Quartz 2d
I want to extract an object such as a man ,a car or something like that from an image.The image is just an ordinary iamge, not medical image or other types for specific purpose.
I have searched for a long time and found that the automatic image segmentation algorithms just segment the image into a set of regions or gives out the contour in the image,not a semantic object. so I turned to the interactive image segmentation algorithms and I found some popular algorithms like interactive graph cuts and SIOX and so on. I think these algorithms just meet my demand.
Further more, I also downloaded two interactive image segmentation tool,the first one is the interactive segmentation tool, the second one is the interactive segmentation tool-box.
So my quesions are
1.if the interactive image segmentation algorithm is the right solution for my task since the performance is the most important.
2.and if I want to use the automatic image segmentation algorithm, what should I do next?
Any suggestion will be approciated.
If you want to pick out a object from a single static image just by a few scribbles. I recommend you have a read of
'Closed-form solution to image matting'
or 'Spectral matting',
or 'lazy snapping'
but as in my tests, the last doesn't perform as well as the first two methods when dealing with subtle objects like hairs.
However you can find their source matlab codes very easily from google.
But the first two method are't so pleasant to use actually, I think you'll need to do lots of modification to make them easy to use. It's main problem IMHO, is it requires very decent scribbles on the image, that's if you draw some extra scribbles or at wrong positions, you'll ruin your object cutting .
Apart from these, you may try 'bayesian matting, possion matting, etc.' which all request some helping image called trimap, and it's hard to draw really.
Extracting objects from an image, specially a picture is not that easy as you think, you may want to take a look at the OpenCV project.
OpenCV
Other than OpenCV, I would suggest looking at ITK. It is very popular in medical image analysis projects, because there it is known that semi-automatic segmentation tools provide the best results. I think that the methods apply to natural images as well.
Try looking at tools like livewire segmentation, and level-set based image segmentation. ITK has several demos that allow you to play with these tools on your own images. The demo application such as this is part of the open source distribution, but it can be downloaded directly from the itk servers (look around for instructions)
If this is a business case, you'd better look for companies specialized in "video content analysis". I mean it: reliable people and vehicle detection aren't a single man's project.
Genreral purpose segmentation tools won't do the trick because they have no notion of what a man or a car look like. All they are deemed to do is to find uniform regions in an image.
It is quite late but there is an algorithm called connected component labeling, which you may find useful.
here is wiki link of the algorithm
i am a final year student of computer engg.i am doing a project on sketch generation from an image.The generated sketch will then be given the feature of lip synch.
I wanted to know what type of programming languages would be suitable for this kind of application?
Thanking you in advance
Since nobody has the answer for this,After a few months of search, i found out myself.The language which can be used are either pure Matlab for image processing and lip sync(however it is difficult in Matlab due to absence of multi threading,but still can be done using a TimerFcn).Java can be used as language here,but the image processing and sound processing is a bit complicated in Java.Flash is a language which can be used the best for this type of processing though i doubt if it satisfies the criterion of a 'programming language'.other readily available tools are present which may be used of the implementation of the project.
For my job i've been using a Java version of ARToolkit (NyARTookit). So far it proven good enough for our needs, but my boss is starting to want the framework ported in other platforms such as web (Flash, etc) and mobiles. While i suppose i could use other ports, i'm increasingly annoyed by not knowing how the kit works and beyond that, from some limitations. Later i'll also need to extend the kit's abilities to add stuff like interaction (virtual buttons on cards, etc), which as far as i've seen in NyARToolkit aren't supported.
So basically, i need to replace ARToolkit with a custom mark detector (and in case of NyARToolkit, try to get rid of JMF and use a better solution via JNI). However i don't know how these detectors work. I know about 3D graphics and i've built a nice framework around it, but i need to know how to build the underlying tech :-).
Does anyone know any sources about how to implement a marker-based augmented reality application from scratch? When searching in google i only find "applications" of AR, not the underlying algorithms :-/.
'From scratch' is a relative term. Truly doing it from scratch, without using any pre-existing vision code, would be very painful and you wouldn't do a better job of it than the entire computer vision community.
However, if you want to do AR with existing vision code, this is more reasonable. The essential sub-tasks are:
Find the markers in your image or video.
Make sure they are the ones you want.
Figure out how they are oriented relative to the camera.
The first task is keypoint localization. Techniques for this include SIFT keypoint detection, the Harris corner detector, and others. Some of these have open source implementations - i think OpenCV has the Harris corner detector in the function GoodFeaturesToTrack.
The second task is making region descriptors. Techniques for this include SIFT descriptors, HOG descriptors, and many many others. There should be an open-source implementation of one of these somewhere.
The third task is also done by keypoint localizers. Ideally you want an affine transformation, since this will tell you how the marker is sitting in 3-space. The Harris affine detector should work for this. For more details go here: http://en.wikipedia.org/wiki/Harris_affine_region_detector