Hello I'm a Java developer and I'm a part of video on demand website team.
I'm currently doing research on how to implement a back-end component that we are planning to build; the component is expected to automatically generate a meaningful thumbnail representing the content of the videos like the algorithm used in YouTube to generate default thumbnails.
However, I can't seem to find any good open source or payed implementation that can do so, and building the algorithm from scratch is very complicated and needs a lot of time that I don't think the company is willing to invest at the current stage (maybe in the future though)
I would appreciate if someone can refer to any implementation that can help me or even vendors that sell an implementation or a product that can serve my component's objective.
Thanks!
As explained by google research blog:
https://research.googleblog.com/2015/10/improving-youtube-video-thumbnails-with.html
The key component is using a convolutional neural network to predict the score for each sampled frame.
There are so many open sourced CNN implementation like caffe or tensorflow. The only efforts are preparing some training data.
Related
I have started a social networking app and there is one user who won't stop uploading images of woman, who, well, are up to some sexual activities. He additionally adds offensive captions to them.
My question: how can I detect adult content in images and text and block them from my app? I think this is a problem that most people face who are making any kind of open networking app. It would be great if the solution was as fast and low-priced as possible.
Implement a system which essentially stores {256-sha image hash, human rating, computer rating} into a database.
Create an interface for the human rating and the computer rating which can judge and categorize images as well as an interface in your software which can use that information on how to handle such images.
Choose a tool, likely a convolutional neural network based algorithm, with an easy to use api. Here's a random result from searching: https://imagga.com/solutions/adult-content-moderation.html
Put everything together and you should have a system which can automatically guess how to handle images, but also allows you to iterate through them which both corrects the database as well as trains the rating algorithm which trains based on the existing human produced data.
Note: The status of an image is not permanent by the software unless a human rates it. Whenever one is accessed, the latest state of the image detection decides on it. If this happens far too frequently to support, then associate a time buffer with the image so that it doesn't re-rate it often.
Update: The advantages of this custom solution is that you can control things to work the way you want. You can define the rating system and how to handle the situation as well as governance over whatever set of trained algorithms you are using. You always have the final say and you can see what is going on at all times. The catch is that you would need to implement this software as an extension to your project.
Not easily, it would require machine learning techniques and a ton of training. Not to mention, all modern techniques can easily be tricked.
There are a few moderation solutions, but they aren't ideal.
First, you could ban them. Not the best, as they could make another account, but it means that they have to make another email for it.
Second, you could isolate him. I forget exactly how it works, but the idea is that they still think that they are posting on your app, but none of their content gets propagated to other people.
I don't know the legality of either of these, its all up to your terms and conditions. But AI is not really a good option, especially if your app were to need to scale.
I'm trying to choose an API to match object images taken with a cell phone with a list of images in a file system. The point is, I'm afraid that I won't get reliable results and it won't be worth it to loose time in this feature.
I would really appreciate some advice regarding this topic.
I am new to sci-kit learn. I have viewed the online tutorials but they all seem to leverage existing data (e.g., digits, iris, etc). I need the information on how to process images so that they can be used by scikit learn.
Details of my Study: I have a webcam set up outside my office. It captures all of the traffic on my street that passes in the field of view. I have cropped several hundred images of sedans, trucks and SUV's. The goal is to predict whether a vehicle is one of these categories. I have applied Histogram Oriented Gradients (HOG) to these images which I have attached for your review to see the differences in the categories. This blog will not allow me to post any images but you can see them here https://stats.stackexchange.com/questions/149421/obtaining-a-hog-feature-vector-for-implementation-in-svm-in-python. I posted the same question at this site but no response. This post is the closest answer I have found. Resize HOG feature for Scikit-Learn classifier
I wish to train an SVM classifier based on these images. I understand that there are algorithms that exist in scikit-image that prepares the HOG images for use in scikit-learn. Can someone help me understand this process. I am also grateful for any thoughts based on your experience as to the probability of success of this classification study. I also understand that I need to train the model using a negative images ( ones with no vehicles. How is this done?
I know I am asking a lot but I am surprised no one that I am aware of has done a tutorial on these early steps. It seems like a fairly elementary study.
I am working on a feature matching project and i am using OpenCV Python as the tool for developed the application.
According to the project requirement, my database have images of some objects like glass, ball,etc ....with their descriptions. User can send images to the back end of the application and back end is responsible for matching the sent image with images which are exist in the database and send the image description to the user.
I had done some research on the above scenario. Unfortunately still i could not find a algorithm for matching two images and identifying both are matching or not.
If any body have that kind of algorithm please send me.(I have to use OpenCV python or JavaCV)
Thank you
This is a very common problem in Computer Vision nowadays. A simple solution is really simple. But there are many, many variants for more sophisticated solutions.
Simple Solution
Feature Detector and Descriptor based.
The idea here being that you get a bunch of keypoints and their descriptors (search for SIFT/SURF/ORB). You can then find matches easily with tools provided in OpenCV. You would match the keypoints in your query image against all keypoints in the training dataset. Because of typical outliers, you would like to add a robust matching technique, like RanSaC. All of this is part of OpenCV.
Bag-of-Word model
If you want just the image that is as much the same as your query image, you can use Nearest-Neighbour search. Be aware that OpenCV comes with the much faster Approximated-Nearest-Neighbour (ANN) algorithm. Or you can use the BruteForceMatcher.
Advanced Solution
If you have many images (many==1 Million), you can look at Locality-Sensitive-Hashing (see Dean et al, 100,000 Object Categories).
If you do use Bag-of-Visual-Words, then you should probably build an Inverted Index.
Have a look at Fisher Vectors for improved accuracy as compared to BOW.
Suggestion
Start by using Bag-Of-Visual-Words. There are tutorials on how to train the dictionary for this
model.
Training:
Extract Local features (just pick SIFT, you can easily change this as OpenCV is very modular) from a subset of your training images. First detect features and then extract them. There are many tutorials on the web about this.
Train Dictionary. Helpful documentation with a reference to a sample implementation in Python (opencv_source_code/samples/python2/find_obj.py)!
Compute Histogram for each training image. (Also in the BOW documentation from previous step)
Put your image descriptors from the step above into a FLANN-Based-matcher.
Querying:
Compute features on your query image.
Use the dictionary from training to build a BOW histogram for your query image.
Use that feature to find the nearest neighbor(s).
I think you are talking about Content Based Image Retrieval
There are many research paper available on Internet.Get any one of them and Implement Best out of them according to your needs.Select Criteria according to your application like Texture based,color based,shape based image retrieval (This is best when you are working with image retrieval on internet for speed).
So you Need python Implementation, I would like to suggest you to go through Chapter 7, 8 of book Computer Vision Book . It Contains Working Example with code of what you are looking for
One question you may found useful : Are there any API's that'll let me search by image?
I need to implement algorithm which for input has picture ( jpeg ) and create new picture like output, but only with bodies ( background is removed completely ). Input picture is picture with people from vacation and I need to recognize human bodies and remove background. Can someone suggest me what algorithm to use, what book to buy to learn that algorihms ?
Check this link it will perfectly answer your question of removing the background and performing further processing
neural networks are particuarly useful for this kind of task, but the theory is a universe, if you're doing it from scratch ... that's a lot of work
This is a segmentation problem. In the general case, segmenting images is a hard research problem (I just spent five years doing a doctorate on segmenting greyscale medical images, for example) and the way you go about it is strongly tied to the type of images with which you have to deal. The best advice I can give is to go and read the appropriate literature on segmenting colour images (e.g. use Google Scholar). In terms of books, this one's a good general-purpose introduction to image processing:
http://www.amazon.co.uk/Digital-Image-Processing-Rafael-Gonzalez/dp/0130946508/ref=sr_1_7?ie=UTF8&qid=1326236038&sr=8-7
Searching for "segmenting people in colour images" on Google seems to turn up some good links, incidentally.
I have a question for you: you want to implement this using an algorithm? If so, then it might require a lot of things to be done (provided you are new to the field of image processing).
Otherwise you may try using masking techniques in image editing software like Adobe Photoshop (that would hardly take 15 mins, depending upon how well you know it)
A good book to start with image processing techniques is: "Digital Image Processing" by Gonzalez and Woods; it starts from the basics, and explains stuff in depth.
Still it may take a lot of time to develop an algorithm to do this job. I recommend you use some library for the same. OpenCV(opensource computer vision) is an excellent choice. The library itself comes with demos which include programs for face detection etc. The inbuilt functions provide a variety of features (edge detection/Feature identification and extraction, you may have to use this) Here's the link
http://opencv.willowgarage.com/wiki/
The link provides a lot of reference material that you can make use of! :)
Start with facial recognition software and algorithms; they have been the most refined over the years and as long as all of your bodies have heads, you can use exif data to figure image capture orientation (of course you can't completely rely on that), sample the facial skin to get skin tone ranges, and find the attached body. Anything that is not head and body should be deleted. This process assumes that a person has roughly the same skin tone on their face as their body and the camera flash isn't washing this out. You could grab the flash duration and some other attributes from exif and adjust your ranges accordingly.
A lot of software out there can recognize faces (look at iPhoto for example), so you'll have to use the face as a reference point, along with skin tone, to find your body edges. You result isn't going to be perfect, but as long as your approach is sound, you'll end up with something useful.
And release your software as open source when you're done so I can use it... :)
You can download a free PDF of the book Computer Vision by Richard Szeliski from the author's website. Not only do you have a free book on algorithms, but it's a book that addresses this specific problem.
http://szeliski.org/Book/
You'll see this image at the top of that page of the author's website.
Used copies of the hardcover are available for about $62 if you check addall.com. If you spent some time doing image processing, you'll appreciate having a paper copy of at least one good general reference book.
Its tough but not impossible. I can't give you any code but Peter Norvig had a great talk on the value of data and in the talk he shows how he was able to take a picture of lake and remove all the houses blocking the image and have the lake expanded with boats,etc..
The computer basically learned how lakes look and boats go on lakes and then removed the houses and placed it there. He explains his process(but no code or anything).
Here it is:
Peter Norvig - The Unreasonable Effectiveness of Data
http://www.youtube.com/watch?v=yvDCzhbjYWs