Related
Apologies if this is a very specific and out of the blue type of question. Tekken 3 was one of the best games that left a lasting impression on me, growing up, especially the cinematic intro. For me it's still one of the best and coolest game cinematic intros ever, even compared to this era (giving it's been over 20 years now since the game's release). Even though I'm a software developer, I've always been intrigued by how such amazing cinematics intros are created. I've researched on Google everywhere, but unfortunately couldn't find a source that discloses such information. I know it's been over 20 years since and the game is quite old, but I still find it strange that there's no discussion anywhere online on how its amazing cinematic intro was created (what software(s) was used, how the cinematic effect was created ...etc). The best resource info I could find, simply talks about the game characters, moves ..etc. Nothing about the opening cinematic intro. Just for those who need a reference as to what I'm talking about, here's the video from YouTube: https://www.youtube.com/watch?v=IsvtUxEFQaU.
I'm aware it's a very complicated process that requires a highly experienced team to work on it, but I simply want to know what software and what kind of processes/effects were used (even guesses from experienced animators/developers would suffice) and as much information about the process as possible, would be really appreciated. There are numerous tutorials online about how to make computer animations on 3D std max, Maya, Unity and Unreal, but they all look like children Disney animations or animation from the actual game graphic itself, not the cinematic effect rendering you experience from Tekken 3 intro. If you watch its intro you will know what I mean. I would really appreciate any help and would be very intrigued to learn the process if somebody could provide me with a direction and some information on the software and processes used, just from a very high level view.
Thank you so much in advance for the help. You will literally be answering one of my main life's mysteries :)
but they all look like children Disney animations or animation from the actual game graphic itself
I'm sorry, but did we watch the same video? It's entertaining for sure but visually nothing impressive.
This doesn't look like an in-game cinematic so it's probably using a 3d animation package like 3DsMax/Maya/Softimage/etc.. These packages are also used for feature films so I'm not sure where you're getting at that it all looks like children Disney animations. They can also be used for live action for photo realistic renders. Though these days 3DsMax is mostly used for games, Maya is what's mainly being used for films as it's what the studios are using (for the most part, there's exceptions). These packages also include solutions for particle effects, and cloth simulations. Right now Houdini is king for these 2 though.
Of course what matters most is the skills from the team to make the most out of the software.
The process generally goes like this:
Create a script to define the story
Create concept art for characters and locations
Create assets (3d models/textures/shaders)
Rig the assets so they can deform
Animate the assets into shots (modeling/rigging/anim can work in parallel)
Do character effects on your finished shots (cloth, hair, rigid dynamics. This depends on your budget and is optional)
If you're using any other software for lighting/effects, you probably need to export your scene to cache data (alembic is a popular format)
Render everything out
Use a compositing software to add all different elements together and make final tweaks to the shot (optional)
I'm still missing a few stuff, especially for bigger productions, but that's the general idea.
I was wondering if there is Aforge.NET algorithm that is intended for human activity recognition?
For example, I would like to recognizing drowning while capturing frames from surveillance camera on the beach.
I saw there are algorithms for motion detection, but what I need is motion detection plus logic to process that motion so that computer can conclude does that motion fit into drowning category or any other category I tell him.
Comments would be appreciated.
you might need to develop your own algorythms, i do that too with Aforge
Aforge basicly allows me for simple video aquisition, while my math does the interesting stuff.
In your case..
Detect spots with people
Zoom in to them ??
then it becommes tricky how to distinguish someone who dives from someone who sinks ?..
Also there are waves who can get in front of the person your trying to follow..
usually this recognition comes down to simple observations, like someone pulling his hands up is not like a circle of a swimming head..
you got to think how a beach guard can see the difference, what are the main visual clues and how can you convert them to bitmap math.
Consider using Accord.NET - it's a library based on AForge.NET that contains many machine learning algorithms. However you must write all the logic, as you call it, by yourself.
Another possibility is to use Emgu CV which has some motion detection algorithms.
What would be the best way to detect a fast moving object using OpenCV?
Say, I have 5 random video files:
1) Video of a crowd, people walking, static camera.
2) Video of a cat playing with a ball, shaky iPhone camera.
3) Video of a person being interviewed. Static camera.
4) Animation (3D) of a fast moving car, background is blurred etc. etc.
5) A blurred out video shot with iPhone camera (just camera waved around, nothing is visible).
So I would like to isolate video5 and detect that there is a lot of movement in video4 and video2.
What would be the best approach to do that? I think of using OpenCV2, but if there is a better solution for that, I'd be happy to learn about that.
Any input greatly appreciated. Pseudo-code or just recommendations of specific algorithms.
Thank you
Optical Flow This will be one of many ways of detecting motion.
I don't know if you are still on it but I found it interesting to answer.
Approach 1-
As suggested by user349026, one of the most intuitive way is to work with Optical flow, it will give you dominating motion but optical flow always comes with noises. You will have to use some filter before using the optical flow.
Approach- 2
This one difficult but gives good results.
This is from CVPR-2013 paper link- http://www.irisa.fr/texmex/people/jain/w-Flow/motion_cvpr13.pdf
I think the just introduction of this paper will solve your problem.
In a multi-touch environment, how does gesture recognition work? What mathematical methods or algorithms are utilized to recognize or reject data for possible gestures?
I've created some retro-reflective gloves and an IR LED array, coupled with a Wii remote. The Wii remote does internal blob detection and tracks 4 points of IR light and transmits this information to my computer via a bluetooth dongle.
This is based off Johnny Chung Lee's Wii Research. My precise setup is exactly like the graduate students from the Netherlands displayed here. I can easily track 4 point's positions in 2d space and I've written my basic software to receive and visualize these points.
The Netherlands students have gotten a lot of functionality out of their basic pinch-click recognition. I'd like to take it a step further if I could, and implement some other gestures.
How is gesture recognition usually implemented? Beyond anything trivial, how could I write software to recognize and identify a variety of gestures: various swipes, circular movements, letter tracing, etc.
Gesture recognition, as I've seen it anyway, is usually implemented using machine learning techniques similar to image recognition software. Here's a cool project on codeproject about doing mouse gesture recognition in c#. I'm sure the concepts are quite similar since you can likely reduce the problem down to 2D space. If you get something working with this, I'd love to see it. Great project idea!
One way to look at it is as a compression / recognition problem. Basically, you want to take a whole bunch of data, throw out most of it, and categorize the remainder. If I were doing this (from scratch) I'd probably proceed as follows:
work with a rolling history window
take the center of gravity of the four points in the start frame, save it, and subtract it out of all the positions in all frames.
factor each frame into two components: the shape of the constellation and the movement of it's CofG relative to the last frame's.
save the absolute CofG for the last frame too
the series of CofG changes gives you swipes, waves, etc.
the series of constellation morphing gives you pinches, etc.
After seeing your photo (two points on each hand, not four points on one, doh!) I'd modify the above as follows:
Do the CofG calculation on pairs, with the caveats that:
If there are four points visible, pairs are chosen to minimize the product of the intrapair distances
If there are three points visible, the closest two are one pair, the other one is the other
Use prior / following frames to override when needed
Instead of a constellation, you've got a nested structure of distance / orientation pairs (i.e., one D/O between the hands, and one more for each hand).
Pass the full reduced data to recognizers for each gesture, and let them sort out what they care about.
If you want to get cute, do a little DSL to recognize the patterns, and write things like:
fire when
in frame.final: rectangle(points)
and
over frames.final(5): points.all (p => p.jerk)
or
fire when
over frames.final(3): hands.all (h => h.click)
A video of what has been done with this sort of technology, if anyone is interested?
Pattie Maes demos the Sixth Sense - TED 2009
Most simple gesture-recognition tools I've looked at use a vector-based template to recognize them. For example, you can define right-swipe as "0", a checkmark as "-45, 45, 45", a clockwise circle as "0, -45, -90, -135, 180, 135, 90, 45, 0", and so on.
Err.. I've been working on gesture recognition for the past year or so now, but I don't want to say too much because I'm trying to patent my technology :) But... we've had some luck with adaptive boosting, although what you're doing looks fundamentally different. You only have 4 points of data to process, so I don't think you really need to "reduce" anything.
What I would investigate is how programs like Flash turn a freehand drawn circle into an actual circle. It seems like you could track the points for duration of about a second, and then "smooth" the path in some fashion, and then you could probably get away with hardcoding your gestures (if you make them simple enough). Otherwise, yes, you're going to want to use a learning algorithm. Neural nets might work... I don't know. Just tossing out ideas :) Maybe look at how OCR is done too... or even Hough transforms. It looks to me like this is a problem of recognizing shapes more than it is of recognizing gestures.
I'm not very well versed in this type of mathematics, but I have read somewhere that people sometimes use Markov Chains or Hidden Markov Models to do Gesture Recognition.
Perhaps someone with a little more background in this side of Computer Science can illuminate it further and provide some more details.
I am building a web application using .NET 3.5 (ASP.NET, SQL Server, C#, WCF, WF, etc) and I have run into a major design dilemma. This is a uni project btw, but it is 100% up to me what I develop.
I need to design a system whereby I can take an image and automatically crop a certain object within it, without user input. So for example, cut out the car in a picture of a road. I've given this a lot of thought, and I can't see any feasible method. I guess this thread is to discuss the issues and feasibility of achieving this goal. Eventually, I would get the dimensions of a car (or whatever it may be), and then pass this into a 3d modelling app (custom) as parameters, to render a 3d model. This last step is a lot more feasible. It's the cropping issue which is an issue. I have thought of all sorts of ideas, like getting the colour of the car and then the outline around that colour. So if the car (example) is yellow, when there is a yellow pixel in the image, trace around it. But this would fail if there are two yellow cars in a photo.
Ideally, I would like the system to be completely automated. But I guess I can't have everything my way. Also, my skills are in what I mentioned above (.NET 3.5, SQL Server, AJAX, web design) as opposed to C++ but I would be open to any solution just to see the feasibility.
I also found this patent: US Patent 7034848 - System and method for automatically cropping graphical images
Thanks
This is one of the problems that needed to be solved to finish the DARPA Grand Challenge. Google video has a great presentation by the project lead from the winning team, where he talks about how they went about their solution, and how some of the other teams approached it. The relevant portion starts around 19:30 of the video, but it's a great talk, and the whole thing is worth a watch. Hopefully it gives you a good starting point for solving your problem.
What you are talking about is an open research problem, or even several research problems. One way to tackle this, is by image segmentation. If you can safely assume that there is one object of interest in the image, you can try a figure-ground segmentation algorithm. There are many such algorithms, and none of them are perfect. They usually output a segmentation mask: a binary image where the figure is white and the background is black. You would then find the bounding box of the figure, and use it to crop. The thing to remember is that none of the existing segmentation algorithm will give you what you want 100% of the time.
Alternatively, if you know ahead of time what specific type of object you need to crop (car, person, motorcycle), then you can try an object detection algorithm. Once again, there are many, and none of them are perfect either. On the other hand, some of them may work better than segmentation if your object of interest is on very cluttered background.
To summarize, if you wish to pursue this, you would have to read a fair number of computer vision papers, and try a fair number of different algorithms. You will also increase your chances of success if you constrain your problem domain as much as possible: for example restrict yourself to a small number of object categories, assume there is only one object of interest in an image, or restrict yourself to a certain type of scenes (nature, sea, etc.). Also keep in mind, that even the accuracy of state-of-the-art approaches to solving this type of problems has a lot of room for improvement.
And by the way, the choice of language or platform for this project is by far the least difficult part.
A method often used for face detection in images is through the use of a Haar classifier cascade. A classifier cascade can be trained to detect any objects, not just faces, but the ability of the classifier is highly dependent on the quality of the training data.
This paper by Viola and Jones explains how it works and how it can be optimised.
Although it is C++ you might want to take a look at the image processing libraries provided by the OpenCV project which include code to both train and use Haar cascades. You will need a set of car and non-car images to train a system!
Some of the best attempts I've see of this is using a large database of images to help understand the image you have. These days you have flickr, which is not only a giant corpus of images, but it's also tagged with meta-information about what the image is.
Some projects that do this are documented here:
http://blogs.zdnet.com/emergingtech/?p=629
Start with analyzing the images yourself. That way you can formulate the criteria on which to match the car. And you get to define what you cannot match.
If all cars have the same background, for example, it need not be that complex. But your example states a car on a street. There may be parked cars. Should they be recognized?
If you have access to MatLab, you could test your pattern recognition filters with specialized software like PRTools.
Wwhen I was studying (a long time ago:) I used Khoros Cantata and found that an edge filter can simplify the image greatly.
But again, first define the conditions on the input. If you don't do that you will not succeed because pattern recognition is really hard (think about how long it took to crack captcha's)
I did say photo, so this could be a black car with a black background. I did think of specifying the colour of the object, and then when that colour is found, trace around it (high level explanation). But, with a black object in a black background (no constrast in other words), it would be a very difficult task.
Better still, I've come across several sites with 3d models of cars. I could always use this, stick it into a 3d model, and render it.
A 3D model would be easier to work with, a real world photo much harder. It does suck :(
If I'm reading this right... This is where AI shines.
I think the "simplest" solution would be to use a neural-network based image recognition algorithm. Unless you know that the car will look the exact same in each picture, then that's pretty much the only way.
If it IS the exact same, then you can just search for the pixel pattern, and get the bounding rectangle, and just set the image border to the inner boundary of the rectangle.
I think that you will never get good results without a real user telling the program what to do. Think of it this way: how should your program decide when there is more than 1 interesting object present (for example: 2 cars)? what if the object you want is actually the mountain in the background? what if nothing of interest is inside the picture, thus nothing to select as the object to crop out? etc, etc...
With that said, if you can make assumptions like: only 1 object will be present, then you can have a go with using image recognition algorithms.
Now that I think of it. I recently got a lecture about artificial intelligence in robots and in robotic research techniques. Their research went on about language interaction, evolution, and language recognition. But in order to do that they also needed some simple image recognition algorithms to process the perceived environment. One of the tricks they used was to make a 3D plot of the image where x and y where the normal x and y axis and the z axis was the brightness of that particular point, then they used the same technique for red-green values, and blue-yellow. And lo and behold they had something (relatively) easy they could use to pick out the objects from the perceived environment.
(I'm terribly sorry, but I can't find a link to the nice charts they had that showed how it all worked).
Anyway, the point is that they were not interested (that much) in image recognition so they created something that worked good enough and used something less advanced and thus less time consuming, so it is possible to create something simple for this complex task.
Also any good image editing program has some kind of magic wand that will select, with the right amount of tweaking, the object of interest you point it on, maybe it's worth your time to look into that as well.
So, it basically will mean that you:
have to make some assumptions, otherwise it will fail terribly
will probably best be served with techniques from AI, and more specifically image recognition
can take a look at paint.NET and their algorithm for their magic wand
try to use the fact that a good photo will have the object of interest somewhere in the middle of the image
.. but i'm not saying that this is the solution for your problem, maybe something simpler can be used.
Oh, and I will continue to look for those links, they hold some really valuable information about this topic, but I can't promise anything.