Dynamic image creation/manipulation in AppEngine? - image

I have not been able to track down an answer on this. I'd like to be able to manipulate or create images to then compile into a video. I'm starting to think this is just not a good fit for GAE. I wanted to do this in Python but doesn't look like that is possible without C support. Even with Java I'm seeing conflicting information about what is possible.
Does anyone know for sure if there are any fully supported image libraries for Python or Java?

You're right - anything that involves heavy image manipulation isn't a good fit for App Engine - especially video encoding. Consider writing a service that does this on something such as EC2, and calling it from App Engine when needed.

Related

How to detect location/place type from the image?

I have a web application where user uploads the images of their locations. I want to write a program to detect the type of location and list of objects from the image. I write a program in C# using alturos YOLO to detect objects in the image. The result is fine for me but the problem is i want to detect the place type from the image. Like, if you upload some image that has snow then it should detect the "Snow" keyword. If you upload the "Lake" image then it should show keywords like "Lake, water, river etc". I am a web developer and never done any Machine Learning or image processing thing. But i am keen to learn this. Is there any way to do this or anyone can tell me the right path to do this.
I found this "https://www.clarifai.com/" but i want to write my own code because i have large number of images.
All in all, I'm pretty sure that there's no single correct answer to this. You could implement image recognition in a hundred different equally correct ways using different tools. So here's my opinionated perspective. Anyone and everyone is free to agree/disagree with what I'm saying.
I've worked a bit with Open CV (Python) in the past. There are a great number of libraries available based on it, so you can probably find a working base to build off of. I think that it should be capable of doing the task you specify, although I'm not quite sure how it would be done.
The other framework for machine learning and object recognition that I have seen is Apple's Create ML/ Core ML system (Swift or Objective-C). My experience with that one is as limited as cloning a git repo and poking around inside, but it looks pretty powerful.

interactive Augmented Reality 3D drawer

I'm planning on doing an interactive AR application that will use a laser sensor (for distances), GPS technology to get a location, and then use compass/gyroscope for tracking 6DOF viewfinder
movements. The user can choose from a number of ready-made 3D-models, and should be able to place them by selecting the desired location on the screen.
My target platform will be a 8"-handheld-device, running on windows8.
Any hints what would be the best AR-SDK or 3D-viewer to work with?
thanks in advance!
There are quite a few 3D viewers that are working in the browsers. But most recently and most notably: va3C viewer
It is webgl based app and doesnt require a server, so if your handheld device supports webgl, then you are good to go, however, whether it works on IE or not is questionable ;).
Although based on my experience and your usecase, I believe client side JS libraries do not provide enough access to the device's hardware. So you might have to serve the information like GPS, Gyroscope, from the server side, then gather this on the client using something like socket.io and then mash it up alongside the geometry.
I am trying to do something similar, although havent quite done it yet. Will keep you posted.
Another approach I am exploring is X3DOM, which gives the ability to write 3D data like XML alongside HTML, which is quite declarative and simple to pickup. X3DOM derives from X3D.
Tell me if you need more info.
Also, worth exploring for its motion abilities, is Robot Studio, which is a desktop app with SDK.

server-side fallback rendering

Is there any way to have three.js running server-side on a headless server (standalone server, Amazon AWS or similar)?
Currently I fall back to canvas rendering (wireframe only for performance reasons) when user's browser does not support WebGL. This is good enough for realtime interaction, but for the app to make sense, users would really need to somehow be able to see a properly rendered version with lights, shadows, post processing etc. even if it comes with great latency.
So... would it be possible to create a server-side service with functional three.js instance? The client would still use tree.js canvas wireframe rendering, but after say... a second of inactivity, it would request via AJAX a full render from the server-side service, and overlay it simply as an image.
Are there currently any applications, libraries or anything that would allow such a thing (functional javascript+webgl+three.js on a headless, preferably linux server, and GPU-less at that)?
PhantomJS comes to mind, but apparently it does not yet support WebGL: http://code.google.com/p/phantomjs/issues/detail?id=273
Or any alternative approaches to the problem? Going the route of programmatically controlling a full desktop machine with a GPU and standard chrome/firefox instance feels possible, while fragile, and I really really wouldn't want to go there if there are any software-only solutions.
In its QA infrastructure, Google can run Chromium testing using Mesa (see issue 97675, via the switch --use-gl=osmesa). The software rasterizer in the latest edition of Mesa is pretty advanced, involving the use of LLVM to convert the shaders and emulate the execution on the CPU. Your first adventure could be building Mesa, building Chromium, and then try to tie them together.
As a side note, this is also what I plan (in the near future) for PhantomJS itself, in particular since Qt is also moving in that direction, i.e. using Mesa/LLVMpipe instead of its own raster engine only. The numbers actually look good. Even better, for an offline, non-animated single-shot capture, the performance would be more than satisfactory.
Some inputs in this thread : https://github.com/mrdoob/three.js/issues/2182
In particular this demo shows how to generate some images on server side using nodejs.
Thanks,
Nico
Links below will not resolve your problem with AWS but will give you a hint.
I am working on the application with a similar architecture and came across with these examples:
Multiplayer game with realtime socket.io
My original question on similar architecture

Is there any image comparison server software out there, made from something like OpenCV (Windows or Mac)?

Is there any image comparison server software out there, made from something like OpenCV (Windows or Mac)? I'm looking to make an in-house image recognition server for an internal project and I need to know if there are any options out there.
Most that I see available are Internet web-based API's and cost monthly fees. I'd like to set something up internally instead, both for quicker speeds and cheaper costs.
If not, what is recommended as the best way to set something like this up?
Check this out
http://www.abbyy.com/recognition_server/product_overview/ - Product Overview
http://www.cvisiontech.com/products/general/maestro-recognition-server.html
Also this article might be helpful
https://www.google.com/enterprise/marketplace/viewListing?productListingId=6096210+10692120271328191677&pli=1
We have included a binary image classifier with the open library SimpleCV:
http://www.simplecv.org
Here is a video of what I'm talking about:
http://www.youtube.com/watch?v=cH5e-ZkJa0U
You could use that example to start to build something up. It doesn't work as an image search out of the box, but you could probably easily modify it to do what you want.

How to implement a voice changer?

I want to write a app which change the microphone input voice and make it like robot or some funny man's voice.It must support send changed voice to all application like IM Software or Game Client. Which technology should I pick up? Windows WaveForm Api? DirectX?
audio driver?
Thank you very much!
There's an MSDN Coding4Fun article that explains how to create a voice changer that operates over Skype, in C# (.NET). The full source code is also hosted as a project on CodePlex. In addition, it should be fairly easy do something else with the audio (as opposed to streaming it via Skype), since the project is based around the NAudio framework, which contains a good level of abstraction. Anyway, it is a reasonably complete (and stable) example - definitely worth checking out in my opinion.
If you want/need to use C++ or some other language for development, then this project should at least give you some ideas about how to go about it. Still, if you can use .NET, then you're in luck I think.
Robot voice is often done with a ring modulator effect, mixing the voice with a sine wave - this is easier. Or use a vocoder effect, modulating the voice onto some other waveform, like rectangle - might be a bit more tricky. Go read up how the effects work, get a program with which you can check out how they sound (Audacity works for the ring modulator, finding and using a vocoder may be a bit harder). Then read how it's done or get a library which will do the processing for you.
You are looking to support VSTi or DXi plugins.
There are tons that also act as vocoders, even for free.
You just need to write the host application.
Take a look here :)
Now that's a neat idea, especially for a mobile app.
I'd probably start off-line by using a .wav file as input to get the effects working the way I wanted. You can use any high level language for this, but you probably want something that will map reasonably well into C/C++.
In terms of a production version, I'd go native and do this in C or C++. You want something fast for real time audio processing & I like to avoid dependencies on things like .net for distribution. (Not that I have anything against .net, it's great for servers and distribution within a company but I'm not so keen on having it as a dependency for shrink wrap software.)
Windows DirectShow would be a tempting option - you could do some interesting effects with multi-media as well if you had the voice morpher implemented as a direct show filter.
What you're looking for is a vocoder. I don't know if any of the technologies listed above has a vocoder effect, but the best chance would be with DirectX.
Try this sample app .I think its useful to you.Link

Resources