I'm working on raytracing algorithm and I want to make changes on sound file
Can you help me to find any (lib or class for OpenGL c++) for sound process to wave file "or any other types(mp3 ...)"
Thanks...
OpenGL is for Graphics rendering. It can be used for certain types of generic processing, but sound processing is done on the CPU due to various reasons, including floating point capabilities required.
For sound processing, you can use Audacity (http://audacity.sourceforge.net/)
Related
I'm working on a project which needs to convert sport commentary to text. For that I have already used Microsoft system speech library. It's working fine without background noises. Can any one tell me a way of removing this background noise from the given audio file by using ffmpeg-like tool or in some other programmatic way.
For better accuracy in such case it is better to use more specialized solutions like CMUSphinx.
It helps you with different things: you can configure decoder vocabulary so it will correctly recognize sport terms and expressions
You can fully use noise robust speech recognition in order to deal with background noises. External noise cleanup is actually pretty harmful for speech recognition accuracy and is not recommended. Even a simple processing algorithm like Vuvuzella denoising with Matlab is better used within the decoder, not before processing.
Is there some way to store a "scene" in Direct2D on the GPU?
I'm looking for something like ID2D1Mesh (i.e. storing the resource in vector format, not as a bitmap) but where I can configure if the mesh/scene/resource should be rendered with anti-aliasing or not.
Rick is correct in that you can apply antialiasing at two different levels. Either through Direct2D or through Direct3D. You can do both but that’s pointless and would only waste resources and lead to poor results. Direct2D antialiasing is suitable if you want per-primitive geometry-aware antialiasing. Direct3D antialiasing is useful if you want to sacrifice a bit of quality for better overall performance in some scenarios.
The Direct2D 1.1 command list literally stores/records a list of drawing commands that can be played back against different targets. This may be what you’re after as it’s not rasterized. Conceptually it’s like storing a vector image in device memory. Command lists are somewhat limited in that you cannot modify the command list once created and resources being drawn may also not be changed, but it’s still quite handy nonetheless.
There is a way to get antialiasing with ID2D1Mesh, but it's non-trivial. You have to create the Direct3D device yourself and then use ID2D1Factory::CreateDxgiSurfaceRenderTarget(). This allows you to configure the multisampling/antialiasing settings of the D3D device directly, and then meshes play along just fine (in fact I think you'd just always tell Direct2D to use aliased rendering). I haven't done this myself, but there is a MSDN sample that shows how to do this. It's not for the faint of heart ... and in order to do software rendering you have to initialize a WARP device. It does work, however.
Also, in Direct2D 1.1 (Windows 8, or Windows 7 + Platform Update), you can use the ID2D1CommandList interface for record/playback stuff. I'm not sure if that's implemented as "compile to GPU" (ala mesh), or if it's just macros (record/playback of commands).
In Windows 8.1, Direct2D introduced geometry realizations, which lets you store a tessellated version of the geometry and later render it back with or without anti-aliasing, just like you asked. These are highly recommended over the use of meshes. Command lists, while convenient, don't have the same caching abilities as creating and storing the geometry realizations yourself.
First let me explain the application a little bit. This is video security software that can display up to 48 cameras at once. Each video stream gets its own Windows HDC but they all use a shared OpenGL context. I get pretty good performance with OpenGL and it runs on Windows/Linux/Mac. Under the hood the contexts are created using wxWidgets 2.8 wxGLCanvas, but I don't think that has anything to do with the issue.
Now here's the issue. Say I take the same camera and display it in all 48 of my windows. This basically means I'm only decoding 30fps (which is done on a different thread anywa) but displaying up to 1440fps to take decoding out of the picture. I'm using PBOs to transfer the images over, depending on whether pixel shaders and multitexturing are supported I may use those to do YUV->RGB conversion on the GPU. Then I use a quad to position the texture and call SwapBuffers. All the OpenGL calls come from the UI thread. Also I've tried doing YUV->RGB conversion on the CPU and messed with using GL_RGBA and GL_BGRA textures, but all formats still yield roughly the same performance. Now the problem is I'm only getting around 1000fps out of the possible 1440fps (I know I shouldn't be measuring in fps, but its easier in this scenario). The above scenario is using 320x240 (YUV420) video which is roughly only 110MB/sec. If I use a 1280x720 camera then I get roughly the same framerate which is nearly 1.3GB/sec. This tells me that it certainly isn't the texture upload speed. If I do the YUV->RGB conversion and scaling on the CPU and paint using a Windows DC then I can easily get the full 1440fps.
The other thing to mention is that I've disabled vsync both on my video card and through OpenGL using wglSwapIntervalEXT. Also there are no OpenGL errors being reported. However, using very sleepy to profile the application it seems to be spending most of its time in SwapBuffers. I'm assuming the issue is somehow related to my use of multiple HDCs or with SwapBuffers somewhere, however, I'm not sure how else to do what I'm doing.
I'm no expert on OpenGL so if anyone has any suggestions or anything I would love to hear them. If there is anything that I'm doing that sounds wrong or any way I could achieve the same thing more efficiently I'd love to hear it.
Here's some links to glIntercept logs for a better understanding of all the OpenGL calls being made:
Simple RGB: https://docs.google.com/open?id=0BzGMib6CGH4TdUdlcTBYMHNTRnM
Shaders YUV: https://docs.google.com/open?id=0BzGMib6CGH4TSDJTZGxDanBwS2M
Profiling Information:
So after profiling it reported several redundant state changes which I'm not surprised by. I eliminated all of them and saw no noticeable performance difference which I kind of expected. I have 34 state changes per render loop and I am using several deprecated functions. I'll look into using vertex arrays which would solve these. However, I'm just doing one quad per render loop so I don't really expect much performance impact from this. Also keep in mind I don't want to rip everything out and go all VBOs because I still need to support some fairly old Intel chipset drivers that I believe are only OpenGL 1.4.
The thing that really interested me and it hadn't occurred to me before was that each context has its own front and back buffer. Since I'm only using one context the previous HDCs render call must finish writing to the back buffer before the swap can occur and then the next one can start writing to the back buffer again. Would it really be more efficient to use more than one context? Or should I look into rendering to textures (FBOs I think) instead and continue using one context?
EDIT: The original description mentioned using multiple OpenGL contexts, but I was wrong I'm only using one OpenGL context and multiple HDCs.
EDIT2: Added some information after profiling with gDEBugger.
What I try to make your application faster. I made one OpenGL render thread (or more if you have 2 or more video cards). Video card cannot process several context in one time, your multiple OpenGL contexts are waiting one of context. This thread will make only OpenGL work, like YUV->RGB conversion (Used FBO to render to texture). Camere`s thread send images to this thread and UI thread can picked up it to show on window.
You have query to process in OpenGL context and you can combine several frames to one texture to convert it by one pass. It maybe useful, because you have up to 48 cameras. As another variant if OpenGL thread is busy now, you can convert some frame on CPU.
From the log I see you often call the same methods:
glEnable(GL_TEXTURE_2D)
glMatrixMode(GL_TEXTURE)
glLoadIdentity()
glColor4f(1.000000,1.000000,1.000000,1.000000)
You may call it once per context and did not call for each render.
If I understung correct you use 3 texture for each plane of YUV
glTexSubImage2D(GL_TEXTURE_2D,0,0,0,352,240,GL_LUMINANCE,GL_UNSIGNED_BYTE,00000000)
glTexSubImage2D(GL_TEXTURE_2D,0,0,0,176,120,GL_LUMINANCE,GL_UNSIGNED_BYTE,000000)
glTexSubImage2D(GL_TEXTURE_2D,0,0,0,176,120,GL_LUMINANCE,GL_UNSIGNED_BYTE,00000000)
Try to use one texture and use calculation in shader to take correct YUV value for pixel. It is possible, I made it in my application.
I need to take a streaming image from a USB / UVC compliant web camera and digitally magnify it. The application is for people with low vision to be able to read.
The application must run on a MAC / OS X.
I am trying figure which framework to use. I did find that I can use the CALayer and apply an affine transformation, however the image is grainy as you would expect. I need to smooth it out with some method such as anti-aliasing, or some other method..
I know this is a very general question, but I need to know how to focus my efforts. At the moment I am chasing my tail reading docs, etc.
Anybody have a suggestion on what OS X Frameworks, also what algorithms or methods to smooth out grainy magnified images?
Thanks!
You're basically talking about interpolation--resizing images with some intelligent filling in of information rather than just chopping the pixels. There are various methods you could look into, such as "bilinear", "bicubic", etc. In the Apple API you might want to look at CGContextSetInterpolationQuality, which could help with some of the noise you're seeing. OpenCV is a very extensive image processing library which is readily available on OSX and has a C++ API.
Keep in mind, though, that you will ultimately be limited by the quality of the web cam, and the environment (especially lighting). There are "super resolution" techniques to stitch multiple images together, but they may not be applicable if this is a live-video application.
I can't overemphasize the importance of good lighting for applications like this. If you think about doing an iOS version of this, look at what apps like Turboscan do using the flash and a "best-of-three" technique. It's amazing what high-quality images you can get of printed text that way.
I'm about to start a project that will record and edit audio files, and I'm looking for a good library (preferably Ruby, but will consider anything other than Java or .NET) for on-the-fly visualization of waveforms.
Does anybody know where I should start my search?
That's a lot of data to be streaming into a browser. Flash or Flex charts is probably the only solution that will be memory efficient. Javascript charting tends to break-down for large data sets.
When displaying an audio waveform, you will want to do some sort of data reduction on the original data, because there is usually more data available in an audio file than pixels on the screen. Most audio editors build a separate file (called a peak file or overview file) which stores a subset of the audio data (usually the peaks and valleys of a waveform) for use at different zoom levels. Then as you zoom in past a certain point you start referencing the raw audio data itself.
Here are some good articles on this:
Waveform Display
Build an Audio Waveform Display
As far as source code goes, I would recommend looking through the Audacity source code. Audacity's waveform display is pretty good and mostly likely does a similar sort of data reduction when rendering the waveforms.
i wrote one:
http://github.com/pangdudu/rude/tree/master/lib/waveform_narray_testing.rb
,nick
The other option is generating the waveforms on the server-side with GD or RMagick. But good luck getting RubyGD to compile.
Processing is often used for visualization, and it has a Ruby port:
https://github.com/jashkenas/ruby-processing/wiki