I am trying to do the operations like rgb2gray(img) on a live video read using vid=videoinput() like rgb2gray(vid).
It is a type mismatch but I am stuck here. Should I convert the vid to any image format and store it in a matrix or is there any other way like to do rgb2gray? I don't want to use vid.ReturnedColorSpace = 'grayscale', as I need to convert the video into images or matrix and do rgb2gray operation.
In your code vid is a videoinput object that lets you capture frames from a camera. You cannot pass it to rgb2gray. What you can do is grab the frames one at a time in a loop, and pass each one to rgb2gray individually.
Related
I'm trying to use ffmpeg(3.4) hwaccel d3d11va to decode multiple rtsp streams for video surveillance. I want to get the decode the frames to an existing ID3D11Texture2D texture directly, without ID3D11DeviceContext->Map/Unmap operations.
I have seen the examples\hw_decode.c, the sample use av_hwframe_transfer_data to get data from GPU to CPU, but I want to decode frames to an existing ID3D11Texture2D texture directly, or copy to an existing ID3D11Texture2D texture.
How can I do that? Thanks.
In trying to understand how to convert mediafoundation rgb32 data into a bitmap data that can be loaded into image/bitmap widgets or saved as a bitmap file, I am wondering what the RGB32 data actually is, in comparison to the data a BMP has?
Is it simply missing header information or key information a bitmap file has like width, height, etc?
What does RGB32 actually mean, in comparison to BMP data in a bitmap file or memory stream?
You normally have 32-bit RGB as IMFMediaBuffer attached to IMFSample. This is just bitmap bits, without format specific metadata. You can access this data by obtaining media buffer pointer, such as, for example, by doing IMFSample::ConvertToContiguousBuffer call, then doing IMFMediaBuffer::Lock to get a pixel data pointer.
The obtained buffer is compatible to data in standard .BMP file (except maybe, at some times, the rows could be in reverse order), it is just .BMP file has a header before this data. .BMP file normally has BITMAPFILEHEADER structure, then BITMAPINFOHEADER and then the buffer in question. If you write it one after another initialized respectively, this would yield you a valid picture file. This and other questions here show the way to create a .BMP file from bitmap bits.
See this GitHub code snippet, which is really close to the requested task and might be a good starting point.
Trying to test a variety of camera inputs against an application, but since it would be prohibitive to get the webcam to do the exact same thing every time, and to change lenses, would like to just shoot videos and use it as an input.
I can see how to query OSX for AVCapture devices, but is it possible create one and register it with the system, while feeding it frames from a saved video file?
I captured raw audio data stream together with its WAVEFORMATEXTENSIBLE struct.
WAVEFORMATEXTENSIBLE is shown in the figure below:
Following the standard of wav file, I tried to write the raw bits into a wav file.
What I do is:
write "RIFF".
write a DWORD. (filesize - sizeof("RIFF") - sizeof(DWORD)).
=== WaveFormat Chunk ===
write "WAVEfmt "
write a DWORD. (size of the WAVEFORMATEXTENSIBLE struct)
write the WAVEFORMATEXTENSIBLE struct.
=== Fact Chunk ===
write "fact"
write a DWORD. ( 4 )
write a DWORD. ( num of samples in the stream, which should be sizeof(rawdata)*8/wBitsPerSample ).
=== Data Chunk ===
write "data"
write a DWORD (size of rawdata)
write the raw data.
After getting the wav file from the above steps, I played the wav file with media player, there is no sound, playing with audacity will give me a distorted sound, I can hear that it is the correct audio I want, but the sound is distorted with noise.
The raw data can be find here
The wav file I generate is here
It is very confusing to me, because when I use the same method to convert IEEE-float data to wav file, it works just fine.
I figured this out, it seems the getbuffer releasebuffer cycle in IAudioRenderClient is putting raw data that has the format same as that passed into the initialize method of the IAudioClient.
The GetMixFormat in IAudioClient in my case is different from the format passed into the initialize method. I think GetMixFormat gets the format that the device supports.
IAudioClient should have done the conversion of format from the initialized format to the mixformat. I intercept the initialize method, get the format, and it works like a charm.
I'm intercepting WASAPI to access the audio data and face the exact same issue where the generated audio file from the data sounds like the correct content but is very noisy somehow although the frame rate, sample width, number of channels etc. are set properly.
The SubFormat field of WAVEFORMATEXTENSIBLE shows that the data is actually KSDATAFORMAT_SUBTYPE_IEEE_FLOAT, while I originally treat it as integers. According to this page, KSDATAFORMAT_SUBTYPE_IEEE_FLOAT is equivalent to WAVE_FORMAT_IEEE_FLOAT in WAVEFORMATEX. Hence, setting the "audio format" in the wav file's fmt chunk(normally starts in the 20th position) to WAVE_FORMAT_IEEE_FLOAT(which is 3) solved the problem. Remember to put it in little endian.
Original value of audio format
After modification
As you may know, when you record a video on a windows phone, it is saved as a .mp4. I want to be able to access the video file (even if it's only stored in isolated storage for the app), and manipulate the pixel values for each frame.
I can't find anything that allows me to load a .mp4 into an app, then access the frames. I want to be able to save the manipulated video as .mp4 file as well, or be able to share it.
Has anyone figured out a good set of steps to do this?
My guess was to first load the .mp4 file into a Stream object. From here I don't know what exactly I can do, but I want to get it into a form where I can iterate through the frames, manipulate the pixels, then create a .mp4 with the audio again once the manipulation is completed.
I tried doing the exact same thing once. Unfortunately, there are no publicly available libraries that will help you with this. You will have to write your own code to do this.
The way to go about this would be to first read up on the storage format of mp4 and figure out how the frames are stored there. You can then read the mp4, extract the frames, modify them and stitch them back in the original format.
My biggest concern is that the hardware might not be powerful enough to accomplish this in a sufficiently small amount of time.