I am looking for a way to programmatically analyze a video feed from an external usb webcam under OSX.
Since I haven't done any low level programming like this before I am currently kind of lost on where to start.
How can I access a webcam feed and grab the image data to then process further?
At this point I am just trying to understand the basic concept and am not looking for language-specific solutions.
Any sample code would be highly appreciated.
I'd appreciate it very much if someone could point me in the right direction and help me get started.
Thank you very much in advance!
Use OpenCV.
And check my previous answer on this subject if you are looking for a code example to display the webcam images. It converts the video feed to grayscale and displays them on a window:
OpenCV 2.1: Runtime error
If you just want to display the frames, then replace the else block by this:
cvShowImage("Colored video", color_frame);
In case you are wandering how to manipulate the pixels of the frame:
int width = color_frame->width;
int height = color_frame->height;
int bpp = color_frame->nChannels;
for (int i=0; i < width*height*bpp; i+=bpp)
if (!(i % (width*bpp))) // print empty line for better readability
std::cout << std::endl;
std::cout << std::dec << "R:" << (int) color_frame->imageData[i] <<
" G:" << (int) color_frame->imageData[i+1] <<
" B:" << (int) color_frame->imageData[i+2] << " ";
For quick access to a webcam and for manipulation of pixel data, you can use Processing with the Video library - the easiest way to start is to check out the examples bundled with the IDE.
Processing is a java based visualisation language which is easy to learn and use and works on WIndows, MacOSX and Linux. I found the webcam stuff worked out of the box on my MacBook.
Here is an example script (based on an example bundled in the IDE) which loads a webcam feed and renders the pixels in greyscale.
import processing.video.*;
int numPixels;
Capture video;
void setup() {
// Change size to 320 x 240 if too slow at 640 x 480
size(640, 480, P2D);
video = new Capture(this, width, height, 24);
numPixels = video.width * video.height;
// Make the pixels[] array available for direct manipulation
void draw() {
if (video.available()) {
video.read(); // Read a new video frame
video.loadPixels(); // Make the pixels of video available
for (int i = 0; i < numPixels; i++) { // For each pixel in the video frame...
// Make all the pixels grey if mouse is pressed
if (mousePressed) {
float greyVal = brightness(video.pixels[i]);
pixels[i] = color(greyVal);
} else {
// If mouse not pressed, show normal video
pixels[i] = video.pixels[i];
updatePixels(); // Notify that the pixels[] array has changed
Moreover, there is a great interface to OpenCV which can be used for edge detection etc.
When I captured video from camera on Intel Mac, used VideoToolbox to hardware encode raw pixel buffers to H.264 codec slices, I found that the VideoToolbox encoded I frame not clear, causing it looks like blurs every serveral seconds. Below are properties setted:
self.bitrate = 1000000;
self.frameRate = 20;
int interval_second = 2;
int interval_second = 2;
NSDictionary *compressionProperties = #{
(id)kVTCompressionPropertyKey_ProfileLevel: (id)kVTProfileLevel_H264_High_AutoLevel,
(id)kVTCompressionPropertyKey_RealTime: #YES,
(id)kVTCompressionPropertyKey_AllowFrameReordering: #NO,
(id)kVTCompressionPropertyKey_H264EntropyMode: (id)kVTH264EntropyMode_CABAC,
(id)kVTCompressionPropertyKey_PixelTransferProperties: #{
(id)kVTPixelTransferPropertyKey_ScalingMode: (id)kVTScalingMode_Trim,
(id)kVTCompressionPropertyKey_AverageBitRate: #(self.bitrate),
(id)kVTCompressionPropertyKey_ExpectedFrameRate: #(self.frameRate),
(id)kVTCompressionPropertyKey_MaxKeyFrameInterval: #(self.frameRate * interval_second),
(id)kVTCompressionPropertyKey_MaxKeyFrameIntervalDuration: #(interval_second),
(id)kVTCompressionPropertyKey_DataRateLimits: #[#(self.bitrate / 8), #1.0],
result = VTSessionSetProperties(self.compressionSession, (CFDictionaryRef)compressionProperties);
if (result != noErr) {
NSLog(#"VTSessionSetProperties failed: %d", (int)result);
} else {
NSLog(#"VTSessionSetProperties succeeded");
These are very strange compression settings. Do you really need short GOP and very strict data rate limits?
I very much suspect you just copied some code off the internet without having any idea what it does. If it's the case, just set interval_second = 300 and remove kVTCompressionPropertyKey_DataRateLimits completely
I'm currently doing some tests to see if my app runs correctly on Retina Macs. I have installed Quartz Debug for this purpose and I'm currently running a scaled mode. My screen mode is now 960x540 but of course the physical size of the monitor is still Full HD, i.e. 1920x1080 pixels.
When querying the monitor database using CGGetActiveDisplayList() and then using CGDisplayBounds() on the individual monitors in the list, the returned monitor size is 960x540. This is what I expected because CGDisplayBounds() is said to use the global display coordinate space, not pixels.
To my surprise, however, CGDisplayPixelsWide() and CGDisplayPixelsHigh() also return 960x540, although they're explicitly said to return pixels so I'd expect them to return 1920x1080 instead. But they don't.
This leaves me wondering how can I retrieve the real physical resolution of the monitor instead of the scaled mode using the CGDisplay APIs, i.e. 1920x1080 instead of 960x540 in my case? Is there any way to get the scaling coefficient for a CGDisplay so that I can compute the real physical resolution on my own?
I know I can get this scaling coefficient using the backingScaleFactor method but this is only possible for NSScreen, how can I get the scaling coefficient for a CGDisplay?
You need to examine the mode of the display, not just the display itself. Use CGDisplayCopyDisplayMode() and then CGDisplayModeGetPixelWidth() and CGDisplayModeGetPixelHeight(). These last two are relatively newer functions and the documentation primarily exists in the headers.
And, of course, don't forget to CGDisplayModeRelease() the mode object.
From Ken's answer it is not obvious how you find the native mode(s). To do this, call CGDisplayModeGetIOFlags and choose from the modes that have the kDisplayModeNativeFlag set (see IOKit/IOGraphicsTypes.h, the value is 0x02000000).
const int kFlagNativeMode = 0x2000000; // see IOGraphicsTypes.h
const CGFloat kNoSize = 100000.0;
NSScreen *screen = NSScreen.mainScreen;
NSDictionary *desc = screen.deviceDescription;
unsigned int displayID = [[desc objectForKey:#"NSScreenNumber"] unsignedIntValue];
CGSize displaySizeMM = CGDisplayScreenSize(displayID);
CGSize nativeSize = CGSizeMake(kNoSize, kNoSize);
CFStringRef keys[1] = { kCGDisplayShowDuplicateLowResolutionModes };
CFBooleanRef values[1] = { kCFBooleanTrue };
CFDictionaryRef options = CFDictionaryCreate(kCFAllocatorDefault, (const void**)keys, (const void**)values, 1, &kCFTypeDictionaryKeyCallBacks, &kCFTypeDictionaryValueCallBacks );
CFArrayRef modes = CGDisplayCopyAllDisplayModes(displayID, options);
int n = CFArrayGetCount(modes);
for (int i = 0; i < n; i++) {
CGDisplayModeRef mode = (CGDisplayModeRef) CFArrayGetValueAtIndex(modes, i);
if (CGDisplayModeGetIOFlags(mode) & kFlagNativeMode) {
int w = CGDisplayModeGetWidth(mode);
// We get both high resolution (screen.backingScaleFactor > 1)
// and the "low" resolution, in CGFloat units. Since screen.frame
// is CGFloat units, we want the lowest native resolution.
if (w < nativeSize.width) {
nativeSize.width = w;
nativeSize.height = CGDisplayModeGetHeight(mode);
// printf("mode: %dx%d %f dpi 0x%x\n", (int)CGDisplayModeGetWidth(mode), (int)CGDisplayModeGetHeight(mode), CGDisplayModeGetWidth(mode) / displaySizeMM.width * 25.4, CGDisplayModeGetIOFlags(mode));
if (nativeSize.width == kNoSize) {
nativeSize = screen.frame.size;
float scaleFactor = screen.frame.size.width / nativeSize.width;
I use the new 3D reconstruction API's (MIRA release). I have a problem when a call the Tango3DR_update function. It returns TANGO_3DR_INVALID code when I set the parameters associated with an image camera (const Tango3DR_ImageBuffer * image * const Tango3DR_Pose image_pose, Tango3DR_CameraCalibration const * calibration). I have checked my parameters, they seem to be correct. When I call this function without image parameters, this to work properly ... Is this a known bug?
thank you in advance for your answers.
TLDR; The support library ImageBufferManager has a bug with strides. Do color_image.stride = image_buffer->width; when creating your Tango3DR_ImageBuffer.
I think there are two things :
Image Format
First, you have to make sure to use the TANGO_HAL_PIXEL_FORMAT_YCrCb_420_SP. You can do that by using the ImageBufferManager from the support library.
ImageBufferManager and strides
Second, there is a catch if you use the support library ImageBufferManager though. TangoSupport_getLatestImageBuffer seems to fail to initialize the stride of the returned image (I got 0 and some other very large values) which the 3DR library doesn't like. The original TangoImageBuffer from OnColorAvailable has stride=1280 (=image_width) and forcing that value on the TangoImageBuffer returned
from the ImageBufferManager seems to fix the issue. I believe this is a bug in ImageBufferManager.
This means doing
color_image.stride = image_buffer->width;
instead of
color_image.stride = image_buffer->stride
when creating the Tango3DR_ImageBuffer.
Full code example
I got it working with the following code in my Render method :
TangoImageBuffer* image_buffer;
ret = TangoSupport_getLatestImageBuffer(
image_buffer_manager_, &image_buffer);
if (ret != TANGO_SUCCESS) {
LOG(ERROR) << "Error in TangoSupport_getLatestImageBuffer";
Tango3DR_ImageBuffer color_image;
color_image.width = image_buffer->width;
color_image.height = image_buffer->height;
// VERY Important - The support library ImageBufferManager seems to have
// a bug where it will always put the stride of the returned buffer
// at 0, which causes 3DR to fail
color_image.stride = image_buffer->width;
color_image.timestamp = image_buffer->timestamp;
color_image.format = (Tango3DR_ImageFormatType)image_buffer->format;
color_image.data = image_buffer->data;
ret = Tango3DR_update(
I am using the ImageManager from the support library. So my OnColorAvailable looks like that
void SynchronizationApplication::OnColorAvailable(
const TangoImageBuffer* buffer) {
if (tango_3dr_enabled_ && tango_3dr_use_color_) {
TangoErrorType ret = TangoSupport_updateImageBuffer(
image_buffer_manager_, buffer);
if (ret != TANGO_SUCCESS) {
LOG(ERROR) << "Error in TangoSupport_updatePointCloud";
And the image_buffer_manager_ is initialized as follow (the pixel format might be important).
I am copying the calibration as follow :
void CopyCalibrationTangoTo3DR(const TangoCameraIntrinsics& tango,
Tango3DR_CameraCalibration* out) {
out->calibration_type =
out->cx = tango.cx;
out->cy = tango.cy;
memcpy(out->distortion, tango.distortion, sizeof(double) * 5);
out->fx = tango.fx;
out->fy = tango.fy;
out->height = tango.height;
out->width = tango.width;
Intro: I am trying to write a program which connects to a FLIR AX5(GigE Vision) camera and then save images after regular intervals to a pre-specified location on my PC. These images must be 14bit which contains the temperature information. Later I need to process these images using openCV to get some meaningful results from obtained temperature data.
Current Position: I can save image at regular interval but the image which I am getting doesn't contain 14 bit data but 8 bit data instead. This even after I change the PixelFormat to 14 bit, CMOS and LVDT bit depths to 14 bit. I checked the resulting .bin file in matlab and found that the max pixel value is 255 which means image is being stored in 8 bit format. I am using the sample code provided by eBus SDK to do this job. In this code I have made some changes as per my requirement.
Please help in saving the image in the raw format from which I can read the temperature data.
P.S. Relevant code is here.
// If the buffer contains an image, display width and height.
uint32_t lWidth = 0, lHeight = 0;
lType = lBuffer->GetPayloadType();
cout << fixed << setprecision( 1 );
cout << lDoodle[ lDoodleIndex ];
cout << " BlockID: " << uppercase << hex << setfill( '0' ) << setw( 16 ) << lBuffer->GetBlockID();
if (lType == PvPayloadTypeImage)
// Get image specific buffer interface.
PvImage *lImage = lBuffer->GetImage();
// Read width, height.
lWidth = lImage->GetWidth();
lHeight = lImage->GetHeight();
cout << " W: " << dec << lWidth << " H: " << lHeight;
lBuffer->GetImage()->Alloc(lWidth, lHeight, lBuffer->GetImage()->GetPixelType());
if (lBuffer->GetBlockID()%50==0) {
char filename[]= IMAGE_SAVE_LOC;
std::string s=std::to_string(lBuffer->GetBlockID());
char const *schar=s.c_str();
strcat(filename, schar);
Be sure that the streaming is configured for 14 bits stream.
Before create PvStream you have to set PixelFormat to 14 bits. If you PvDevice object it's called _pvDevice:
_pvDevice->GetParameters()->SetEnumValue("PixelFormat", PvPixelMono14);
_pvDevice->GetParameters()->SetEnumValue("DigitalOutput", 3);
I've searched the net, I've searched here. I've found code that I could compile and it works fine, but for some reason my code won't produce any sound. I'm porting an old game to the PC (Windows,) and I'm trying to make it as authentic as possible, so I'm wanting to use generated wave forms. I've pretty much copied and pasted the working code (only adding in multiple voices,) and it still won't work (even thought the exact same code for a single voice works fine.) I know I'm missing something obvious, but I just cannot figure out what. Any help would be appreciated thank you.
First some notes... I was looking for something that would allow me to use the original methodology. The original system used paired bytes for music (sound effects - only 2 - were handled in code.) A time byte that counted down every time the routine was called, and a note byte that was played until time reached zero. this was done by patching into the interrupt vector, windows doesn't allow that, so I set up a timer that routing that accomplished the same thing. The timer kicks in, updates the display, and then runs the music sequence. I set this up with a defined time so that I only have one place to adjust the timing at (to get it as close as possible to the original sequence. The music is a generated wave form (and I've double checked the math, and even examined the generated data in debug mode,) and it looks good. The sequence looks good, but doesn't actually produce sound. I tried SDL2 first, and it's method of only playing 1 sound doesn't work for me, also, unless I make the sample duration extremely short (and the sound produced this way is awful,) I can't match the timing (it plays the entire sample through it's own interrupt without letting me make adjustments.) Also, blending the 3 voices together (when they all run with different timings,) is a mess. Most of the other engines I examined work in much the same way, they want to use their own callback interrupt and won't allow me to tweak it appropriately. This is why I started working with OpenAL. It allows multiple voices (sources,) and allows me to set the timings myself. On advice from several forums, I set it up so that the sample lengths are all multiples of full cycles.
Anyway, here's the code.
int main(int argc, char* argv[])
FreeConsole(); //Get rid of the DOS console, don't need it
if (InitLog() < 0) return -1; //Start logging
SDL_Event event;
InitVideo(false); //Set to window for now, will put options in later
curmusic = 5;
while (event.type != SDL_KEYDOWN) SDL_PollEvent(&event);
return 0;
void CALLBACK TimerProc(HWND hWind, UINT Msg, UINT_PTR idEvent, DWORD dwTime)
void InitAudio(void)
ALCdevice *dev;
ALCcontext *cxt;
Log("Initializing OpenAL Audio\r\n");
dev = alcOpenDevice(NULL);
if (!dev) {
Log("Failed to open an audio device\r\n");
cxt = alcCreateContext(dev, NULL);
if(!cxt) {
Log("Failed to create audio context\r\n");
if (alGetError() != AL_NO_ERROR) {
Log("Error during buffer creation\r\n");
alGenSources(4, Source);
if (alGetError() != AL_NO_ERROR) {
Log("Error during source creation\r\n");
void PlayMusic()
static int oldsong, ofset, mtime[4];
double freq;
ALuint srate = 44100;
ALuint voice, i, note, len, hold;
short buf[4][_BUFFSIZE_];
bool test[4] = {false, false, false, false};
if (curmusic != oldsong) {
oldsong = (int)curmusic;
if (curmusic > 0)
ofset = moffset[(curmusic - 1)];
for (voice = 1; voice < 4; voice++)
mtime[voice] = 0;
if (curmusic == 0) return;
//Only 3 voices for music, but have
for (voice = 0; voice < 3; voice ++) { // 4 set asside for eventual sound effects
if (mtime[voice] == 0) { //is note finished
alSourceStop(Source[voice]); //It is, so stop the channel (source)
mtime[voice] = music[ofset++]; //Get the next duration
if (mtime[voice] == 0) {oldsong = 0; return;} //zero marks end, so restart
note = music[ofset++]; //Get the next note
if (note > 127) { //Old HW data was designed for could only
if (note == 255) note = 127; //use values 128 - 255 (255 = 127)
freq = (15980 / (voice + (int)(voice / 3))) / (256 - note); //freq of note
len = (ALuint)(srate / freq); //A single cycle of that freq.
hold = len;
while (len < (srate / (1000 / _FREQ_))) len += hold; //Multiply till 1 interrup cycle
while (len > _BUFFSIZE_) len -= hold; //Don't overload buffer
if (len == 0) len = _BUFFSIZE_; //Just to be safe
for (i = 0; i < len; i++) //calculate sine wave and put in buffer
buf[voice][i] = (short)((32760 * sin((2 * M_PI * i * freq) / srate)));
alBufferData(Buffer[voice], AL_FORMAT_MONO16, buf[voice], len, srate);
alSourcei(openAL.Source[i], AL_LOOPING, AL_TRUE);
alSourcei(Source[i], AL_BUFFER, Buffer[i]);
} else --mtime[voice];
Well, it turns out there were 3 problems with my code. First, you have to link the built wave buffer to the AL generated buffer "before" you link the buffer to the source:
alBufferData(buffer,AL_FORMAT_MONO16,&wave_sample,sample_lenght * sizeof(short),frequency);
Also in the above example, I multiplied the sample_length by how many bytes are in each sample (in this case "sizeof(short)".
The final problem was that you need to un-link a buffer from the source before you change the buffer data
The music would play, but not correctly until I added that line to the note change code.