I'm trying to play music when I move my hand forward and pause music when I move my hand background as in this video.
I can play music when I move forward. But I can't see how to pause my player when I move background.
import ddf.minim.*;
import SimpleOpenNI.*;
SimpleOpenNI kinect;
int closestValue;
int closestX;
int closestY;
PImage img;
AudioPlayer player;
Minim minim;
void setup() {
size(640, 480);
img = loadImage("background.jpg");
kinect = new SimpleOpenNI(this);
image(img, 0, 0);
minim = new Minim(this);
player = minim.loadFile("music.mp3");
void draw() {
closestValue = 600;
int[] depthValues = kinect.depthMap();
// this breaks our array down into rows
for (int y = 0; y < 480; y++ ) {
// this breaks our array down into specific pixels in each row
for (int x = 0; x < 640; x++) {
// this pulls out the specific array position
int i = x + y * 640;
int current = depthValues[i];
//now we're on to comparing them!
if ( current > 0 && current < closestValue) {
closestValue = current;
closestX = x;
closestY = y;
/// I made this else if in order to make the pause when the current value is superior or equal to the closest value ///
/// but it doesn't play the player, it seems that it directly pauses the player. ///
/*else if ( current > 0 && current >= closestValue) {
// draw the depth image on the screen
image(kinect.depthImage(), 0, 0);
// draw that swanky red circle identifying it
fill(255, 0, 0); //This sets the colour to red
ellipse(closestX, closestY, 25, 25);
Try to player.pause(); in the play code segment and see what happens. If it pauses. then it's because of the expression you are using for the else if statement.
I'm trying to create a music visualizer using Processing. My idea is to have a line, across the entire screen, that moves accordingly with a song.
This is what I have, right now
import ddf.minim.analysis.*;
import ddf.minim.*;
Minim minim;
AudioPlayer track;
AudioInput input;
FFT fft;
//AudioIn in;
String audioFileName = "dune.mp3";
void setup()
//size(480, 320);
minim = new Minim(this);
track = minim.loadFile(audioFileName, 2048);
input = minim.getLineIn();
fft = new FFT(input.bufferSize(), input.sampleRate());
void draw()
for (int i = 0; i < fft.specSize(); i++)
ellipse(i, 200, 7, fft.getBand(i)*10);
//line(0, 200, 200, fft.getBand(i));
So, my problems are:
the music that I'm loading, it's not working. (String audioFileName). Only the input, so when I talk to the microphone, works
I can't get a line, only the ellipse like the code
Does anyone knows how to solve this? Or has a good tutorial that I can follow?
Thank you
Your demo crashes on my Mac and will take me a while to debug. Perhaps the following demo which I copied from the web will get you started (see file header): http://code.compartmental.net/minim/minim_class_minim.html . It is heavily commented with several 'println' calls to hopefully give you some idea of how it works. You will need to create a 'data' folder in your Processing sketch folder and place a file named 'groove.mp3' inside the data folder in order to run it. I've tested it on my system and it seems to work ok. Does not use FFT.
This sketch demonstrates how to play a file with Minim using an AudioPlayer.
It's also a good example of how to draw the waveform of the audio. Full documentation
for AudioPlayer can be found at http://code.compartmental.net/minim/audioplayer_class_audioplayer.html
For more information about Minim and additional features, visit http://code.compartmental.net/minim/
import ddf.minim.*;
Minim minim;
AudioPlayer player;
void setup() {
size(512, 200);
// we pass this to Minim so that it can load files from the data directory
minim = new Minim(this);
println("minim = ", minim);
// loadFile will look in all the same places as loadImage does.
// this means you can find files that are in the data folder and the
// sketch folder. you can also pass an absolute path, or a URL.
player = minim.loadFile("groove.mp3");
println("player = ", player);
void draw() {
// draw the waveforms
// the values returned by left.get() and right.get() will be between -1 and 1,
// so we need to scale them up to see the waveform
// note that if the file is MONO, left.get() and right.get() will return the same value
for (int i = 0; i < player.bufferSize() - 1; i++) {
float x1 = map( i, 0, player.bufferSize(), 0, width );
float x2 = map( i+1, 0, player.bufferSize(), 0, width );
line( x1, 50 + player.left.get(i)*50, x2, 50 + player.left.get(i+1)*50 );
line( x1, 150 + player.right.get(i)*50, x2, 150 + player.right.get(i+1)*50 );
// draw a line to show where in the song playback is currently located
float posx = map(player.position(), 0, player.length(), 0, width);
stroke(0, 200, 0);
line(posx, 0, posx, height);
if ( player.isPlaying() ) {
text("Press any key to pause playback.", 10, 20 );
} else {
text("Press any key to start playback.", 10, 20 );
void keyPressed() {
if ( player.isPlaying() ) {
// if the player is at the end of the file,
// we have to rewind it before telling it to play again
else if ( player.position() == player.length() ) {
} else {
The following is a revision of your initial post along with the reference that I used to debug it. 'Input' is the microphone and 'track' is the file; you had it set up to use the microphone and I switched it to the file. Just like the example above you will need to create a data folder in the Processing sketch folder and insert a file named 'groove.mp3'.
Reference: https://github.com/ddf/Minim/blob/v2.2.2/examples/Analysis/SoundSpectrum/SoundSpectrum.pde
import ddf.minim.analysis.*;
import ddf.minim.*;
Minim minim;
AudioPlayer track;
AudioInput input;
FFT fft;
String audioFileName = "groove.mp3";
void setup() {
size(480, 320);
minim = new Minim(this);
track = minim.loadFile(audioFileName, 1024);
input = minim.getLineIn();
fft = new FFT(track.bufferSize(), track.sampleRate());
void draw() {
for (int i = 0; i < fft.specSize(); i++) {
ellipse(i, 200, 7, fft.getBand(i)*10);
// line(0, 200, 200, fft.getBand(i)); //Unable to get this to work
I wrote a waveform renderer that takes an audio file and creates something like this:
The logic is pretty simple. I calculate the number of audio samples required for each pixel, read those samples, average them and draw a column of pixels according to the resulting value.
Typically, I will render a whole song on around 600-800 pixels, so the wave is pretty compressed. Unfortunately this usually results in unappealing visuals as almost the entire song is just rendered at almost the same heights. There is no variation.
Interestingly, if you look at the waveforms on SoundCloud almost none of them are as boring as my results. They all have some variation. What could be the trick here? I don't think they just add random noise.
I don't think SoundCloud is doing anything particularly special. There are plenty of songs I see on their front page that are very flat. It has more to do with the way detail is perceived and what the overall dynamics of the song are like. The main difference is that SoundCloud is drawing absolute value. (The negative side of the image is just a mirror.)
For demonstration, here is a basic white noise plot with straight lines:
Now, typically a fill is used to make the overall outline easier to see. This already does a lot for the appearance:
Larger waveforms ("zoomed out" in particular) typically use a mirror effect because the dynamics become more pronounced:
Bars are another way to visualize and can give an illusion of detail:
A pseudo routine for a typical waveform graphic (average of abs and mirror) might look like this:
for (each pixel in width of image) {
var sum = 0
for (each sample in subset contained within pixel) {
sum = sum + abs(sample)
var avg = sum / length of subset
draw line(avg to -avg)
This is effectively like compressing the time axis as RMS of the window. (RMS could also be used but they are almost the same.) Now the waveform shows overall dynamics.
That is not too different from what you are already doing, just abs, mirror and fill. For boxes like SoundCloud uses, you would be drawing rectangles.
Just as a bonus, here is an MCVE written in Java to generate a waveform with boxes as described. (Sorry if Java is not your language.) The actual drawing code is near the top. This program also normalizes, i.e., the waveform is "stretched" to the height of the image.
This simple output is the same as the above pseudo routine:
This output with boxes is very similar to SoundCloud:
import javax.swing.*;
import java.awt.*;
import java.awt.event.*;
import java.awt.image.*;
import java.io.*;
import javax.sound.sampled.*;
public class BoxWaveform {
static int boxWidth = 4;
static Dimension size = new Dimension(boxWidth == 1 ? 512 : 513, 97);
static BufferedImage img;
static JPanel view;
// draw the image
static void drawImage(float[] samples) {
Graphics2D g2d = img.createGraphics();
int numSubsets = size.width / boxWidth;
int subsetLength = samples.length / numSubsets;
float[] subsets = new float[numSubsets];
// find average(abs) of each box subset
int s = 0;
for(int i = 0; i < subsets.length; i++) {
double sum = 0;
for(int k = 0; k < subsetLength; k++) {
sum += Math.abs(samples[s++]);
subsets[i] = (float)(sum / subsetLength);
// find the peak so the waveform can be normalized
// to the height of the image
float normal = 0;
for(float sample : subsets) {
if(sample > normal)
normal = sample;
// normalize and scale
normal = 32768.0f / normal;
for(int i = 0; i < subsets.length; i++) {
subsets[i] *= normal;
subsets[i] = (subsets[i] / 32768.0f) * (size.height / 2);
// convert to image coords and do actual drawing
for(int i = 0; i < subsets.length; i++) {
int sample = (int)subsets[i];
int posY = (size.height / 2) - sample;
int negY = (size.height / 2) + sample;
int x = i * boxWidth;
if(boxWidth == 1) {
g2d.drawLine(x, posY, x, negY);
} else {
g2d.fillRect(x + 1, posY + 1, boxWidth - 1, negY - posY - 1);
g2d.drawRect(x, posY, boxWidth, negY - posY);
// handle most WAV and AIFF files
static void loadImage() {
JFileChooser chooser = new JFileChooser();
int val = chooser.showOpenDialog(null);
if(val != JFileChooser.APPROVE_OPTION) {
File file = chooser.getSelectedFile();
float[] samples;
try {
AudioInputStream in = AudioSystem.getAudioInputStream(file);
AudioFormat fmt = in.getFormat();
if(fmt.getEncoding() != AudioFormat.Encoding.PCM_SIGNED) {
throw new UnsupportedAudioFileException("unsigned");
boolean big = fmt.isBigEndian();
int chans = fmt.getChannels();
int bits = fmt.getSampleSizeInBits();
int bytes = bits + 7 >> 3;
int frameLength = (int)in.getFrameLength();
int bufferLength = chans * bytes * 1024;
samples = new float[frameLength];
byte[] buf = new byte[bufferLength];
int i = 0;
int bRead;
while((bRead = in.read(buf)) > -1) {
for(int b = 0; b < bRead;) {
double sum = 0;
// (sums to mono if multiple channels)
for(int c = 0; c < chans; c++) {
if(bytes == 1) {
sum += buf[b++] << 8;
} else {
int sample = 0;
// (quantizes to 16-bit)
if(big) {
sample |= (buf[b++] & 0xFF) << 8;
sample |= (buf[b++] & 0xFF);
b += bytes - 2;
} else {
b += bytes - 2;
sample |= (buf[b++] & 0xFF);
sample |= (buf[b++] & 0xFF) << 8;
final int sign = 1 << 15;
final int mask = -1 << 16;
if((sample & sign) == sign) {
sample |= mask;
sum += sample;
samples[i++] = (float)(sum / chans);
} catch(Exception e) {
if(img == null) {
img = new BufferedImage(size.width, size.height, BufferedImage.TYPE_INT_ARGB);
static void problem(Object msg) {
JOptionPane.showMessageDialog(null, String.valueOf(msg));
public static void main(String[] args) {
SwingUtilities.invokeLater(new Runnable() {
public void run() {
JFrame frame = new JFrame("Box Waveform");
JPanel content = new JPanel(new BorderLayout());
JButton load = new JButton("Load");
load.addActionListener(new ActionListener() {
public void actionPerformed(ActionEvent ae) {
view = new JPanel() {
protected void paintComponent(Graphics g) {
if(img != null) {
g.drawImage(img, 1, 1, img.getWidth(), img.getHeight(), null);
view.setPreferredSize(new Dimension(size.width + 2, size.height + 2));
content.add(view, BorderLayout.CENTER);
content.add(load, BorderLayout.SOUTH);
Note: for the sake of simplicity, this program loads the entire audio file in to memory. Some JVMs may throw OutOfMemoryError. To correct this, run with increased heap size as described here.
I am trying to generate a binary image from the depthMap()-function in OpenNI, which provides an array of type int. With that image I want to do blob-Tracking.
Problem is that I am not able to generate a clear binary image from the depthMap. In my understanding the depth image generates a bright pixel for everything that is closer to the sensor and the farer away from the sensor the darker they get. So I ask every Pixel in the (one-dimensional) Array if it is over my min and under my max-Threshold to make up a range from that I want the get the data.
Here is my code:
// import library
import SimpleOpenNI.*;
import processing.opengl.*; // opengl
import blobDetection.*; // blobs
// declare SimpleOpenNI object
SimpleOpenNI context;
BlobDetection theBlobDetection;
BlobBall blobBalls;
PrintWriter output;
// threshold for binaryImage
int minThreshold, maxThreshold;
// Size of the kinect Image
int kinectWidth = 640;
int kinectHeight = 480;
float globalX, globalY;
// Colors
color bgColor = color(0, 0, 123);
color white = color(255,255,255);
color black = color(0,0,0);
// PImage to hold incoming imagery
int[] distanceArray;
PImage cam, forBlobDetect;
void setup() {
output = createWriter("positions.txt");
// init threshold
minThreshold = 960;
maxThreshold = 2500;
// same as Kinect dimensions
size(kinectWidth, kinectHeight);
// initialize SimpleOpenNI object
context = new SimpleOpenNI(this);
if (context.isInit() == false) {
println("Can't init SimpleOpenNI, maybe the camera is not connected!");
else {
// mirror the image to be more intuitive
// context.enableScene();
distanceArray = context.depthMap();
forBlobDetect = new PImage(width, height);
theBlobDetection = new BlobDetection(forBlobDetect.width, forBlobDetect.height);
void draw() {
// update the SimpleOpenNI object
// put the image into a PImage
cam = context.depthImage();
// copy the image into the smaller blob image
// forBlobDetect.copy(cam, 0, 0, cam.width, cam.height, 0, 0, forBlobDetect.width, forBlobDetect.height);
// blur the blob image
forBlobDetect.filter(BLUR, 2);
int pos = 0;
int currentDepthValue = 0;
distanceArray = context.depthMap();
for(int x = 0; x < cam.width; x++) {
for(int y = 0; y < cam.height; y++) {
pos = y*cam.width+x;
currentDepthValue = distanceArray[pos];
// println(currentDepthValue);
if((currentDepthValue > minThreshold) && (currentDepthValue < maxThreshold)) {
forBlobDetect.pixels[pos] = black;
} else {
forBlobDetect.pixels[pos] = white;
// for(int i=0; i < distanceArray.length; i++) {
// currentDepthValue = distanceArray[i];
// // println(currentDepthValue);
// if(currentDepthValue > minThreshold) /*&& (currentDepthValue < maxThreshold)*/) {
// forBlobDetect.pixels[pos] = white;
// } else {
// forBlobDetect.pixels[pos] = black;
// }
// }
// detect the blobs
// display the image
image(cam, 0, 0);
image(forBlobDetect, 0, 0, width/2, height/2);
// image(context.sceneImage(), context.depthWidth(), 0);
Really stupid mistake by myself because I missunderstood the 11-bit Array.
Thanks to the "Making things see" examples I solved it.
I'm having a ton of trouble making a simple video delay in processing. I looked around on the internet and I keep finding the same bit of code and I can't get it to work at all. When I first tried it, it did nothing (at all). Here's my modified version (which at least seems to load frames into the buffer), I really have no idea why it doesn't work and I'm getting really tired of pulling out my hair. Please... please, for the love of god, please somebody point out the stupid mistake I'm making here.
And now, without further delay (hah, get it?), the code:
import processing.video.*;
VideoBuffer vb;
Movie myMovie;
Capture cam;
float seconds = 1;
void setup() {
size(320,240, P3D);
String[] cameras = Capture.list();
if (cameras.length == 0) {
println("There are no cameras available for capture.");
} else {
println("Available cameras:");
for (int i = 0; i < cameras.length; i++) {
cam = new Capture(this, cameras[3]);
vb = new VideoBuffer(90, width, height);
void draw() {
if (cam.available() == true) {
image(cam, 0, 0);
image( vb.getFrame(), 150, 0 );
class VideoBuffer
PImage[] buffer;
int inputFrame = 0;
int outputFrame = 0;
int frameWidth = 0;
int frameHeight = 0;
VideoBuffer( int frames, int vWidth, int vHeight )
buffer = new PImage[frames];
for(int i = 0; i < frames; i++)
this.buffer[i] = new PImage(vWidth, vHeight);
this.inputFrame = 0;
this.outputFrame = 1;
this.frameWidth = vWidth;
this.frameHeight = vHeight;
// return the current "playback" frame.
PImage getFrame()
return this.buffer[this.outputFrame];
// Add a new frame to the buffer.
void addFrame( PImage frame )
// copy the new frame into the buffer.
this.buffer[this.inputFrame] = frame;
// advance the input and output indexes
println(this.inputFrame + " " + this.outputFrame);
// wrap the values..
if(this.inputFrame >= this.buffer.length)
this.inputFrame = 0;
if(this.outputFrame >= this.buffer.length)
this.outputFrame = 0;
This works in Processing 2.0.1.
import processing.video.*;
Capture cam;
PImage[] buffer;
int w = 640;
int h = 360;
int nFrames = 60;
int iWrite = 0, iRead = 1;
void setup(){
size(w, h);
cam = new Capture(this, w, h);
buffer = new PImage[nFrames];
void draw() {
if(cam.available()) {
buffer[iWrite] = cam.get();
if(buffer[iRead] != null){
image(buffer[iRead], 0, 0);
if(iRead >= nFrames-1){
iRead = 0;
if(iWrite >= nFrames-1){
iWrite = 0;
There is a problem inside your addFrame-Method. You just store a reference to the PImage object, whose pixels get overwritten all the time. You have to use buffer[inputFrame] = frame.get() instead of buffer[inputFrame] = frame. The get() method returns a copy of the image.
I can't get around a peculiar problem with SimpleOpenNI for Processing ao I'm asking for your help.
I'd like to store snapshots of pixel depth data (returned by .depthMapRealWorld() method as PVector arrays) on discrete time intervals, then process them further for a presentation. I tried adding them in an ArrayList, but it seems that the depthMapRealWorld() method is returning only a reference to a current depth data, not a real array. I tried in this sequence:
Just getting the data and adding it in an arraylist. On every call of the update() method the whole arraylist contained the same PVector array, even if the array at the zero position was added many iterations away!
Then I made the PVector array, along with its creation time, part of a class. Rewrote the sketch a little, but it didn't help. All of the arrays in the arraylist werw still the same.
Finally, in the constructor of the class, I "manually" copied the xyz coordinates of every vector from the PVector array into a int array. That seemed to solve the problem - the int arrays in the arraylist are now different from each other. But this solution introduced serious performance problems.
The question is: is there a more efficient way of storing these PVector arrays and retaining their value?
import processing.opengl.*;
import SimpleOpenNI.*;
SimpleOpenNI kinect;
float rotation = 0;
int time = 0;
ArrayList dissolver;
ArrayList<Integer> timer;
int pSize = 10;
Past past;
void setup() {
dissolver = new ArrayList();
timer = new ArrayList();
size(1024, 768, OPENGL);
kinect = new SimpleOpenNI(this);
translate(width/2, height/2, -100);
void draw() {
translate(width/2, height/2, 500);
stroke (255, 255, 255);
past = new Past (kinect.depthMapRealWorld(), time);
if (dissolver.size() == pSize) { //remove the oldest arraylist element if when list gets full
dissolver.remove(0); //
if (time % 20 == 0) {
dissolver.add (past);
Past p1 = (Past) dissolver.get (0);
float [][] o2 = p1.getVector();
println ("x coord of a random point at arraylist position 0: " + o2[50000][0]); //for testing
if (dissolver.size() == pSize-1) {
//dissolve ();
time ++;
void dissolve () { //from the previous nonworking version; ignore
for (int offset = 0; offset < pSize-1; offset ++) {
PVector[] offPoints = (PVector[]) dissolver.get (offset);
int offTime = timer.get(offset);
for (int i = 0; i < offPoints.length; i+=10) {
int col = (time-offTime)*2; //why??
stroke (255, 0, col);
PVector currentPoint = offPoints[i];
if (currentPoint.z <1500) {
point(currentPoint.x, currentPoint.y, currentPoint.z); // - 2*(time-offTime) + random(0, 100)
class Past {
private PVector [] depth; //should contain this, not int
private float [][] depth1;
private int time;
Past (PVector [] now, int t) {
//should be like this: depth = now;
//clumsy and performancewise catastrophic solution below
depth1 = new float [now.length][3];
for (int i = 0; i< now.length; i+=10) {
PVector temp = now[i];
depth1 [i][0] = temp.x;
depth1 [i][1] = temp.y;
depth1 [i][2] = temp.z;
//arrayCopy(now, depth); this didn't work either
time = t;
float [][] getVector () {
return depth1;
int getTime () {
return time;
If I understood correctly, you want to store the 3D positions(ArrayList of PVectors) for each frame, right ?
If so, you should be able to simply store PVectors and reference them later.
Here's a basic sketch to illustrate this:
import processing.opengl.*;
import SimpleOpenNI.*;
SimpleOpenNI kinect;
ArrayList<ArrayList<PVector>> frames = new ArrayList<ArrayList<PVector>>();
ArrayList<PVector> frame;
boolean isRecording = true;
boolean isRecFrame;
void setup() {
size(1024, 768, OPENGL);
kinect = new SimpleOpenNI(this);
void draw() {
translate(width/2, height/2, 500);
isRecFrame = (frameCount % 20 == 0);//record every 20 frames
int[] depthMap = kinect.depthMap();
int steps = 5; // to speed up the drawing, draw every N point
int index;
PVector realWorldPoint;
if(isRecFrame) frame = new ArrayList<PVector>();
for(int y=0;y < kinect.depthHeight();y+=steps)
for(int x=0;x < kinect.depthWidth();x+=steps)
index = x + y * kinect.depthWidth();
if(depthMap[index] > 0)
realWorldPoint = kinect.depthMapRealWorld()[index];
if(isRecFrame) frame.add(realWorldPoint.get());
if(isRecFrame) frames.add(frame);
ArrayList<PVector> currentFrame = frames.get(frameCount%frames.size());//playback is faster than recording now for testing purposes - add a decent frame counter here at some point
for(PVector p : currentFrame) point(p.x,p.y,p.z);
void keyPressed(){
if(key == ' ') isRecording = !isRecording;
Use the SPACE key to toggle between recording and playback.
The main thing to note is I'm storing a copy of the real world position for each depth pixel (frame.add(realWorldPoint.get());). Another thing to keep in mind is that currently you're storing these coordinates in memory which at some point will fill. If you only store a limited number of frames that should be fine, if not you might want to save to the points to disk. This way you can reuse recordings with other sketches. A basic way would be to sore them in a csv file:
void saveCSV(ArrayList<PVector> pts){
String csv = "x,y,z\n";
for(PVector p : pts) csv += p.x + "," + p.y + "," + p.z + "\n";
Another would be to use a more suitable format for point clouds, like PLY.
Saving an ASCII PLY is fairly straight forward:
void savePLY(ArrayList<PVector> pts){
String ply = "ply\n";
ply += "format ascii 1.0\n";
ply += "element vertex " + pts.size() + "\n";
ply += "property float x\n";
ply += "property float y\n";
ply += "property float z\n";
ply += "end_header\n";
for(PVector p : pts)ply += p.x + " " + p.y + " " + p.z + "\n";
You can later open/explore/process these files with tools like MeshLab.