Algorithm to detect overlapping rows of two images - image

Let's say I have 2 images A and B as below.
Notice that the bottom of A overlaps with the top of B for n rows of pixels, denoted by the two red rectangles. A and B have the same number of columns but might have different number of rows.
Two questions:
Given A and B, how to determine n efficiently?
If B is somehow changed in a way that 30%-50% of its pixels are completely replaced (for example, imagine the top left area showing # of votes/answers/views is replaced with an ad banner). How to determine n?
If anyone can point to an algorithm or better yet, an implementation in any language (preferred C/C++, C#, Java and JavaScript), it is much appreciated.

If I understood correctly, you probably want to look at normalized cross correlation of greyscale versions of the two images. Where you have large images, or large overlapping regions, this is done most efficiently in the frequency domain using the FFTs of the images (or overlap areas) and is called phase correlation.
The basic steps I would take in your situation are as follows:
Extract the bottom half of the first image and the top half of the second image.
Convert both image patches to greyscale.
Perform FFT on each image patch (there are some details here relating to windowing and padding).
Calculate the complex conjugate of the two FFTs (same as correlation in spatial domain).
Do inverse FFT on the result.
Find the peak in the above to get the XY shift that best aligns the two images.
Having found the relative offset between the top and bottom image patches, you can easily calculate n as you required.
If you want to experiment without having to code the above from scratch, OpenCV has a number of functions for template matching, which you can easily try. See here for details.
If part of either image has been changed - e.g. by a banner ad - the above procedure still gives the best match, and the magnitude of the peak you find in step 6 gives an indication of the match "confidence" - so you can get a rough idea of how similar the two areas are.

I had a little play at doing this with ImageMagick. Here is the animation of what I did, and the explanation and code follow.
First I grabbed a couple of StackOverflow pages, using webkit2png, calling them a.png and b.png.
Then I cropped a rectangle out of the top-left of b.png and a column the same width, but the full height out of a.png
That gave me this:
and this
I now overlay the smaller rectangle from the second page onto the bottom of the strip from the first page. I then calculate the difference between the two images by subtracting one from the other and note that when the difference is zero, the pictures must be the same, and the output image will be black, so I have found the point at which they overlap.
Here is the code:
#!/bin/bash
# Grab page 2 as "A" and page 3 as "B"
# webkit2png -F -o A http://stackoverflow.com/questions?page=2&sort=newest
# webkit2png -F -o B http://stackoverflow.com/questions?page=3&sort=newest
BLOBH=256 # blob height
BLOBW=256 # blob width
# Get height of x.png
XHEIGHT=$(identify -format "%h" x.png)
# Crop a column 256 pixels out of a.png that doesn't contain adverts or junk, into x.png
convert a.png -crop ${BLOBW}x+0+0 x.png
# Crop a rectangle 256x256 pixels out of top left corner of b.png, into y.png
convert b.png -crop ${BLOBW}x${BLOBH}+0+0 y.png
# Now slide y.png up across x.png, starting at the bottom of x.png
# ... differencing the two images as we go
# ... stop when the difference is nothing, i.e. they are the same and difference is black image
lines=0
while :; do
OFFSET=$((XHEIGHT-BLOBH-1-lines))
if [ $OFFSET -lt 0 ]; then exit; fi
FN=$(printf "out-%04d.png" $lines)
diff=$(convert x.png -crop ${BLOBW}x${BLOBH}+0+${OFFSET} +repage \
y.png \
-fuzz 5% -compose difference -composite +write $FN \
\( +clone -evaluate set 0 \) -metric AE -compare -format "%[distortion]" info:)
echo $diff:$lines
((lines++))
done
n=$((BLOBH+lines))

The FFT solution might be more complex than you were hoping for.
For a general problem, that might be the only robust way.
For a simple solution, you need to start making assumptions.
For example, can you guarantee that the columns of the images line up (barring the noted changes)? This allows you to go down the path suggested by #n.m.
Can you cut the image into vertical strips, and consider a row matches if a sufficient proportion of the strips match?
[ This could be redone to use a few passes with difference column offsets if we need to be robust to that.]
This gives something like:
class Image
{
public:
virtual ~Image() {}
typedef int Pixel;
virtual Pixel* getRow(int rowId) const = 0;
virtual int getWidth() const = 0;
virtual int getHeight() const = 0;
};
class Analyser
{
Analyser(const Image& a, const Image& b)
: a_(a), b_(b) {}
typedef Image::Pixel* Section;
static const int numStrips = 16;
struct StripId
{
StripId(int r = 0, int c = 0)
: row_(r), strip_(c)
{}
int row_;
int strip_;
};
typedef std::unordered_map<unsigned, StripId> StripTable;
int numberOfOverlappingRows()
{
int commonWidth = std::min(a_.getWidth(), b_.getWidth());
int stripWidth = commonWidth/numStrips;
StripTable aHash;
createStripTable(aHash, a_, stripWidth);
StripTable bHash;
createStripTable(bHash, b_, stripWidth);
// This is the position that the bottom row of A appears in B.
int bottomOfA = 0;
bool canFindBottomOfAInB = canFindLine(a_.getRow(a_.getHeight() - 1), bHash, stripWidth, bottomOfA);
int topOfB= 0;
bool canFindTopOfBInA = canFindLine(b_.getRow(0), aHash, stripWidth, topOfB);
int topOFBfromBottomOfA = a_.getHeight() - topOfB;
// Expect topOFBfromBottomOfA == bottomOfA
return bottomOfA;
}
bool canFindLine(Image::Pixel* source, StripTable& target, int stripWidth, int& matchingRow)
{
Image::Pixel* strip = source;
std::map<int, int> matchedRows;
for(int index = 0; index < stripWidth; ++index)
{
Image::Pixel hashValue = getHashOfStrip(strip,stripWidth);
bool match = target.count(hashValue) > 0;
if (match)
{
++matchedRows[target[hashValue].row_];
}
strip += stripWidth;
}
// Can set a threshold requiring more matches than 0
if (matchedRows.size() == 0)
return false;
// FIXME return the most matched row.
matchingRow = matchedRows.begin()->first;
return true;
}
Image::Pixel* getStrip(const Image& im, int row, int stripId, int stripWidth)
{
return im.getRow(row) + stripId * stripWidth;
}
static Image::Pixel getHashOfStrip(Image::Pixel* strip, unsigned width)
{
Image::Pixel hashValue = 0;
for(unsigned col = 0; col < width; ++col)
{
hashValue |= *(strip + col);
}
}
void createStripTable(StripTable& hash, const Image& image, int stripWidth)
{
for(int row = 0; row < image.getHeight(); ++row)
{
for(int index = 0; index < stripWidth; ++index)
{
// Warning: Not this simple!
// If images are sourced from lossy intermediate and hence pixels not _exactly_ the same, need some kind of fuzzy equality here.
// Details are going to depend on the image format etc, but this is the gist.
Image::Pixel* strip = getStrip(image, row, index, stripWidth);
Image::Pixel hashValue = getHashOfStrip(strip,stripWidth);
hash[hashValue] = StripId(row, index);
}
}
}
const Image& a_;
const Image& b_;
};

If rows match exactly, then sort rows in both images and merge. Your duplicates are right there. Then go to the original images and find the longest contiguous streak of duplicates in A, such that the corresponding rows in B are also contiguous. Or just look near the top and the bottom of corresponding images.
If there are banner ads, the first thing that comes to mind is breaking the images into several vertical strips and doing that with each pair of strips separately.

Something like this will probably help:
First, traverse the image A from bottom upwards, search for a row with significant information in it. An "information" can be calculated, for example, by counting the total color shift across the row. Say, two adjacent pixels have colors #ffffff and #ff0000 - add 2.0 to total count. Have a series of thresholds ready, and lock on the first row that's reaching that threshold. The series can be "10.0, 0.1*row length, 0.15*row length, ..." to a reasonable limit. Then, traverse this array from topmost discovered downwards, take the corresponding row and search for its match in B from upside down. If found, and the threshold is big enough, take the next one in the array and calculate the position of its match, and compare. If succeed, you have locked a correct offset of B over A, and it equals height_of_A - first_row_index + first_row_match_index. If failed continue searching for the next row. If all matches failed, search for very last row of A from the very first row of B, up to the offset of the first row found from the bottom of A. If again failed, then the answer is 0. Of course, if using JPEG images, use threshold-match, as pixels might not be exact in A and B, perhaps with a tolerance to unmatched pixels as well.

Related

Storing motion vectors from calculated optical flow in a practical way which enables reconstruction of subsequent frames from initial keyframes

I am trying to store the motion detected from optical flow for frames in a video sequence and then use these stored motion vectors in order to predict the already known frames using just the first frame as a reference. I am currently using two processing sketches - the first sketch draws a motion vector for every pixel grid (each of width and height 10 pixels). This is done for every frame in the video sequence. The vector is only drawn in a grid if there is sufficient motion detected. The second sketch aims to reconstruct the video frames crudely from just the initial frame of the video sequence combined with information about the motion vectors got from the first sketch.
My approach so far is as follows: I am able to determine the size, position and direction of each motion vector drawn in the first sketch from four variables. By creating four arrays (two for the motion vector's x and y coordinate and another two for its length in the x and y direction), every time a motion vector is drawn I can append each of the four variables to the arrays mentioned above. This is done for each pixel grid throughout an entire frame where the vector is drawn and for each frame in the sequence - via for loops. Once the arrays are full, I can then save them to a text file as a list of strings. I then load these strings from the text file into the second sketch, along with the first frame of the video sequence. I load the strings into variables within a while loop in the draw function and convert them back into floats. I increment a variable by one each time the draw function is called - this moves on to the next frame (I used a specific number as a separator in my text-files which appears at the end of every frame - the loop searches for this number and then increments the variable by one, thus breaking the while loop and the draw function is called again for the subsequent frame). For each frame, I can draw 10 by 10 pixel boxes and move then by the parameters got from the text files in the first sketch. My problem is simply this: How do I draw the motion of a particular frame without letting what I've have blitted to the screen in the previous frame affect what will be drawn for the next frame. My only way of getting my 10 by 10 pixel box is by using the get() function which gets pixels that are already drawn to the screen.
Apologies for the length and complexity of my question. Any tips would be very much appreciated! I will add the code for the second sketch. I can also add the first sketch if required, but it's rather long and a lot of it is not my own. Here is the second sketch:
import processing.video.*;
Movie video;
PImage [] naturalMovie = new PImage [0];
String xlengths [];
String ylengths [];
String xpositions [];
String ypositions [];
int a = 0;
int c = 0;
int d = 0;
int p;
int gs = 10;
void setup(){
size(640, 480, JAVA2D);
xlengths = loadStrings("xlengths.txt");
ylengths = loadStrings("ylengths.txt");
xpositions = loadStrings("xpositions.txt");
ypositions = loadStrings("ypositions.txt");
video = new Movie(this, "sample1.mov");
video.play();
rectMode(CENTER);
}
void movieEvent(Movie m) {
m.read();
PImage f = createImage(m.width, m.height, ARGB);
f.set(0, 0, m);
f.resize(width, height);
naturalMovie = (PImage []) append(naturalMovie, f);
println("naturalMovie length: " + naturalMovie.length);
p = naturalMovie.length - 1;
}
void draw() {
if(naturalMovie.length >= p && p > 0){
if (c == 0){
image(naturalMovie[0], 0, 0);
}
d = c;
while (c == d && c < xlengths.length){
float u, v, x0, y0;
u = float(xlengths[a]);
v = float(ylengths[a]);
x0 = float(xpositions[a]);
y0 = float(ypositions[a]);
if (u != 1.0E-19){
//stroke(255,255,255);
//line(x0,y0,x0+u,y0+v);
PImage box;
box = get(int(x0-gs/2), int(y0 - gs/2), gs, gs);
image(box, x0-gs/2 +u, y0 - gs/2 +v, gs, gs);
if (a < xlengths.length - 1){
a += 1;
}
}
else if (u == 1.0E-19){
if (a < xlengths.length - 1){
c += 1;
a += 1;
}
}
}
}
}
Word to the wise: most people aren't going to read that wall of text. Try to "dumb down" your posts so they get to the details right away, without any extra information. You'll also be better off if you post an MCVE instead of only giving us half your code. Note that this does not mean posting your entire project. Instead, start over with a blank sketch and only create the most basic code required to show the problem. Don't include any of your movie logic, and hardcode as much as possible. We should be able to copy and paste your code onto our own machines to run it and see the problem.
All of that being said, I think I understand what you're asking.
How do I draw the motion of a particular frame without letting what I've have blitted to the screen in the previous frame affect what will be drawn for the next frame. My only way of getting my 10 by 10 pixel box is by using the get() function which gets pixels that are already drawn to the screen.
Separate your program into a view and a model. Right now you're using the screen (the view) to store all of your information, which is going to cause you headaches. Instead, store the state of your program into a set of variables (the model). For you, this might just be a bunch of PVector instances.
Let's say I have an ArrayList<PVector> that holds the current position of all of my vectors:
ArrayList<PVector> currentPositions = new ArrayList<PVector>();
void setup() {
size(500, 500);
for (int i = 0; i < 100; i++) {
currentPositions.add(new PVector(random(width), random(height)));
}
}
void draw(){
background(0);
for(PVector vector : currentPositions){
ellipse(vector.x, vector.y, 10, 10);
}
}
Notice that I'm just hardcoding their positions to be random. This is what your MCVE should do as well. And then in the draw() function, I'm simply drawing each vector. This is like drawing a single frame for you.
Now that we have that, we can create a nextFrame() function that moves the vectors based on the ArrayList (our model) and not what's drawn on the screen!
void nextFrame(){
for(PVector vector : currentPositions){
vector.x += random(-2, 2);
vector.y += random(-2, 2);
}
}
Again, I'm just hardcoding a random movement, but you would be reading these from your file. Then we just call the nextFrame() function as the last line in the draw() function:
If you're still having trouble, I highly recommend posting an MCVE similar to mine and posting a new question. Good luck.

efficiently calculate locations for rectangles in a unit grid

I'm working on a specific layout algorithm to display photos in a unit based grid. The desired behaviour is to have every photo placed in the next available space line by line.
Since there could easily be a thousand photos whose positions need to be calculated at once, efficiency is very important.
Has this problem maybe been solved with an existing algorithm already?
If not, how can I approach it to be as efficient as possible?
Edit
Regarding the positioning:
What I'm basically doing right now is iterating every line of the grid cell by cell until I find room to fit the element. That's why 4 is placed next to 2.
How about keeping a list of next available row by width? Initially the next-available-row list looks like:
(0,0,0,0,0)
When you've added the first photo, it looks like
(0,0,0,0,1)
Then
(0,0,0,2,2)
Then
(0,0,0,3,3)
Then
(1,1,1,4,4)
And the final photo doesn't change the list.
This could be efficient because you're only maintaining a small list, updating a little bit at each iteration (versus searching the entire space every time. It gets a little complicated - there could be a situation (with a tall photo) where the nominal next available row doesn't work, and then you could default to the existing approach. But overall I think this should save a fair amount of time, at the cost of a little added complexity.
Update
In response to #matteok's request for a coordinateForPhoto(width, height) method:
Let's say I called that array "nextAvailableRowByWidth".
public Coordinate coordinateForPhoto(width, height) {
int rowIndex = nextAvailableRowByWidth[width + 1]; // because arrays are zero-indexed
int[] row = space[rowIndex]
int column = findConsecutiveEmptySpace(width, row);
for (int i = 1; i < height; i++) {
if (!consecutiveEmptySpaceExists(width, space[i], column)) {
return null;
// return and fall back on the slow method, starting at rowIndex
}
}
// now either you broke out and are solving some other way,
// or your starting point is rowIndex, column. Done.
return new Coordinate(rowIndex, column);
}
Update #2
In response to #matteok's request for how to update the nextAvailableRowByWidth array:
OK, so you've just placed a new photo of height H and width W at row R. Any elements in the array which are less than R don't change (because this change didn't affect their row, so if there were 3 consecutive spaces available in the row before placing the photo, there are still 3 consecutive spaces available in it after). Every element which is in the range (R, R+H) needs to be checked, because it might have been affected. Let's postulate a method maxConsecutiveBlocksInRow() - because that's easy to write, right?
public void updateAvailableAfterPlacing(int W, int H, int R) {
for (int i = 0; i < nextAvailableRowByWidth.length; i++) {
if (nextAvailableRowByWidth[i] < R) {
continue;
}
int r = R;
while (maxConsecutiveBlocksInRow(r) < i + 1) {
r++;
}
nextAvailableRowByWidth[i] = r;
}
}
I think that should do it.
How about a matrix (your example would be 5x9) where each cell has a value representing the distance from the top left corner (for instance (row+1)*(column+1) [+1 is only necessary if your first row and value are 0]). In this matrix you look for the area which has the lowest value (when summing up the values of empty cells).
A 2nd matrix (or a 3rd dimension of the first matrix) stores the status of each cell.
edit:
int[][] grid = new int[9][5];
int[] filledRows = new int [9];
int photowidth = 2;
int photoheight = 1;
int emptyRowCounter = 0;
boolean photoFits = true;
for(int i = 0; i < grid.length; i++){
for(int m = 0; m < filledRows.length; m++){
if(filledRows[m]-(photoHeight-1) > i || filledRows[m]+(photoHeight-1) < i){
for(int j = 0; j < grid[i].length; j++){
if(grid[i][j] == 0){
for(int k = 0; k < photowidth; k++){
for(int l = 0; k < photoheight){
if(grid[i+l][j+k]!=0){
photoFits = false;
}
}
}
} else{
emptyRowCounter++;
}
}
if(photoFits){
//place Photo at i,j
}
if(emptyRowCounter == 5){
filledRows[i] = 1;
}
}
}
}
In the gif you have above, it turned out nicely that there was a photo (5) that could fit into the gap under (1) and to the left of (2). My intuition suggests we want to avoid creating gaps like that. Here is an idea that should avoid these gaps.
Maintain a list of "open regions", where an open region has a int leftBoundary, an int topBoundary, and an optional int bottomBoundary. The first open region is just the whole grid (leftBoundary:0, topBoundary: 0, bottom: null).
Sort the photos by height, breaking ties by width.
Until you have placed all photos:
Choose the tallest photo (in case of ties, choose the widest of the tallest photos). Find the first open region it can fit in (such that grid.Width - region.leftBoundary >= photo.Width). Place the photo at the top left of this region. When you place this photo, it may span the entire width or height of the region.
If it spans both the width and the height of the region, the region is filled! Remove this region from the list of open regions.
If it spans the width, but not the height, add the photo's height to the topBoundary of the region.
If it spans the height, but not the width, add the photo's width to the leftBoundary of the region.
If it does not span the height or width of the boundary, we are going to conceptually divide this region into two: one region will cover the space directly to the right of this photo (call it rightRegion), and the other region will cover the space below this region (call it belowRegion).
rightRegion = {
leftBoundary = parentRegion.leftBoundary + photo.width,
topBoundary = parentRegion.topBoundary,
bottomBoundary = parentRegion.topBoundary + photo.height
}
belowRegion = {
leftBoundary = 0,
topBoundary = parentRegion.topBoundary + photo.height,
bottomBoundary = parentRegion.bottomBoundary
}
Replace the current region in the list of open regions with rightRegion, and insert belowRegion directly after rightRegion.
You can visualize how this algorithm would work on your example: First, it would sort the photos: (2,3,4,1,5).
It considers 2, which fits into the first region (the whole grid). When it places 2 at the top left, it splits that region into the space directly to the right of 2, and the space below 2.
Then, it considers 3. It considers the open regions in turn. The first open region is to the right of 2. 3 fits there, so that's where it goes. It spans the width of the region, so the region's topBoundary gets adjusted downward.
Then, it considers 4. It again fits in the first open region, so it places 4 there. 4 spans the height of the region, so the region's leftBoundary gets adjusted rightward.
Then, 1 gets put in the 1x1 gap to the right of 4, filling its region. Finally, 5 gets put just below 2.

OpenCV : Transparent area of imported .png file is now white

I'm trying to develop a small and simplistic webcam-controlled game, where the user moves a figure on the x-axis by tracking a lighting source with the webcam (flashlight eg.)
So far my code generates a target object every couple of seconds at a random location in the picture.
That object is stored as a Mat via
Mat target = imread("target.png");
In order to paint the object onto the background image, I'm using
bgClear.copyTo(temp);
for(int i = targetX; i < target.cols + targetX; i++){
for(int j = targetY; j < target.rows + targetY; j++){
temp.at<Vec3b>(j,i) = target.at<Vec3b>(j-targetY,i-targetX);
}
}
temp.copyTo(bg);
where bgClear represents the clean background, temp the background copy that is being edited and bg the final background thats being shown. including the object.
targetX and targetY are the starting coordinates of the object (whereas targetX is randomly generated beforehand so that the object spawns at a random location in the upper half of the image), relative to the background. (so I'm not iterating through the whole background, only the range of the object).
It works so far, but I have a problem:
The transparent area of the imported image is now white, and I dont seem to be able to fix it by checking the pixel values with something like
if(target.at<Vec3b>(Point(j-targetY,i-targetX))[0] != 255 &&
target.at<Vec3b>(Point(j-targetY,i-targetX))[1] != 255 &&
target.at<Vec3b>(Point(j-targetY,i-targetX))[2] != 255)
before I am actually replacing the pixel.
I've also tried loading the .png file by adding the -1 flag (alpha channel), but then the image just seems ghosty and can barely be seen.
In case I might you have problems imaging what I'm talking about, here's a partial screenshot of it: Screenshot
Any advice on how I might fix this ?
Regards,
Daniel
You need to handle transparency manually. General idea is, while copying to temp only copy pixels that are opaque i.e. alpha value is high.
use CV_LOAD_IMAGE_UNCHANGED (= -1) in imread.
split target to four single channel image using split.
merge first three channels to form a BGR image using merge.
in the paint loop, use newly formed BGR image as source and the unmerged fourth channel (alpha) as mask.
...as I was mentioning in my comment to asif's helpful answer:
Mat target = imread("target", CV_LOAD_IMAGE_UNCHANGED); // load image
Mat targetBGR(target.rows, target.cols, CV_8UC3); // create BGR mat
Mat targetAlpha(target.rows, target.cols, CV_8UC1); // create alpha mat
Mat out[] = {targetBGR, targetAlpha}; // create array of matrices
int from_to[] = { 0,0, 1,1, 2,2, 3,3 }; // create array of index pairs
mixChannels( &target, 1, out, 2, from_to, 4 ); // finally split target into 3
channel BGR plus 1 channel Alpha
...as described in this example. (minus the R-B-channel-swapping).
...later in the pixel-processing loop:
if(targetAlpha.at<uchar>(j-targetY,i-targetX) > 0)
temp.at<Vec3b>(j,i) = targetBGR.at<Vec3b>(j-targetY,i-targetX);
Working like a charm!

Algorithm to Generate All Possible Black and White Pixel Images in 640 x 360 Dimensions?

I have very minimal programming experience.
I would like to write a program that will generate and save as a gif image every possible image that can be created using only black and white pixels in 640 by 360 px dimensions.
In other words, each pixel can be either black or white. 640 x 360 = 230,400 pixels. So I believe total of 460,800 images are possible to be generated (230,400 x 2 for black/white).
I would like a program to do this automatically.
Please help!
First to answer your questions. Yes there will be writings on "some" pictures. Actually ever text written by human which fits in 640x360 pixels will show up. Also every other text (text not yet written or text that never will be written). Also you will see pictures of every human which is, was or will be alive. See Infinite Monkey Theorem for further information.
The code to create your wanted gif is fairly easy. I used Java for this. Note that you need an extra class: AnimatedGifEncoder. The Code is not memory-bound because the AanimatedGifEncoder will write each image to disk as soon it is computed. But make sure that you have enough disk space available.
import java.awt.Color;
import java.awt.image.BufferedImage;
public class BigPicture {
private final int width;
private final int height;
private final int WHITE = Color.WHITE.getRGB();
private final int BLACK = Color.BLACK.getRGB();
public BigPicture(int width, int height) {
this.width = width;
this.height = height;
}
public void process(String outFile) {
AnimatedGifEncoder gif = new AnimatedGifEncoder();
gif.setSize(width, height);
gif.setTransparent(null); // no transparency
gif.setRepeat(-1); // play only once
gif.setDelay(0); // 0 ms delay between images,
// 'cause ain't nobody got time for that!
gif.start(outFile);
BufferedImage bufferedImage = new BufferedImage(width, height, BufferedImage.TYPE_BYTE_BINARY);
// set the image to all white
for (int y = 0; y < height; y++) {
for (int x = 0; x < width; x++) {
bufferedImage.setRGB(x, y, WHITE);
}
}
// add white image
gif.addFrame(bufferedImage);
// add all other combinations
while (increase(bufferedImage)) {
gif.addFrame(bufferedImage);
}
gif.finish();
}
/**
* #param bufferedImage
* the image to increase
* #return false if last pixel set to black => image is complete black
*/
private boolean increase(BufferedImage bufferedImage) {
for (int y = 0; y < height; y++) {
for (int x = 0; x < width; x++) {
if (bufferedImage.getRGB(x, y) == WHITE) {
bufferedImage.setRGB(x, y, BLACK);
return true;
}
bufferedImage.setRGB(x, y, WHITE);
}
}
return false;
}
public static void main(String[] args) {
new BigPicture(640, 360).process("C:\\temp\\bigpicture.gif");
System.out.println("finished.");
}
}
Please be aware that this will take some time. So don't bother waiting and enjoy your life instead! ;)
EDIT: Since my solution is a bit unclear i will explain the algorithm.
I have defined a method called increase. This method takes the BufferedImage and changes the bit pattern of the image so that the next bit pattern appears. The method is just a bit addition. The method will return false if the image encounters the last bit pattern (all pixels are set to black).
As long as it is possible to increase the bit pattern (i.e. increase() returns true) we will save the image as new frame and increase the image again.
How the increase() method works: The method runs over the image first in x-direction then in y-direction. I assume that white pixels are 0 and black pixels are 1. So, we want to take the bit pattern of the image and add 1. We inspect the first pixel: if it is white (0) we can add 1 without an overflow so we turn the pixel to black (0 + 1 = 1 => black pixel). After that we return from the method because we want to increase only one position. It returns true because an increase was possible. If we encounter a black pixel we have an overflow (1 + 1 = 2 or in binary 10). So we have to set the current pixel to white and add the 1 to the next pixel. This will continue until we find the first white pixel.
example:
first we create a print method: this method prints the image as binary number. Attention the number is reversed and the most significant bit is the bit on the right side.
public void print(BufferedImage bufferedImage) {
for (int y = 0; y < height; y++) {
for (int x = 0; x < width; x++) {
if (bufferedImage.getRGB(x, y) == WHITE) {
System.out.print(0); // white pixel
} else {
System.out.print(1); // black pixel
}
}
}
System.out.println();
}
now we modify our main-while loop:
print(bufferedImage); // this one prints the empty image
while (increase(bufferedImage)) {
print(bufferedImage);
}
and now set some short example to test:
new BigPicture(1, 5).process("C:\\temp\\bigpicture.gif");
and finally the output:
00000 // 0 this is the first print before the loop -> "white image"
10000 // 1 the first white pixel is set to black
01000 // 2 the first overflow, so the second pixel is set to black "2"
11000 // 3
00100 // 4
10100 // 5
01100
11100
00010 // 8
10010
01010
11010
00110
10110
01110
11110
00001 // 16
10001
01001
11001
00101
10101
01101
11101
00011
10011
01011
11011
00111
10111
01111
11111 // 31 == 2^5 - 1
finished.
In other words, each pixel can be either black or white. 640 x 360 =
230,400 pixels. So I believe total of 460,800 images are possible to
be generated (230,400 x 2 for black/white).
There is a little flaw in your belief. You are right about the number of pixels: 230,400. Unfortunately, this means there are not 2 * 230,400, but 2 ^ 230,400 possible pictures, which is a number with more than 60,000 digits (longer than the allowed answer size, I am afraid). For comparison a particular number with 45 digits signifies the diameter of the observable universe in centimeters (roughly the width of a pinkie).
In order to understand why your computation of the number of pictures is wrong consider this example: if your pictures contained only three pixels, you could have 8 different pictures (2 ^ 3), rather than 6 (2 * 3). Here are all of them: BBB, BBW, BWB, BWW, WBB, WBW, WWB, WWW. Adding another pixel doubles the size of possible pictures because you can have it white for all the 3-pixel cases, or black for all the 3-pixel cases. Doubling 1 (which is the amount of pictures you can have with 0 pixels) 230,400 times gives you 2 ^ 230,400.
It's great that there is a bounty for the question, but it is rather distracting and counter-productive if it was just as an April's Fool joke.
I'm going to go ahead and pinch some code from a related question, just for fun.
from itertools import product
for matrix in product([0, 1], repeat=(math,pow(2,230400)):
# render and save your .gif
As all the comments have already stated, good luck!
On a more serious note, if you didn't want to be absolutely sure that you had all permutations, you could generate a random 640x360 matrix and store it as an image.
Perform this action say 100k times, and you'll have at least an interesting set of pictures to look at, but it's unfeasible to get every possible permutation.
You could then delete all identical files to reduce the set to just the unique images.

Retrieve color information from images

I need to determine the amount/quality of color in an image in order to compare it with other images and recommend a user (owner of the image) maybe he needs to print it in black and white and not in color.
So far I'm analyzing the image and extracting some data of it:
The number of different colors I find in the image
The percentage of color in the whole page (color pixels / total pixels)
For further analysis I may need other characteristic of these images. Do you know what else is important (or I'm missing here) in image analysis?
After some time I found a missing characteristic (very important) which helped me a lot with the analysis of the images. I don't know if there is a name for that but I called it the average color of the image:
When I was looping over all the pixels of the image and counting each color I also retrieved the information of the RGB values and summarized all the Reds, Greens and Blues of all the pixels. Just to come up with this average color which, again, saved my life when I wanted to compare some kind of images.
The code is something like this:
File f = new File("image.jpg");
BufferedImage im = ImageIO.read(f);
int tot = 0;
int red = 0;
int blue= 0;
int green = 0;
int w = im.getWidth();
int h = im.getHeight();
// Going over all the pixels
for (int i=0;i<w;i++){
for (int j=0;j<h;j++){
int pix = im.getRGB(i, j); //
if (!sameARGB(pix)) { // Compares the RGB values
tot+=1;
red+=pix.getRed();
green+=pix.getGreen();
blue+=pix.getBlue();
}
}
}
And you should get the results like this:
// Percentage of color on the image
double per = (double)tot/(h*w);
// Average color <-------------
Color c = new Color((double)red/tot,(double)green/tot,(double)blue/tot);

Resources