How do ASCII art image conversion algorithms work? [closed] - algorithm

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
There are some nice free "image to ASCII art" conversion sites like this one: ASCII-art.org
How does such an image conversion algorithm work?
,
. W ,
W W #
W ,W W
, W, :W* .W .
# WW #WW WW #
W WW.WWW WW: W
W. WW*WWW# WW# W
* :WW.WWWWWWW#WWW#W #
+* #WW#WWWWWWWWWWWWW# W
W# #WWWWWWWWWWWWWWWWW W
WW WWWWWWWWWWWWWWWWWW W
WW WWWWWWWWWWWWWWWWWW#W#
,WW.WWWWWWWWWWWWWWWWWWWWW
WW#WWWWWWWWWWWWWWWWWWWWW
: WWWWWWWWWWWWWWWWWWWWWWWW :
# WWWWWWWW#WWWWWWW##WWWWWW.
W*WWWWWW::::#WWW:::::#WWWWW
WWWWWW#:: :+*:. ::#WWWW
WWWWW#:*:.:: .,.:.:WWWW
#WWWW#:.:::. .:: #:#WWW
:WWW#:#. :: :WWWW:#WWWW
WWW#*:W#*#W . W:#WWW
#WWWW:# :: :: *WWWW
W#WW*W .::,.::::,:+ ##WW#,
WWWW## ,,.: .:::.: . .WWW:,
#WWW#: W..::::: #. :WWWW
WWWW:: *..:. ::.,. :WWWW
WWWW:: :.:.: : :: ,#WW#
WWWW: .:, : ,, :WW,
.: # : , : *
W + ., ::: ., : #
W :: .: W
#,,,W:. ,, ::*#*:, . :#W.,,#
+.....*: : : .#WWWWW: : .#:....+,
#...:::*:,, : :WWWWWWW, , *::::..,#
:...::::::W:, #W::::*W. :W:::::...#
###########W#####W######W#####W##########:

The big-picture-level concept is simple:
Each printable character can be assigned an approximate gray-scale value; the "at" sign # obviously is visually darker than the "plus" sign +, for example. The effect will vary, depending on the font and spacing actually used.
Based on the proportions of the chosen font, group the input image into rectangular pixel blocks with constant width and height (e.g. a rectangle 4 pixels wide and 5 pixels high). Each such block will become one character in the output. (Using the pixel blocks just mentioned, a 240w-x-320h image would become 64 lines of 60 characters.)
Compute the average gray-scale value of each pixel block.
For each pixel block, select a character whose gray-scale value (from step 1) is a good approximation of the pixel block average (from step 3).
That's the simplest form of the exercise. A more sophisticated version will also take the actual shapes of the characters into account when breaking ties among candidates for a pixel block. For example, a "slash" (/) would be a better choice than a "backward slash" (\) for a pixel block that appears to have a bottom-left-to-upper-right contrast feature.

aalib (last release in 2001) is an open source ASCII art library that's used in applications like mplayer. You may want to check out its source code to see how it does it. Other than that, this page describes in more detail about how such algorithms work.

Also you can take a look at libcaca (latest release 2014), which acording to their website has the following improvements over aalib:
Unicode support
2048 available colours (some devices can onlyhandle 16)
dithering of colour images
advanced text canvas operations (blitting, rotations)

I found this CodeProject article written by Daniel Fisher containing a simple C# implementation of a image to ASCII art conversion algorithm.
These are the steps the program/library performs:
Load the Image stream to a bitmap object
Grayscale the bitmap using a Graphics object
Loop through the image's pixels (because we don't want one ASCII character per pixel, we take one per 10 x 5)
To let every pixel influence the resulting ASCII char, we loop them and calculate the brightness of the amount of the current 10 x 5 block.
Finally, append different ASCII characters based for the current block on the calculated amount.
Quite easy, isn't it?
BTW: In the comments to the article I found this cool AJAX implementation: Gaia Ajax ASCII Art Generator:
[...] I felt compelled to demonstrate
it could easily be done in a
standardized set of web technologies.
I set out to see if I could find some
libraries to use, and I found Sau Fan
Lee's codeproject article about his
ASCII fying .NET library.
P.S.: Lucas (see comments) found another CodeProject article.

Related

ORB Feature Descriptor Official Paper Explanation

I was just reading the official paper of ORB from Ethan Rublee Official Paper and somewhat I find hard to understand the section of "4.3 Learning Good Binary Features"
I was surfing over the Internet to dig much deep into it and I found the below paragraph. I haven't getting the practical explanation of this. Can any of you explain me this in a simple terms.
"Given a local image patch in size of m × m, and suppose the local window
(i.e., the box filter used in BRIEF) used for intensity test is of size r × r , there are N = (m − r )2 such local windows.
Each two of them can define an intensity test, so we have C2N bit features. In the original implementation of ORB, m is set to 31, generating 228,150 binary tests. After removing tests that overlap, we finally have a set of 205,590 candidate bit features. Based on a training set, ORB selects at most 256 bits according to Greedy algorithm."
What am getting from the official paper and from the above paragraph is that.
We have a patch size of 31X31 and select a size of 5X5.. We will have N=(31-5)^2 = 676 possible Sub Windows. Am not getting the lines which are marked in bold. What does it mean by removing test that overlap, we get 205,590 bit Features?
Imagine a small image with size 31x31 (patch) and a small 5x5 window. How many different positions this window can be placed into the image? If you slide it 1 by 1 pixel then it can be placed in (31-5)^2 = 676 different positions, right? Combining only central pixels of 676 windows by 2 elements you have 676!/(2!*(676-2)!) = 228,150 combinations. In case of ORB descriptor they were not interested in slide the window in 1 by 1 pixel, it could be so much noised because of overlap between some windows (they are much near). Then they removed overlapping windows sliding it 5 by 5 pixels and used their central pixels to create binary tests, what reduced total combinations to 205,590.

YUV to XYZ conversion

I would like to convert 10 bit, bt.2020 YUV color image to XYZ color components. Are there anybody help me for this?
Also, is Y components in YUV and L component in Lab same?
According to this document, Y in YUV is same as Y in CIE XYZ space. However, L in CIE LAB space has a nonlinear relation with Y. You can check the relation in the same document, equation 19.
So the short answer to your question is no. Also, for colorspace conversion, I prefer this library.
I know this is an old question but I have recently come across the same problem and only found a hint of what needs to be done by accident (a reference in a google book search sample. P251 of "Digital Video and HD: Algorithms and Interfaces").
This is my understanding of what needs to be done, so it could be incorrect and/or incomplete:
Calculate the Normalised Primary Matrix (NPM) using SMPTE RP-177 section 3.3 in combination with the colour primaries (CIE 1931 chromaticity coordinates) of the colour space (e.g. Rec.709). There is a worked example in Annex B.1.
Convert the YUV (YCbCr) values to RGB using the matrix coefficients of the colour space.
Then apply the NPM to the RGB values to calculate the XYZ coordinates.

Comment image's "source code"

When you open an image in a text editor you get some characters which don't really makes sense (at least not to me). Is there a way to add comments to that text, so the file would not apear damaged when opened with an image viewer.
So, something like this:
Would turn into this:
The way to go is setting metadata fields if your image format supports any.
For example, for a PNG you can set a comment field when exporting the file or with a separate tool like exiftool:
exiftool -comment="One does not simply put text into the image data" test.png
If the purpose of the text is to ensure ownership then take a look at digital watermarking.
If you are looking to actually encode information in your image you should use steganography ( https://en.wikipedia.org/wiki/Steganography )
The wiki article runs you through the basics and shows and example of a picture of a cat hidden in a picture of trees as an example of how you can hide information. In the case of hiding text you can do the following:
Encoding
Come up with your phase: For argument's sake I'll use the word Hidden
Convert that text to a numeric representation - for simplicity I'll assume ASCII conversion of characters, but you don't have to
"Hidden" = 72 105 100 100 101 110
Convert the numeric representation to Binary
72 = 01001000 / 105 = 01101001 / 100 = 01100100 / 101=01100100 / 110=01101110
For each letter convert the 8 bit binary representations into four 2 bit binary representations that we shall call mA,mR,mG,mB for reasons that will become clear shortly
72 = 01 00 10 00 => 1 0 2 0 = mA mR mG mB
Open an image file for editing: I would suggest using C# to load the image and then use Get/Set Pixels to edit them (How to manipulate images at the pixel level in C# )
use the last 2 bits of each color channel for each pixel to encode your message. For example to encode H in the first pixel of an image you can use the C# code at the end of the instructions
Once all letters of the Word - one per pixel - have been encoded in the image you are done.
Decoding
Use the same basic process in reverse.
You walk through the image one pixel at a time
You take the 2 least significant bits of each color channel in the pixel
You concatenate the LSB together in alpha,red,green,blue order.
You convert the concatenated bits into an 8 bit representation and then convert that binary form to base 10. Finally, you perform a look up on the base 10 number in an ASCII chart, or just cast the number to a char.
You repeat for the next pixel
The thing to remember is that the technique I described will allow you to encode information in the image without a human observer noticing because it only manipulates the image on the last 2 bits of each color channel in a single pixel, and human eyes cannot really tell the difference between the colors in the range of [(252,252,252,252) => (255,255,255,255)].
But as food for thought, I will mention that a computer can with the right algorithms, and there is active research into bettering the ability of a computer to be able to pick this sort of thing out.
So if you only want to put in a watermark then this should work, but if you want to actually hide something you have to encrypt the message and then perform the
Steganography on the encrypted binary. Since encrypted data is MUCH larger than plain text data it will require an image with far more pixels.
Here is the code to encode H into the first pixel of your image in C#.
//H=72 and needs the following message Alpha, message Red, message Green, message Blue components
mA = 1;
mR = 0;
mG = 2;
mB = 0;
Bitmap myBitmap = new Bitmap("YourImage.bmp");
//pixel 0,0 is the first pixel
Color pixelColor = myBitmap.GetPixel(0, 0);
//the 252 places 1's in the 6 bits that we aren't manipulating so that ANDing with the message bits works
pixelColor = Color.FromArgb(c.A & (252 + mA), c.R & (252 + mR), c.G & (252 + mG), c.B & (252 + mB));
myBitmap.SetPixel(0, 0, pixelColor);

Detect black dots from color background

My short question
How to detect the black dots in the following images? (I paste only one test image to make the question look compact. More images can be found →here←).
My long question
As shown above, the background color is roughly blue, and the dots color is "black". If pick one black pixel and measure its color in RGB, the value can be (0, 44, 65) or (14, 69, 89).... Therefore, we cannot set a range to tell the pixel is part of the black dot or the background.
I test 10 images of different colors, but I hope I can find a method to detect the black dots from more complicated background which may be made up of three or more colors, as long as human eyes can identify the black dots easily. Some extremely small or blur dots can be omitted.
Previous work
Last month, I have asked a similar question at stackoverflow, but have not got a perfect solution, some excellent answers though. Find more details about my work if you are interested.
Here are the methods I have tried:
Converting to grayscale or the brightness of image. The difficulty is that I can not find an adaptive threshold to do binarization. Obviously, turning a color image to grayscale or using the brightness (HSV) will lose much useful information. Otsu algorithm which calculates adaptive threshold can not work either.
Calculating RGB histogram. In my last question, natan's method is to estimate the black color by histogram. It is time-saving, but the adaptive threshold is also a problem.
Clustering. I have tried k-means clustering and found it quite effective for the background that only has one color. The shortage (see my own answer) is I need to set the number of clustering center in advance but I don't know how the background will be. What's more, it is too slow! My application is for real time capturing on iPhone and now it can process 7~8 frames per second using k-means (20 FPS is good I think).
Summary
I think not only similar colors but also adjacent pixels should be "clustered" or "merged" in order to extract the black dots. Please guide me a proper way to solve my problem. Any advice or algorithm will be appreciated. There is no free lunch but I hope a better trade-off between cost and accuracy.
I was able to get some pretty nice first pass results by converting to HSV color space with rgb2hsv, then using the Image Processing Toolbox functions imopen and imregionalmin on the value channel:
rgb = imread('6abIc.jpg');
hsv = rgb2hsv(rgb);
openimg = imopen(hsv(:, :, 3), strel('disk', 11));
mask = imregionalmin(openimg);
imshow(rgb);
hold on;
[r, c] = find(mask);
plot(c, r, 'r.');
And the resulting images (for the image in the question and one chosen from your link):
You can see a few false positives and missed dots, as well as some dots that are labeled with multiple points, but a few refinements (such as modifying the structure element used in the opening step) could clean these up some.
I was curios to test with my old 2d peak finder code on the images without any threshold or any color considerations, really crude don't you think?
im0=imread('Snap10.jpg');
im=(abs(255-im0));
d=rgb2gray(im);
filter=fspecial('gaussian',16,3.5);
p=FastPeakFind(d,0,filter);
imagesc(im0); hold on
plot(p(1:2:end),p(2:2:end),'r.')
The code I'm using is a simple 2D local maxima finder, there are some false positives, but all in all this captures most of the points with no duplication. The filter I was using was a 2d gaussian of width and std similar to a typical blob (the best would have been to get a matched filter for your problem).
A more sophisticated version that does treat the colors (rgb2hsv?) could improve this further...
Here is an extraodinarily simplified version, that can be extended to be full RGB, and it also does not use the image procesing library. Basically you can do 2-D convolution with a filter image (which is an example of the dot you are looking for), and from the points where the convolution returns the highest values, are the best matches for the dots. You can then of course threshold that. Here is a simple binary image example of just that.
%creating a dummy image with a bunch of small white crosses
im = zeros(100,100);
numPoints = 10;
% randomly chose the location to put those crosses
points = randperm(numel(im));
% keep only certain number of points
points = points(1:numPoints);
% get the row and columns (x,y)
[xVals,yVals] = ind2sub(size(im),points);
for ii = 1:numel(points)
x = xVals(ii);
y = yVals(ii);
try
% create the crosses, try statement is here to prevent index out of bounds
% not necessarily the best practice but whatever, it is only for demonstration
im(x,y) = 1;
im(x+1,y) = 1;
im(x-1,y) = 1;
im(x,y+1) = 1;
im(x,y-1) = 1;
catch err
end
end
% display the randomly generated image
imshow(im)
% create a simple cross filter
filter = [0,1,0;1,1,1;0,1,0];
figure; imshow(filter)
% perform convolution of the random image with the cross template
result = conv2(im,filter,'same');
% get the number of white pixels in filter
filSum = sum(filter(:));
% look for all points in the convolution results that matched identically to the filter
matches = find(result == filSum);
%validate all points found
sort(matches(:)) == sort(points(:))
% get x and y coordinate matches
[xMatch,yMatch] = ind2sub(size(im),matches);
I would highly suggest looking at the conv2 documentation on MATLAB's website.

How to detect boundaries of a pattern [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Detecting thin lines in blurry image
So as the title says, I am trying to detect boundaries of patterns. In the images attached, you can basically see three different patterns.
Close stripe lines
One thick L shaped line
The area between 1 & 2
I am trying to separate these three, in say 3 separate images. Depend on where the answers go, I will upload more images if needed. Both idea or code will be helpful.
You can solve (for some values of "solve") this problem using morphology. First, to make the image more uniform, remove irrelevant minima. One way to do this is using the h-dome transform for regional minima, which suppresses minima of height < h. Now, we want to join the thin lines. That is accomplished by a morphological opening with a horizontal line of length l. If the lines were merged, then the regional minima of the current image is the background. So we can fill holes to obtain the relevant components. The following code summarizes these tasks:
f = rgb2gray(imread('http://i.stack.imgur.com/02X9Z.jpg'));
hm = imhmin(f, h);
o = imopen(hm, strel('line', l, 0));
result = imfill(~imregionalmin(o), 'holes');
Now, you need to determine h and l. The parameter h is expected to be easier since it is not related to the scale of the input, and in your example, values in the range [10, 30] work fine. To determine l maybe a granulometry analysis could help. Another way is to check if the result contains two significant connected components, corresponding to the bigger L shape and the region of the thin lines. There is no need to increase l one by one, you could perform something that resembles a binary search.
Here are the hm, o and result images with h = 30 and l = 15 (l in [13, 19] works equally good here). This approach gives flexibility on parameter choosing, making it easier to pick/find good values.
To calculate the area in the space between the two largest components, we could merge them and simply count the black pixels inside the new connected component.
You can pass a window (10x10 pixels?) and collect features for that window. The features could be something as simple as the cumulative gradients (edges) within that window. This would distinguish the various areas as long as the window is big enough.
Then using each window as a data point, you can do some clustering, or if the patterns don't vary that much you can do some simple thresholds to determine which data points belong to which patterns (the larger gradient sums belong to the small lines: more edges, while the smallest gradient sums belong to the thickest lines: only one edge, and those in between belong to the other "in-between" pattern .
Once you have this classification, you can create separate images if need be.
Just throwing out ideas. You can binarize the image and do connected component labelling. Then perform some analysis on the connected components such as width to discriminate between the regions.

Resources