When I am reading about the resolution of a digital image from the following link http://www.rideau-info.com/photos/whatis.html, I confused at the following Paragraph:
If the field of view is 20 feet across, a 3 megapixel camera will be resolving that view at 102 pixels per foot. If that same shot was taken with an 18 Mp camera it would be resolving that view at 259 pixels per foot, 2.5 times more resolution than a 3 Mp camera.
Here, how come the author is arriving at the conclusion: "102 pixels per foot and 259 pixels"?
A 3MP camera, in that article, is 2048 wide x 1536 high. Think of 2048 pixels across as 2048 boxes laid in a straight line. Now, if you were to divide these equally amongst 20 sections (20 feet of field of view), you would get ~120 boxes per section. Hence, the logic behind 102 pixels per foot. Similar reasoning is used for the 18MP camera which is 5184 W x 3546 H. 5184 divided into 20 is ~259.
Related
I currently display 115 (!) different sponsor icons at the bottom of many web pages on my website. They're lazy-loaded, but even so, that's quite a lot.
At present, these icons are loaded separately, and are sized 75x50 (or x2 or x3, depending on the screen of the device).
I'm toying with the idea of making them all into one sprite, rather than 115 separate files. That would mean, instead of lots of tiny little files, I'd have one large PNG or WEBP file instead. The way I'm considering doing it would mean the smallest file would be 8,625 pixels across; and the x3 version would be 25,875 pixels across, which seems like a really very large image (albeit only 225 px high).
Will an image of this pixel size cause a browser to choke?
Is a sprite the right way to achieve a faster-loading page here, or is there something else I should be considering?
115 icons with 75 pixel wide sure will calculate to very wide 8625 pixels image, which is only 50px heigh...
but you don't have to use a low height (50 pixel) very wide (8625 pixel) image.
you can make a suitable rectangular smart size image with grid of icons... say, 12 rows of 10 icons per line...
115 x 10 = 1150 + 50 pixel (5 pixel space between 10 icons) total 1200 pixel wide approx.
50 x 12 = 600 + 120 pixel (5 pixel space between 12 icons) total 720 pixel tall approx.
If my first convolution have 64 filters and my second has 32 filters.
Will i have :
1 Image -> Conv(64 filters) -> 64 ImagesFiltred -> Conv(32 filters) -> 64 x 32 = 2048 Images filtred
Or :
1 Image -> Conv(64 filters) -> 64 ImagesFiltred -> Conv(32 filters) -> 32 Images filtred
If it is the second answer : what are goin on between the 64 ImagesFiltred and the second Conv ??
Thanks for your answer, in don't find a good tutorial that explain clearly, it always a rush ...
Your first point is correct. Convolutions are essentially ways of altering and extracting features from data. We do this by creating m images, each looking at a certain frame of the original image. On this first convolutional layer, we then take n images for each convoluted image in the first layer.
SO: k1 *k2 would be the total number of images.
To further this point,
a convolution works by making feature maps of an image. When you have successive convolutional layers, you are making feature maps of feature maps. I.e. if I start with 1 image, and my first convolutional layer is of size 20, then I have 20 images (more specifically feature maps) at the end of convolution 1. Then let's say I add a second convolution of size 10. What happens is then I am making 10 feature maps for every 1 image. Thus, it would be 20*10 images = 200 feature maps.
Let's say for example you have a 50x50 pixel image. Let's say you have a convolutional layer with a filter of size 5x5. What happens if you don't have padding or anything else) is that you "slide" across the image and get a weighted average of the pixels at each iteration of the slide (depending on your location). You would then get an output feature map of size 5x5. Let's say you do this 20 times then (i.e. a 5x5x20 convolution) You would then have as an output 20 feature maps of size 5x5. In the diagram mentioned in the VGG neural network post below, the diagram only shows the number of feature maps to made for the incoming feature maps NOT the end sum of feature maps.
I hope this explanation was thorough!
Here we have the architecture of the VGG-16
In VGG-16 we have 4 convolutions : 64, 128, 256 512
And in the architecture we saw that we don't have 64 images, 64*128 images etc
but just 64 images, 128 images etc
So the good answer was not the first but the second. And it imply my second questions :
"What are goin on between the 64 ImagesFiltred and the second Conv ??"
I think between a 64 conv et and 32 conv they are finaly only 1 filter but on two pixel couch so it divide the thickness of the conv by 2.
And between a 64 conv and a 128 conv they are only 2 filter on one pixel couch so ti multiply by 2 the thickness of the conv.
Am i right ?
I have a character in my scene that is less than 6k vertices and less than 2k faces. The face limit where it turns red in the stats seems to be 1k.
I have tried it on my desktop, a iPhone 7 and a Pixel 1. It runs at 60 fps on all three. Why is that limit so low?
There's an issue filed for this. https://github.com/aframevr/aframe/issues/3024
The answer is it shouldn't be.
I have read the original Viola Jones article, the Wikipedia article, the openCV manual and these SO answers:
How does the Viola-Jones face detection method work?
Defining an (initial) set of Haar Like Features
I am trying to implement my own version of the two detectors in the original article (the Adaboost-200-features version and the final cascade version), but something is missing in my understanding, specifically with how the 24 x 24 detector works on the entire image and (maybe) its sub-images on (maybe) different scales. To my understanding, during detection:
(1) The image integral is computed twice, for image variance normalization, once as is, once squared:
The variance of an image sub-window can be computed quickly using a
pair of integral images.
(2) The 24 x 24 square detector is moved across the normalized image in steps of 1 pixel, deciding for each square whether it is a face (or a different object) or not:
The final detector is scanned across the image at multiple scales and
locations.
(3) Then the image is scaled to be 1.25 smaller, and we go back to (1)
This is done 12 times until the smaller side of the image rectangle is 24 pixels long (288 in original image divided by (1.25 ^ (12 - 1)) is just over 24):
a 384 by 288 pixel image is scanned at 12 scales each a factor of 1.25
larger than the last.
But then I see this quote in the article:
Scaling is achieved by scaling the detector itself, rather
than scaling the image. This process makes sense because the features
can be evaluated at any scale with the same cost. Good detection
results were obtained using scales which are a factor of
1.25 apart.
And I see this quote in the Wikipedia article:
Instead of scaling the image itself (e.g. pyramid-filters), we scale the features.
And I'm thrown off understanding what exactly is going on. How can the 24 x 24 detector be scaled? The entire set of features calculated on this area (whether in the 200-features version or the ~6K-cascade version) is based on originally exploring the 162K possible features of this 24 x 24 rectangle. And why did I get that the pyramid-paradigm still holds for this algorithm?
What should change in the above algorithm? Is it the image which is scaled or the 24 x 24 detector, and if it is the detector how is it exactly done?
or "How I learned to stop worrying and learned to love measurement systems"
I wanted a central spot that I can refer to later to give me a quick low-down on various units of measurement used in programming. SO seemed the best place to put it, and while I could go ahead and answer the question myself, y'all are a much smarter bunch than I, so I might as well let you do it.
Please pick one unit that you're familiar with, use "#name" in the first line to give it as the heading (making it easy to find) and define it within your answer. Please do not duplicate - add comments or edit existing answers rather than adding a new answer. Similar units are still seperate - so please don't define em and en in the same answer. If a unit is exactly the same as another unit, add a line for "aliases" below the heading.
If it's a particularly obscure measurement type, please link to a second reference so people don't downvote you because they've never heard of it.
Point
Pica
Twips
Pixel
Em
En
CPI
DPI
I'm seeing a lot of downvoting - I suppose people believe this doesn't add value to StackOverflow's community. Please consider commenting below if you feel this doesn't add to the community, or if you think this is a bad question. I'm interested in improving it if you have any suggestions.
The great thing about standards is there are so many to choose from!
-Adam
I recommend to ammend the above answers using the following descriptions
PICA
Pica Typographic unit of measurement in the anglo-american point system. One pica is 1/72 Inch (0,351 mm) and equals 12 pica points. The didot equivalent of a pica is a cicero. A standard unit of measure in newspapers. There are 6 picas in one inch, 12 points in one pica.
PICA POINT
Pica Point 1/12 of a pica
POINT
996 points are equivalent to 35 centimeters, or one point is equal to .01383 inches. This means about 72.3 points to the inch. We in electronic printing use 72 points per inch
1 point (Truchet) = 0.188 mm (obsolete today)
1 point (Didot) = 0.376 mm = 1/72 of a French royal inch (27.07 mm)
1 point (ATA) = 0.3514598 mm = 0.013837 inch
1 point (TeX) = 0.3514598035 mm = 1/72.27 inch
1 point (Postscript) = 0.3527777778 mm = 1/72 inch
1 point (l’Imprimerie nationale, IN) = 0.4 mm
EM
An old printing term for a square-shaped blank space that’s as wide as the type is high; in other words, a 10-point em space will be 10 points wide.
EN
Half an em space; a 10-point en space will be 5 points wide.
DPI
The number of dots per inch a printer prints. The higher the dpi, the finer the resolution of the output.
PIXEL
The smallest dot you can draw on a computer screen
CPI
Counts per inch for Mouse properties and The number of horizontal characters that will fit in one inch for Printer properties
PITCH Alias CPI
Pitch describes the width of a character. Pitch equals the number of characters that can fit side-by-side in 1 inch; for example, 10 pitch equals 10 characters-per-inch or 10 CPI. Pitch is a term generally used with non-proportional (fixed-width) fonts.
TWIPS
A twip (derived from TWentieth of an Imperial Point) is a typographical measurement, defined as 1/20 of a typographical point. One twip is 1/1440 inch or 17.639 µm when derived from the PostScript point at 72 to the inch, and 1/1445.4 inch or 17.573 µm based on the printer's point at 72.27 to the inch
Additional Units:
LPI
The number of vertical lines of text that will fit in one inch
PPI
Thickness of paper, expressed in thousandths of an inch or pages per inch.
or sometimes no of horizontal pixels closely printed or displayed per inch.
FONT SIZE
Font size or Type size is the baseline distance for which the font was designed. A font should normally be identified and selected by this size, because the intended baseline distance is much more relevant for practical layout work than the actual dimensions of certain characters.
FONT HEIGHT
Font height is the height in mm of letters such as k or H. Typically, the font height is around 72% of the font size, but this is of course at the discretion of the font designer.
X-HEIGHT
x-height indicate typesize of lower-case letters excluding ascenders and descenders (from the height of the lower-case x)
H-HEIGHT
h-height or cap height refers to the height of a capital letter above the baseline for a particular typeface. It specifically refers to the height of capital letters that are flat—such as H or I—as opposed to round letters such as O.
Pixel
One of the little colored squares on your screen.
Pica
A typographical measure of 12 points, sometimes (incorrectly) called an Em. (in fact, an em is actually a horizontal distance the same as the point size of the type).
Twips
'Twentieth of an Imperial Point'. A measure used for marking up positions of widgets in Visual BASIC user interfaces. It was used this way so that positions could be specified precisely using integers. One Twip = 1/20 point = 1/1440 inch.
EM
An old printing term for a square-shaped blank space that’s as wide as the type is high; in other words, a 10-point em space will be 10 points wide.
DPI
Dots per inch. A dimensionless number used to measure the resolution of something in space, i.e. with respect to real occupied physical size.
dds complexity and headache since the standard/default DPI of a computer screen varies with the operating system. Macintosh screens generally have 72 DPI, while Windows favors 96. If you don't compensate for this when displaying images (and text), you will get unexpected variations.
Always amusing when people start talking of "the DPI of this image", for digital images such as PNG or JPEG. To me, they only have absolute pixels in them, unconnected to any physical size. If you want to print the image on a (for instance) 300 DPI printer, then you need to adapt and scale to get it correct, but the image itself only has pixels.
EN
Half an em space; a 10-point en space will be 5 points wide.
CPI
Counts per inch for Mouse properties and
The number of horizontal characters that will fit in one inch for Printer properties
PITCH Alias CPI
Pitch describes the width of a character. Pitch equals the number of characters that can fit side-by-side in 1 inch; for example, 10 pitch equals 10 characters-per-inch or 10 CPI. Pitch is a term generally used with non-proportional (fixed-width) fonts.
PostScript Point
1/72th of an inch.