I'm new at Go and trying to improve my skills. Currently I'm working with images and I need to have all pixels' red value of an image. I know I can use the code below to achieve this but it seemed slow to me(~485 msecs),
pixList := make([]uint8, width*height)
for y := 0; y < height; y++ {
for x := 0; x < width; x++ {
r, _, _, _ := img.At(x, y).RGBA()
var rNew uint8 = uint8(float32(r)*(255.0/65535.0))
pixList[(x*height)+y] = rNew
}
}
Is there any faster way to do this? Any built-in functions to get all pixel values at once?
Edit: I'm now using the Pix to get all pixel data but still my Pix list is not giving what I'm looking for.
new code:
pixList := img.(*image.Paletted).Pix
newPixList := make([]uint8, width*height)
fmt.Println(len(pixList))//gives width*height, shouldn't it be width*height*4?
for index := 0; index < width*height; index++ {
newPixList[index] = pixList[index*4]//this part gives index out of range error, because the pixList is length of width*height, i dunno why
}
I think it's not behaving my image as it's an rgba image, maybe a conversion could work. Any ideas?
Thanks.
You can't make this pattern performant, because this requires an interface method call for every pixel. For fast access to the image data, you access the image's data directly. Take the image.RGBA type for example:
type RGBA struct {
// Pix holds the image's pixels, in R, G, B, A order. The pixel at
// (x, y) starts at Pix[(y-Rect.Min.Y)*Stride + (x-Rect.Min.X)*4].
Pix []uint8
// Stride is the Pix stride (in bytes) between vertically adjacent pixels.
Stride int
// Rect is the image's bounds.
Rect Rectangle
}
The docs for each image type include the data layout and indexing formula. For this type you could extract all red pixels from the Pix slice with:
w, h := img.Rect.Dx(), img.Rect.Dy()
pixList := make([]uint8, w*h)
for i := 0; i < w*h; i++ {
pixList[i] = img.Pix[i*4]
}
If you need to convert other image types, you can use the existing methods to do the color conversion, but first assert the correct image type and use the native *At method to avoid the interface call. Extracting the approximate red values from a YCbCr image :
w, h := img.Rect.Dx(), img.Rect.Dy()
pixList := make([]uint8, w*h)
for x := 0; x < w; x++ {
for y := 0; y < h; y++ {
r, _, _, _ := img.YCbCrAt(x, y).RGBA()
pixList[(x*h)+y] = uint8(r >> 8)
}
}
return pixList
Similar to how the YCbCr image above has no "red" pixels (the value needs to be computed for each individual pixel), a paletted image has no individual RGBA values for the pixels, and needs to be looked up in the image's palette. You could take this one step further and predetermine the color model of the palette colors to remove the Color.RGBA() interface call to speed this up even more like so:
palette := make([]*color.RGBA, len(img.Palette))
for i, c := range img.Palette {
palette[i] = c.(*color.RGBA)
}
pixList := make([]uint8, len(img.Pix))
for i, p := range img.Pix {
pixList[i] = palette[p].R
}
Related
I have a C library and function that expects a pointer to byte array that contains a 24 bit bitmap in RGB format. Alpha channel is not important and can be truncated. I've tried something like this:
func load(filePath string) *image.RGBA {
imgFile, err := os.Open(filePath)
if err != nil {
fmt.Printf("Cannot read file %v\n", err)
}
defer imgFile.Close()
img, _, err := image.Decode(imgFile)
if err != nil {
fmt.Printf("Cannot decode file %v\n", err)
}
return img.(*image.RGBA)
}
img := load("myimg.png")
bounds := img.Bounds()
width, height := bounds.Max.X, bounds.Max.Y
// Convert to RGB? Probably not...
newImg := image.NewNRGBA(image.Rect(0, 0, width, height))
draw.Draw(newImg, newImg.Bounds(), img, bounds.Min, draw.Src)
// Pass image pointer to C function.
C.PaintOnImage(unsafe.Pointer(&newImg.Pix[0]), C.int(newImg.Bounds().Dy()), C.int(newImg.Bounds().Dx())
However, it seems that NRGBA is also built on 4 bytes per pixel. I could solve this probably by using GoCV but this seems like overkill for such simple task. Is there a way to do this in a simple and efficient manner in Go?
There is no RGB image type in the standard library, but you can assemble your RGB array pretty easily:
bounds := img.Bounds()
rgb := make([]byte, bounds.Dx()*bounds.Dy()*3)
idx := 0
for y := bounds.Min.Y; y < bounds.Max.Y; y++ {
for x := bounds.Min.X; x < bounds.Max.X; x++ {
offs := img.PixOffset(x, y)
copy(rgb[idx:], img.Pix[offs:offs+3])
idx += 3
}
}
The img.Pix data holds the 4-byte RGBA values. The code above just copies the leading 3-byte RGB values of all pixels.
Since lines are continuous in the Pix array, you can improve the above code by only calling PixOffset onces per line, and advance by 4 bytes for every pixel. Also manually copying 3 bytes may be faster than calling copy() (benchmark if it matters to you):
bounds := img.Bounds()
rgb := make([]byte, bounds.Dx()*bounds.Dy()*3)
idx := 0
for y := bounds.Min.Y; y < bounds.Max.Y; y++ {
offs := img.PixOffset(bounds.Min.X, y)
for x := bounds.Min.X; x < bounds.Max.X; x++ {
rgb[idx+0] = img.Pix[offs+0]
rgb[idx+1] = img.Pix[offs+1]
rgb[idx+2] = img.Pix[offs+2]
idx += 3
offs += 4
}
}
I'm attempting to resize and convert an image into a grayscale slice of float64 (so I can do some transformations on the floats) and then back into an image, but I'm not sure how to convert a []float64 back to RGB or something I can turn into an image.
So far I have:
// colorToGrayScaleFloat64 reduces rgb
// to a grayscale approximation.
func colorToGrayScaleFloat64(c color.Color) float64 {
r, g, b, _ := c.RGBA()
return 0.299*float64(r) +
0.587*float64(g) +
0.114*float64(b)
}
func getFloatPixels(filePath string) {
size := 32
f, err := os.Open(filePath)
if err != nil {
log.Fatalf("FAILED TO OPEN FILE: %s", err.Error())
}
defer f.Close()
image, _, err := image.Decode(f)
if err != nil {
log.Fatalf("FAILED TO DECODE FILE: %s", err.Error())
}
// Resize image to 32X32
im := imaging.Resize(image, size, size, imaging.Lanczos)
// Convert image to grayscale float array
vals := make([]float64, size*size)
for i := 0; i < size; i++ {
for j := 0; j < size; j++ {
vals[size*i+j] = colorToGrayScaleFloat64(im.At(i, j))
}
}
fmt.Printf("pixel vals %+v\n", vals)
}
Which produces the output:
pixel vals [40315.076 48372.797 48812.780999999995 47005.557 ... 25129.973999999995 24719.287999999997]
How can I convert this pixel []float64 back to an image?
So basically you have the luminosity of the gray pixel, and you want to have a color.Color value representing it.
This is quite simple: there is a color.Gray type, and a higher precision color.Gray16 which model the color with its luminosity, so simply create a value of those. They implement color.Color, so you can use them to set pixels of an image.
col := color.Gray{uint8(lum / 256)}
Also note that your colorToGrayScaleFloat64() function is already present in the standard lib. There are several converters in the image/color package as implementations of color.Model. Use the color.GrayModel or color.Gray16Model to convert a color.Color to a value of type color.Gray or color.Gray16 which directly store the luminosity of the gray color.
For example:
gray := color.Gray16Model.Convert(img.At(x, y))
lum := float64(gray.(color.Gray16).Y)
Testing it:
c := color.RGBA{111, 111, 111, 255}
fmt.Println("original:", c)
gray := color.Gray16Model.Convert(c)
lum := float64(gray.(color.Gray16).Y)
fmt.Println("lum:", lum)
col := color.Gray{uint8(lum / 256)}
r, g, b, a := col.RGBA()
a >>= 8
fmt.Println("lum to col:", r/a, g/a, b/a, a)
fmt.Println()
This will output (try it on the Go Playground):
original: {111 111 111 255}
lum: 28527
lum to col: 111 111 111 255
Also note that if you want to create an image full of gray pixels, you may use the image.Gray and image.Gray16 types so when drawing these colors on them, no color conversion will be needed. They also have designated Gray.SetGray() and Gray16.SetGray16() methods that directly take colors of these types.
See related:
Converting RGBA image to Grayscale Golang
Reading the answers from Get a pixel array from from golang image.Image, I saw that there were two methods of pixel RGBA retrieval, via img.At() and rgba.Pix().
Which is better to use? Should one always be used, or are there cases where one should be used over the other and vice versa?
If your program will be conducting a computation where you need a majority, if not all of the pixel data, then rgba.Pix() significantly outperforms img.At(). If you only need the pixel data of a single or few pixels in an image, use img.At() (the overhead of computing the rgba.Pix() prerequisite is too high in such cases).
Here are the results of various test loads, the durations of which are averaged over 10 samples each.
Method
1x1
1000x667
3840x2160
1000x667 + computation
1000x667 only 5x5 accessed
img.At()
195ns
30.211071ms
294.885396ms
853.345043ms
42.431μs
rgba.Pix()
719ns
7.786029ms
77.700552ms
836.480063ms
6.791461ms
We can see how for the tiny 1x1 image and the image where we limit our for loops to upper bounds of 5, using img.At() results in faster execution time. However, for use cases where every pixel is fetched, rgba.Pix() results in better performance. This improvement in performance is less evident the more computation we do with every pixel, as the total time increases and the difference between img.At() and rgba.Pix() becomes less obvious, as seen in "1000x667 + computation" in the table above.
Here is the test code used:
func main() {
resp, err := http.Get("IMAGE URL GOES HERE")
if err != nil {
panic(err)
}
defer resp.Body.Close()
img, _, err := image.Decode(resp.Body)
if err != nil {
panic(err)
}
var start time.Time
var duration time.Duration
samples := 10
var sum time.Duration
fmt.Println("Samples: ", samples)
sum = time.Duration(0)
for i := 0; i < samples; i++ {
start = time.Now()
usingAt(img)
duration = time.Since(start)
sum += duration
}
fmt.Println("*** At avg: ", sum/time.Duration(samples))
sum = time.Duration(0)
for i := 0; i < samples; i++ {
start = time.Now()
usingPix(img)
duration = time.Since(start)
sum += duration
}
fmt.Println("*** Pix avg: ", sum/time.Duration(samples))
}
func usingAt(img image.Image) {
bounds := img.Bounds()
width, height := bounds.Max.X, bounds.Max.Y
for y := 0; y < height; y++ {
for x := 0; x < width; x++ {
r, g, b, _ := img.At(x, y).RGBA()
_ = uint8(r >> 8)
_ = uint8(g >> 8)
_ = uint8(b >> 8)
}
}
}
func usingPix(img image.Image, targetColor colorful.Color) {
bounds := img.Bounds()
width, height := bounds.Max.X, bounds.Max.Y
rgba := image.NewRGBA(bounds)
draw.Draw(rgba, bounds, img, bounds.Min, draw.Src)
for y := 0; y < height; y++ {
for x := 0; x < width; x++ {
index := (y*width + x) * 4
pix := rgba.Pix[index : index+4]
_ = pix[0]
_ = pix[1]
_ = pix[2]
}
}
}
1000x667 only 5x5 accessed replaced the height and width in the for loops with 5 and 5, limiting the number of pixels accessed.
1000x667 + computation actually used the RGB values by comparing each pixel's color distance from a target color with go-colorful's DE2000 calculation.
How do you take the FFT of an image in Google Go?
The Go DSP library (github.com/mjibson/go-dsp/fft) has a function for a 2D FFT with the following signature:
func FFT2Real(x [][]float64) [][]complex128
How do I convert an image from the standard go image types to float64? Is this the right approach?
Here is a link to the documentation.
You have two options, both involve copying the pixels. You can either use the methods provided by the Image interface, namely At(x,y) or you can assert the image to one of the image types provided by the image packet and access the Pix attribute directly.
Since you will most likely be using a Gray image, you could easily assert your image to type *image.Gray and access the pixels directly but for the sake of abstraction I did not in my example:
inImage, _, err := image.Decode(inFile)
// error checking
bounds := inImage.Bounds()
realPixels := make([][]float64, bounds.Dy())
for y := 0; y < bounds.Dy(); y++ {
realPixels[y] = make([]float64, bounds.Dx())
for x := 0; x < bounds.Dx(); x++ {
r, _, _, _ := inImage.At(x, y).RGBA()
realPixels[y][x] = float64(r)
}
}
This way you read all the pixels of your image inImage and store them as float64 values in a two-dimensional slice, ready to be processed by fft.FFT2Real:
// apply discrete fourier transform on realPixels.
coeffs := fft.FFT2Real(realPixels)
// use inverse fourier transform to transform fft
// values back to the original image.
coeffs = fft.IFFT2(coeffs)
// write everything to a new image
outImage := image.NewGray(bounds)
for y := 0; y < bounds.Dy(); y++ {
for x := 0; x < bounds.Dx(); x++ {
px := uint8(cmplx.Abs(coeffs[y][x]))
outImage.SetGray(x, y, color.Gray{px})
}
}
err = png.Encode(outFile, outImage)
In the code above I applied FFT on the pixels stored in realPixels and then, to see whether it worked, used inverse FFT on the result. The expected result is the original image.
A full example can be found here.
I have some code which performs the following logical operations:
Read in and decode a gif image to a *GIF using gif.DecodeAll
Modify some pixels in each frame of the *GIF using image.Set
Write out the resulting modified image using gif.EncodeAll
Here's some code snippets to help demonstrate what the code is doing (error handling, file closing, etc removed for brevity):
f, err := os.Open(filename)
reader := bufio.NewReader(f)
g, err := gif.DecodeAll(reader)
err = modify_image(g)
of, err := os.Create("out.gif")
writer := bufio.NewWriter(of)
err = gif.EncodeAll(writer, g)
Here's the modify_image function:
func modify_image(img *gif.GIF) error {
for i := 0; i < len(img.Image); i++ {
err := modify_frame(img.Image[i])
}
return nil
}
And modify_frame:
func modify_frame(frame *image.Paletted) error {
xmin := frame.Rect.Min.X
ymin := frame.Rect.Min.Y
xmax := frame.Rect.Max.X
ymax := frame.Rect.Max.Y
for y := ymin; y < ymax; y++ {
for x := xmin; x < xmax; x++ {
if should_turn_pixel_transparent(frame, x, y) {
frame.Set(x, y, color.RGBA64{0, 0, 0, 0})
}
}
}
return nil
}
The out.gif that this code produces has the correct pixels turned transparent, but as the animation proceeds, the pixels which I turned transparent are not "clearing"; i.e. as these transparent pixels are written over non-transparent pixels, the non-transparent pixels underneath are still displayed.
My (brief) understanding is that there are two different methods for representing transparency in gifs. I don't know if I need to use index transparency versus alpha transparency, or if I'm just doing things entirely wrong. Any advice would be appreciated.
This is often omitted or not covered in various golang tutorials for generating gifs, but along with setting the delay Delay slice for each frame in the Image slice, it is also optional to set Disposal for each frame of the gif. DisposalNone is used of the slice does not have a member corresponding to the current frame index.
Disposal options are:
const (
DisposalNone = 0x01 // dont dispose of previous frames
DisposalBackground = 0x02 // dispose of specific colour in previous frames defined by GIF.BackgroundIndex
DisposalPrevious = 0x03 // dispose of the previous frame
)
The following is the resulting gif for each type of disposal.
DisposalNone:
DisposalBackground:
DisposalPrevious: