How do you take the FFT of an image in Google Go?
The Go DSP library (github.com/mjibson/go-dsp/fft) has a function for a 2D FFT with the following signature:
func FFT2Real(x [][]float64) [][]complex128
How do I convert an image from the standard go image types to float64? Is this the right approach?
Here is a link to the documentation.
You have two options, both involve copying the pixels. You can either use the methods provided by the Image interface, namely At(x,y) or you can assert the image to one of the image types provided by the image packet and access the Pix attribute directly.
Since you will most likely be using a Gray image, you could easily assert your image to type *image.Gray and access the pixels directly but for the sake of abstraction I did not in my example:
inImage, _, err := image.Decode(inFile)
// error checking
bounds := inImage.Bounds()
realPixels := make([][]float64, bounds.Dy())
for y := 0; y < bounds.Dy(); y++ {
realPixels[y] = make([]float64, bounds.Dx())
for x := 0; x < bounds.Dx(); x++ {
r, _, _, _ := inImage.At(x, y).RGBA()
realPixels[y][x] = float64(r)
}
}
This way you read all the pixels of your image inImage and store them as float64 values in a two-dimensional slice, ready to be processed by fft.FFT2Real:
// apply discrete fourier transform on realPixels.
coeffs := fft.FFT2Real(realPixels)
// use inverse fourier transform to transform fft
// values back to the original image.
coeffs = fft.IFFT2(coeffs)
// write everything to a new image
outImage := image.NewGray(bounds)
for y := 0; y < bounds.Dy(); y++ {
for x := 0; x < bounds.Dx(); x++ {
px := uint8(cmplx.Abs(coeffs[y][x]))
outImage.SetGray(x, y, color.Gray{px})
}
}
err = png.Encode(outFile, outImage)
In the code above I applied FFT on the pixels stored in realPixels and then, to see whether it worked, used inverse FFT on the result. The expected result is the original image.
A full example can be found here.
Related
I'm attempting to resize and convert an image into a grayscale slice of float64 (so I can do some transformations on the floats) and then back into an image, but I'm not sure how to convert a []float64 back to RGB or something I can turn into an image.
So far I have:
// colorToGrayScaleFloat64 reduces rgb
// to a grayscale approximation.
func colorToGrayScaleFloat64(c color.Color) float64 {
r, g, b, _ := c.RGBA()
return 0.299*float64(r) +
0.587*float64(g) +
0.114*float64(b)
}
func getFloatPixels(filePath string) {
size := 32
f, err := os.Open(filePath)
if err != nil {
log.Fatalf("FAILED TO OPEN FILE: %s", err.Error())
}
defer f.Close()
image, _, err := image.Decode(f)
if err != nil {
log.Fatalf("FAILED TO DECODE FILE: %s", err.Error())
}
// Resize image to 32X32
im := imaging.Resize(image, size, size, imaging.Lanczos)
// Convert image to grayscale float array
vals := make([]float64, size*size)
for i := 0; i < size; i++ {
for j := 0; j < size; j++ {
vals[size*i+j] = colorToGrayScaleFloat64(im.At(i, j))
}
}
fmt.Printf("pixel vals %+v\n", vals)
}
Which produces the output:
pixel vals [40315.076 48372.797 48812.780999999995 47005.557 ... 25129.973999999995 24719.287999999997]
How can I convert this pixel []float64 back to an image?
So basically you have the luminosity of the gray pixel, and you want to have a color.Color value representing it.
This is quite simple: there is a color.Gray type, and a higher precision color.Gray16 which model the color with its luminosity, so simply create a value of those. They implement color.Color, so you can use them to set pixels of an image.
col := color.Gray{uint8(lum / 256)}
Also note that your colorToGrayScaleFloat64() function is already present in the standard lib. There are several converters in the image/color package as implementations of color.Model. Use the color.GrayModel or color.Gray16Model to convert a color.Color to a value of type color.Gray or color.Gray16 which directly store the luminosity of the gray color.
For example:
gray := color.Gray16Model.Convert(img.At(x, y))
lum := float64(gray.(color.Gray16).Y)
Testing it:
c := color.RGBA{111, 111, 111, 255}
fmt.Println("original:", c)
gray := color.Gray16Model.Convert(c)
lum := float64(gray.(color.Gray16).Y)
fmt.Println("lum:", lum)
col := color.Gray{uint8(lum / 256)}
r, g, b, a := col.RGBA()
a >>= 8
fmt.Println("lum to col:", r/a, g/a, b/a, a)
fmt.Println()
This will output (try it on the Go Playground):
original: {111 111 111 255}
lum: 28527
lum to col: 111 111 111 255
Also note that if you want to create an image full of gray pixels, you may use the image.Gray and image.Gray16 types so when drawing these colors on them, no color conversion will be needed. They also have designated Gray.SetGray() and Gray16.SetGray16() methods that directly take colors of these types.
See related:
Converting RGBA image to Grayscale Golang
I want to change all Blue values of pixels to 255, if it is equal to 20.
I read the source image, draw.Draw it to new image.RGBA, so that i can modify pixels.
But, when I take output image(after executing the program) and feed it as the input, and put a debug point inside the IF block, and run program in debug mode, i see in multiple points debugger stops inside there. Which means, I am not correctly modifying image.
Can anyone tell me, how can I modify pixels and save correctly? Thanks a lot
func changeOnePixelInImage() {
imgPath := "./source.png"
f, err := os.Open(imgPath)
check(err)
defer f.Close()
sourceImage, _, err := image.Decode(f)
size := sourceImage.Bounds().Size()
destImage := image.NewRGBA(sourceImage.Bounds())
draw.Draw(destImage, sourceImage.Bounds(), sourceImage, image.Point{}, draw.Over)
for x := 0; x < size.X; x++ {
for y := 0; y < size.Y; y++ {
pixel := sourceImage.At(x, y)
originalColor := color.RGBAModel.Convert(pixel).
(color.RGBA)
b := originalColor.B
if b == 20 {
b = 255 // <--- then i swap source and destination paths, and debug this line
}
c := color.RGBA{
R: originalColor.R,
G: originalColor.G,
B: b,
A: originalColor.A,
}
destImage.SetRGBA(x, y, c)
}
}
ext := filepath.Ext(imgPath)
newImagePath := fmt.Sprintf("%s/dest%s", filepath.Dir(imgPath), ext)
fg, err := os.Create(newImagePath)
check(err)
defer fg.Close()
err = jpeg.Encode(fg, destImage, &jpeg.Options{100})
check(err)
}
I found the answer to my question.
The thing is, I was Decoding jpeg image, and I found out that JPEG images loss quality(so, pixel values are modified during the process) from this stackoverflow question: Is JPEG lossless when quality is set to 100?
So, I should have used PNG images.(and even though I am using source.png as source image, it was actually jpg image :/)
So i changes last lines to:
if ext != ".png" {
panic("cannot do my thing with jpg images, since they get compressed")
}
err = png.Encode(fg, destImage)
I'm new at Go and trying to improve my skills. Currently I'm working with images and I need to have all pixels' red value of an image. I know I can use the code below to achieve this but it seemed slow to me(~485 msecs),
pixList := make([]uint8, width*height)
for y := 0; y < height; y++ {
for x := 0; x < width; x++ {
r, _, _, _ := img.At(x, y).RGBA()
var rNew uint8 = uint8(float32(r)*(255.0/65535.0))
pixList[(x*height)+y] = rNew
}
}
Is there any faster way to do this? Any built-in functions to get all pixel values at once?
Edit: I'm now using the Pix to get all pixel data but still my Pix list is not giving what I'm looking for.
new code:
pixList := img.(*image.Paletted).Pix
newPixList := make([]uint8, width*height)
fmt.Println(len(pixList))//gives width*height, shouldn't it be width*height*4?
for index := 0; index < width*height; index++ {
newPixList[index] = pixList[index*4]//this part gives index out of range error, because the pixList is length of width*height, i dunno why
}
I think it's not behaving my image as it's an rgba image, maybe a conversion could work. Any ideas?
Thanks.
You can't make this pattern performant, because this requires an interface method call for every pixel. For fast access to the image data, you access the image's data directly. Take the image.RGBA type for example:
type RGBA struct {
// Pix holds the image's pixels, in R, G, B, A order. The pixel at
// (x, y) starts at Pix[(y-Rect.Min.Y)*Stride + (x-Rect.Min.X)*4].
Pix []uint8
// Stride is the Pix stride (in bytes) between vertically adjacent pixels.
Stride int
// Rect is the image's bounds.
Rect Rectangle
}
The docs for each image type include the data layout and indexing formula. For this type you could extract all red pixels from the Pix slice with:
w, h := img.Rect.Dx(), img.Rect.Dy()
pixList := make([]uint8, w*h)
for i := 0; i < w*h; i++ {
pixList[i] = img.Pix[i*4]
}
If you need to convert other image types, you can use the existing methods to do the color conversion, but first assert the correct image type and use the native *At method to avoid the interface call. Extracting the approximate red values from a YCbCr image :
w, h := img.Rect.Dx(), img.Rect.Dy()
pixList := make([]uint8, w*h)
for x := 0; x < w; x++ {
for y := 0; y < h; y++ {
r, _, _, _ := img.YCbCrAt(x, y).RGBA()
pixList[(x*h)+y] = uint8(r >> 8)
}
}
return pixList
Similar to how the YCbCr image above has no "red" pixels (the value needs to be computed for each individual pixel), a paletted image has no individual RGBA values for the pixels, and needs to be looked up in the image's palette. You could take this one step further and predetermine the color model of the palette colors to remove the Color.RGBA() interface call to speed this up even more like so:
palette := make([]*color.RGBA, len(img.Palette))
for i, c := range img.Palette {
palette[i] = c.(*color.RGBA)
}
pixList := make([]uint8, len(img.Pix))
for i, p := range img.Pix {
pixList[i] = palette[p].R
}
I can't seem to figure out what to do next. My goal is to create an array of all the sub images from the original image using the SubImage function from the image package. I am able to partition an image in the imageSplit() function and pass to imageReceiver() function via a channel.
I actually receive the data in function imageReceiver(), but I don't know how to append to an array and use it after receiving all the images from imageSplit() function.
// Partitions Image
func Partition(src image.Image) []image.Image {
newImg := image.NewNRGBA64(src.Bounds())
r := newImg.Rect
dx, dy := r.Dx(), r.Dy()
// partitionNum
pNum := 3
// partition x
px, py := (dx / pNum), (dy / pNum)
imgChan := make(chan image.Image)
imgStorage := make([]image.Image, 0)
for i := 1; i < pNum; i++ {
for j := 1; j < pNum; j++ {
startX, startY := ((px * i) - px), ((py * j) - py)
endX, endY := (px * i), (py * j)
go imageSplit(imgChan, newImg, startX, startY, endX, endY)
go imageReceiver(imgChan)
}
}
return imgStorage
}
// Creates sub-images of img
func imageSplit(imgChan chan image.Image, img *image.NRGBA64, startX, startY, endX, endY int) {
r := image.Rect(startX, startY, endX, endY)
subImg := img.SubImage(r)
imgChan <- subImg
}
// Receive sub-image from channel
func imageReceiver(imgChan chan image.Image) {
img := <-imgChan
spew.Dump(img.Bounds())
}
I thought of creating a global array of image.Image but I'm unsure if this is the correct way to "save" all the sub images.
I guess the reason this is a bit confusing is because this is the first time I'm working with concurrency in Go.
Thanks for any help :)
There are a few options for how you can do this but I would say your basic problem is that your receiver doesn't do aggregation and if you changed it so it did it would not be thread safe.
The simple choice to modify your receiver to do aggregation would be to allocate an Image array before the loop and pass a pointer to it into the receiver method which would then just use append when it reads of the channel. But then you would have a bunch of different goroutines fighting for access to the same array. So really, you don't want the aggregation to be multithreaded. If it is you need a locking mechanism in order to write to the collection.
Instead you want to block after the loop. The simplest way to do that would just be to put the body of your receiver right there inline after the loop like;
imgs := []image.Image{}
img := <-imgChan
imgs = append(imgs, img)
spew.Dump(img.Bounds())
The problem is in the real world then your software would block on that line and be unresponsive (have no way of dying or exiting or anything) so instead you'd typically use a channel select where you have at least 2 channels/cases, an abort channel that the caller of Partition can use to kill it if it needs to exit and the case that receives from imgChan. That would look a little more like this;
imgs := []image.Image{}
select {
case img := <-imgChan
imgs = append(imgs, img)
spew.Dump(img.Bounds())
case _ := <-abortChan:
return MyCustomError();
}
Which make it so your aggregation is not concurrent, only the work to produce the results which I personally think is the better design. I could explain how to lock in your receiver method as well but I'm sure you can find plenty of examples of mutex's ect.
I have some code which performs the following logical operations:
Read in and decode a gif image to a *GIF using gif.DecodeAll
Modify some pixels in each frame of the *GIF using image.Set
Write out the resulting modified image using gif.EncodeAll
Here's some code snippets to help demonstrate what the code is doing (error handling, file closing, etc removed for brevity):
f, err := os.Open(filename)
reader := bufio.NewReader(f)
g, err := gif.DecodeAll(reader)
err = modify_image(g)
of, err := os.Create("out.gif")
writer := bufio.NewWriter(of)
err = gif.EncodeAll(writer, g)
Here's the modify_image function:
func modify_image(img *gif.GIF) error {
for i := 0; i < len(img.Image); i++ {
err := modify_frame(img.Image[i])
}
return nil
}
And modify_frame:
func modify_frame(frame *image.Paletted) error {
xmin := frame.Rect.Min.X
ymin := frame.Rect.Min.Y
xmax := frame.Rect.Max.X
ymax := frame.Rect.Max.Y
for y := ymin; y < ymax; y++ {
for x := xmin; x < xmax; x++ {
if should_turn_pixel_transparent(frame, x, y) {
frame.Set(x, y, color.RGBA64{0, 0, 0, 0})
}
}
}
return nil
}
The out.gif that this code produces has the correct pixels turned transparent, but as the animation proceeds, the pixels which I turned transparent are not "clearing"; i.e. as these transparent pixels are written over non-transparent pixels, the non-transparent pixels underneath are still displayed.
My (brief) understanding is that there are two different methods for representing transparency in gifs. I don't know if I need to use index transparency versus alpha transparency, or if I'm just doing things entirely wrong. Any advice would be appreciated.
This is often omitted or not covered in various golang tutorials for generating gifs, but along with setting the delay Delay slice for each frame in the Image slice, it is also optional to set Disposal for each frame of the gif. DisposalNone is used of the slice does not have a member corresponding to the current frame index.
Disposal options are:
const (
DisposalNone = 0x01 // dont dispose of previous frames
DisposalBackground = 0x02 // dispose of specific colour in previous frames defined by GIF.BackgroundIndex
DisposalPrevious = 0x03 // dispose of the previous frame
)
The following is the resulting gif for each type of disposal.
DisposalNone:
DisposalBackground:
DisposalPrevious: