Repeated calls to image.png.Decode() leads to out of memory errors - image

I am attempting to do what I originally thought would be pretty simple. To wit:
For every file in a list of input files:
open the file with png.Decode()
scan every pixel in the file and test to see if it is "grey".
Return the percentage of "grey" pixels in the image.
This is the function I am calling:
func greyLevel(fname string) (float64, string) {
f, err := os.Open(fname)
if err != nil {
return -1.0, "can't open file"
}
defer f.Close()
i, err := png.Decode(f)
if err != nil {
return -1.0, "unable to decode"
}
bounds := i.Bounds()
var lo uint32 = 122 // Low grey RGB value.
var hi uint32 = 134 // High grey RGB value.
var gpix float64 // Grey pixel count.
var opix float64 // Other (non-grey) pixel count.
var tpix float64 // Total pixels.
for x := bounds.Min.X; x < bounds.Max.X; x++ {
for y := bounds.Min.Y; y < bounds.Max.Y; y++ {
r, g, b, _ := i.At(x, y).RGBA()
if ((r/255)-1 > lo && (r/255)-1 < hi) &&
((g/255)-1 > lo && (g/255)-1 < hi) &&
((b/255)-1 > lo && (b/255)-1 < hi) {
gpix++
} else {
opix++
}
tpix++
}
}
return (gpix / tpix) * 100, ""
}
func main() {
srcDir := flag.String("s", "", "Directory containing image files.")
threshold := flag.Float64("t", 65.0, "Threshold (in percent) of grey pixels.")
flag.Parse()
dirlist, direrr := ioutil.ReadDir(*srcDir)
if direrr != nil {
log.Fatalf("Error reading %s: %s\n", *srcDir, direrr)
}
for f := range dirlist {
src := path.Join(*srcDir, dirlist[f].Name())
level, msg := greyLevel(src)
if msg != "" {
log.Printf("error processing %s: %s\n", src, msg)
continue
}
if level >= *threshold {
log.Printf("%s is grey (%2.2f%%)\n", src, level)
} else {
log.Printf("%s is not grey (%2.2f%%)\n", src, level)
}
}
}
The files are relatively small (960x720, 8-bit RGB)
I am calling ioutil.ReadDir() to generate a list of files, looping over the slice and calling greyLevel().
After about 155 files (out of a list of >4000) the script panics with:
runtime: memory allocated by OS not in usable range
runtime: out of memory: cannot allocate 2818048-byte block (534708224 in use)
throw: out of memory
I figure there is something simple I am missing. I thought that Go would de-allocate the memory allocated in greyLevels() but I guess not?
Follow up:
After inserting runtime.GC() after every call to greyLevels, the memory usage evens out. Last night I was teting to about 800 images then stopped. Today I let it run over the entire input set, approximately 6800 images.
After 1500 images, top looks like this:
top - 10:30:11 up 41 days, 11:47, 2 users, load average: 1.46, 1.25, 0.88
Tasks: 135 total, 2 running, 131 sleeping, 1 stopped, 1 zombie
Cpu(s): 49.8%us, 5.1%sy, 0.2%ni, 29.6%id, 15.0%wa, 0.0%hi, 0.3%si, 0.0%st
Mem: 3090304k total, 2921108k used, 169196k free, 2840k buffers
Swap: 3135484k total, 31500k used, 3103984k free, 640676k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28474 mtw 20 0 2311m 1.8g 412 R 99 60.5 16:48.52 8.out
And remained steady after processing another 5000 images.

It appears that you are using a 32-bit machine. It is likely that the program runs out of memory because Go's garbage collector is conservative. A conservative garbage collector may fail to detect that some region of memory is no longer in use. There is currently no workaround for this in Go programs other than avoiding data structures that the garbage collector cannot handle (such as: struct {...; binaryData [256]byte})
Try to call runtime.GC() in each iteration of the loop in which you are calling function greyLevel. Maybe it will help the program to process more images.
If calling runtime.GC() fails to improve the situation you may want to change your strategy so that the program processes a smaller number of PNG files per run.

Seems like issue 3173 which was recently fixed. Could you please retry with latest weekly? (Assuming you now use some pre 2012-03-07 version).

Related

How to use channels efficiently [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I read on Uber's style guide that one should use at most a channel length of 1.
Although it's clear to me that using a channel size of 100 or 1000 is very bad practice, I was however wondering why a channel size of 10 isn't considered a valid option. I'm missing some part to get to the right conclusion.
Below, you can follow my arguments (and counter arguments) backed by some benchmark test.
I understand that, if your both go-routines, responsible for writing or reading from this channel, would be interrupted in between sequential writings or readings to/from the channel by some other IO action, no gain is to be expected from a higher channel buffer and I agree that 1 is the best option.
But, lets say that there is no significant other go-routine switching needed apart from the implicit locking and unlocking caused by writing/reading to/from the channel. Then I would conclude the following:
Consider the amount of context switches when processing 100 values on a channel with either a channel buffer of size 1 and of 10 (GR = go-routine)
Buffer=1: (GR1 inserts 1 value, GR2 reads 1 value) X 100 ~ 200 go-routine switches
Buffer=10: (GR1 inserts 10 values, GR2 reads 10 values) X 10 ~ 20 go-routine switches
I did some benchmarking to prove that this actually goes faster:
package main
import (
"testing"
)
type a struct {
b [100]int64
}
func BenchmarkBuffer1(b *testing.B) {
count := 0
c := make(chan a, 1)
go func() {
for i := 0; i < b.N; i++ {
c <- a{}
}
close(c)
}()
for v := range c {
for i := range v.b {
count += i
}
}
}
func BenchmarkBuffer10(b *testing.B) {
count := 0
c := make(chan a, 10)
go func() {
for i := 0; i < b.N; i++ {
c <- a{}
}
close(c)
}()
for v := range c {
for i := range v.b {
count += i
}
}
}
Results when comparing simple reading & writing + non-blocking processing:
BenchmarkBuffer1-12 5072902 266 ns/op
BenchmarkBuffer10-12 6029602 179 ns/op
PASS
BenchmarkBuffer1-12 5228782 256 ns/op
BenchmarkBuffer10-12 5392410 216 ns/op
PASS
BenchmarkBuffer1-12 4806208 287 ns/op
BenchmarkBuffer10-12 4637842 233 ns/op
PASS
However, if I add a sleep every 10 reads, it doesn't yield any better results.
import (
"testing"
"time"
)
func BenchmarkBuffer1WithSleep(b *testing.B) {
count := 0
c := make(chan int, 1)
go func() {
for i := 0; i < b.N; i++ {
c <- i
}
close(c)
}()
for a := range c {
count++
if count%10 == 0 {
time.Sleep(time.Duration(a) * time.Nanosecond)
}
}
}
func BenchmarkBuffer10WithSleep(b *testing.B) {
count := 0
c := make(chan int, 10)
go func() {
for i := 0; i < b.N; i++ {
c <- i
}
close(c)
}()
for a := range c {
count++
if count%10 == 0 {
time.Sleep(time.Duration(a) * time.Nanosecond)
}
}
}
Results when adding a sleep every 10 reads:
BenchmarkBuffer1WithSleep-12 856886 53219 ns/op
BenchmarkBuffer10WithSleep-12 929113 56939 ns/op
FYI: I also did the test again with only one CPU and got the following results:
BenchmarkBuffer1 5831193 207 ns/op
BenchmarkBuffer10 6226983 180 ns/op
BenchmarkBuffer1WithSleep 556635 35510 ns/op
BenchmarkBuffer10WithSleep 984472 61434 ns/op
Absolutely nothing is wrong with a channel of cap 500 e.g. if this channel is used as a semaphore.
The style guide you read recommends to not use buffered channels of let's say cap 64 "because this looks like a nice number". But this recommendation is not because of performance! (Btw: You microbenchmarks are useless microbenchmarks, they do not measure anything relevant.)
An unbuffered channel is some kind of synchronisation primitive and us such very much useful.
A buffered channel, well, may buffer between sender and receiver and this buffering can be problematic for observing, tuning and debugging the code (because creation and consumption are further decoupled). Thats why the style guide recommends unbuffered channels (or at most a cap of 1 as this is sometimes needed for correctness!).
It also doesn't prohibit larger buffer caps:
Any other [than 0 or 1] size must be subject to a high level of scrutiny. Consider how the size is determined, what prevents the channel from filling up under load and blocking writers, and what happens when this occurs. [emph. mine]
You may use a cap of 27 if you can explain why 27 (and not 22 or 31) and how this will influence program behaviour (not only performance!) if the buffer is filled.
Most people overrate performance. Correctness, operational stability and maintainability come first. And this is what this style guide is about here.

changing pixel value, saving and reading again returns the original color

I want to change all Blue values of pixels to 255, if it is equal to 20.
I read the source image, draw.Draw it to new image.RGBA, so that i can modify pixels.
But, when I take output image(after executing the program) and feed it as the input, and put a debug point inside the IF block, and run program in debug mode, i see in multiple points debugger stops inside there. Which means, I am not correctly modifying image.
Can anyone tell me, how can I modify pixels and save correctly? Thanks a lot
func changeOnePixelInImage() {
imgPath := "./source.png"
f, err := os.Open(imgPath)
check(err)
defer f.Close()
sourceImage, _, err := image.Decode(f)
size := sourceImage.Bounds().Size()
destImage := image.NewRGBA(sourceImage.Bounds())
draw.Draw(destImage, sourceImage.Bounds(), sourceImage, image.Point{}, draw.Over)
for x := 0; x < size.X; x++ {
for y := 0; y < size.Y; y++ {
pixel := sourceImage.At(x, y)
originalColor := color.RGBAModel.Convert(pixel).
(color.RGBA)
b := originalColor.B
if b == 20 {
b = 255 // <--- then i swap source and destination paths, and debug this line
}
c := color.RGBA{
R: originalColor.R,
G: originalColor.G,
B: b,
A: originalColor.A,
}
destImage.SetRGBA(x, y, c)
}
}
ext := filepath.Ext(imgPath)
newImagePath := fmt.Sprintf("%s/dest%s", filepath.Dir(imgPath), ext)
fg, err := os.Create(newImagePath)
check(err)
defer fg.Close()
err = jpeg.Encode(fg, destImage, &jpeg.Options{100})
check(err)
}
I found the answer to my question.
The thing is, I was Decoding jpeg image, and I found out that JPEG images loss quality(so, pixel values are modified during the process) from this stackoverflow question: Is JPEG lossless when quality is set to 100?
So, I should have used PNG images.(and even though I am using source.png as source image, it was actually jpg image :/)
So i changes last lines to:
if ext != ".png" {
panic("cannot do my thing with jpg images, since they get compressed")
}
err = png.Encode(fg, destImage)

How to allocate empty CString?

The cFunctionCall populates b and I am able to get content of string into GO string. However, I think that my memory allocation (line #1) is not efficient.
b := C.CString(strings.Repeat(" ", 50))
defer C.free(unsafe.Pointer(b))
C.cFunctionCall(b, 50)
rs := C.GoString(b)
log.Printf("rs: '%v'\n", rs)
If you want it to be initialized without the extra allocation and copy from Go, you would need to implement the strings.Repeat function over a C string:
func emptyString(size int) *C.char {
p := C.malloc(C.size_t(size + 1))
pp := (*[1 << 30]byte)(p)
bp := copy(pp[:], " ")
for bp < size {
copy(pp[bp:], pp[:bp])
bp *= 2
}
pp[size] = 0
return (*C.char)(p)
}
If it doesn't need to be initialized, you can simply malloc/calloc the pointer yourself and pass it to your function.
b := C.malloc(50) // or 51 if the depending on what size your function is expecting
defer C.free(unsafe.Pointer(b))
C.cFunctionCall((*C.char)(b), 50)
Unless this is being called many times and actually poses a performance problem, use what you already have and reduce the amount of C code you have to deal with.

Mysterious and Excessive memory allocation in a function in a go program

I have the following code, which uses tones of memory, which is way higher than expected.
I used to pprof tool and it shows that the function NewEdge is allocating more than 94% of all the memory allocated by the program.
My question is, what is wrong with this code, that is uses so much memory:
type Vertex struct {
Id string `json:"id"` // must be unique
Properties map[string]string `json:"properties"` // to be implemented soon
verticesThisIsConnectedTo map[string][]string `json:"-"` //id for the edges *Edge // keys are Vertex ids, each pair of vertices can be connected to each other with multiple edges
verticesConnectedToThis map[string][]string `json:"_"` //id for the edges *Edge // keys are Vertex ids,
}
type Edge struct {
id string `json:"-"` // for internal use, unique
Label string `json:"label"`
SourceId string `json:"source-id"`
TargetId string `json:"terget-id"`
Type string `json:"type"`
Properties map[string]string `json:"properties"` // to be implemented soon
}
func (v *Vertex) isPartof(g *Graph) bool {
_, b := g.Vertices[v.Id]
return b
}
func (g *Graph) NewEdge(source, target *Vertex, label, edgeType string) (Edge, error) {
if source.Id == target.Id {
return Edge{}, ERROR_NO_EDGE_TO_SELF_ALLOWED
}
if !source.isPartof(g) || !target.isPartof(g) {
return Edge{}, errors.New("InvalidEdge, source or target not in this graph")
}
e := Edge{id: <-nextId, Label: label, SourceId: source.Id, TargetId: target.Id, Type: edgeType}
g.Edges[e.id] = &e
source.verticesThisIsConnectedTo[target.Id] = append(source.verticesThisIsConnectedTo[target.Id], e.id)
target.verticesConnectedToThis[source.Id] = append(target.verticesConnectedToThis[source.Id], e.id)
return e, nil
}
The allocation happens by a call like this: fakeGraph(Aragog, 2000, 1) where :
func fakeGraph(g Graph, nodesCount, followratio int) error {
var err error
// create the vertices
for i := 0; i < nodesCount; i++ {
v := NewVertex("") //FH.RandStr(10))
g.AddVertex(v)
}
// create some "follow edges"
followcount := followratio * nodesCount / 100
vkeys := []string{}
for pk := range g.Vertices {
vkeys = append(vkeys, pk)
}
for ki := range g.Vertices {
pidx := rand.Perm(nodesCount)
followcounter := followcount
for j := 0; j < followcounter; j++ {
_, err := g.NewEdge(g.Vertices[ki], g.Vertices[vkeys[pidx[j]]], <-nextId, EDGE_TYPE_FOLLOW)
if err != nil {
followcounter++ // to compensate for references to self
}
}
}
return err
}
Question / mystery :
I can create thousands of Vertexs and the memory usage is very reasonable. But calls to NewEdge are very memory intensive. I first noticed that the code was using tones of memory. I ran pprof with -memprofile and then used go tool pprof and got this:
(pprof) top10
Total: 9.9 MB
8.9 89.9% 89.9% 8.9 89.9% main.(*Graph).NewEdge
0.5 5.0% 95.0% 0.5 5.0% allocg
0.5 5.0% 100.0% 0.5 5.0% fmt.Sprintf
0.0 0.0% 100.0% 0.5 5.0% _rt0_go
0.0 0.0% 100.0% 8.9 89.9% main.fakeGraph
0.0 0.0% 100.0% 0.5 5.0% main.funcĀ·003
0.0 0.0% 100.0% 8.9 89.9% main.main
0.0 0.0% 100.0% 0.5 5.0% mcommoninit
(pprof)
Any help is very much appreciated.
#ali I think there is no mystery in this memory profiling.
First of all, If you check size of your structs you will see what Edge struct is 2 times bigger than Vertex struct. (you can check size of structs by unsafe.Sizeof())
So, if you will call fakeGraph(Aragog, 2000, 1) Go will allocate:
2000 Vertex structs
at least 2000 * 20 = 40 000 Edge structs
As you can see NewEdge() will allocate at least 40 times more memory then fakeGraph()
Also, every time you will try to create new edge, new Edge struct will allocated - even if NewEdge() return error.
Another factor is - you return struct itself, not pointer to struct. In Go struct is value types, so entire struct will be copied once you will return from NewEdge() and it also can cause new allocation.
Yes, I see what you never use returned struct, but I'm not sure if Go compiler will check caller's context and skip Edge copying

Newbie: Properly sizing a []byte size in GO (Chunking)

Go Newbie alert!
Not quite sure how to do this - I want to make a "file chunker" where I grab fixed slices out of a binary file for later upload as a learning project.
I currently have this:
type (
fileChunk []byte
fileChunks []fileChunk
)
func NumChunks(fi os.FileInfo, chunkSize int) int {
chunks := fi.Size() / int64(chunkSize)
if rem := fi.Size() % int64(chunkSize) != 0; rem {
chunks++
}
return int(chunks)
}
// left out err checks for brevity
func chunker(filePtr *string) fileChunks {
f, err := os.Open(*filePtr)
defer f.Close()
// create the initial container to hold the slices
file_chunks := make(fileChunks, 0)
fi, err := f.Stat()
// show me how big the original file is
fmt.Printf("File Name: %s, Size: %d\n", fi.Name(), fi.Size())
// let's partition it into 10000 byte pieces
chunkSize := 10000
chunks := NumChunks(fi, chunkSize)
fmt.Printf("Need %d chunks for this file", chunks)
for i := 0; i < chunks; i++ {
b := make(fileChunk, chunkSize) // allocate a chunk, 10000 bytes
n1, err := f.Read(b)
fmt.Printf("Chunk: %d, %d bytes read\n", i, n1)
// add chunk to "container"
file_chunks = append(file_chunks, b)
}
fmt.Println(len(file_chunks))
return file_chunks
}
This all works mostly fine, but here's what happens if my fize size is 31234 bytes, then I'll end up with three slices full of the first 30000 bytes from the file, the final "chunk" will consist of 1234 "file bytes" followed by "padding" to the 10000 byte chunk size - I'd like the "remainder" filechunk ([]byte) to be sized to 1234, not the full capacity - what would the proper way to do this be? On the receiving side I would then "stitch" together all the pieces to recreate the original file.
You need to re-slice the remainder chunk to be just the length of the last chunk read:
n1, err := f.Read(b)
fmt.Printf("Chunk: %d, %d bytes read\n", i, n1)
b = b[:n1]
This does the re-slicing for all chunks. Normally, n1 will be 10000 for all the non-remainder chunks, but there is no guarantee. The docs say "Read reads up to len(b) bytes from the File." So it's good to pay attention to n1 all the time.

Resources