Go channel buffer length - best practice

Go channel buffer length - best practice - go

I'm currently learning Go and have started to re-write a test-data-generation program I originally wrote in Java. I've been intrigued by Go's channels / threading possibilities, as many of the programs I've written have been focused around load testing a system / recording various metrics.
Here, I am creating some data to be written out to a CSV file. I started out by generating all of the data, then passing that off to be written to a file. I then thought I'd try and implement a channel, so data could be written while it's still being generated.
It worked - it almost eliminated the overhead of generating the data first and then writing it. However, I found that this only worked if I had a channel with a buffer big enough to cope with all of the test data being generated: c := make(chan string, count), where count is the same size as the number of test data lines I am generating.
So, to my question: I'm regularly generating millions of records of test data (load test applications) - should I be using a channel with a buffer that large? I can't find much about restrictions on the size of the buffer?
Running the below with a 10m count completes in ~59.5s; generating the data up front and writing it all to a file takes ~62s; using a buffer length of 1 - 100 takes ~80s.
const externalRefPrefix = "Ref"
const fileName = "citizens.csv"
var counter int32 = 0
func WriteCitizensForApplication(applicationId string, count int) {
file, err := os.Create(fileName)
if err != nil {
panic(err)
}
defer file.Close()
c := make(chan string, count)
go generateCitizens(applicationId, count, c)
for line := range c {
file.WriteString(line)
}
}
func generateCitizens(applicationId string, count int, c chan string) {
for i := 0; i < count; i++ {
c <- fmt.Sprintf("%v%v\n", applicationId, generateExternalRef())
}
close(c)
}
func generateExternalRef() string {
atomic.AddInt32(&COUNTER, 1)
return fmt.Sprintf("%v%08d", externalRefPrefix, counter)
}

Related

Memory usage of [][]string vs struct containing []string

When appending to a [][]string profiling shows the app uses around 145MiB of memory.
defer profile.Start(profile.MemProfile).Stop()
f, _ := os.Open("test.csv") // 100 MiB File
r := csv.NewReader(f)
var records [][]string
for {
values, err := r.Read()
if err == io.EOF {
break
}
records = append(records, values)
}
When storing the slice in a struct and appending that the app uses around 260MiB of memory.
defer profile.Start(profile.MemProfile).Stop()
type record struct {
values []string
}
f, _ := os.Open("test.csv") // 100 MiB File
r := csv.NewReader(f)
var records []record
for {
values, err := r.Read()
if err == io.EOF {
break
}
r := record{values: values}
records = append(records, r)
}
It feels like it's using double the memory in the second example. Can someone explain why the second example uses more memory?

For those of you who are using between go 1.12 and go 1.15, debug.FreeOSMemory() would not have returned the free memory back to OS, so htop/top will show wrong number or if you rely on RSS to monitor your app, it would be wrong.
This is due to the fact that runtime in Golang (1.12 - 1.15) uses MADV_FREE rather than MADV_DONTNEED.<= Go (1.11) and Go (1.16 - which was released couple of days back) use MADV_DONTNEED.
Go 1.16 reverted back to MADV_DONTNEED.
Please find the changelog image and change log URL here.
Please upgrade to get predicatable analytics on memory usage. If you want to use GoLang(1.12-1.15) and still want to use MADV_DONTNEED, Kindly run your binaries using GODEBUG=madvdontneed=1 ./main.

Using an io.WriteSeeker without a File in Go

I am using a third party library to generate PDFs. In order to write the PDF at the end (after all of content has been added using the lib's API), the pdfWriter type has a Write function that expects an io.WriteSeeker.
This is OK if I want to work with files, but I need to work in-memory. Trouble is, I can't find any way to do this - the only native type I found that implements io.WriteSeeker is File.
This is the part that works by using File for the io.Writer in the Write function of the pdfWriter:
fWrite, err := os.Create(outputPath)
if err != nil {
return err
}
defer fWrite.Close()
err = pdfWriter.Write(fWrite)
Is there way to do this without an actual File? Like getting a []byte or something?

Unfortunately there is no ready solution for an in-memory io.WriteSeeker implementation in the standard lib.
But as always, you can always implement your own. It's not that hard.
An io.WriteSeeker is an io.Writer and an io.Seeker, so basically you only need to implement 2 methods:
Write(p []byte) (n int, err error)
Seek(offset int64, whence int) (int64, error)
Read the general contract of these methods in their documentation how they should behave.
Here's a simple implementation which uses an in-memory byte slice ([]byte). It's not optimized for speed, this is just a "demo" implementation.
type mywriter struct {
buf []byte
pos int
}
func (m *mywriter) Write(p []byte) (n int, err error) {
minCap := m.pos + len(p)
if minCap > cap(m.buf) { // Make sure buf has enough capacity:
buf2 := make([]byte, len(m.buf), minCap+len(p)) // add some extra
copy(buf2, m.buf)
m.buf = buf2
}
if minCap > len(m.buf) {
m.buf = m.buf[:minCap]
}
copy(m.buf[m.pos:], p)
m.pos += len(p)
return len(p), nil
}
func (m *mywriter) Seek(offset int64, whence int) (int64, error) {
newPos, offs := 0, int(offset)
switch whence {
case io.SeekStart:
newPos = offs
case io.SeekCurrent:
newPos = m.pos + offs
case io.SeekEnd:
newPos = len(m.buf) + offs
}
if newPos < 0 {
return 0, errors.New("negative result pos")
}
m.pos = newPos
return int64(newPos), nil
}
Yes, and that's it.
Testing it:
my := &mywriter{}
var ws io.WriteSeeker = my
ws.Write([]byte("hello"))
fmt.Println(string(my.buf))
ws.Write([]byte(" world"))
fmt.Println(string(my.buf))
ws.Seek(-2, io.SeekEnd)
ws.Write([]byte("k!"))
fmt.Println(string(my.buf))
ws.Seek(6, io.SeekStart)
ws.Write([]byte("gopher"))
fmt.Println(string(my.buf))
Output (try it on the Go Playground):
hello
hello world
hello work!
hello gopher
Things that can be improved:
Create a mywriter value with an initial empty buf slice, but with a capacity that will most likely cover the size of the result PDF document. E.g. if you estimate the result PDFs are around 1 MB, create a buffer with capacity for 2 MB like this:
my := &mywriter{buf: make([]byte, 0, 2<<20)}
Inside mywriter.Write() when capacity needs to be increased (and existing content copied over), it may be profitable to use bigger increment, e.g. double the current capacity to a certain extent, which reserves space for future appends and minimizes the reallocations.

Why Go channels limit the buffer size

I am new to Go and I might be missing the point but why are Go channels limited in the maximum buffer size buffered channels can have? For example if I make a channel like so
channel := make(chan int, 100)
I cannot add more than 100 elements to the channel without blocking, is there a reason for this? Further they cannot dynamically be resized, because the channel API does not support that.
This seems sort of limiting in the language's support for universal synchronization with a single mechanism since it lacks convenience compared to an unbounded semaphore. For example a generalized semaphore's value can be increased without bounds.

If one component of a program can't keep up with its input, it needs to put back-pressure on the rest of the system, rather than letting it run ahead and generate gigabytes of data that will never get processed because the system ran out of memory and crashed.
There is really no such thing as an unlimited buffer, because machines have limits on what they can handle. Go requires you to specify a size for buffered channels so that you will think about what size buffer your program actually needs and can handle. If it really needs a billion items, and can handle them, you can create a channel that big. But in most cases a buffer size of 0 or 1 is actually what is needed.

This is because Channels are designed for efficient communication between concurrent goroutines, but the need you have is something different: the fact you are blocking denotes that the recipient is not attending the work queue, and "dynamic" is rarely free.
There are a variety of different patterns and algorithms you can use to solve the problem you have: you could change your channel to accept arrays of ints, you could add additional goroutines to better balance or filter work. Or you could implement your own dynamic channel. Doing so is certainly a useful exercise for seeing why dynamic channels aren't a great way to build concurrency.
package main
import "fmt"
func dynamicChannel(initial int) (chan <- interface{}, <- chan interface{}) {
in := make(chan interface{}, initial)
out := make(chan interface{}, initial)
go func () {
defer close(out)
buffer := make([]interface{}, 0, initial)
loop:
for {
packet, ok := <- in
if !ok {
break loop
}
select {
case out <- packet:
continue
default:
}
buffer = append(buffer, packet)
for len(buffer) > 0 {
select {
case packet, ok := <-in:
if !ok {
break loop
}
buffer = append(buffer, packet)
case out <- buffer[0]:
buffer = buffer[1:]
}
}
}
for len(buffer) > 0 {
out <- buffer[0]
buffer = buffer[1:]
}
} ()
return in, out
}
func main() {
in, out := dynamicChannel(4)
in <- 10
fmt.Println(<-out)
in <- 20
in <- 30
fmt.Println(<-out)
fmt.Println(<-out)
for i := 100; i < 120; i++ {
in <- i
}
fmt.Println("queued 100-120")
fmt.Println(<-out)
close(in)
fmt.Println("in closed")
for i := range out {
fmt.Println(i)
}
}
Generally, if you are blocking, it indicates your concurrency is not well balanced. Consider a different strategy. For example, a simple tool to look for files with a matching .checksum file and then check the hashes:
func crawl(toplevel string, workQ chan <- string) {
defer close(workQ)
for _, path := filepath.Walk(toplevel, func (path string, info os.FileInfo, err error) error {
if err == nil && info.Mode().IsRegular() {
workQ <- path
}
}
// if a file has a .checksum file, compare it with the file's checksum.
func validateWorker(workQ <- chan string) {
for path := range workQ {
// If there's a .checksum file, read it, limit to 256 bytes.
expectedSum, err := os.ReadFile(path + ".checksum")[:256]
if err != nil { // ignore
continue
}
file, err := os.Open(path)
if err != nil {
continue
}
defer close(file)
hash := sha256.New()
if _, err := io.Copy(hash, file); err != nil {
log.Printf("couldn't hash %s: %w", path, err)
continue
}
actualSum := fmt.Sprintf("%x", hash.Sum(nil))
if actualSum != expectedSum {
log.Printf("%s: mismatch: expected %s, got %s", path, expectedSum, actualSum)
}
}
Even without any .checksum files, the crawl function will tend to outpace the worker queue. When .checksum files are encountered, especially if the files are large, the worker could take much, much longer to perform a single checksum.
A better aim here would be to achieve more consistent throughput by reducing the number of things the "validateWorker" does. Right now it is sometimes fast, because it checks for the checksum file. Othertimes it is slow, because it also loads has to read and checksum the files.
type ChecksumQuery struct {
Filepath string
ExpectSum string
}
func crawl(toplevel string, workQ chan <- ChecksumQuery) {
// Have a worker filter out files which don't have .checksums, and allow it
// to get a little ahead of the crawl function.
checkupQ := make(chan string, 4)
go func () {
defer close(workQ)
for path := range checkupQ {
expected, err := os.ReadFile(path + ".checksum")[:256]
if err == nil && len(expected) > 0 {
workQ <- ChecksumQuery{ path, string(expected) }
}
}
}()
go func () {
defer close(checkupQ)
for _, path := filepath.Walk(toplevel, func (path string, info os.FileInfo, err error) error {
if err == nil && info.Mode().IsRegular() {
checkupQ <- path
}
}
}()
}
Run a suitable number of validate workers, assign the workQ a suitable size, but if the crawler or validate functions block, it is because validate is doing useful work.
If your validate workers are all busy, they are all consuming large files from disk and hashing them. Having other workers interrupt this by crawling for more filenames, allocating and passing strings, isn't advantageous.
Other scenarios might be passing large lists to workers, in which pass the slices over channels (its cheap); or dynamic sized groups of things, in which case consider passing channels or captures over channels.

The buffer size is the number of elements that can be sent to the channel without the send blocking. By default, a channel has a buffer size of 0 (you get this with make(chan int)). This means that every single send will block until another goroutine receives from the channel. A channel of buffer size 1 can hold 1 element until sending blocks, so you'd get
c := make(chan int, 1)
c <- 1 // doesn't block
c <- 2 // blocks until another goroutine receives from the channel
I suggest you to look this for more clarification:
https://rogpeppe.wordpress.com/2010/02/10/unlimited-buffering-with-low-overhead/
http://openmymind.net/Introduction-To-Go-Buffered-Channels/

How to search a huge slice of maps[string]string concurrently

I need to search a huge slice of maps[string]string. My thought was that this is a good chance for go's channel and go routines.
The Plan was to divide the slice in parts and send search them in parallel.
But I was kind of shocked that my parallel version timed out while the search of the whole slice did the trick.
I am not sure what I am doing wrong. Down below is my code which I used to test the concept. The real code would involve more complexity
//Search for a giving term
//This function gets the data passed which will need to be search
//and the search term and it will return the matched maps
// the data is pretty simply the map contains { key: andSomeText }
func Search(data []map[string]string, term string) []map[string]string {
set := []map[string]string{}
for _, v := range data {
if v["key"] == term {
set = append(set, v)
}
}
return set
}
So this works pretty well to search the slice of maps for a given SearchTerm.
Now I thought if my slice would have like 20K entries, I would like to do the search in parallel
// All searches all records concurrently
// Has the same function signature as the the search function
// but the main task is to fan out the slice in 5 parts and search
// in parallel
func All(data []map[string]string, term string) []map[string]string {
countOfSlices := 5
part := len(data) / countOfSlices
fmt.Printf("Size of the data:%v\n", len(data))
fmt.Printf("Fragemnt Size:%v\n", part)
timeout := time.After(60000 * time.Millisecond)
c := make(chan []map[string]string)
for i := 0; i < countOfSlices; i++ {
// Fragments of the array passed on to the search method
go func() { c <- Search(data[(part*i):(part*(i+1))], term) }()
}
result := []map[string]string{}
for i := 0; i < part-1; i++ {
select {
case records := <-c:
result = append(result, records...)
case <-timeout:
fmt.Println("timed out!")
return result
}
}
return result
}
Here are my tests:
I have a function to generate my test data and 2 tests.
func GenerateTestData(search string) ([]map[string]string, int) {
rand.Seed(time.Now().UTC().UnixNano())
strin := []string{"String One", "This", "String Two", "String Three", "String Four", "String Five"}
var matchCount int
numOfRecords := 20000
set := []map[string]string{}
for i := 0; i < numOfRecords; i++ {
p := rand.Intn(len(strin))
s := strin[p]
if s == search {
matchCount++
}
set = append(set, map[string]string{"key": s})
}
return set, matchCount
}
The 2 tests: The first just traverses the slice and the second searches in parallel
func TestSearchItem(t *testing.T) {
tests := []struct {
InSearchTerm string
Fn func(data []map[string]string, term string) []map[string]string
}{
{
InSearchTerm: "This",
Fn: Search,
},
{InSearchTerm: "This",
Fn: All,
},
}
for i, test := range tests {
startTime := time.Now()
data, expectedMatchCount := GenerateTestData(test.InSearchTerm)
result := test.Fn(data, test.InSearchTerm)
fmt.Printf("Test: [%v]:\nTime: %v \n\n", i+1, time.Since(startTime))
assert.Equal(t, len(result), expectedMatchCount, "expected: %v to be: %v", len(result), expectedMatchCount)
}
}
It would be great if someone could explain me why my parallel code is so slow? What is wrong with the code and what I am missing here as well as what the recommended way would be to search huge slices in memory 50K+.

This looks like just a simple typo. The problem is that you divide your original big slice into 5 pieces (countOfSlices), and you properly launch 5 goroutines to search each part:
for i := 0; i < countOfSlices; i++ {
// Fragments of the array passed on to the search method
go func() { c <- Search(data[(part*i):(part*(i+1))], term) }()
}
This means you should expect 5 results, but you don't. You expect 4000-1 results:
for i := 0; i < part-1; i++ {
select {
case records := <-c:
result = append(result, records...)
case <-timeout:
fmt.Println("timed out!")
return result
}
}
Obviously if you only launched 5 goroutines, each of which delivers 1 single result, you can only expect as many (5). And since your loop waits a lot more (which will never come), it times out as expected.
Change the condition to this:
for i := 0; i < countOfSlices; i++ {
// ...
}

Concurrency is not parallelism. Go is massively concurrent language, not parallel. Even using multicore machine you will pay for data exchange between CPUs when accessing your shared slice in computation threads. You can take advantage of concurrency searching just first match for example. Or doing something with results(say print them, or write to some Writer, or sort) while continue to search.
func Search(data []map[string]string, term string, ch chan map[string]string) {
for _, v := range data {
if v["key"] == term {
ch <- v
}
}
}
func main(){
...
go search(datapart1, term, ch)
go search(datapart2, term, ch)
go search(datapart3, term, ch)
...
for vv := range ch{
fmt.Println(vv) //do something with match concurrently
}
...
}
The recommended way to search huge slice would be to keep it sorted, or make binary tree. There are no magic.

There are two problems - as icza notes you never finish the select as you need to use countOfSlices, and then also your call to Search will not get the data you want as you need to allocate that before calling the go func(), so allocate the slice outside and pass it in.
You might find it still isn't faster though to do this particular work in parallel with such simple data (perhaps with more complex data on a machine with lots of cores it would be worthwhile)?
Be sure when testing that you try swapping the order of your test runs - you might be surprised by the results! Also perhaps try the benchmarking tools available in the testing package which runs your code lots of times for you and averages the results, this might help you get a better idea of whether the fanout speeds things up.

what can create huge overhead of goroutines?

for an assignment we are using go and one of the things we are going to do is to parse a uniprotdatabasefile line-by-line to collect uniprot-records.
I prefer not to share too much code, but I have a working code snippet that does parse such a file (2.5 GB) correctly in 48 s (measured using the time go-package). It parses the file iteratively and add lines to a record until a record end signal is reached (a full record), and metadata on the record is created. Then the record string is nulled, and a new record is collected line-by-line. Then I thought that I would try to use go-routines.
I have got some tips before from stackoverflow, and then to the original code I simple added a function to handle everything concerning the metadata-creation.
So, the code is doing
create an empty record,
iterate the file and add lines to the record,
if a record stop signal is found (now we have a full record) - give it to a go routine to create the metadata
null the record string and continue from 2).
I also added a sync.WaitGroup() to make sure that I waited (in the end) for each routine to finish. I thought that this would actually lower the time spent on parsing the databasefile as it continued to parse while the goroutines would act on each record. However, the code seems to run for more than 20 minutes indicating that something is wrong or the overhead went crazy. Any suggestions?
package main
import (
"bufio"
"crypto/sha1"
"fmt"
"io"
"log"
"os"
"strings"
"sync"
"time"
)
type producer struct {
parser uniprot
}
type unit struct {
tag string
}
type uniprot struct {
filenames []string
recordUnits chan unit
recordStrings map[string]string
}
func main() {
p := producer{parser: uniprot{}}
p.parser.recordUnits = make(chan unit, 1000000)
p.parser.recordStrings = make(map[string]string)
p.parser.collectRecords(os.Args[1])
}
func (u *uniprot) collectRecords(name string) {
fmt.Println("file to open ", name)
t0 := time.Now()
wg := new(sync.WaitGroup)
record := []string{}
file, err := os.Open(name)
errorCheck(err)
scanner := bufio.NewScanner(file)
for scanner.Scan() { //Scan the file
retText := scanner.Text()
if strings.HasPrefix(retText, "//") {
wg.Add(1)
go u.handleRecord(record, wg)
record = []string{}
} else {
record = append(record, retText)
}
}
file.Close()
wg.Wait()
t1 := time.Now()
fmt.Println(t1.Sub(t0))
}
func (u *uniprot) handleRecord(record []string, wg *sync.WaitGroup) {
defer wg.Done()
recString := strings.Join(record, "\n")
t := hashfunc(recString)
u.recordUnits <- unit{tag: t}
u.recordStrings[t] = recString
}
func hashfunc(record string) (hashtag string) {
hash := sha1.New()
io.WriteString(hash, record)
hashtag = string(hash.Sum(nil))
return
}
func errorCheck(err error) {
if err != nil {
log.Fatal(err)
}
}

First of all: your code is not thread-safe. Mainly because you're accessing a hashmap
concurrently. These are not safe for concurrency in go and need to be locked. Faulty line in your code:
u.recordStrings[t] = recString
As this will blow up when you're running go with GOMAXPROCS > 1, I'm assuming that you're not doing that. Make sure you're running your application with GOMAXPROCS=2 or higher to achieve parallelism.
The default value is 1, therefore your code runs on one single OS thread which, of course, can't be scheduled on two CPU or CPU cores simultaneously. Example:
$ GOMAXPROCS=2 go run udb.go uniprot_sprot_viruses.dat
At last: pull the values from the channel or otherwise your program will not terminate.
You're creating a deadlock if the number of goroutines exceeds your limit. I tested with a
76MiB file of data, you said your file was about 2.5GB. I have 16347 entries. Assuming linear growth,
your file will exceed 1e6 and therefore there are not enough slots in the channel and your program
will deadlock, giving no result while accumulating goroutines which don't run to fail at the end
(miserably).
So the solution should be to add a go routine which pulls the values from the channel and does
something with them.
As a side note: If you're worried about performance, do not use strings as they're always copied. Use []byte instead.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Go channel buffer length - best practice - go

Related

Memory usage of [][]string vs struct containing []string

Using an io.WriteSeeker without a File in Go

Why Go channels limit the buffer size

How to search a huge slice of maps[string]string concurrently

what can create huge overhead of goroutines?

Categories

Resources