Why do you need `Flush` at all if `Close` is enough? - go

This is how I am using gzip writer.
var b bytes.Buffer
gz := gzip.NewWriter(&b)
if _, err := gz.Write([]byte(data)); err != nil {
panic(err)
}
/*
if err := gz.Flush(); err != nil {
panic(err)
}
*/
if err := gz.Close(); err != nil {
panic(err)
}
playground link https://play.golang.org/p/oafHItGOlDN
Clearly, Flush + Close and just Close are giving different results.
Docs for the compress/gzip package says:
func (z *Writer) Close() error
Close closes the Writer by flushing any unwritten data to the underlying io.Writer and writing the GZIP footer. It does not close the underlying io.Writer.
What flushing is this doc talking about? Why do you need Flush function at all if Close is enough? Why doesn't Close call Flush?

Closing does cause a flush. When you call Flush and then Close, the stream is flushed twice, which causes an additional chunk to be output, which uses 5 bytes to code 0 bytes of data. Both streams encode the same data, but one of them is wasteful.
As for why you would use Flush, the explanation is right there in the documentation for Flush. Sometimes you're not done writing, but you need to ensure that all of the data that you've written up to this point is readable by the client, before additional data is available. At those points, you flush the stream. You only close when there will be no more data.

Related

why *(*string)(unsafe.Pointer(&b)) doesn't work with bufio.Reader

i have a file. it has some ip
1.1.1.0/24
1.1.2.0/24
2.2.1.0/24
2.2.2.0/24
i read this file to slice, and used *(*string)(unsafe.Pointer(&b)) to parse []byte to string, but is doesn't work
func TestInitIpRangeFromFile(t *testing.T) {
filepath := "/tmp/test"
file, err := os.Open(filepath)
if err != nil {
t.Errorf("failed to open ip range file:%s, err:%s", filepath, err)
}
reader := bufio.NewReader(file)
ranges := make([]string, 0)
for {
ip, _, err := reader.ReadLine()
if err != nil {
if err == io.EOF {
break
}
logger.Fatalf("failed to read ip range file, err:%s", err)
}
t.Logf("ip:%s", *(*string)(unsafe.Pointer(&ip)))
ranges = append(ranges, *(*string)(unsafe.Pointer(&ip)))
}
t.Logf("%v", ranges)
}
result:
task_test.go:71: ip:1.1.1.0/24
task_test.go:71: ip:1.1.2.0/24
task_test.go:71: ip:2.2.1.0/24
task_test.go:71: ip:2.2.2.0/24
task_test.go:75: [2.2.2.0/24 1.1.2.0/24 2.2.1.0/24 2.2.2.0/24]
why 1.1.1.0/24 changed to 2.2.2.0/24 ?
change
*(*string)(unsafe.Pointer(&ip))
to string(ip) it works
So, while reinterpreting a slice-header as a string-header the way you did is absolutely bonkers and has no guarantee whatsoever of working correctly, it's only indirectly the cause of your problem.
The real problem is that you're retaining a pointer to the return value of bufio/Reader.ReadLine(), but the docs for that method say "The returned buffer is only valid until the next call to ReadLine." Which means that the reader is free to reuse that memory later on, and that's what's happening.
When you do the cast in the proper way, string(ip), Go copies the contents of the buffer into the newly-created string, which remains valid in the future. But when you type-pun the slice into a string, you keep the exact same pointer, which stops working as soon as the reader refills its buffer.
If you decided to do the pointer trickery as a performance hack to avoid copying and allocation... too bad. The reader interface is going to force you to copy the data out anyway, and since it does, you should just use string().

When I io.Copy a file, it doesn't block and trying to use it might fail until it finishes

I have this code (more or less):
resp, err := http.Get(url)
if err != nil {
// handle error
}
if resp.StatusCode != http.StatusOK {
// handle error
}
out, err := os.Create(filepath)
if err != nil {
return err
}
// Write the body to file
_, err = io.Copy(out, resp.Body)
resp.Body.Close()
out.Close()
My issue is that if I immediately try to do something (e.g. take the hash of this file), then I see that it is still copying for a while.
At first I was deferring the out.Close(), and I though that I need to out.Close after the io.Copy, which will block until its done with it. or so I thought.
This didn't work and I still have the same issue.
How do I block or wait for the io.Copy operation to finish?
Thanks!
Likely you are hitting some disk buffer/cache, where your OS or disk device keeps some data in memory before actually persisting the write to the disk.
Calling
out.Sync()
forces a fsync syscall, which will instruct the OS to force a flush of the buffer and write the data to disk. I suggest calling out.Flush() after the io.Copy call returns.
Related docs you may find interesting:
https://pkg.go.dev/os#File.Sync
https://man7.org/linux/man-pages/man2/fdatasync.2.html

How can I process a large http response body from a Druid query in Go

I am currently querying Druid and returning a large dataset back (roughly 4-5GBs). I would like to process this response and decode the JSON into a list of structs. I have it working fine when I change the query to return a smaller dataset but as soon as the response gets too large I get an unexpected EOF error.
I have tried reading the entire response body
bytes, err := ioutil.ReadAll(resp.Body)
Decoding the response body directly
var object []NewObject
err = json.NewDecoder(resp.Body).Decode(&object)
Creating a buffer and writing to a file
f, err := os.OpenFile("/tmp/test.txt", os.O_APPEND|os.O_WRONLY, 0600)
defer f.Close()
const oneMB = 1024 * 1024
bytesRead := 0
respBuf := make([]byte, oneMB)
// Read the response body
for {
n, err := resp.Body.Read(respBuf)
bytesRead += n
if err == io.EOF {
break
}
if err != nil {
fmt.Println("Error reading HTTP response: ", err.Error())
break
}
if _, err = f.Write(respBuf); err != nil {
panic(err)
}
}
All of these have ended with me getting an unexpected EOF error. I am using the default net/http, and encoding/json module which appear they should work fine. Is there anything else I can try?
I've figured out the problem. It appears to be that Druid has a hard query timeout (this link explains more about these configuration settings). What is interesting is it appears if you want the response as an object and it is too large it will just cut off mid byte stream, but if you want the result as array and it is too large it will send the bytes to finish the current element and then send a closing ] so when you parse the result it appears it all came through (no errors), though you may have been cut off midstream.

Golang reading from serial

I'm trying to read from a serial port (a GPS device on a Raspberry Pi).
Following the instructions from http://www.modmypi.com/blog/raspberry-pi-gps-hat-and-python
I can read from shell using
stty -F /dev/ttyAMA0 raw 9600 cs8 clocal -cstopb
cat /dev/ttyAMA0
I get well formatted output
$GNGLL,5133.35213,N,00108.27278,W,160345.00,A,A*65
$GNRMC,160346.00,A,5153.35209,N,00108.27286,W,0.237,,290418,,,A*75
$GNVTG,,T,,M,0.237,N,0.439,K,A*35
$GNGGA,160346.00,5153.35209,N,00108.27286,W,1,12,0.67,81.5,M,46.9,M,,*6C
$GNGSA,A,3,29,25,31,20,26,23,21,16,05,27,,,1.11,0.67,0.89*10
$GNGSA,A,3,68,73,83,74,84,75,85,67,,,,,1.11,0.67,0.89*1D
$GPGSV,4,1,15,04,,,34,05,14,040,21,09,07,330,,16,45,298,34*40
$GPGSV,4,2,15,20,14,127,18,21,59,154,30,23,07,295,26,25,13,123,22*74
$GPGSV,4,3,15,26,76,281,40,27,15,255,20,29,40,068,19,31,34,199,33*7C
$GPGSV,4,4,15,33,29,198,,36,23,141,,49,30,172,*4C
$GLGSV,3,1,11,66,00,325,,67,13,011,20,68,09,062,16,73,12,156,21*60
$GLGSV,3,2,11,74,62,177,20,75,53,312,36,76,08,328,,83,17,046,25*69
$GLGSV,3,3,11,84,75,032,22,85,44,233,32,,,,35*62
$GNGLL,5153.35209,N,00108.27286,W,160346.00,A,A*6C
$GNRMC,160347.00,A,5153.35205,N,00108.27292,W,0.216,,290418,,,A*7E
$GNVTG,,T,,M,0.216,N,0.401,K,A*3D
$GNGGA,160347.00,5153.35205,N,00108.27292,W,1,12,0.67,81.7,M,46.9,M,,*66
$GNGSA,A,3,29,25,31,20,26,23,21,16,05,27,,,1.11,0.67,0.89*10
$GNGSA,A,3,68,73,83,74,84,75,85,67,,,,,1.11,0.67,0.89*1D
$GPGSV,4,1,15,04,,,34,05,14,040,21,09,07,330,,16,45,298,34*40
(I've put some random data in)
I'm trying to read this in Go. Currently, I have
package main
import "fmt"
import "log"
import "github.com/tarm/serial"
func main() {
config := &serial.Config{
Name: "/dev/ttyAMA0",
Baud: 9600,
ReadTimeout: 1,
Size: 8,
}
stream, err := serial.OpenPort(config)
if err != nil {
log.Fatal(err)
}
buf := make([]byte, 1024)
for {
n, err := stream.Read(buf)
if err != nil {
log.Fatal(err)
}
s := string(buf[:n])
fmt.Println(s)
}
}
But this prints malformed data. I suspect that this is due to the buffer size or the value of Size in the config struct being wrong, but I'm not sure how to get those values from the stty settings.
Looking back, I think the issue is that I'm getting a stream and I want to be able to iterate over lines of the stty, rather than chunks. This is how the stream is outputted:
$GLGSV,3
,1,09,69
,10,017,
,70,43,0
69,,71,3
2,135,27
,76,23,2
32,22*6F
$GLGSV
,3,2,09,
77,35,30
0,21,78,
11,347,,
85,31,08
1,30,86,
72,355,3
6*6C
$G
LGSV,3,3
,09,87,2
4,285,30
*59
$GN
GLL,5153
.34919,N
,00108.2
7603,W,1
92901.00
,A,A*6A
The struct you get back from serial.OpenPort() contains a pointer to an open os.File corresponding to the opened serial port connection. When you Read() from this, the library calls Read() on the underlying os.File.
The documentation for this function call is:
Read reads up to len(b) bytes from the File. It returns the number of bytes read and any error encountered. At end of file, Read returns 0, io.EOF.
This means you have to keep track of how much data was read. You also have to keep track of whether there were newlines, if this is important to you. Unfortunately, the underlying *os.File is not exported, so you'll find it difficult to use tricks like bufio.ReadLine(). It may be worth modifying the library and sending a pull request.
As Matthew Rankin noted in a comment, Port implements io.ReadWriter so you can simply use bufio to read by lines.
stream, err := serial.OpenPort(config)
if err != nil {
log.Fatal(err)
}
scanner := bufio.NewScanner(stream)
for scanner.Scan() {
fmt.Println(scanner.Text()) // Println will add back the final '\n'
}
if err := scanner.Err(); err != nil {
log.Fatal(err)
}
Change
fmt.Println(s)
to
fmt.Print(s)
and you will probably get what you want.
Or did I misunderstand the question?
Two additions to Michael Hamptom's answer which can be useful:
line endings
You might receive data that is not newline-separated text. bufio.Scanner uses ScanLines by default to split the received data into lines - but you can also write your own line splitter based on the default function's signature and set it for the scanner:
scanner := bufio.NewScanner(stream)
scanner.Split(ownLineSplitter) // set custom line splitter function
reader shutdown
You might not receive a constant stream but only some packets of bytes from time to time. If no bytes arrive at the port, the scanner will block and you can't just kill it. You'll have to close the stream to do so, effectively raising an error. To not block any outer loops and handle errors appropriately, you can wrap the scanner in a goroutine that takes a context. If the context was cancelled, ignore the error, otherwise forward the error. In principle, this can look like
var errChan = make(chan error)
var dataChan = make(chan []byte)
ctx, cancelPortScanner := context.WithCancel(context.Background())
go func(ctx context.Context) {
scanner := bufio.NewScanner(stream)
for scanner.Scan() { // will terminate if connection is closed
dataChan <- scanner.Bytes()
}
// if execution reaches this point, something went wrong or stream was closed
select {
case <-ctx.Done():
return // ctx was cancelled, just return without error
default:
errChan <- scanner.Err() // ctx wasn't cancelled, forward error
}
}(ctx)
// handle data from dataChan, error from errChan
To stop the scanner, you would cancel the context and close the connection:
cancelPortScanner()
stream.Close()

Gzip uncompressed http.Response.Body

I am building a Go application that takes an http.Response object and saves it (response headers and body) to a redis hash. When the application receives an http.Response.Body that is not gzipped, I want to gzip it before saving it to the cache.
My confusion stems from my inability to make clear sense of Go's io interfaces, and how to negotiate between http.Response.Body's io.ReadCloser and the gzip Writer. I imagine there is an elegant, streaming solution here, but I can't quite get it to work.
If you've already determined the body is uncompressed, and if you need a []byte of the compressed data (instead of for example already having an io.Writer you could write to, e.g. if you wanted to save the body to a file then you'd want to stream into the file not into a buffer) then something like this should work:
func getCompressedBody(r *http.Response) ([]byte, error) {
var buf bytes.Buffer
gz := gzip.NewWriter(&buf)
if _, err := io.Copy(gz, r.Body); err != nil {
return nil, err
}
err := gz.Close()
return buf.Bytes(), err
}
(this is just an example and would probably be in-line instead of as a function; if you wanted it as a fuction then it should probably take an io.Reader instead of an *http.Response).

Resources