Using an io.WriteSeeker without a File in Go - go

I am using a third party library to generate PDFs. In order to write the PDF at the end (after all of content has been added using the lib's API), the pdfWriter type has a Write function that expects an io.WriteSeeker.
This is OK if I want to work with files, but I need to work in-memory. Trouble is, I can't find any way to do this - the only native type I found that implements io.WriteSeeker is File.
This is the part that works by using File for the io.Writer in the Write function of the pdfWriter:
fWrite, err := os.Create(outputPath)
if err != nil {
return err
}
defer fWrite.Close()
err = pdfWriter.Write(fWrite)
Is there way to do this without an actual File? Like getting a []byte or something?

Unfortunately there is no ready solution for an in-memory io.WriteSeeker implementation in the standard lib.
But as always, you can always implement your own. It's not that hard.
An io.WriteSeeker is an io.Writer and an io.Seeker, so basically you only need to implement 2 methods:
Write(p []byte) (n int, err error)
Seek(offset int64, whence int) (int64, error)
Read the general contract of these methods in their documentation how they should behave.
Here's a simple implementation which uses an in-memory byte slice ([]byte). It's not optimized for speed, this is just a "demo" implementation.
type mywriter struct {
buf []byte
pos int
}
func (m *mywriter) Write(p []byte) (n int, err error) {
minCap := m.pos + len(p)
if minCap > cap(m.buf) { // Make sure buf has enough capacity:
buf2 := make([]byte, len(m.buf), minCap+len(p)) // add some extra
copy(buf2, m.buf)
m.buf = buf2
}
if minCap > len(m.buf) {
m.buf = m.buf[:minCap]
}
copy(m.buf[m.pos:], p)
m.pos += len(p)
return len(p), nil
}
func (m *mywriter) Seek(offset int64, whence int) (int64, error) {
newPos, offs := 0, int(offset)
switch whence {
case io.SeekStart:
newPos = offs
case io.SeekCurrent:
newPos = m.pos + offs
case io.SeekEnd:
newPos = len(m.buf) + offs
}
if newPos < 0 {
return 0, errors.New("negative result pos")
}
m.pos = newPos
return int64(newPos), nil
}
Yes, and that's it.
Testing it:
my := &mywriter{}
var ws io.WriteSeeker = my
ws.Write([]byte("hello"))
fmt.Println(string(my.buf))
ws.Write([]byte(" world"))
fmt.Println(string(my.buf))
ws.Seek(-2, io.SeekEnd)
ws.Write([]byte("k!"))
fmt.Println(string(my.buf))
ws.Seek(6, io.SeekStart)
ws.Write([]byte("gopher"))
fmt.Println(string(my.buf))
Output (try it on the Go Playground):
hello
hello world
hello work!
hello gopher
Things that can be improved:
Create a mywriter value with an initial empty buf slice, but with a capacity that will most likely cover the size of the result PDF document. E.g. if you estimate the result PDFs are around 1 MB, create a buffer with capacity for 2 MB like this:
my := &mywriter{buf: make([]byte, 0, 2<<20)}
Inside mywriter.Write() when capacity needs to be increased (and existing content copied over), it may be profitable to use bigger increment, e.g. double the current capacity to a certain extent, which reserves space for future appends and minimizes the reallocations.

Related

Is there a way to get transfer speed from io.Copy?

I am copying a network stream to a file using io.Copy. I would like to extract the current speed, preferably in bytes per second, that the transfer is operating at.
res, err := http.Get(url)
if err != nil {
panic(err)
}
// Open output file
out, err := os.OpenFile("output", os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
if err != nil {
panic(err)
}
// Close output file as well as body
defer out.Close()
defer func(Body io.ReadCloser) {
err := Body.Close()
if err != nil {
panic(err)
}
}(res.Body)
_, err := io.Copy(out, res.Body)
As noted in the comments - the entire transfer rate is easily computed after the fact - especially when using io.Copy. If you want to track "live" transfer rates - and poll the results over a long file transfer - then a little more work is involved.
Below I've outlined a simple io.Reader wrapper to track the overall transfer rate. For brevity, it is not goroutine safe, but would be trivial do make it so. And then one could poll from another goroutine the progress, while the main goroutine does the reading.
You can create a io.Reader wrapper - and use that to track the moment of first read - and then track future read byte counts. The final result may look like this:
r := NewRater(resp.Body) // io.Reader wrapper
n, err := io.Copy(out, r)
log.Print(r) // stringer method shows human readable "b/s" output
To implement this, one approach:
type rate struct {
r io.Reader
count int64 // may have large (2GB+) files - so don't use int
start, end time.Time
}
func NewRater(r io.Reader) *rate { return &rate{r: r} }
then we need the wrapper Read to track the underlying io.Readers progress:
func (r *rate) Read(b []byte) (n int, err error) {
if r.start.IsZero() {
r.start = time.Now()
}
n, err = r.r.Read(b) // underlying io.Reader read
r.count += int64(n)
if err == io.EOF {
r.end = time.Now()
}
return
}
the rate at any time can be polled like so - even before EOF:
func (r *rate) Rate() (n int64, d time.Duration) {
end := r.rend
if end.IsZero() {
end = time.Now()
}
return r.count, end.Sub(r.start)
}
and a simple Stringer method to show b/s:
func (r *rate) String() string {
n, d := r.Rate()
return fmt.Sprintf("%.0f b/s", float64(n)/(d.Seconds()))
}
Note: the above io.Reader wrapper has no locking in place, so operations must be from the same goroutine. Since the question relates to io.Copy - then this is a safe assumption to make.

Serialize struct fields to pre existing slice of bytes

I have a setup where I receive data over the network and serialize it to my struct. It works fine, but now I need to serialize the data to a slice buffer to send it across the network.
I am trying to avoid having to allocate more than needed so I have already set up a buffer which I like to write to for all my serializing. But am not sure how to do this.
My setup is like this:
recieveBuffer := make([]byte, 1500)
header := recieveBuffer[0:1]
message := recieveBuffer[1:]
So I am trying to write fields from a struct to message and the total number of bytes for all the fields as a value for header.
This was how I deserialized to the struct:
// Deserialize ...
func (userSession *UserSession) Deserialize(message []byte) {
userSession.UID = int64(binary.LittleEndian.Uint32(message[0:4]))
userSession.UUID = string(message[4:40])
userSession.Username = string(message[40:])
}
I don't really know how to do the reverse of this, however. Is it possible without creating buffers for each field I want to serialize before copying to message?
Given the preallocated buffer buf, you can reverse the process like this:
buf[0] = byte(40+len(userSession.Username))
binary.LittleEndian.PutUint32(buf[1:], uint32(int32(userSession.UID)))
copy(buf[5:41], userSession.UUID)
copy(buf[41:], userSession.Username)
Given two helper functions.
One to encode a primitive to a byte slice:
func EncodeNumber2NetworkOrder(v interface{}) ([]byte, error) {
switch v.(type) {
case int: // int is at least 32 bits
b := make([]byte, 4)
binary.BigEndian.PutUint32(b, uint32(v.(int)))
return b, nil
case int8:
b := []byte{byte(v.(int8))}
return b, nil
// ... truncated
and one to convert primitive, non-byte slices to a byte slice
func EncodeBigEndian(in []float64) []byte {
var out []byte = make([]byte, len(in)*8)
var wg sync.WaitGroup
wg.Add(len(in))
for i := 0; i < len(in); i++ {
go func(out *[]byte, i int, f float64) {
defer wg.Done()
binary.BigEndian.PutUint64((*out)[(i<<3):], math.Float64bits(f))
}(&out, i, in[i])
}
wg.Wait()
return out
}
your binary serialization might look like this for a bogus struct like
type Foo struct {
time int64
data []float64
}
func Encode(f *Foo) []byte {
da := encoder.EncodeBigEndian(f.data)
bytes := make([]byte,0)
bytes = append(bytes, da...)
return bytes
}

What is wrong with solution to the 23'th task of go tour?

There is a go tour. I've solved https://tour.golang.org/methods/23 like this:
func (old_reader rot13Reader) Read(b []byte) (int, error) {
const LEN int = 1024
tmp_bytes := make([]byte, LEN)
old_len, err := old_reader.r.Read(tmp_bytes)
if err == nil {
tmp_bytes = tmp_bytes[:old_len]
rot13(tmp_bytes)
return len(tmp_bytes), nil
} else {
return 0, err
}
}
func main() {
s := strings.NewReader("Lbh penpxrq gur pbqr!")
r := rot13Reader{s}
io.Copy(os.Stdout, &r)
}
Where rot13 is correct and debug output right before return shows correct string. But why there is no output to console?
The Read method for an io.Reader needs to operate on the byte slice provided to it. You're reading into a new slice, and never modifying the original.
Just use b throughout the Read method:
func (old_reader rot13Reader) Read(b []byte) (int, error) {
n, err := old_reader.r.Read(b)
rot13(b[:n])
return n, err
}
You're never modifying b in your reader. The semantic of io.Reader's Read function is that you put the data into b's underlying array directly.
Assuming the rot13() function also in-place modifies, this will work (edit: I've tried to keep this code close to your version so you can see what's changed easier. JimB's solution is a more idiomatic solution to this problem):
func (old_reader rot13Reader) Read(b []byte) (int, error) {
tmp_bytes := make([]byte, len(b))
old_len, err := old_reader.r.Read(tmp_bytes)
tmp_bytes = tmp_bytes[:old_len]
rot13(tmp_bytes)
for i := range tmp_bytes {
b[i] = tmp_bytes[i]
}
return old_len, err
}
Example (with stubbed rot13()): https://play.golang.org/p/vlbra-46zk
On a side note, from an idiomatic perspect, old_reader isn't a proper receiver name (nor is old_len a proper variable name). Go prefers short receiver names (like r or rdr in this case), and also prefer camelcase to underscores (underscores will actually fire a golint warning).
Edit2: A more idiomatic version of your code. Kept the same mechanism of action, just cleaned it up a bit.
func (rdr rot13Reader) Read(b []byte) (int, error) {
tmp := make([]byte, len(b))
n, err := rdr.r.Read(tmp)
tmp = tmp[:n]
rot13(tmp)
for i := range tmp {
b[i] = tmp[i]
}
return n, err
}
From this, removing the tmp byte slice and using the destination b directly results in JimB's idiomatic solution to the problem.
Edit3: Updated to fix the issue Paul pointed out in comments.

Tour of Go exercise #22: Reader, what does the question mean?

Exercise: Readers
Implement a Reader type that emits an infinite stream of the ASCII character 'A'.
I don't understand the question, how to emit character 'A'? into which variable should I set that character?
Here's what I tried:
package main
import "golang.org/x/tour/reader"
type MyReader struct{}
// TODO: Add a Read([]byte) (int, error) method to MyReader.
func main() {
reader.Validate(MyReader{}) // what did this function expect?
}
func (m MyReader) Read(b []byte) (i int, e error) {
b = append(b, 'A') // this is wrong..
return 1, nil // this is also wrong..
}
Ah I understand XD
I think it would be better to say: "rewrite all values in []byte into 'A's"
package main
import "golang.org/x/tour/reader"
type MyReader struct{}
// TODO: Add a Read([]byte) (int, error) method to MyReader.
func (m MyReader) Read(b []byte) (i int, e error) {
for x := range b {
b[x] = 'A'
}
return len(b), nil
}
func main() {
reader.Validate(MyReader{})
}
An io.Reader.Read role is to write a given memory location with data read from its source.
To implement a stream of 'A', the function must write given memory location with 'A' values.
It is not required to fill in the entire slice provided in input, it can decide how many bytes of the input slice is written (Read reads up to len(p) bytes into p), it must return that number to indicate to the consumer the length of data to process.
By convention an io.Reader indicates its end by returning an io.EOF error. If the reader does not return an error, it behaves as an infinite source of data to its consumer which can never detect an exit condition.
Note that a call to Read that returns 0 bytes read can happen and does not indicate anything particular, Callers should treat a return of 0 and nil as indicating that nothing happened; Which makes this non-solution https://play.golang.org/p/aiUyc4UDYi2 fails with a timeout.
In regard to that, the solution provided here https://stackoverflow.com/a/68077578/4466350 return copy(b, "A"), nil is really just right. It writes the minimum required, with an elegant use of built-ins and syntax facilities, and it never returns an error.
The alleged answer is didn't work for me, even without the typos.
Try as I did, that string would not go into b.
func (r MyReader) Read(b []byte) (int, error) {
return copy(b, "A"), nil
}
My solution: just add one byte at a time, store the index i using closure.
package main
import (
"golang.org/x/tour/reader"
)
type MyReader struct{}
func (mr MyReader) Read(b []byte) (int, error) {
i := 0
p := func () int {
b[i] = 'A'
i += 1
return i
}
return p(), nil
}
func main() {
reader.Validate(MyReader{})
}
Simplest one:
func (s MyReader) Read(b []byte) (int, error) {
b[0] = byte('A')
return 1, nil
}
You can generalize the idea to create an eternal reader, alwaysReader, from which you always read the same byte value over and over (it never results in EOF):
package readers
type alwaysReader struct {
value byte
}
func (r alwaysReader) Read(p []byte) (n int, err error) {
for i := range p {
p[i] = r.value
}
return len(p), nil
}
func NewAlwaysReader(value byte) alwaysReader {
return alwaysReader { value }
}
NewAlwaysReader() is the constructor for alwaysReader (which isn't exported). The result of NewAlwaysReader('A') is a reader from whom you will always read 'A'.
A clarifying unit test for alwaysReader:
package readers_test
import (
"bytes"
"io"
"readers"
"testing"
)
func TestEmptyReader(t *testing.T) {
const numBytes = 128
const value = 'A'
buf := bytes.NewBuffer(make([]byte, 0, numBytes))
reader := io.LimitReader(readers.NewAlwaysReader(value), numBytes)
n, err := io.Copy(buf, reader)
if err != nil {
t.Fatal("copy failed: %w")
}
if n != numBytes {
t.Errorf("%d bytes read but %d expected", n, numBytes)
}
for i, elem := range buf.Bytes() {
if elem != value {
t.Errorf("byte at position %d has not the value %v but %v", i, value, elem)
}
}
}
Since we can read from the alwaysReader forever, we need to decorate it with a io.LimitReader so that we end up reading at most numBytes from it. Otherwise, the bytes.Buffer will eventually run out of memory for reallocating its internal buffer because of io.Copy().
Note that the following implementation of Read() for alwaysReader is also valid:
func (r alwaysReader) Read(p []byte) (n int, err error) {
if len(p) > 0 {
p[0] = r.value
return 1, nil
}
return 0, nil
}
The former Read() implementation fills the whole byte slice with the byte value, whereas the latter writes a single byte.

Consuming all elements of a channel into a slice

How can I construct a slice out of all of elements consumed from a channel (like Python's list does)? I can use this helper function:
func ToSlice(c chan int) []int {
s := make([]int, 0)
for i := range c {
s = append(s, i)
}
return s
}
but due to the lack of generics, I'll have to write that for every type, won't I? Is there a builtin function that implements this? If not, how can I avoid copying and pasting the above code for every single type I'm using?
If there's only a few instances in your code where the conversion is needed, then there's absolutely nothing wrong with copying the 7 lines of code a few times (or even inlining it where it's used, which reduces it to 4 lines of code and is probably the most readable solution).
If you've really got conversions between lots and lots of types of channels and slices and want something generic, then you can do this with reflection at the cost of ugliness and lack of static typing at the callsite of ChanToSlice.
Here's complete example code for how you can use reflect to solve this problem with a demonstration of it working for an int channel.
package main
import (
"fmt"
"reflect"
)
// ChanToSlice reads all data from ch (which must be a chan), returning a
// slice of the data. If ch is a 'T chan' then the return value is of type
// []T inside the returned interface.
// A typical call would be sl := ChanToSlice(ch).([]int)
func ChanToSlice(ch interface{}) interface{} {
chv := reflect.ValueOf(ch)
slv := reflect.MakeSlice(reflect.SliceOf(reflect.TypeOf(ch).Elem()), 0, 0)
for {
v, ok := chv.Recv()
if !ok {
return slv.Interface()
}
slv = reflect.Append(slv, v)
}
}
func main() {
ch := make(chan int)
go func() {
for i := 0; i < 10; i++ {
ch <- i
}
close(ch)
}()
sl := ChanToSlice(ch).([]int)
fmt.Println(sl)
}
You could make ToSlice() just work on interface{}'s, but the amount of code you save here will likely cost you in complexity elsewhere.
func ToSlice(c chan interface{}) []interface{} {
s := make([]interface{}, 0)
for i := range c {
s = append(s, i)
}
return s
}
Full example at http://play.golang.org/p/wxx-Yf5ESN
That being said: As #Volker said in the comments from the slice (haha) of code you showed it seems like it'd be saner to either process the results in a streaming fashion or "buffer them up" at the generator and just send the slice down the channel.

Resources