Size control on logging an unknown length of parameters - go

The Problem:
Right now, I'm logging my SQL query and the args that related to that query, but what will happen if my args weight a lot? say 100MB?
The Solution:
I want to iterate over the args and once they exceeded the 0.5MB I want to take the args up till this point and only log them (of course I'll use the entire args set in the actual SQL query).
Where am stuck:
I find it hard to find the size on the disk of an interface{}.
How can I print it? (there is a nicer way to do it than %v?)
The concern is mainly focused on the first section, how can I find the size, I need to know the type, if its an array, stack, heap, etc..
If code helps, here is my code structure (everything sits in dal pkg in util file):
package dal
import (
"fmt"
)
const limitedLogArgsSizeB = 100000 // ~ 0.1MB
func parsedArgs(args ...interface{}) string {
currentSize := 0
var res string
for i := 0; i < len(args); i++ {
currentEleSize := getSizeOfElement(args[i])
if !(currentSize+currentEleSize =< limitedLogArgsSizeB) {
break
}
currentSize += currentEleSize
res = fmt.Sprintf("%s, %v", res, args[i])
}
return "[" + res + "]"
}
func getSizeOfElement(interface{}) (sizeInBytes int) {
}
So as you can see I expect to get back from parsedArgs() a string that looks like:
"[4378233, 33, true]"
for completeness, the query that goes with it:
INSERT INTO Person (id,age,is_healthy) VALUES ($0,$1,$2)
so to demonstrate the point of all of this:
lets say the first two args are equal exactly to the threshold of the size limit that I want to log, I will only get back from the parsedArgs() the first two args as a string like this:
"[4378233, 33]"
I can provide further details upon request, Thanks :)

Getting the memory size of arbitrary values (arbitrary data structures) is not impossible but "hard" in Go. For details, see How to get memory size of variable in Go?
The easiest solution could be to produce the data to be logged in memory, and you can simply truncate it before logging (e.g. if it's a string or a byte slice, simply slice it). This is however not the gentlest solution (slower and requires more memory).
Instead I would achieve what you want differently. I would try to assemble the data to be logged, but I would use a special io.Writer as the target (which may be targeted at your disk or at an in-memory buffer) which keeps track of the bytes written to it, and once a limit is reached, it could discard further data (or report an error, whatever suits you).
You can see a counting io.Writer implementation here: Size in bits of object encoded to JSON?
type CounterWr struct {
io.Writer
Count int
}
func (cw *CounterWr) Write(p []byte) (n int, err error) {
n, err = cw.Writer.Write(p)
cw.Count += n
return
}
We can easily change it to become a functional limited-writer:
type LimitWriter struct {
io.Writer
Remaining int
}
func (lw *LimitWriter) Write(p []byte) (n int, err error) {
if lw.Remaining == 0 {
return 0, io.EOF
}
if lw.Remaining < len(p) {
p = p[:lw.Remaining]
}
n, err = lw.Writer.Write(p)
lw.Remaining -= n
return
}
And you can use the fmt.FprintXXX() functions to write into a value of this LimitWriter.
An example writing to an in-memory buffer:
buf := &bytes.Buffer{}
lw := &LimitWriter{
Writer: buf,
Remaining: 20,
}
args := []interface{}{1, 2, "Looooooooooooong"}
fmt.Fprint(lw, args)
fmt.Printf("%d %q", buf.Len(), buf)
This will output (try it on the Go Playground):
20 "[1 2 Looooooooooooon"
As you can see, our LimitWriter only allowed to write 20 bytes (LimitWriter.Remaining), and the rest were discarded.
Note that in this example I assembled the data in an in-memory buffer, but in your logging system you can write directly to your logging stream, just wrap it in LimitWriter (so you can completely omit the in-memory buffer).
Optimization tip: if you have the arguments as a slice, you may optimize the truncated rendering by using a loop, and stop printing arguments once the limit is reached.
An example doing this:
buf := &bytes.Buffer{}
lw := &LimitWriter{
Writer: buf,
Remaining: 20,
}
args := []interface{}{1, 2, "Loooooooooooooooong", 3, 4, 5}
io.WriteString(lw, "[")
for i, v := range args {
if _, err := fmt.Fprint(lw, v, " "); err != nil {
fmt.Printf("Breaking at argument %d, err: %v\n", i, err)
break
}
}
io.WriteString(lw, "]")
fmt.Printf("%d %q", buf.Len(), buf)
Output (try it on the Go Playground):
Breaking at argument 3, err: EOF
20 "[1 2 Loooooooooooooo"
The good thing about this is that once we reach the limit, we don't have to produce the string representation of the remaining arguments that would be discarded anyway, saving some CPU (and memory) resources.

Related

How is this code generating memory aligned slices?

I'm trying to do direct i/o on linux, so I need to create memory aligned buffers. I copied some code to do it, but I don't understand how it works:
package main
import (
"fmt"
"golang.org/x/sys/unix"
"unsafe"
"yottaStore/yottaStore-go/src/yfs/test/utils"
)
const (
AlignSize = 4096
BlockSize = 4096
)
// Looks like dark magic
func Alignment(block []byte, AlignSize int) int {
return int(uintptr(unsafe.Pointer(&block[0])) & uintptr(AlignSize-1))
}
func main() {
path := "/path/to/file.txt"
fd, err := unix.Open(path, unix.O_RDONLY|unix.O_DIRECT, 0666)
defer unix.Close(fd)
if err != nil {
panic(err)
}
file := make([]byte, 4096*2)
a := Alignment(file, AlignSize)
offset := 0
if a != 0 {
offset = AlignSize - a
}
file = file[offset : offset+BlockSize]
n, readErr := unix.Pread(fd, file, 0)
if readErr != nil {
panic(readErr)
}
fmt.Println(a, offset, offset+utils.BlockSize, len(file))
fmt.Println("Content is: ", string(file))
}
I understand that I'm generating a slice twice as big than what I need, and then extracting a memory aligned block from it, but the Alignment function doesn't make sense to me.
How does the Alignment function works?
If I try to fmt.Println the intermediate steps of that function I get different results, why? I guess because observing it changes its memory alignment (like in quantum physics :D)
Edit:
Example with fmt.println, where I don't need any more alignment:
package main
import (
"fmt"
"golang.org/x/sys/unix"
"unsafe"
)
func main() {
path := "/path/to/file.txt"
fd, err := unix.Open(path, unix.O_RDONLY|unix.O_DIRECT, 0666)
defer unix.Close(fd)
if err != nil {
panic(err)
}
file := make([]byte, 4096)
fmt.Println("Pointer: ", &file[0])
n, readErr := unix.Pread(fd, file, 0)
fmt.Println("Return is: ", n)
if readErr != nil {
panic(readErr)
}
fmt.Println("Content is: ", string(file))
}
Your AlignSize has a value of a power of 2. In binary representation it contains a 1 bit followed by full of zeros:
fmt.Printf("%b", AlignSize) // 1000000000000
A slice allocated by make() may have a memory address that is more or less random, consisting of ones and zeros following randomly in binary; or more precisely the starting address of its backing array.
Since you allocate twice the required size, that's a guarantee that the backing array will cover an address space that has an address in the middle somewhere that ends with as many zeros as the AlignSize's binary representation, and has BlockSize room in the array starting at this. We want to find this address.
This is what the Alignment() function does. It gets the starting address of the backing array with &block[0]. In Go there's no pointer arithmetic, so in order to do something like that, we have to convert the pointer to an integer (there is integer arithmetic of course). In order to do that, we have to convert the pointer to unsafe.Pointer: all pointers are convertible to this type, and unsafe.Pointer can be converted to uintptr (which is an unsigned integer large enough to store the uninterpreted bits of a pointer value), on which–being an integer–we can perform integer arithmetic.
We use bitwise AND with the value uintptr(AlignSize-1). Since AlignSize is a power of 2 (contains a single 1 bit followed by zeros), the number one less is a number whose binary representation is full of ones, as many as trailing zeros AlignSize has. See this example:
x := 0b1010101110101010101
fmt.Printf("AlignSize : %22b\n", AlignSize)
fmt.Printf("AlignSize-1 : %22b\n", AlignSize-1)
fmt.Printf("x : %22b\n", x)
fmt.Printf("result of & : %22b\n", x&(AlignSize-1))
Output:
AlignSize : 1000000000000
AlignSize-1 : 111111111111
x : 1010101110101010101
result of & : 110101010101
So the result of & is the offset which if you subtract from AlignSize, you get an address that has as many trailing zeros as AlignSize itself: the result is "aligned" to the multiple of AlignSize.
So we will use the part of the file slice starting at offset, and we only need BlockSize:
file = file[offset : offset+BlockSize]
Edit:
Looking at your modified code trying to print the steps: I get an output like:
Pointer: 0xc0000b6000
Unsafe pointer: 0xc0000b6000
Unsafe pointer, uintptr: 824634466304
Unpersand: 0
Cast to int: 0
Return is: 0
Content is:
Note nothing is changed here. Simply the fmt package prints pointer values using hexadecimal representation, prefixed by 0x. uintptr values are printed as integers, using decimal representation. Those values are equal:
fmt.Println(0xc0000b6000, 824634466304) // output: 824634466304 824634466304
Also note the rest is 0 because in my case 0xc0000b6000 is already a multiple of 4096, in binary it is 1100000000000000000100001110000000000000.
Edit #2:
When you use fmt.Println() to debug parts of the calculation, that may change escape analysis and may change the allocation of the slice (from stack to heap). This depends on the used Go version too. Do not rely on your slice being allocated at an address that is (already) aligned to AlignSize.
See related questions for more details:
Mix print and fmt.Println and stack growing
why struct arrays comparing has different result
Addresses of slices of empty structs

Why copyBuffer implements while loop

I am trying to understand how copyBuffer works under the hood, but what is not clear to me is the use of while loop
for {
nr, er := src.Read(buf)
//...
}
Full code below:
// copyBuffer is the actual implementation of Copy and CopyBuffer.
// if buf is nil, one is allocated.
func copyBuffer(dst Writer, src Reader, buf []byte) (written int64, err error) {
// If the reader has a WriteTo method, use it to do the copy.
// Avoids an allocation and a copy.
if wt, ok := src.(WriterTo); ok {
return wt.WriteTo(dst)
}
// Similarly, if the writer has a ReadFrom method, use it to do the copy.
if rt, ok := dst.(ReaderFrom); ok {
return rt.ReadFrom(src)
}
size := 32 * 1024
if l, ok := src.(*LimitedReader); ok && int64(size) > l.N {
if l.N < 1 {
size = 1
} else {
size = int(l.N)
}
}
if buf == nil {
buf = make([]byte, size)
}
for {
nr, er := src.Read(buf)
if nr > 0 {
nw, ew := dst.Write(buf[0:nr])
if nw > 0 {
written += int64(nw)
}
if ew != nil {
err = ew
break
}
if nr != nw {
err = ErrShortWrite
break
}
}
if er != nil {
if er != EOF {
err = er
}
break
}
}
return written, err
}
It writes to nw, ew := dst.Write(buf[0:nr]) when nr is the number of bytes read, so why is the while loop necessary?
Let's assume that src does not implement WriterTo and dst does not implement ReaderFrom, since otherwise we would not get down to the for loop at all.
Let's further assume, for simplicity, that src does not implement LimitedReader, so that size is 32 * 1024: 32 kBytes. (There is no real loss of generality here as LimitedReader just allows the source to pick an even smaller number, at least in this case.)
Finally, let's assume buf is nil. (Or, if it's not nil, let's assume it has a capacity of 32768 bytes. If it has a large capacity, we can just change the rest of the assumptions below, so that src has more bytes than there are in the buffer.)
So: we enter the loop with size holding the size of the temporary buffer buf, which is 32k. Now suppose the source is a file that holds 64k. It will take at least two src.Read() calls to read it! Clearly we need an outer loop. That's the overall for here.
Now suppose that src.Read() really does read the full 32k, so that nr is also 32 * 1024. The code will now call dst.Write(), passing the full 32k of data. Unlike src.Read()—which is allowed to only read, say, 1k instead of the full 32k—the next chunk of code requires that dst.Write() write all 32k. If it doesn't, the loop will break with err set to ErrShortWrite.
(An alternative would have been to keep calling dst.Write() with the remaining bytes, so that dst.Write() could write only 1k of the 32k, requiring 32 calls to get it all written.)
Note that src.Read() can choose to read only, say, 1k instead of 32k. If the actual file is 64k, it will then take 64 trips, rather than 2, through the outer loop. (An alternative choice would have been to force such a reader to implement the LimitedReaderinterface. That's not as flexible, though, and is not what LimitedReader is intended for.)
func copyBuffer(dst Writer, src Reader, buf []byte) (written int64, err error)
when the total data size to copy if larger than len(buf), nr, er := src.Read(buf) will try read at most len(buf) data every time.
that's how copyBuffer works:
for {
copy `len(buf)` data from `src` to `dst`;
if EOF {
//done
break;
}
if other Errors {
return Error
}
}
In the normal case, you would just call Copy rather than CopyBuffer.
func Copy(dst Writer, src Reader) (written int64, err error) {
return copyBuffer(dst, src, nil)
}
The option to have a user-supplied buffer is, I think, just for extreme optimization scenarios. The use of the word "Buffer" in the name is possibly a source of confusion since the function is not copying the buffer -- just using it internally.
There are two reasons for the looping...
The buffer might not be large enough to copy all of the data (the size of which is not necessarily known in advance) in one pass.
Reader, though not 'Writer', may return partial results when it makes sense to do so.
Regarding the second item, consider that the Reader does not necessarily represent a fixed file or data buffer. It could, instead, be a live stream from some other thread or process. As such, there are many valid scenarios for stream data to be read and processed on an as-available basis. Although CopyBuffer doesn't do this, it still has to work with such behaviors from any Reader.

How can I log the value of passed parameters to a function?

My aim is to create a logging function that lists the name of a function and the list of passed parameters.
An example would be the following:
func MyFunc(a string, b int){
... some code ...
if err != nil{
errorDescription := myLoggingFunction(err)
fmt.Println(errorDescription)
}
}
func main(){
MyFunc("hello", 42)
}
// where MyLoggingFunction should return something like:
// "MyFunc: a = hello, b = 42, receivedError = "dummy error description"
So far it seems that in Go there is no way to get the name of the parameters of a function at runtime, as answered in this question, but I could give up this feature.
I've managed to get the function name and the memory address of the passed parameters by analysing the stack trace, but I'm hitting a wall when it comes to print somehow the parameters starting from their address (I understand that it might not be trivial depending on the type of the parameters, but even something very simple will do for now)
This is an implementation of the logging function I'm building (you can test it on this playground), is there away to print the parameter values?
func MyLoggingFunction(err error) string {
callersPCs := make([]uintptr, 10)
n := runtime.Callers(2, callersPCs) //skip first 2 entries, (Callers, GetStackTrace)
callersPCs = callersPCs[:n]
b := make([]byte, 1000)
runtime.Stack(b, false)
stackString := string(b)
frames := runtime.CallersFrames(callersPCs)
frame, _ := frames.Next()
trimmedString := strings.Split(strings.Split(stackString, "(")[2], ")")[0]
trimmedString = strings.Replace(trimmedString, " ", "", -1)
parametersPointers := strings.Split(trimmedString, ",")
return fmt.Sprintf("Name: %s \nParameters: %s \nReceived Error: %s", frame.Function, parametersPointers, err.Error())
}
If there are other ideas for building such logging function without analysing the stack trace, except the one that consists in passing a map[string]interface{} containing all the passed parameter names as keys and their values as values (that is my current implementation and is tedious since I'd like to log errors very often), I'd be glad to read them.

Pass slice as function argument, and modify the original slice

I know everything is passed by value in Go, meaning if I give a slice to a function and that function appends to the slice using the builtin append function, then the original slice will not have the values that were appended in the scope of the function.
For instance:
nums := []int{1, 2, 3}
func addToNumbs(nums []int) []int {
nums = append(nums, 4)
fmt.Println(nums) // []int{1, 2, 3, 4}
}
fmt.Println(nums) // []int{1, 2, 3}
This causes a problem for me, because I am trying to do recursion on an accumulated slice, basically a reduce type function except the reducer calls itself.
Here is an example:
func Validate(obj Validatable) ([]ValidationMessage, error) {
messages := make([]ValidationMessage, 0)
if err := validate(obj, messages); err != nil {
return messages, err
}
return messages, nil
}
func validate(obj Validatable, accumulator []ValidationMessage) error {
// If something is true, recurse
if something {
if err := validate(obj, accumulator); err != nil {
return err
}
}
// Append to the accumulator passed in
accumulator = append(accumulator, message)
return nil
}
The code above gives me the same error as the first example, in that the accumulator does not get all the appended values because they only exist within the scope of the function.
To solve this, I pass in a pointer struct into the function, and that struct contains the accumulator. That solution works nicely.
My question is, is there a better way to do this, and is my approach idiomatic to Go?
Updated solution (thanks to icza):
I just return the slice in the recursed function. Such a facepalm, should have thought of that.
func Validate(obj Validatable) ([]ValidationMessage, error) {
messages := make([]ValidationMessage, 0)
return validate(obj, messages)
}
func validate(obj Validatable, messages []ValidationMessage) ([]ValidationMessage, error) {
err := v.Struct(obj)
if _, ok := err.(*validator.InvalidValidationError); ok {
return []ValidationMessage{}, errors.New(err.Error())
}
if _, ok := err.(validator.ValidationErrors); ok {
messageMap := obj.Validate()
for _, err := range err.(validator.ValidationErrors) {
f := err.StructField()
t := err.Tag()
if v, ok := err.Value().(Validatable); ok {
return validate(v, messages)
} else if _, ok := messageMap[f]; ok {
if _, ok := messageMap[f][t]; ok {
messages = append(messages, ValidationMessage(messageMap[f][t]))
}
}
}
}
return messages, nil
}
If you want to pass a slice as a parameter to a function, and have that function modify the original slice, then you have to pass a pointer to the slice:
func myAppend(list *[]string, value string) {
*list = append(*list, value)
}
I have no idea if the Go compiler is naive or smart about this; performance is left as an exercise for the comment section.
For junior coders out there, please note that this code is provided without error checking. For example, this code will panic if list is nil.
Slice grows dynamically as required if the current size of the slice is not sufficient to append new value thereby changing the underlying array. If this new slice is not returned, your append change will not be visible.
Example:
package main
import (
"fmt"
)
func noReturn(a []int, data ...int) {
a = append(a, data...)
}
func returnS(a []int, data ...int) []int {
return append(a, data...)
}
func main() {
a := make([]int, 1)
noReturn(a, 1, 2, 3)
fmt.Println(a) // append changes will not visible since slice size grew on demand changing underlying array
a = returnS(a, 1, 2, 3)
fmt.Println(a) // append changes will be visible here since your are returning the new updated slice
}
Result:
[0]
[0 1 2 3]
Note:
You don't have to return the slice if you are updating items in the slice without adding new items to slice
Slice you passed is an reference to an array, which means the size is fixed. If you just modified the stored values, that's ok, the value will be updated outside the called function.
But if you added new element to the slice, it will reslice to accommodate new element, in other words, a new slice will be created and old slice will not be overwritten.
As a summary, if you need to extend or cut the slice, pass the pointer to the slice.Otherwise, use slice itself is good enough.
Update
I need to explain some important facts. For adding new elements to a slice which was passed as a value to a function, there are 2 cases:
A
the underlying array reached its capacity, a new slice created to replace the origin one, obviously the origin slice will not be modified.
B
the underlying array has not reached its capacity, and was modified. BUT the field len of the slice was not overwritten because the slice was passed by value. As a result, the origin slice will not aware its len was modified, which result in the slice not modified.
When appending data into slice, if the underlying array of the slice doesn't have enough space, a new array will be allocated. Then the elements in old array will be copied into this new memory, accompanied with adding new data behind

Writing a struct's fields and values of different types to a file in Go

I'm writing a simple program that takes in input from a form, populates an instance of a struct with the received data and the writes this received data to a file.
I'm a bit stuck at the moment with figuring out the best way to iterate over the populated struct and write its contents to the file.
The struct in question contains 3 different types of fields (ints, strings, []strings).
I can iterate over them but I am unable to get their actual type.
Inspecting my posted code below with print statements reveals that each of their types is coming back as structs rather than the aforementioned string, int etc.
The desired output format is be plain text.
For example:
field_1="value_1"
field_2=10
field_3=["a", "b", "c"]
Anyone have any ideas? Perhaps I'm going about this the wrong way entirely?
func (c *Config) writeConfigToFile(file *os.File) {
listVal := reflect.ValueOf(c)
element := listVal.Elem()
for i := 0; i < element.NumField(); i++ {
field := element.Field(i)
myType := reflect.TypeOf(field)
if myType.Kind() == reflect.Int {
file.Write(field.Bytes())
} else {
file.WriteString(field.String())
}
}
}
Instead of using the Bytes method on reflect.Value which does not work as you initially intended, you can use either the strconv package or the fmt to format you fields.
Here's an example using fmt:
var s string
switch fi.Kind() {
case reflect.String:
s = fmt.Sprintf("%q", fi.String())
case reflect.Int:
s = fmt.Sprintf("%d", fi.Int())
case reflect.Slice:
if fi.Type().Elem().Kind() != reflect.String {
continue
}
s = "["
for j := 0; j < fi.Len(); j++ {
s = fmt.Sprintf("%s%q, ", s, fi.Index(i).String())
}
s = strings.TrimRight(s, ", ") + "]"
default:
continue
}
sf := rv.Type().Field(i)
if _, err := fmt.Fprintf(file, "%s=%s\n", sf.Name, s); err!= nil {
panic(err)
}
Playground: https://play.golang.org/p/KQF3CicVzA
Why not use the built-in gob package to store your struct values?
I use it to store different structures, one per line, in files. During decoding, you can test the type conversion or provide a hint in a wrapper - whichever is faster for your given use case.
You'd treat each line as a buffer when Encoding and Decoding when reading back the line. You can even gzip/zlib/compress, encrypt/decrypt, etc the stream in real-time.
No point in re-inventing the wheel when you have a polished and armorall'd wheel already at your disposal.

Resources