Related
i have a file. it has some ip
1.1.1.0/24
1.1.2.0/24
2.2.1.0/24
2.2.2.0/24
i read this file to slice, and used *(*string)(unsafe.Pointer(&b)) to parse []byte to string, but is doesn't work
func TestInitIpRangeFromFile(t *testing.T) {
filepath := "/tmp/test"
file, err := os.Open(filepath)
if err != nil {
t.Errorf("failed to open ip range file:%s, err:%s", filepath, err)
}
reader := bufio.NewReader(file)
ranges := make([]string, 0)
for {
ip, _, err := reader.ReadLine()
if err != nil {
if err == io.EOF {
break
}
logger.Fatalf("failed to read ip range file, err:%s", err)
}
t.Logf("ip:%s", *(*string)(unsafe.Pointer(&ip)))
ranges = append(ranges, *(*string)(unsafe.Pointer(&ip)))
}
t.Logf("%v", ranges)
}
result:
task_test.go:71: ip:1.1.1.0/24
task_test.go:71: ip:1.1.2.0/24
task_test.go:71: ip:2.2.1.0/24
task_test.go:71: ip:2.2.2.0/24
task_test.go:75: [2.2.2.0/24 1.1.2.0/24 2.2.1.0/24 2.2.2.0/24]
why 1.1.1.0/24 changed to 2.2.2.0/24 ?
change
*(*string)(unsafe.Pointer(&ip))
to string(ip) it works
So, while reinterpreting a slice-header as a string-header the way you did is absolutely bonkers and has no guarantee whatsoever of working correctly, it's only indirectly the cause of your problem.
The real problem is that you're retaining a pointer to the return value of bufio/Reader.ReadLine(), but the docs for that method say "The returned buffer is only valid until the next call to ReadLine." Which means that the reader is free to reuse that memory later on, and that's what's happening.
When you do the cast in the proper way, string(ip), Go copies the contents of the buffer into the newly-created string, which remains valid in the future. But when you type-pun the slice into a string, you keep the exact same pointer, which stops working as soon as the reader refills its buffer.
If you decided to do the pointer trickery as a performance hack to avoid copying and allocation... too bad. The reader interface is going to force you to copy the data out anyway, and since it does, you should just use string().
I started a new job and we've been instructed to use Ubers Go coding standards. I'm not sure about one of their guidelines entitled "Exit Once":
If possible, prefer to call os.Exit or log.Fatal at most once in your main(). If there are multiple error scenarios that halt program execution, put that logic under a separate function and return errors from it.
Wouldn't this just mean offloading main() into another function (run())? This seems a little superfluous to me. What benefits does Uber's approach have?
I'm not familiar with Uber's entire Go coding standards, but that particular piece of advice is sound. One issue with os.Exit is that it puts an end to the programme very brutally, without honouring any deferred function calls pending:
Exit causes the current program to exit with the given status code. Conventionally, code zero indicates success, non-zero an error.
The program terminates immediately; deferred functions are not run.
(my emphasis)
However, those deferred function calls may be responsible for important cleanup tasks. Consider Uber's example code snippet:
package main
func main() {
args := os.Args[1:]
if len(args) != 1 {
log.Fatal("missing file")
}
name := args[0]
f, err := os.Open(name)
if err != nil {
log.Fatal(err)
}
defer f.Close()
// If we call log.Fatal after this line,
// f.Close will not be called.
b, err := ioutil.ReadAll(f)
if err != nil {
log.Fatal(err)
}
// ...
}
If ioutil.ReadAll returns a non-nil error, log.Fatal is called; and because log.Fatal calls os.Exit under the hood, the deferred call to f.Close will not be run. In this particular case, it's not that serious, but imagine a situation where deferred calls involved some cleanup, like removing files; you'd leave your disk in an unclean state. For more on that topic, see episode #112 of the Go Time podcast, in which these considerations were discussed.
Therefore, it's a good idea to eschew os.Exit, log.Fatal, etc. "deep" in your programme. A run function as described in Uber's Go coding standards allows deferred calls to be run as they should before programme execution ends (potentially with a non-zero status code):
package main
func main() {
if err := run(); err != nil {
log.Fatal(err)
}
}
func run() error {
args := os.Args[1:]
if len(args) != 1 {
return errors.New("missing file")
}
name := args[0]
f, err := os.Open(name)
if err != nil {
return err
}
defer f.Close()
b, err := ioutil.ReadAll(f)
if err != nil {
return err
}
// ...
}
An additional benefit of this approach is that, although the main function itself isn't readily testable, you can design such a run function with testability in mind; see Mat Ryer's blog post on that topic.
Go 1.12 on Linux 4.19.93 armv6l.
Hardware is a raspberypi zero w (BCM2835) running a yocto linux image.
I've got a gpio driven SRF04 proximity sensor driven by the srf04 linux driver.
It works great over sysfs and the busybox shell.
# cat /sys/bus/iio/devices/iio:device0/in_distance_raw
1646
I've used Go before with IIO devices that support triggers and buffered output at high sample rates on this hardware platform. However for this application the srf04 driver doesn't implement those IIO features. Drat. I don't really feel like adding buffer / trigger support to the driver myself (at this time) since I do not have a need for a 'high' sample rate. A handful of pings per second should suffice for my purpose. I figure I'll calculate mean & std. dev. for a rolling window of data points and 'divine' the signal out of the noise.
So with that - I'd be perfectly happy to Read the bytes from the published sysfs file with Go.
Which brings me to the point of this post.
When I open the file for reading, and try to Read() any number of bytes, I always get a generic -EIO error.
func (s *Srf04) Read() (int, error) {
samp := make([]byte, 16)
f, err := os.OpenFile(s.readPath, OS.O_RDONLY, os.ModeDevice)
if err != nil {
return 0, err
}
defer f.Close()
n, err := f.Read(samp)
if err != nil {
// This block is always executed.
// The error is never a timeout, and always 'input/output error' (-EIO aka -5)
log.Fatal(err)
}
...
}
This seems like strange behavior to me.
So I decided to mess with using io.ReadFull. This yielded unreliable results.
func (s *Srf04) Read() (int, error) {
samp := make([]byte, 16)
f, err := os.OpenFile(s.readPath, OS.O_RDONLY, os.ModeDevice)
if err != nil {
return 0, err
}
defer f.Close()
for {
n, err := io.ReadFull(readFile, samp)
log.Println("ReadFull ", n, " bytes.")
if err == io.EOF {
break
}
if err != nil {
log.Println(err)
}
}
...
}
I ended up adding it to a loop, as I found behavior changes from 'one-off' reads to multiple read calls subsequent to one another. I have it exiting if it gets an EOF, and repeatedly trying to read otherwise.
The results are straight-up crazy unreliable, seemingly returning random results. Sometimes I get the -5, other times I read between 2 - 5 bytes from the device. Sometimes I get bytes without an eof file before the EOF. The bytes appear to represent character data for numbers (each rune is a rune between [0-9]) -- which I'd expect.
Aside: I expect this is related to file polling and the go blocking IO implementation, but I have no way to really tell.
As a temporary workaround, I decided try using os.exec, and now I get results I'd expect to see.
func (s *Srf04)Read() (int, error) {
out, err := exec.Command("cat", s.readPath).Output()
if err != nil {
return 0, err
}
return strconv.Atoi(string(out))
}
But Yick. os.exec. Yuck.
I'd try to run that cat whatever encantation under strace and then peer at what read(2) calls cat actually manages to do (including the number of bytes actually read), and then I'd try to re-create that behaviour in Go.
My own sheer guess at the problem's cause is that the driver (or the sysfs layer) is not too well prepared to deal with certain access patterns.
For a start, consider that GNU cat is not a simple-minded byte shoveler but is rather a reasonably tricky piece of software, which, among other things, considers optimal I/O block sizes for both input and output devices (if available), calls fadvise(2) etc. It's not that any of that gets actually used when you run it on your sysfs-exported file, but it may influence how the full stack (starting with the sysfs layer) performs in the case of using cat and with your code, respectively.
Hence my advice: start with strace-ing the cat and then try to re-create its usage pattern in your Go code; then try to come up with a minimal subset of that, which works; then profoundly comment your code ;-)
I'm sure I've been looking at this too long tonight, and this code is probably terrible. That said, here's the snippet of what I came up with that works just as reliably as the busybox cat, but in Go.
The Srf04 struct carries a few things, the important bits are included below:
type Srf04 struct {
readBuf []byte `json:"-"`
readFile *os.File `json:"-"`
samples *ring.Ring `json:"-"`
}
func (s *Srf04) Read() (int, error) {
/** Reliable, but really really slow.
out, err := exec.Command("cat", s.readPath).Output()
if err != nil {
log.Fatal(err)
}
val, err := strconv.Atoi(string(out[:len(out) - 2]))
if err == nil {
s.samples.Value = val
s.samples = s.samples.Next()
}
*/
// Seek should tell us the new offset (0) and no err.
bytesRead := 0
_, err := s.readFile.Seek(0, 0)
// Loop until N > 0 AND err != EOF && err != timeout.
if err == nil {
n := 0
for {
n, err = s.readFile.Read(s.readBuf)
bytesRead += n
if os.IsTimeout(err) {
// bail out.
bytesRead = 0
break
}
if err == io.EOF {
// Success!
break
}
// Any other err means 'keep trying to read.'
}
}
if bytesRead > 0 {
val, err := strconv.Atoi(string(s.readBuf[:bytesRead-1]))
if err == nil {
fmt.Println(val)
s.samples.Value = val
s.samples = s.samples.Next()
}
return val, err
}
return 0, err
}
I'm currently writing a small program which converts CSV-files into structs to be used for further prosessing. The csv lines look like this
20140102,09:30,38.88,38.88,38.82,38.85,67004
I have 500 files, each about 20-30 MB.
My code works just fine, but I can't help wondering if there isn't a better way to convert these files than what I'm doing now.
First reading the file and converting to csv records (pseudo code)
data, err := ioutil.ReadFile(path)
if err != nil {
...
}
r := csv.NewReader(bytes.NewReader(data))
records, err := r.ReadAll()
if err != nil {
...
}
Then looping over all the records and doing
parsedTime, err := time.Parse("2006010215:04", record[0]+record[1])
if err != nil {
return model.ZorroT6{}, time.Time{}, err
}
t6.Date = ConvertToOle(parsedTime)
if open, err := strconv.ParseFloat(record[2], 32); err == nil {
t6.Open = float32(open)
}
if high, err := strconv.ParseFloat(record[3], 32); err == nil {
t6.High = float32(high)
}
if low, err := strconv.ParseFloat(record[4], 32); err == nil {
t6.Low = float32(low)
}
if close, err := strconv.ParseFloat(record[5], 32); err == nil {
t6.Close = float32(close)
}
if vol, err := strconv.ParseInt(record[6], 10,32); err == nil {
t6.Vol = int32(vol)
}
For example I have to go through []byte -> string -> float64 -> float32 to get my float values. What could I do to improve this code?
EDIT: Just to be clear I don't really need to improve the performance, I'm just better trying to understand Go and what performance optimization that could be applied to a problem like this. For example it seems like a lot of overhead to create loads of strings and float64 when I have a byte slice and want a float32.
There is only one problem I see that needs fix:
Do not use ioutil.ReadFile together with bytes.NewReader. It reads all the contents into the memory, which is inefficient when the file is large.
Instead, use os.Open(file), it perfectly provides a io.Reader that csv.NewReader can utilize. Do not forget to close the file and handle errors.
If you still want to improve performance:
Since your csv file is of fixed format, it is possible to using raw bytes instead provided by bufio instead of csv.
You can copy and paste the underlying code in strconv and time to avoid general code that is not of your need.
But I think they are not worth the trouble.
Is there a way to clean up this (IMO) horrific-looking code?
aJson, err1 := json.Marshal(a)
bJson, err2 := json.Marshal(b)
cJson, err3 := json.Marshal(c)
dJson, err4 := json.Marshal(d)
eJson, err5 := json.Marshal(e)
fJson, err6 := json.Marshal(f)
gJson, err4 := json.Marshal(g)
if err1 != nil {
return err1
} else if err2 != nil {
return err2
} else if err3 != nil {
return err3
} else if err4 != nil {
return err4
} else if err5 != nil {
return err5
} else if err5 != nil {
return err5
} else if err6 != nil {
return err6
}
Specifically, I'm talking about the error handling. It would be nice to be able to handle all the errors in one go.
var err error
f := func(dest *D, src S) bool {
*dest, err = json.Marshal(src)
return err == nil
} // EDIT: removed ()
f(&aJson, a) &&
f(&bJson, b) &&
f(&cJson, c) &&
f(&dJson, d) &&
f(&eJson, e) &&
f(&fJson, f) &&
f(&gJson, g)
return err
Put the result in a slice instead of variables, put the intial values in another slice to iterate and return during the iteration if there's an error.
var result [][]byte
for _, item := range []interface{}{a, b, c, d, e, f, g} {
res, err := json.Marshal(item)
if err != nil {
return err
}
result = append(result, res)
}
You could even reuse an array instead of having two slices.
var values, err = [...]interface{}{a, b, c, d, e, f, g}, error(nil)
for i, item := range values {
if values[i], err = json.Marshal(item); err != nil {
return err
}
}
Of course, this'll require a type assertion to use the results.
define a function.
func marshalMany(vals ...interface{}) ([][]byte, error) {
out := make([][]byte, 0, len(vals))
for i := range vals {
b, err := json.Marshal(vals[i])
if err != nil {
return nil, err
}
out = append(out, b)
}
return out, nil
}
you didn't say anything about how you'd like your error handling to work. Fail one, fail all? First to fail? Collect successes or toss them?
I believe the other answers here are correct for your specific problem, but more generally, panic can be used to shorten error handling while still being a well-behaving library. (i.e., not panicing across package boundaries.)
Consider:
func mustMarshal(v interface{}) []byte {
bs, err := json.Marshal(v)
if err != nil {
panic(err)
}
return bs
}
func encodeAll() (err error) {
defer func() {
if r := recover(); r != nil {
var ok bool
if err, ok = r.(error); ok {
return
}
panic(r)
}
}()
ea := mustMarshal(a)
eb := mustMarshal(b)
ec := mustMarshal(c)
return nil
}
This code uses mustMarshal to panic whenever there is a problem marshaling a value. But the encodeAll function will recover from the panic and return it as a normal error value. The client in this case is never exposed to the panic.
But this comes with a warning: using this approach everywhere is not idiomatic. It can also be worse since it doesn't lend itself well to handling each individual error specially, but more or less treating each error the same. But it has its uses when there are tons of errors to handle. As an example, I use this kind of approach in a web application, where a top-level handler can catch different kinds of errors and display them appropriately to the user (or a log file) depending on the kind of error.
It makes for terser code when there is a lot of error handling, but at the loss of idiomatic Go and handling each error specially. Another down-side is that it could prevent something that should panic from actually panicing. (But this can be trivially solved by using your own error type.)
You can use go-multierror by Hashicorp.
var merr error
if err := step1(); err != nil {
merr = multierror.Append(merr, err)
}
if err := step2(); err != nil {
merr = multierror.Append(merr, err)
}
return merr
You can create a reusable method to handle multiple errors, this implementation will only show the last error but you could return every error msg combined by modifying the following code:
func hasError(errs ...error) error {
for i, _ := range errs {
if errs[i] != nil {
return errs[i]
}
}
return nil
}
aJson, err := json.Marshal(a)
bJson, err1 := json.Marshal(b)
cJson, err2 := json.Marshal(c)
if error := hasError(err, err1, err2); error != nil {
return error
}
Another perspective on this is, instead of asking "how" to handle the abhorrent verbosity, whether we actually "should". This advice is heavily dependent on context, so be careful.
In order to decide whether handling the json.Marshal error is worth it, we can inspect its implementation to see when errors are returned. In order to return errors to the caller and preserve code terseness, json.Marshal uses panic and recover internally in a manner akin to exceptions. It defines an internal helper method which, when called, panics with the given error value. By looking at each call of this function, we learn that json.Marshal errors in the given scenarios:
calling MarshalJSON or MarshalText on a value/field of a type which implements json.Marshaler or encoding.TextMarshaler returns an error—in other words, a custom marshaling method fails;
the input is/contains a cyclic (self-referencing) structure;
the input is/contains a value of an unsupported type (complex, chan, func);
the input is/contains a floating-point number which is NaN or Infinity (these are not allowed by the spec, see section 2.4);
the input is/contains a json.Number string that is an incorrect number representation (for example, "foo" instead of "123").
Now, a usual scenario for marshaling data is creating an API response, for example. In that case, you will 100% have data types that satisfy all of the marshaler's constraints and valid values, given that the server itself generates them. In the situation user-provided input is used, the data should be validated anyway beforehand, so it should still not cause issues with the marshaler. Furthermore, we can see that, apart from the custom marshaler errors, all the other errors occur at runtime because Go's type system cannot enforce the required conditions by itself. With all these points given, here comes the question: given our control over the data types and values, do we need to handle json.Marshal's error at all?
Probably no. For a type like
type Person struct {
Name string
Age int
}
it is now obvious that json.Marshal cannot fail. It is trickier when the type looks like
type Foo struct {
Data any
}
(any is a new Go 1.18 alias for interface{}) because there is no compile-time guarantee that Foo.Data will hold a value of a valid type—but I'd still argue that if Foo is meant to be serialized as a response, Foo.Data will also be serializable. Infinity or NaN floats remain an issue, but, given the JSON standard limitation, if you want to serialize these two special values you cannot use JSON numbers anyway, so you'll have to look for another solution, which means that you'll end up avoiding the error anyway.
To conclude, my point is that you can probably do:
aJson, _ := json.Marshal(a)
bJson, _ := json.Marshal(b)
cJson, _ := json.Marshal(c)
dJson, _ := json.Marshal(d)
eJson, _ := json.Marshal(e)
fJson, _ := json.Marshal(f)
gJson, _ := json.Marshal(g)
and live fine with it. If you want to be pedantic, you can use a helper such as:
func must[T any](v T, err error) T {
if err != nil {
panic(err)
}
return v
}
(note the Go 1.18 generics usage) and do
aJson := must(json.Marshal(a))
bJson := must(json.Marshal(b))
cJson := must(json.Marshal(c))
dJson := must(json.Marshal(d))
eJson := must(json.Marshal(e))
fJson := must(json.Marshal(f))
gJson := must(json.Marshal(g))
This will work nice when you have something like an HTTP server, where each request is wrapped in a middleware that recovers from panics and responds to the client with status 500. It's also where you would care about these unexpected errors—when you don't want the program/service to crash at all. For one-time scripts you'll probably want to have the operation halted and a stack trace dumped.
If you're unsure of how your types will be changed in the future, you don't trust your tests, data may not be in your full control, the codebase is too big to trace the data or whatever other reason which causes uncertainty over the correctness of your data, it is better to handle the error. Pay attention to the context you're in!
P.S.: Pragmatically ignoring errors should be generally sought after. For example, the Write* methods on bytes.Buffer, strings.Builder never return errors; fmt.Fprintf, with a valid format string and a writer that doesn't return errors, also returns no errors; bufio.Writer aswell doesn't, if the underlying writer doesn't return. You will find some types implement interfaces with methods that return errors but don't actually return any. In these cases, if you know the concrete type, handling errors is unnecessarily verbose and redundant. What do you prefer,
var sb strings.Builder
if _, err := sb.WriteString("hello "); err != nil {
return err
}
if _, err := sb.WriteString("world!"); err != nil {
return err
}
or
var sb strings.Builder
sb.WriteString("hello ")
sb.WriteString("world!")
(of course, ignoring that it could be a single WriteString call)?
The given examples write to an in-memory buffer, which unless the machine is out of memory, an error which you cannot handle in Go, cannot ever fail. Other such situations will surface in your code—blindly handling errors adds little to no value! Caution is key—if an implementation changes and does return errors, you may be in trouble. Standard library or well-established packages are good candidates for eliding error checking, if possible.