I am a newbie to Go. Was starting to write my first code in which I have to download a bunch of CSV's from AWS. I don't understand why it is giving me the below error with O_APPEND mode. If I remove os.O_APPEND, I only get the last file data which is not the objective.
The objective is to download all CSV files into one file locally. I'd like to understand what I'm doing incorrectly.
package main
import (
"fmt"
"os"
"path/filepath"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/credentials"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/service/s3"
"github.com/aws/aws-sdk-go/service/s3/s3manager"
)
const (
AccessKeyId = "xxxxxxxxx"
SecretAccessKey = "xxxxxxxxxxxxxxxxxxxx"
Region = "eu-central-1"
Bucket = "dexter-reports"
bucketKey = "Jenkins/pluginVersions/"
)
func main() {
// Load the Shared AWS Configuration
os.Setenv("AWS_ACCESS_KEY_ID", AccessKeyId)
os.Setenv("AWS_SECRET_ACCESS_KEY", SecretAccessKey)
filename := "JenkinsPluginDetais.txt"
cred := credentials.NewStaticCredentials(AccessKeyId, SecretAccessKey, "")
config := aws.Config{Credentials: cred, Region: aws.String(Region), Endpoint: aws.String("s3.amazonaws.com")}
file, err := os.OpenFile(filename, os.O_APPEND|os.O_WRONLY|os.O_CREATE, 0666)
if err != nil {
panic(err)
}
defer file.Close()
sess, err := session.NewSession(&config)
if err != nil {
fmt.Println(err)
}
//list Buckets
ObjectList := listBucketObjects(sess)
//loop over the obectlist. First initialize the s3 downloader via s3manager
downloader := s3manager.NewDownloader(sess)
for _, item := range ObjectList.Contents {
csvFile := filepath.Base(*item.Key)
if csvFile != "pluginVersions" {
downloadBucketObjects(downloader, file, csvFile)
}
}
}
func listBucketObjects(sess *session.Session) *s3.ListObjectsV2Output {
//create a new s3 client
svc := s3.New(sess)
resp, err := svc.ListObjectsV2(&s3.ListObjectsV2Input{
Bucket: aws.String(Bucket),
Prefix: aws.String(bucketKey),
})
if err != nil {
panic(err)
}
return resp
}
func downloadBucketObjects(downloader *s3manager.Downloader, file *os.File, keyobj string) {
fileToDownload := bucketKey + keyobj
numBytes, err := downloader.Download(file,
&s3.GetObjectInput{
Bucket: aws.String(Bucket),
Key: aws.String(fileToDownload),
})
if err != nil {
panic(err)
}
fmt.Println("Downloaded", file.Name(), numBytes, "bytes")
}
Firstly, I don't get it why do you even need os.O_APPEND flag in the first place. As per my understanding, you can omit os.O_APPEND.
Now, let's come to the actual problem of why it's happening:
Doc for O_APPEND (Ref: https://man7.org/linux/man-pages/man2/open.2.html):
O_APPEND
The file is opened in append mode. Before each write(2),
the file offset is positioned at the end of the file, as
if with lseek(2). The modification of the file offset and
the write operation are performed as a single atomic step.
So for every call to write the file offset is positioned at the end of the file.
But (*s3Manager.Download).Download supposedly be using WriteAt method, i.e.,
Doc for WriteAt:
$ go doc os WriteAt
package os // import "os"
func (f *File) WriteAt(b []byte, off int64) (n int, err error)
WriteAt writes len(b) bytes to the File starting at byte offset off. It
returns the number of bytes written and an error, if any. WriteAt returns a
non-nil error when n != len(b).
If file was opened with the O_APPEND flag, WriteAt returns an error.
Notice the last line, that if the file's opened with O_APPEND flag it will result in an error and it's even right because WriteAt's second argument is an offset but mixing O_APPEND's behaviour and WriteAt offset seeking might create problem resulting in unexpected results and it errors out.
Consider the definition of s3manager.Downloader:
func (d Downloader) Download(w io.WriterAt, input *s3.GetObjectInput, options ...func(*Downloader)) (n int64, err error)
The first argument is an io.WriterAt; this interface is:
type WriterAt interface {
WriteAt(p []byte, off int64) (n int, err error)
}
This means that the Download function is going to call the WriteAt method in the File you are passing it. As per the documentation for File.WriteAt
If file was opened with the O_APPEND flag, WriteAt returns an error.
So this explains why you are getting the error but raises the question "why is Download using WriteAt and not accepting an io.Writer (and calling Write)?"; the answer can be found in the documentation:
The w io.WriterAt can be satisfied by an os.File to do multipart concurrent downloads, or in memory []byte wrapper using aws.WriteAtBuffer
So, to increase performance, Downloader might make multiple simultaneous requests for parts of the file and then write these out as they are received (meaning it may not write the data in order). This also explains why calling the function multiple times with the same File results in overwritten data (when Downloader retrieves the each chunk of the file it writes it out at the appropriate position in the output file; this overwrites any data already there).
The above quote from the documentation also points to a possible solution; use an aws.WriteAtBuffer and, once the download is finished, write the data to your file (which could then be opened with O_APPEND) - something like this:
buf := aws.NewWriteAtBuffer([]byte{})
numBytes, err := downloader.Download(buf,
&s3.GetObjectInput{
Bucket: aws.String(Bucket),
Key: aws.String(fileToDownload),
})
if err != nil {
panic(err)
}
_, err = file.Write(buf.Bytes())
if err != nil {
panic(err)
}
An alternative would be to download into a temporary file and then append that to your output file (you may need to do this if the files are large).
Go 1.12 on Linux 4.19.93 armv6l.
Hardware is a raspberypi zero w (BCM2835) running a yocto linux image.
I've got a gpio driven SRF04 proximity sensor driven by the srf04 linux driver.
It works great over sysfs and the busybox shell.
# cat /sys/bus/iio/devices/iio:device0/in_distance_raw
1646
I've used Go before with IIO devices that support triggers and buffered output at high sample rates on this hardware platform. However for this application the srf04 driver doesn't implement those IIO features. Drat. I don't really feel like adding buffer / trigger support to the driver myself (at this time) since I do not have a need for a 'high' sample rate. A handful of pings per second should suffice for my purpose. I figure I'll calculate mean & std. dev. for a rolling window of data points and 'divine' the signal out of the noise.
So with that - I'd be perfectly happy to Read the bytes from the published sysfs file with Go.
Which brings me to the point of this post.
When I open the file for reading, and try to Read() any number of bytes, I always get a generic -EIO error.
func (s *Srf04) Read() (int, error) {
samp := make([]byte, 16)
f, err := os.OpenFile(s.readPath, OS.O_RDONLY, os.ModeDevice)
if err != nil {
return 0, err
}
defer f.Close()
n, err := f.Read(samp)
if err != nil {
// This block is always executed.
// The error is never a timeout, and always 'input/output error' (-EIO aka -5)
log.Fatal(err)
}
...
}
This seems like strange behavior to me.
So I decided to mess with using io.ReadFull. This yielded unreliable results.
func (s *Srf04) Read() (int, error) {
samp := make([]byte, 16)
f, err := os.OpenFile(s.readPath, OS.O_RDONLY, os.ModeDevice)
if err != nil {
return 0, err
}
defer f.Close()
for {
n, err := io.ReadFull(readFile, samp)
log.Println("ReadFull ", n, " bytes.")
if err == io.EOF {
break
}
if err != nil {
log.Println(err)
}
}
...
}
I ended up adding it to a loop, as I found behavior changes from 'one-off' reads to multiple read calls subsequent to one another. I have it exiting if it gets an EOF, and repeatedly trying to read otherwise.
The results are straight-up crazy unreliable, seemingly returning random results. Sometimes I get the -5, other times I read between 2 - 5 bytes from the device. Sometimes I get bytes without an eof file before the EOF. The bytes appear to represent character data for numbers (each rune is a rune between [0-9]) -- which I'd expect.
Aside: I expect this is related to file polling and the go blocking IO implementation, but I have no way to really tell.
As a temporary workaround, I decided try using os.exec, and now I get results I'd expect to see.
func (s *Srf04)Read() (int, error) {
out, err := exec.Command("cat", s.readPath).Output()
if err != nil {
return 0, err
}
return strconv.Atoi(string(out))
}
But Yick. os.exec. Yuck.
I'd try to run that cat whatever encantation under strace and then peer at what read(2) calls cat actually manages to do (including the number of bytes actually read), and then I'd try to re-create that behaviour in Go.
My own sheer guess at the problem's cause is that the driver (or the sysfs layer) is not too well prepared to deal with certain access patterns.
For a start, consider that GNU cat is not a simple-minded byte shoveler but is rather a reasonably tricky piece of software, which, among other things, considers optimal I/O block sizes for both input and output devices (if available), calls fadvise(2) etc. It's not that any of that gets actually used when you run it on your sysfs-exported file, but it may influence how the full stack (starting with the sysfs layer) performs in the case of using cat and with your code, respectively.
Hence my advice: start with strace-ing the cat and then try to re-create its usage pattern in your Go code; then try to come up with a minimal subset of that, which works; then profoundly comment your code ;-)
I'm sure I've been looking at this too long tonight, and this code is probably terrible. That said, here's the snippet of what I came up with that works just as reliably as the busybox cat, but in Go.
The Srf04 struct carries a few things, the important bits are included below:
type Srf04 struct {
readBuf []byte `json:"-"`
readFile *os.File `json:"-"`
samples *ring.Ring `json:"-"`
}
func (s *Srf04) Read() (int, error) {
/** Reliable, but really really slow.
out, err := exec.Command("cat", s.readPath).Output()
if err != nil {
log.Fatal(err)
}
val, err := strconv.Atoi(string(out[:len(out) - 2]))
if err == nil {
s.samples.Value = val
s.samples = s.samples.Next()
}
*/
// Seek should tell us the new offset (0) and no err.
bytesRead := 0
_, err := s.readFile.Seek(0, 0)
// Loop until N > 0 AND err != EOF && err != timeout.
if err == nil {
n := 0
for {
n, err = s.readFile.Read(s.readBuf)
bytesRead += n
if os.IsTimeout(err) {
// bail out.
bytesRead = 0
break
}
if err == io.EOF {
// Success!
break
}
// Any other err means 'keep trying to read.'
}
}
if bytesRead > 0 {
val, err := strconv.Atoi(string(s.readBuf[:bytesRead-1]))
if err == nil {
fmt.Println(val)
s.samples.Value = val
s.samples = s.samples.Next()
}
return val, err
}
return 0, err
}
I'm trying to read from a serial port (a GPS device on a Raspberry Pi).
Following the instructions from http://www.modmypi.com/blog/raspberry-pi-gps-hat-and-python
I can read from shell using
stty -F /dev/ttyAMA0 raw 9600 cs8 clocal -cstopb
cat /dev/ttyAMA0
I get well formatted output
$GNGLL,5133.35213,N,00108.27278,W,160345.00,A,A*65
$GNRMC,160346.00,A,5153.35209,N,00108.27286,W,0.237,,290418,,,A*75
$GNVTG,,T,,M,0.237,N,0.439,K,A*35
$GNGGA,160346.00,5153.35209,N,00108.27286,W,1,12,0.67,81.5,M,46.9,M,,*6C
$GNGSA,A,3,29,25,31,20,26,23,21,16,05,27,,,1.11,0.67,0.89*10
$GNGSA,A,3,68,73,83,74,84,75,85,67,,,,,1.11,0.67,0.89*1D
$GPGSV,4,1,15,04,,,34,05,14,040,21,09,07,330,,16,45,298,34*40
$GPGSV,4,2,15,20,14,127,18,21,59,154,30,23,07,295,26,25,13,123,22*74
$GPGSV,4,3,15,26,76,281,40,27,15,255,20,29,40,068,19,31,34,199,33*7C
$GPGSV,4,4,15,33,29,198,,36,23,141,,49,30,172,*4C
$GLGSV,3,1,11,66,00,325,,67,13,011,20,68,09,062,16,73,12,156,21*60
$GLGSV,3,2,11,74,62,177,20,75,53,312,36,76,08,328,,83,17,046,25*69
$GLGSV,3,3,11,84,75,032,22,85,44,233,32,,,,35*62
$GNGLL,5153.35209,N,00108.27286,W,160346.00,A,A*6C
$GNRMC,160347.00,A,5153.35205,N,00108.27292,W,0.216,,290418,,,A*7E
$GNVTG,,T,,M,0.216,N,0.401,K,A*3D
$GNGGA,160347.00,5153.35205,N,00108.27292,W,1,12,0.67,81.7,M,46.9,M,,*66
$GNGSA,A,3,29,25,31,20,26,23,21,16,05,27,,,1.11,0.67,0.89*10
$GNGSA,A,3,68,73,83,74,84,75,85,67,,,,,1.11,0.67,0.89*1D
$GPGSV,4,1,15,04,,,34,05,14,040,21,09,07,330,,16,45,298,34*40
(I've put some random data in)
I'm trying to read this in Go. Currently, I have
package main
import "fmt"
import "log"
import "github.com/tarm/serial"
func main() {
config := &serial.Config{
Name: "/dev/ttyAMA0",
Baud: 9600,
ReadTimeout: 1,
Size: 8,
}
stream, err := serial.OpenPort(config)
if err != nil {
log.Fatal(err)
}
buf := make([]byte, 1024)
for {
n, err := stream.Read(buf)
if err != nil {
log.Fatal(err)
}
s := string(buf[:n])
fmt.Println(s)
}
}
But this prints malformed data. I suspect that this is due to the buffer size or the value of Size in the config struct being wrong, but I'm not sure how to get those values from the stty settings.
Looking back, I think the issue is that I'm getting a stream and I want to be able to iterate over lines of the stty, rather than chunks. This is how the stream is outputted:
$GLGSV,3
,1,09,69
,10,017,
,70,43,0
69,,71,3
2,135,27
,76,23,2
32,22*6F
$GLGSV
,3,2,09,
77,35,30
0,21,78,
11,347,,
85,31,08
1,30,86,
72,355,3
6*6C
$G
LGSV,3,3
,09,87,2
4,285,30
*59
$GN
GLL,5153
.34919,N
,00108.2
7603,W,1
92901.00
,A,A*6A
The struct you get back from serial.OpenPort() contains a pointer to an open os.File corresponding to the opened serial port connection. When you Read() from this, the library calls Read() on the underlying os.File.
The documentation for this function call is:
Read reads up to len(b) bytes from the File. It returns the number of bytes read and any error encountered. At end of file, Read returns 0, io.EOF.
This means you have to keep track of how much data was read. You also have to keep track of whether there were newlines, if this is important to you. Unfortunately, the underlying *os.File is not exported, so you'll find it difficult to use tricks like bufio.ReadLine(). It may be worth modifying the library and sending a pull request.
As Matthew Rankin noted in a comment, Port implements io.ReadWriter so you can simply use bufio to read by lines.
stream, err := serial.OpenPort(config)
if err != nil {
log.Fatal(err)
}
scanner := bufio.NewScanner(stream)
for scanner.Scan() {
fmt.Println(scanner.Text()) // Println will add back the final '\n'
}
if err := scanner.Err(); err != nil {
log.Fatal(err)
}
Change
fmt.Println(s)
to
fmt.Print(s)
and you will probably get what you want.
Or did I misunderstand the question?
Two additions to Michael Hamptom's answer which can be useful:
line endings
You might receive data that is not newline-separated text. bufio.Scanner uses ScanLines by default to split the received data into lines - but you can also write your own line splitter based on the default function's signature and set it for the scanner:
scanner := bufio.NewScanner(stream)
scanner.Split(ownLineSplitter) // set custom line splitter function
reader shutdown
You might not receive a constant stream but only some packets of bytes from time to time. If no bytes arrive at the port, the scanner will block and you can't just kill it. You'll have to close the stream to do so, effectively raising an error. To not block any outer loops and handle errors appropriately, you can wrap the scanner in a goroutine that takes a context. If the context was cancelled, ignore the error, otherwise forward the error. In principle, this can look like
var errChan = make(chan error)
var dataChan = make(chan []byte)
ctx, cancelPortScanner := context.WithCancel(context.Background())
go func(ctx context.Context) {
scanner := bufio.NewScanner(stream)
for scanner.Scan() { // will terminate if connection is closed
dataChan <- scanner.Bytes()
}
// if execution reaches this point, something went wrong or stream was closed
select {
case <-ctx.Done():
return // ctx was cancelled, just return without error
default:
errChan <- scanner.Err() // ctx wasn't cancelled, forward error
}
}(ctx)
// handle data from dataChan, error from errChan
To stop the scanner, you would cancel the context and close the connection:
cancelPortScanner()
stream.Close()