Does closing io.PipeWriter close the underlying file? - go

I am using logrus for logging and have a few custom format loggers. Each is initialized to write to a different file like:
fp, _ := os.OpenFile(path, os.O_APPEND|os.O_WRONLY|os.O_CREATE, 0755)
// error handling left out for brevity
log.Out = fp
Later in the application, I need to change the file the logger is writing to (for a log rotation logic). What I want to achieve is to properly close the current file before changing the logger's output file. But the closest thing to the file handle logrus provides me is a Writer() method that returns a io.PipeWriter pointer. So would calling Close() on the PipeWriter also close the underlying file?
If not, what are my options to do this, other than keeping the file pointer stored somewhere.

For the record, twelve-factor tells us that applications should not concern themselves with log rotation. If and how logs are handled best depends on how the application is deployed. Systemd has its own logging system, for instance. Writing to files when deployed in (Docker) containers is annoying. Rotating files are annoying during development.
Now, pipes don't have an "underlying file". There's a Reader end and a Writer end, and that's it. From the docs for PipeWriter:
Close closes the writer; subsequent reads from the read half of the pipe will return no bytes and EOF.
So what happens when you close the writer depends on how Logrus handles EOF on the Reader end. Since Logger.Out is an io.Writer, Logrus cannot possibly call Close on your file.
Your best bet would be to wrap *os.File, perhaps like so:
package main
import "os"
type RotatingFile struct {
*os.File
rotate chan struct{}
}
func NewRotatingFile(f *os.File) RotatingFile {
return RotatingFile{
File: f,
rotate: make(chan struct{}, 1),
}
}
func (r RotatingFile) Rotate() {
r.rotate <- struct{}{}
}
func (r RotatingFile) doRotate() error {
// file rotation logic here
return nil
}
func (r RotatingFile) Write(b []byte) (int, error) {
select {
case <-r.rotate:
if err := r.doRotate(); err != nil {
return 0, err
}
default:
}
return r.File.Write(b)
}
Implementing log file rotation in a robust way is surprisingly tricky. For instance, closing the old file before creating the new one is not a good idea. What if the log directory permissions changed? What if you run out of inodes? If you can't create a new log file you may want to keep writing to the current file. Are you okay with ripping lines apart, or do you only want to rotate after a newline? Do you want to rotate empty files? How do you reliably remove old logs if someone deletes the N-1th file? Will you notice the Nth file or stop looking at the N-2nd?
The best advice I can give you is to leave log rotation to the pros. I like svlogd (part of runit) as a standalone log rotation tool.

The closing of io.PipeWriter will not affect actual Writer behind it. The chain of close execution:
PipeWriter.Close() -> PipeWriter.CloseWithError(err error) ->
pipe.CloseWrite(err error)
and it doesn't influence underlying io.Writer.
To close actual writer you need just to close Logger.Out that is an exported field.

Related

Write fixed length padded lines to file Go

For printing, justified and fixed length, seems like what everyone asks about and there are many examples that I have found, like...
package main
import "fmt"
func main() {
values := []string{"Mustang", "10", "car"}
for i := range(values) {
fmt.Printf("%10v...\n", values[i])
}
for i := range(values) {
fmt.Printf("|%-10v|\n", values[i])
}
}
Situation
But what if I need to WRITE to a file with fixed length bytes?
For example: what if I have requirement that states, write this line to a file that must be 32 bytes, left justified and padded to the right with 0's
Question
So, how do you accomplish this when writing to a file?
There are analogous functions to fmt.PrintXX() functions, ones that start with an F, take the form of fmt.FprintXX(). These variants write the result to an io.Writer which may be an os.File as well.
So if you have the fmt.Printf() statements which you want to direct to a file, just change them to call fmt.Fprintf() instead, passing the file as the first argument:
var f *os.File = ... // Initialize / open file
fmt.Fprintf(f, "%10v...\n", values[i])
If you look into the implementation of fmt.Printf():
func Printf(format string, a ...interface{}) (n int, err error) {
return Fprintf(os.Stdout, format, a...)
}
It does exactly this: it calls fmt.Fprintf(), passing os.Stdout as the output to write to.
For how to open a file, see How to read/write from/to file using Go?
See related question: Format a Go string without printing?

How and when is the Go sdk logger flushed?

I'm trying to determine if the default/sdk logger log.PrintYYY() functions are flushed at some point in time, on exit, on panic, etc. I'm unsure if I need to find a way to flush the writer that the logger is hooked up to, especially when setting the output writer with SetOutput(...). Of course, the writer interface doesn't have a flush() method, so not really sure how that might get done.
How and when is the Go sdk logger flushed?
The log package is not responsible for flushing the underlying io.Writer. It would be possible for the log package to perform a type-assertion to see if the current io.Writer has a Flush() method, and if so then call it, but there is no guarantee that if multiple io.Writers are "chained", the data will eventually be flushed to the bottommost layer.
And the primary reason why the log package doesn't flush in my opinion is performance. We use buffered writers so we don't have to reach the underlying layer each time a single byte (or byte slice) is written to it, but we can cache the recently written data, and when we reach a certain size (or a certain time), we can write the "batch" at once, efficiently.
If the log package would flush after each log statement, that would render buffered IO useless. It might not matter in case of small apps, but if you have a high traffic web server, issuing a flush after each log statement (of which there may be many in each request handing) would cause a serious performance drawback.
Then yes, there is the issue if the app is terminated, the last log statements might not make it to the underlying layer. The proper solution is to do a graceful shutdown: implement signal handling, and when your app is about to terminate, properly flush and close the underlying io.Writer of the logger you use. For details, see:
Is it possible to capture a Ctrl+C signal and run a cleanup function, in a "defer" fashion?
Is there something like finally() in Go just opposite to what init()?
Are deferred functions called when SIGINT is received in Go?
If–for simplicity only–you still need a logger that flushes after each log statement, you can achieve that easily. This is because the log.Logger type guarantees that each log message is delivered to the destination io.Writer with a single Writer.Write() call:
Each logging operation makes a single call to the Writer's Write method. A Logger can be used simultaneously from multiple goroutines; it guarantees to serialize access to the Writer.
So basically all you need to do is create a "wrapper" io.Writer whose Write() method does a flush after "forwarding" the Write() call to its underlying writer.
This is how it could look like:
type myWriter struct {
io.Writer
}
func (m *myWriter) Write(p []byte) (n int, err error) {
n, err = m.Writer.Write(p)
if flusher, ok := m.Writer.(interface{ Flush() }); ok {
flusher.Flush()
} else if syncer := m.Writer.(interface{ Sync() error }); ok {
// Preserve original error
if err2 := syncer.Sync(); err2 != nil && err == nil {
err = err2
}
}
return
}
This implementation checks for both the Flush() method and os.File's Sync() method, and calls if they are "present".
This is how this can be used so that logging statements always flush:
f, err := os.Create("log.txt")
if err != nil {
panic(err)
}
defer f.Close()
log.SetOutput(&myWriter{Writer: f})
log.Println("hi")
See related questions:
Go: Create io.Writer inteface for logging to mongodb database
net/http set custom logger
Logger shouldn't know how to flush data. You have to flush output writer which you specified on logger creation (if it has such capability).
See example from github discussion
package main
import (
"bufio"
"flag"
"log"
"os"
"strings"
)
func main() {
var flush bool
flag.BoolVar(&flush, "flush", false, "if set, will flush the buffered io before exiting")
flag.Parse()
br := bufio.NewWriter(os.Stdout)
logger := log.New(br, "", log.Ldate)
logger.Printf("%s\n", strings.Repeat("This is a test\n", 5))
if flush {
br.Flush()
}
logger.Fatalf("exiting now!")
}
You can read entire discussion related to your issue on github
Alternatively, you can look at 3rd party loggers which have flush. Check out zap logger which have method logger.Sync()

How to save data streams in S3? aws-sdk-go example not working?

I am trying to persist a given stream of data to an S3 compatible storage.
The size is not known before the stream ends and can vary from 5MB to ~500GB.
I tried different possibilities but did not find a better solution than to implement sharding myself. My best guess is to make a buffer of a fixed size fill it with my stream and write it to the S3.
Is there a better solution? Maybe a way where this is transparent to me, without writing the whole stream to memory?
The aws-sdk-go readme has an example programm that takes data from stdin and writes it to S3: https://github.com/aws/aws-sdk-go#using-the-go-sdk
When I try to pipe data in with a pipe | I get the following error:
failed to upload object, SerializationError: failed to compute request body size
caused by: seek /dev/stdin: illegal seek
Am I doing something wrong or is the example not working as I expect it to?
I although tried minio-go, with PutObject() or client.PutObjectStreaming().
This is functional but consumes as much memory as the data to store.
Is there a better solution?
Is there a small example program that can pipe arbitrary data into S3?
You can use the sdk's Uploader to handle uploads of unknown size but you'll need to make the os.Stdin "unseekable" by wrapping it into an io.Reader. This is because the Uploader, while it requires only an io.Reader as the input body, under the hood it does a check to see whether the input body is also a Seeker and if it is, it does call Seek on it. And since os.Stdin is just an *os.File which implements the Seeker interface, by default, you would get the same error you got from PutObjectWithContext.
The Uploader also allows you to upload the data in chunks whose size you can configure and you can also configure how many of those chunks should be uploaded concurrently.
Here's a modified version of the linked example, stripped off of code that can remain unchanged.
package main
import (
// ...
"io"
"github.com/aws/aws-sdk-go/service/s3/s3manager"
)
type reader struct {
r io.Reader
}
func (r *reader) Read(p []byte) (int, error) {
return r.r.Read(p)
}
func main() {
// ... parse flags
sess := session.Must(session.NewSession())
uploader := s3manager.NewUploader(sess, func(u *s3manager.Uploader) {
u.PartSize = 20 << 20 // 20MB
// ... more configuration
})
// ... context stuff
_, err := uploader.UploadWithContext(ctx, &s3manager.UploadInput{
Bucket: aws.String(bucket),
Key: aws.String(key),
Body: &reader{os.Stdin},
})
// ... handle error
}
As to whether this is a better solution than minio-go I do not know, you'll have to test that yourself.

Goroutine thread safety of Go logging struct instantiation utility method

I'm working with a new go service and I have a SetupLogger utility function that creates a new instance of go-kit's logging struct log.Logger.
Is this method safe to invoke from code that's handling requests inside separate go-routines?
package utils
import (
"fmt"
"github.com/go-kit/kit/log"
"io"
"os"
"path/filepath"
)
// If the environment-specified directory for writing log files exists, open the existing log file
// if it already exists or create a log file if no log file exists.
// If the environment-specified directory for writing log files does not exist, configure the logger
// to log to process stdout.
// Returns an instance of go-kit logger
func SetupLogger() log.Logger {
var logWriter io.Writer
var err error
LOG_FILE_DIR := os.Getenv("CRAFT_API_LOG_FILE_DIR")
LOG_FILE_NAME := os.Getenv("CRAFT_API_LOG_FILE_NAME")
fullLogFilePath := filepath.Join(
LOG_FILE_DIR,
LOG_FILE_NAME,
)
if dirExists, _ := Exists(&ExistsOsCheckerStruct{}, LOG_FILE_DIR); dirExists {
if logFileExists, _ := Exists(&ExistsOsCheckerStruct{}, fullLogFilePath); !logFileExists {
os.Create(fullLogFilePath)
}
logWriter, err = os.OpenFile(fullLogFilePath, os.O_RDWR|os.O_CREATE|os.O_APPEND, 0666)
if err != nil {
fmt.Println("Could not open log file. ", err)
}
} else {
logWriter = os.Stdout
}
return log.NewContext(log.NewJSONLogger(logWriter)).With(
"timestamp", log.DefaultTimestampUTC,
"caller", log.DefaultCaller,
)
}
First recommendation: Use the -race flag to go build and go test. It will almost always be able to tell you if you have a race condition. Although it might not in this case since you could end up calling your os.Create() and your os.OpenFile() simultaneously.
So, second recommendation is to avoid, if at all possible, the "If it exists/matches/has permissions Then open/delete/whatever" pattern.
That pattern leads to the TOCTTOU (Time of Check To Time Of Use) bug, which is often a security bug and can at the very least lead to data loss.
To avoid it either wrap the check and use into the same mutex, or use atomic operations, such as an OpenFile call that creates the file or returns an error if it already existed (although to be technical, its locked inside the OS kernel. Just like how atomic CPU ops are locked in the hardware bus.).
In your case here I am not quite sure why you have two Open calls since it looks like just one would do the job.
Since your setting up of your Logger only involves library instantiation, creating a log file if it doesn't exist, opening the log file and no writing involved there will be no problem calling it from different go-routines since the shared data is not getting tampered with.
Side note:
Design wise it makes sense (assuming Logger is writing to the same file) to pass around the only instantiated instance of Logger for logging which would prevent two go-routines calling your setup function at the same time.

Is there a built-in go logger that can roll

Is there a built-in Go logger that can roll a log file when it reaches a file size limit?
Thanks.
No, there is no built-in in logger that currently has this feature.
log4go, which you will find recommended when searching, is currently broken. has some issues, that lead to messages getting lost (in case the main program exits and some messages are still in the channelbuffer before being written).
this is present in most if not all the examples.
see also this question
Above the syslog package, which probably is not what you really want, no such thing is in the standard library.
From the 3rd party packages, for example log4go claims to have this feature.
import (
"os"
"github.com/nikandfor/tlog"
"github.com/nikandfor/tlog/rotated"
)
func main() {
f, err := rotated.Create("logfile_template_#.log") // # will be substituted by time of file creation
if err != nil {
panic(err)
}
defer f.Close()
f.MaxSize = 1 << 30 // 1GiB
f.Fallback = os.Stderr // in case of failure to write to file, last chance to save log message
tlog.DefaultLogger = tlog.New(tlog.NewConsoleWriter(f, tlog.LstdFlags))
tlog.Printf("now use it much like %v", "log.Logger")
log.SetOutput(f) // also works for any logger or what ever needs io.Writer
log.Printf("also appears in the log")
}
logger https://github.com/nikandfor/tlog
rotated file https://godoc.org/github.com/nikandfor/tlog/rotated

Resources