Write fixed length padded lines to file Go - go

For printing, justified and fixed length, seems like what everyone asks about and there are many examples that I have found, like...
package main
import "fmt"
func main() {
values := []string{"Mustang", "10", "car"}
for i := range(values) {
fmt.Printf("%10v...\n", values[i])
}
for i := range(values) {
fmt.Printf("|%-10v|\n", values[i])
}
}
Situation
But what if I need to WRITE to a file with fixed length bytes?
For example: what if I have requirement that states, write this line to a file that must be 32 bytes, left justified and padded to the right with 0's
Question
So, how do you accomplish this when writing to a file?

There are analogous functions to fmt.PrintXX() functions, ones that start with an F, take the form of fmt.FprintXX(). These variants write the result to an io.Writer which may be an os.File as well.
So if you have the fmt.Printf() statements which you want to direct to a file, just change them to call fmt.Fprintf() instead, passing the file as the first argument:
var f *os.File = ... // Initialize / open file
fmt.Fprintf(f, "%10v...\n", values[i])
If you look into the implementation of fmt.Printf():
func Printf(format string, a ...interface{}) (n int, err error) {
return Fprintf(os.Stdout, format, a...)
}
It does exactly this: it calls fmt.Fprintf(), passing os.Stdout as the output to write to.
For how to open a file, see How to read/write from/to file using Go?
See related question: Format a Go string without printing?

Related

Unmarshalling in-place into a slice type in Go

Often when using go, not sure why, I get the urge to write something like
type data []event
especially when I know I'm going to be passing the slice around without thinking too much about its contents for much of the program. Sooner or later it's going to be time to unpack some data into that slice of events and I end up writing something like:
func (d *data)Unmarshal(b []byte){
//... lots of sad code that never works
}
No matter what I do I can never quite figure out how to bless my slice type with an unmarshal method that turns some bytes into the data type in-place.
When I give up, I either write a simpler function like func UnmarshalData(b []byte) data which feels like a retreat and makes it hard to write interfaces, or change the type in the first place and make a struct like
type data struct {
actuallyTheData []event
}
which feels like boilerplate purely to compensate for my lack of understanding.
So my question is: is it possible to write a function with a pointer receiver where the receiver is a slice type and that allows me to e.g. Unmarshal in-place?
The closest I can get, though it still doesn't work (and, let's face it, is pretty ugly), is something like:
type foo []int
func (f *foo) Unmarshal(s string) {
numbers := strings.Split(s, ",")
integers := make([]int, len(numbers))
for i, n := range numbers {
integer, err := strconv.Atoi(n)
if err != nil {
log.Fatal(err)
}
integers[i] = integer
}
my_f := foo(integers)
f = &my_f
}
Here's the full example: https://go.dev/play/p/3q7qehoW9tm. Why doesn't it work? What am I misunderstanding?
The last line in your Unmarshal function is overwriting the receiver itself, i.e. its address:
f = &my_f // changing the value of the pointer
The updated value won't be propagated to callers. From Declarations and Scope:
The scope of an identifier denoting a method receiver, function parameter, or result variable is the function body.
You must mutate the value that is being pointed to, then callers will see it upon dereference. (As a matter of fact, you don't have to convert to the defined slice type)
func (f *foo) Unmarshal(s string) {
// ...
integers := make([]int, len(numbers))
*f = integers
}
Fixed playground: https://go.dev/play/p/3JayxQMClt-

Issue with sys package related to I/O operation

As I want to access some lower-level API to do I/O operation using CreateFile function
syscall.CreateFile( name *uint16…)
While doing so I face a problem that the name parameter is of *uint16 but it should be an array ([]uint16) so that it can handle the string in the UTF-16 format. As we can see in the example provided by Microsoft -> link where TEXT macro convert the string into wchar_t array or we can say []uint16.
Thanks in advance and sorry if I said anything wrong as I’m just a toddler in this field.
(Solution 1)
func UTF16PtrFromString(s string) (*uint16, error)
Built-in encoder which returns a pointer to the UTF-16 encoding
(Solution 2)
As previously I was unaware of Solution 1 so I wrote my own function which does the exact work so you can ignore this solution
For passing the file name (string) to the sys package we have to first convert the string to an array of UTF-16 and pass the pointer of the first element
var srcUTf16 [ ]uint16 = utf16.Encode([ ]rune(src+ "\x00"))
syscall.CreateFile(&srcUTf16[0],..... )
Edit:-Adding solution
Edit:- Adding correct solution and adding Terminating NUL in solution 2.
I don't really care for Windows API function signatures Go has made, and I have written about this. So if you want, you can write your own. Make a file like this:
//go:generate mkwinsyscall -output zfile.go file.go
//sys createFile(name string, access int, mode int, sec *windows.SecurityAttributes, disp int, flag int, template int) (hand int, err error) = kernel32.CreateFileW
package main
import "golang.org/x/sys/windows"
func main() {
n, e := createFile(
"file.txt",
windows.GENERIC_READ,
0,
nil,
windows.CREATE_NEW,
windows.FILE_ATTRIBUTE_NORMAL,
0,
)
if e != nil {
panic(e)
}
println(n)
}
Then build:
go mod init file
go generate
go mod tidy
go build
I know the result works, because it returns a valid handle the first time, and invalid handle the second time (also because a file is created of course):
PS C:\> .\file.exe
336
PS C:\> .\file.exe
-1
If you want, you can edit the signature line I put above, to suit your needs.

How to transform HTML entities via io.Reader

My Go program makes HTTP requests whose response bodies are large JSON documents whose strings encode the ampersand character & as & (presumably due to some Microsoft platform quirk?). My program needs to convert those entities back to the ampersand character in a way that is compatible with json.Decoder.
An example response might look like the following:
{"name":"A&B","comment":"foo&bar"}
Whose corresponding object would be as below:
pkg.Object{Name:"A&B", Comment:"foo&bar"}
The documents come in various shapes so it's not feasible to convert the HTML entities after decoding. Ideally it would be done by wrapping the response body reader in another reader that performs the transformation.
Is there an easy way to wrap the http.Response.Body in some io.ReadCloser which replaces all instances of & with & (or in the general case, replaces any string X with string Y)?
I suspect this is possible with x/text/transform but don't immediately see how. In particular, I'm concerned about edge cases wherein an entity spans batches of bytes. That is, one batch ends with &am and the next batch starts with p;, for example. Is there some library or idiom that gracefully handles that situation?
If you don't want to rely on an external package like transform.Reader you can write a custom io.Reader wrapper.
The following will handle the edge case where the find element may span two Read() calls:
type fixer struct {
r io.Reader // source reader
fnd, rpl []byte // find & replace sequences
partial int // track partial find matches from previous Read()
}
// Read satisfies io.Reader interface
func (f *fixer) Read(b []byte) (int, error) {
off := f.partial
if off > 0 {
copy(b, f.fnd[:off]) // copy any partial match from previous `Read`
}
n, err := f.r.Read(b[off:])
n += off
if err != io.EOF {
// no need to check for partial match, if EOF, as that is the last Read!
f.partial = partialFind(b[:n], f.fnd)
n -= f.partial // lop off any partial bytes
}
fixb := bytes.ReplaceAll(b[:n], f.fnd, f.rpl)
return copy(b, fixb), err // preserve err as it may be io.EOF etc.
}
Along with this helper (which could probably use some optimization):
// returns number of matched bytes, if byte-slice ends in a partial-match
func partialFind(b, find []byte) int {
for n := len(find) - 1; n > 0; n-- {
if bytes.HasSuffix(b, find[:n]) {
return n
}
}
return 0 // no match
}
Working playground example.
Note: to test the edge-case logic, one could use a narrowReader to ensure short Read's and force a match is split across Reads like this: validation playground example
You need to create a transform.Transformer that replaces your characters.
So we need one that transforms an old []byte to a new []byte while preserving all other data. An implementation could look like this:
type simpleTransformer struct {
Old, New []byte
}
// Transform transforms `t.Old` bytes to `t.New` bytes.
// The current implementation assumes that len(t.Old) >= len(t.New), but it also seems to work when len(t.Old) < len(t.New) (this has not been tested extensively)
func (t *simpleTransformer) Transform(dst, src []byte, atEOF bool) (nDst, nSrc int, err error) {
// Get the position of the first occurance of `t.Old` so we can replace it
var ci = bytes.Index(src[nSrc:], t.Old)
// Loop over the slice until we can't find any occurances of `t.Old`
// also make sure we don't run into index out of range panics
for ci != -1 && nSrc < len(src) {
// Copy source data before `nSrc+ci` that doesn't need transformation
copied := copy(dst[nDst:nDst+ci], src[nSrc:nSrc+ci])
nDst += copied
nSrc += copied
// Copy new data with transformation to `dst`
nDst += copy(dst[nDst:nDst+len(t.New)], t.New)
// Skip the rest of old bytes in the next iteration
nSrc += len(t.Old)
// search for the next occurance of `t.Old`
ci = bytes.Index(src[nSrc:], t.Old)
}
// Mark the rest of data as not completely processed if it contains a start element of `t.Old`
// (e.g. if the end is `&amp` and we're looking for `&`)
// This data will not yet be copied to `dst` so we can work with it again
// If it is at the end (`atEOF`), we don't need to do the check anymore as the string might just end with `&amp`
if bytes.Contains(src[nSrc:], t.Old[0:1]) && !atEOF {
err = transform.ErrShortSrc
return
}
// Copy rest of data that doesn't need any transformations
// The for loop processed everything except this last chunk
copied := copy(dst[nDst:], src[nSrc:])
nDst += copied
nSrc += copied
return nDst, nSrc, err
}
// To satisfy transformer.Transformer interface
func (t *simpleTransformer) Reset() {}
The implementation has to make sure that it deals with characters that are split between multible calls of the Transform method, which is why it returns transform.ErrShortSrc to tell the transform.Reader that it needs more information about the next bytes.
This can now be used to replace characters in a stream:
var input = strings.NewReader(`{"name":"A&B","comment":"foo&bar"}`)
r := transform.NewReader(input, &simpleTransformer{[]byte(`&`), []byte(`&`)})
io.Copy(os.Stdout, r) // Instead of io.Copy, use the JSON decoder to read from `r`
Output:
{"name":"A&B","comment":"foo&bar"}
You can also see this in action on the Go Playground.

Reading from a file from bufio with a semi complex sequencing through file

So there may be questions like this but its not a super easy thing to google. Basically I have a file thats a set of protobufs encoded and sequenced as they normally are from the protobuf spec.
So think of the bytes values being chunked something like this throughout the file:
[EncodeVarInt(size of protobuf struct)] [protobuf stuct bytes]
So you have a few bytes read one at a time that are used for large jump of a read on our protof structure.
My implementation using the os ReadAt method on a file currently looks something like this.
// getting the next value in a file context feature
func (geobuf *Geobuf_Reader) Next() bool {
if geobuf.EndPos <= geobuf.Pos {
return false
} else {
startpos := int64(geobuf.Pos)
for int(geobuf.Get_Byte(geobuf.Pos)) > 127 {
geobuf.Pos += 1
}
geobuf.Pos += 1
sizebytes := make([]byte,geobuf.Pos-int(startpos))
geobuf.File.ReadAt(sizebytes,startpos)
size,_ := DecodeVarint(sizebytes)
geobuf.Feat_Pos = [2]int{int(size),geobuf.Pos}
geobuf.Pos = geobuf.Pos+int(size)
return true
}
return false
}
// reads a geobuf feature as geojson
func (geobuf *Geobuf_Reader) Feature() *geojson.Feature {
// getting raw bytes
a := make([]byte,geobuf.Feat_Pos[0])
geobuf.File.ReadAt(a,int64(geobuf.Feat_Pos[1]))
return Read_Feature(a)
}
How can I implement something like bufio or other chunked reading mechanisms to speed up so many file ReadAt's? Most bufio implementations I've seen are for having a specific delimitter. Thanks in advance hopefully this wasn't a horrible question.
Package bufio
import "bufio"
type SplitFunc
SplitFunc is the signature of the split function used to tokenize the
input. The arguments are an initial substring of the remaining
unprocessed data and a flag, atEOF, that reports whether the Reader
has no more data to give. The return values are the number of bytes to
advance the input and the next token to return to the user, plus an
error, if any. If the data does not yet hold a complete token, for
instance if it has no newline while scanning lines, SplitFunc can
return (0, nil, nil) to signal the Scanner to read more data into the
slice and try again with a longer slice starting at the same point in
the input.
If the returned error is non-nil, scanning stops and the error is
returned to the client.
The function is never called with an empty data slice unless atEOF is
true. If atEOF is true, however, data may be non-empty and, as always,
holds unprocessed text.
type SplitFunc func(data []byte, atEOF bool) (advance int, token []byte, err error)
Use bufio.Scanner and write a custom protobuf struct SplitFunc.

How to copy array into part of another in Go?

I am new to Go, and would like to copy an array (slice) into part of another. For example, I have a largeArray [1000]byte or something and a smallArray [10]byte and I want the first 10 bytes of largeArray to be equal to the contents of smallArray. I have tried:
largeArray[0:10] = smallArray[:]
But that doesn't seem to work. Is there a built-in memcpy-like function, or will I just have to write one myself?
Thanks!
Use the copy built-in function.
package main
func main() {
largeArray := make([]byte, 1000)
smallArray := make([]byte, 10)
copy(largeArray[0:10], smallArray[:])
}

Resources