Go byte to integer encoding with RPMs - go

I am trying to create a go program that can read and create RPM files without the need of librpm and rpmbuild. Most of the reason for this is to get a better understanding of programming in go.
I am parsing an RPM based off the following: https://github.com/jordansissel/fpm/wiki/rpm-internals
I am looking at the header and trying to parse the number of tags + the length, and I have the following code
fi, err := os.Open("golang-1.1-2.fc19.i686.rpm")
...
// header
head := make([]byte, 16)
// read a chunk
_, err = fi.Read(head)
if err != nil && err != io.EOF { panic(err) }
fmt.Printf("Magic number %s\n", head[:8])
tags, read := binary.Varint(head[8:12])
fmt.Printf("Tag Count: %d\n", tags)
fmt.Printf("Read %d\n", read)
length, read := binary.Varint(head[12:16])
fmt.Printf("Length : %d\n", length)
fmt.Printf("Read %d\n", read)
I get back the following:
Magic number ���
Tag Count: 0
Read 1
Length : 0
Read 1
I printed out the slice and I see this:
Tag bytes: [0 0 0 7]
Length bytes: [0 0 4 132]
I then tried just doing this:
length, read = binary.Varint([]byte{4, 132})
which returns length as 2 and read 1.
Based off what I am reading, the tag and length should be "4 byte 'tag count'", so how would I get the four bytes as one number?
EDIT:
Based off the feedback from #nick-craig-wood and #james-henstridge below is my following prototype code that does what Im looking for:
package main
import (
"io"
"os"
"fmt"
"encoding/binary"
"bytes"
)
type Header struct {
// begin with the 8-byte header magic value: 8D AD E8 01 00 00 00 00
Magic uint64
// 4 byte 'tag count'
Count uint32
// 4 byte 'data length'
Length uint32
}
func main() {
// open input file
fi, err := os.Open("golang-1.1-2.fc19.i686.rpm")
if err != nil { panic(err) }
// close fi on exit and check for its returned error
defer func() {
if err := fi.Close(); err != nil {
panic(err)
}
}()
// ignore lead
fi.Seek(96, 0)
// header
head := make([]byte, 16)
// read a chunk
_, err = fi.Read(head)
if err != nil && err != io.EOF { panic(err) }
fmt.Printf("Magic number %s\n", head[:8])
tags := binary.BigEndian.Uint32(head[8:12])
fmt.Printf("Count Count: %d\n", tags)
length := binary.BigEndian.Uint32(head[12:16])
fmt.Printf("Length : %d\n", length)
// read it as a struct
buf := bytes.NewBuffer(head)
header := Header{}
err = binary.Read(buf, binary.BigEndian, &header)
if err != nil {
fmt.Println("binary.Read failed:", err)
}
fmt.Printf("header = %#v\n", header)
fmt.Printf("Count bytes: %d\n", header.Count)
fmt.Printf("Length bytes: %d\n", header.Length)
}

Firstly don't use Varint - it doesn't do what you think it does!
Decode like this into a go structure is the most convenient way
package main
import (
"bytes"
"encoding/binary"
"fmt"
)
type Header struct {
// begin with the 8-byte header magic value: 8D AD E8 01 00 00 00 00
Magic uint64
// 4 byte 'tag count'
Count uint32
// 4 byte 'data length'
Length uint32
}
var data = []byte{0x8D, 0xAD, 0xE8, 0x01, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0, 4, 132}
func main() {
buf := bytes.NewBuffer(data)
header := Header{}
err := binary.Read(buf, binary.BigEndian, &header)
if err != nil {
fmt.Println("binary.Read failed:", err)
}
fmt.Printf("header = %#v\n", header)
}
Prints
header = main.Header{Magic:0x8dade80100000000, Count:0x7, Length:0x484}
Playground link

The data you are reading doesn't look like it is in Go's variable length integer encoding.
Instead, you probably want binary.BigEndian.Uint32():
tags := binary.BigEndian.Uint32(head[8:12])
length := binary.BigEndian.Uint32(head[12:16])

Related

Binary Encoding/Decoding File in Golang Gives Different Checksum

I'm working on encoding and decoding files in golang. I specifically do need the 2D array that I'm using, this is just test code to show the point. I'm not entirely sure what I'm doing wrong, I'm attempting to convert the file into a list of uint32 numbers and then take those numbers and convert them back to a file. The problem is that when I do it the file looks fine but the checksum doesn't line up. I suspect that I'm doing something wrong in the conversion to uint32. I have to do the switch/case because I have no way of knowing how many bytes I'll read for sure at the end of a given file.
package main
import (
"bufio"
"encoding/binary"
"fmt"
"io"
"os"
)
const (
headerSeq = 8
body = 24
)
type part struct {
Seq int
Data uint32
}
func main() {
f, err := os.Open("speech.pdf")
if err != nil {
panic(err)
}
defer f.Close()
reader := bufio.NewReader(f)
b := make([]byte, 4)
o := make([][]byte, 0)
var value uint32
for {
n, err := reader.Read(b)
if err != nil {
if err != io.EOF {
panic(err)
}
}
if n == 0 {
break
}
fmt.Printf("len array %d\n", len(b))
fmt.Printf("len n %d\n", n)
switch n {
case 1:
value = uint32(b[0])
case 2:
value = uint32(uint32(b[1]) | uint32(b[0])<<8)
case 3:
value = uint32(uint32(b[2]) | uint32(b[1])<<8 | uint32(b[0])<<16)
case 4:
value = uint32(uint32(b[3]) | uint32(b[2])<<8 | uint32(b[1])<<16 | uint32(b[0])<<24)
}
fmt.Println(value)
bs := make([]byte, 4)
binary.BigEndian.PutUint32(bs, value)
o = append(o, bs)
}
fo, err := os.OpenFile("test.pdf", os.O_APPEND|os.O_WRONLY|os.O_CREATE, 0600)
if err != nil {
panic(err)
}
defer fo.Close()
for _, ba := range o {
_, err := fo.Write(ba)
if err != nil {
panic(err)
}
}
}
So, you want to write and read arrays of varying length in a file.
import "encoding/binary"
// You need a consistent byte order for reading and writing multi-byte data types
const order = binary.LittleEndian
var dataToWrite = []byte{ ... ... ... }
var err error
// To write a recoverable array of varying length
var w io.Writer
// First, encode the length of data that will be written
err = binary.Write(w, order, int64(len(dataToWrite)))
// Check error
err = binary.Write(w, order, dataToWrite)
// Check error
// To read a variable length array
var r io.Reader
var dataLen int64
// First, we need to know the length of data to be read
err = binary.Read(r, order, &dataLen)
// Check error
// Allocate a slice to hold the expected amount of data
dataReadIn := make([]byte, dataLen)
err = binary.Read(r, order, dataReadIn)
// Check error
This pattern works not just with byte, but any other fixed size data type. See binary.Write for specifics about the encoding.
If the size of encoded data is a big concern, you can save some bytes by storing the array length as a varint with binary.PutVarint and binary.ReadVarint

What does the error "binary.Write: invalid type" mean?

The code shown below, I create a struct type and want to encode it to binary.
But it show binary.Write: invalid type main.Stu error, I had read some similar code like this,but I can't find why my code doesn't work?
type Stu struct {
Name string
Age int
Id int
}
func main() {
s := &Stu{
Name: "Leo",
Age: 21,
Id: 1,
}
buf := new(bytes.Buffer)
err := binary.Write(buf, binary.BigEndian, s)
if err != nil{
fmt.Println(err)
}
fmt.Printf("%q\n", buf)
}
In short: encoding/binary cannot be used to encode arbitrary values that have non-fixed size. int and string are such examples. Quoting from binary.Write():
Write writes the binary representation of data into w. Data must be a fixed-size value or a slice of fixed-size values, or a pointer to such data.
Note that if you remove the string field and change int fields to int32, it'll work:
type Stu struct {
Age int32
Id int32
}
func main() {
s := &Stu{
Age: 21,
Id: 1,
}
buf := new(bytes.Buffer)
err := binary.Write(buf, binary.BigEndian, s)
if err != nil {
fmt.Println(err)
}
fmt.Printf("%q\n", buf)
}
Which outputs (try it on the Go Playground):
"\x00\x00\x00\x15\x00\x00\x00\x01"
As the doc suggests, to encode complex structures, use encoding/gob.
Example of encoding and decoding using encoding/gob:
buf := new(bytes.Buffer)
enc := gob.NewEncoder(buf)
if err := enc.Encode(s); err != nil {
fmt.Println(err)
}
fmt.Printf("%v\n", buf.Bytes())
dec := gob.NewDecoder(buf)
var s2 *Stu
if err := dec.Decode(&s2); err != nil {
fmt.Println(err)
}
fmt.Printf("%+v\n", s2)
Which outputs (try it on the Go Playground):
[41 255 129 3 1 1 3 83 116 117 1 255 130 0 1 3 1 4 78 97 109 101 1 12 0 1 3 65 103 101 1 4 0 1 2 73 100 1 4 0 0 0 12 255 130 1 3 76 101 111 1 42 1 2 0]
&{Name:Leo Age:21 Id:1}

What does crypto/cipher-XORKeyStream do to the src []byte?

I'm doing AES encryption using Go, I found that the source bytes changed after encryption. Seems that XORKeyStream function does the change if cap(source) > len(source), what it exactly does to the src []byte?
go version go1.12.5 darwin/amd64
func main() {
byte1 := []byte("123abc")
fmt.Println("content1:", byte1, "len1:", len(byte1), "cap1:", cap(byte1)) // content1: [49 50 51 97 98 99] len1: 6 cap1: 6
buf := bytes.NewBuffer([]byte("123abc"))
byte2, _ := ioutil.ReadAll(buf)
fmt.Println("content2:", byte2, "len2:", len(byte2), "cap2:", cap(byte2)) // content2: [49 50 51 97 98 99] len2: 6 cap2: 1536
_, _, _, err := crypt.AESEnc(byte1)
if err != nil {
log.Fatal(err)
}
fmt.Println("content1:", byte1, "len1:", len(byte1), "cap1:", cap(byte1)) // content1: [49 50 51 97 98 99] len1: 6 cap1: 6
_, _, _, err = crypt.AESEnc(byte2)
if err != nil {
log.Fatal(err)
}
fmt.Println("content2:", byte2, "len2:", len(byte2), "cap2:", cap(byte2)) // content2: [132 200 7 200 195 8] len2: 6 cap2: 1536
}
func AESEnc(data []byte) ([]byte, []byte, string, error) {
key := make([]byte, 16)
iv := make([]byte, 16)
_, err := io.ReadFull(rand.Reader, key)
if err != nil {
return nil, nil, "", err
}
_, err = io.ReadFull(rand.Reader, iv)
if err != nil {
return nil, nil, "", err
}
block, err := aes.NewCipher(key)
if err != nil {
return nil, nil, "", err
}
pdata := pckspadding(data, block.BlockSize())
stream := cipher.NewCFBEncrypter(block, iv)
stream.XORKeyStream(pdata, pdata)
return key, iv, base64.StdEncoding.EncodeToString(pdata), nil
}
func pckspadding(ciphertext []byte, blockSize int) []byte {
padding := blockSize - len(ciphertext)%blockSize
padtext := bytes.Repeat([]byte{byte(padding)}, padding)
return append(ciphertext, padtext...)
}
byte2 changes after encryption, what happened?
I'm not familiar a crypto/cypher-XORKeyStream but I can tell you what XOR does to bits if that is helpfull. I have some Electronic experience and here is the truth table to an XOR gate:
Inputs X and Y represent two bits. The output Z is the result of XOR-ing X and Y.
In English you would say to yourself "Inputs, either one or the other but not both" results in an output of "True".
Don't know how much help this will be or how to apply it to more than two input bits with a crypto/cypher-XORKeyStream.
But here would be an example:
X = 00110001010
Y = 11111111111
Z = 11001110101
Good Luck!

Go: deep copy slices

I would like to read a slice of strings representing hexadecimal numbers, and decode them to a slice of byte slices ([]string --> [][]byte). This is my code so far:
func (self *algo_t) decode_args(args []string) ([][]byte, error) {
var data [][]byte
for i := uint32(0); i < self.num_args; i++ {
data = make([][]byte, self.num_args)
tmp, err := hex.DecodeString(args[i])
fmt.Printf("i = %d\ttmp = %x\n", i, tmp)
data[i] = make([]byte, len(tmp))
copy(data[i], tmp)
if err != nil {
fmt.Fprintf(os.Stderr, "Error decoding hex string %s: %s\n", args[i], err.Error())
return nil, err
}
}
fmt.Printf("line 69\tdata[0] = %x\tdata[1] = %x\tdata[2] = %x\n",data[0], data[1], data[2])
return data, nil
}
calling this code and passing args = []string{"010203","040506","070809"} yields the following output:
i = 0 tmp = 010203
i = 1 tmp = 040506
i = 3 tmp = 070809
line 69 data[0] = data[1] = data[2] = 070809
Presumably the function returns [][]byte{[]byte{}, []byte{}, []byte{0x07, 0x08, 0x09}}.
I understand that this is because of the pointer behavior of Go; what is the best practice for doing a deep copy of this kind?
For example,
package main
import (
"encoding/hex"
"fmt"
)
// Decode hex []string to [][]byte
func decode(s []string) ([][]byte, error) {
b := make([][]byte, len(s))
for i, ss := range s {
h, err := hex.DecodeString(ss)
if err != nil {
err = fmt.Errorf(
"Error decoding hex string %s: %s\n",
ss, err.Error(),
)
return nil, err
}
b[i] = h
}
return b, nil
}
func main() {
s := []string{"010203", "040506", "070809"}
fmt.Println(s)
b, err := decode(s)
if err != nil {
fmt.Println(err)
} else {
fmt.Println(b)
}
s = []string{"ABCDEF", "012345", "09AF"}
fmt.Println(s)
b, err = decode(s)
if err != nil {
fmt.Println(err)
} else {
fmt.Println(b)
}
s = []string{"01", "123XYZ"}
fmt.Println(s)
b, err = decode(s)
if err != nil {
fmt.Println(err)
} else {
fmt.Println(b)
}
}
Output:
[010203 040506 070809]
[[1 2 3] [4 5 6] [7 8 9]]
[ABCDEF 012345 09AF]
[[171 205 239] [1 35 69] [9 175]]
[01 123XYZ]
Error decoding hex string 123XYZ: encoding/hex: invalid byte: U+0058 'X'
There is a package built specifically to handle deep copy: http://godoc.org/code.google.com/p/rog-go/exp/deepcopy
You can look at the source here: https://code.google.com/p/rog-go/source/browse/exp/deepcopy/deepcopy.go. It covers copying slices and pointers, so it should cover your case.

limitation on bytes.Buffer?

I am trying to gzip a slice of bytes using the package "compress/gzip". I am writing to a bytes.Buffer and I am writing 45976 bytes, when I am trying to uncompress the content using a gzip.reader and then reader function - I find that the not all of the content is recovered. Is there some limitations to bytes.buffer? and is it a way to by pass or alter this? here is my code (edit):
func compress_and_uncompress() {
var buf bytes.Buffer
w := gzip.NewWriter(&buf)
i,err := w.Write([]byte(long_string))
if(err!=nil){
log.Fatal(err)
}
w.Close()
b2 := make([]byte, 80000)
r, _ := gzip.NewReader(&buf)
j, err := r.Read(b2)
if(err!=nil){
log.Fatal(err)
}
r.Close()
fmt.Println("Wrote:", i, "Read:", j)
}
output from testing (with a chosen string as long_string) would give
Wrote: 45976, Read 32768
Continue reading to get the remaining 13208 bytes. The first read returns 32768 bytes, the second read returns 13208 bytes, and the third read returns zero bytes and EOF.
For example,
package main
import (
"bytes"
"compress/gzip"
"fmt"
"io"
"log"
)
func compress_and_uncompress() {
var buf bytes.Buffer
w := gzip.NewWriter(&buf)
i, err := w.Write([]byte(long_string))
if err != nil {
log.Fatal(err)
}
w.Close()
b2 := make([]byte, 80000)
r, _ := gzip.NewReader(&buf)
j := 0
for {
n, err := r.Read(b2[:cap(b2)])
b2 = b2[:n]
j += n
if err != nil {
if err != io.EOF {
log.Fatal(err)
}
if n == 0 {
break
}
}
fmt.Println(len(b2))
}
r.Close()
fmt.Println("Wrote:", i, "Read:", j)
}
var long_string string
func main() {
long_string = string(make([]byte, 45976))
compress_and_uncompress()
}
Output:
32768
13208
Wrote: 45976 Read: 45976
Use ioutil.ReadAll. The contract for io.Reader says it doesn't have to return all the data and there is a good reason for it not to to do with sizes of internal buffers. ioutil.ReadAll works like io.Reader but will read until EOF.
Eg (untested)
import "io/ioutil"
func compress_and_uncompress() {
var buf bytes.Buffer
w := gzip.NewWriter(&buf)
i,err := w.Write([]byte(long_string))
if err!=nil {
log.Fatal(err)
}
w.Close()
r, _ := gzip.NewReader(&buf)
b2, err := ioutil.ReadAll(r)
if err!=nil {
log.Fatal(err)
}
r.Close()
fmt.Println("Wrote:", i, "Read:", len(b2))
}
If the read from gzip.NewReader does not return the whole expected slice. You can just keep re-reading until you have recieved all the data in the buffer.
Regarding you problem where if you re-read the subsequent reads did not append to the end of the slice, but instead at the beginning; the answer can be found in the implementation of gzip's Read function, which includes
208 z.digest.Write(p[0:n])
This will result in an "append" at the beginning of the string.
This can be solves in this manner
func compress_and_uncompress(long_string string) {
// Writer
var buf bytes.Buffer
w := gzip.NewWriter(&buf)
i,err := w.Write([]byte(long_string))
if(err!=nil){
log.Fatal(err)
}
w.Close()
// Reader
var j, k int
b2 := make([]byte, 80000)
r, _ := gzip.NewReader(&buf)
for j=0 ; ; j+=k {
k, err = r.Read(b2[j:]) // Add the offset here
if(err!=nil){
if(err != io.EOF){
log.Fatal(err)
} else{
break
}
}
}
r.Close()
fmt.Println("Wrote:", i, "Read:", j)
}
The result will be:
Wrote: 45976 Read: 45976
Also after testing with a string of 45976 characters i can confirm that the output is in exactly the same manner as the input, where the second part is correctly appended after the first part.
Source for gzip.Read: http://golang.org/src/pkg/compress/gzip/gunzip.go?s=4633:4683#L189

Resources