golang copy function understanding - go

Hey guys I was playing around with some buffers and I just wrote some code to understand how Read() works
package main
import (
"bytes"
"fmt"
"io"
)
func main() {
tmp := make([]byte, 2)
data := []byte("HEL")
dataReader := bytes.NewReader(data)
dest := make([]byte, len(data))
for {
n, err := dataReader.Read(tmp)
fmt.Println(n)
fmt.Println(string(tmp))
dest = append(dest, tmp[:]...)
if err == io.EOF {
break
}
}
fmt.Println(string(dest))
}
output:
2 -> n
HE -> tmp[:]
1 -> n
LE -> tmp[:]
0 -> n
LE -> tmp[:]
HELELE -> dest
So I know the output is wrong and I should actually be doing temp[:n] to write the bytes, but looking at the output I realised that the tmp buffer does not get cleared on every iteration, also when n is 1 should'nt the contents of the buffer be EL, I mean L is getting prepended to tmp not appended. I took a look at Read function but couldn't understand. Can someone explain it to me.

In the first iteration, Read reads two bytes, and your program produces the HE output. In the second iteration, Read reads one byte into tmp. Now tmp[0] contains that byte, but tmp[1] still contains the E read during the first iteration. However, you append all of tmp to dest, getting HELE. The third time around, read reads 0 bytes, but you still append the LE in tmp to dest.
The correct version of your program would be:
for {
n, err := dataReader.Read(tmp)
fmt.Println(n)
fmt.Println(string(tmp))
dest = append(dest, tmp[:n]...)
if err == io.EOF {
break
}
}

Related

Reading a random line from a file in constant time in Go

I have the following code to choose 2 random lines from a file containing lines of the form ip:port:
import (
"os"
"fmt"
"math/rand"
"log"
"time"
"unicode/utf8"
//"bufio"
)
func main() {
fmt.Println("num bytes in line is: \n", utf8.RuneCountInString("10.244.1.8:8080"))
file_pods_array, err_file_pods_array := os.Open("pods_array.txt")
if err_file_pods_array != nil {
log.Fatalf("failed opening file: %s", err_file_pods_array)
}
//16 = num of bytes in ip:port pair
randsource := rand.NewSource(time.Now().UnixNano())
randgenerator := rand.New(randsource)
firstLoc := randgenerator.Intn(10)
secondLoc := randgenerator.Intn(10)
candidate1 := ""
candidate2 := ""
num_bytes_from_start_first := 16 * (firstLoc + 1)
num_bytes_from_start_second := 16 * (secondLoc + 1)
buf_ipport_first := make([]byte, int64(15))
buf_ipport_second := make([]byte, int64(15))
start_first := int64(num_bytes_from_start_first)
start_second := int64(num_bytes_from_start_second)
_, err_first := file_pods_array.ReadAt(buf_ipport_first, start_first)
first_ipport_ep := buf_ipport_first
if err_first == nil {
candidate1 = string(first_ipport_ep)
}
_, err_second := file_pods_array.ReadAt(buf_ipport_second, start_second)
second_ipport_ep := buf_ipport_second
if err_second == nil {
candidate2 = string(second_ipport_ep)
}
fmt.Println("first is: ", candidate1)
fmt.Println("sec is: ", candidate2)
}
This sometimes prints empty or partial lines.
Why does this happen and how can I fix it?
Output example:
num bytes in line is:
15
first is: 10.244.1.17:808
sec is:
10.244.1.11:80
Thank you.
If your lines were of a fixed length you could do this in constant time.
Length of each line is L.
Check the size of the file, S.
Divide S/L to get the number of lines N.
Pick a random number R from 0 to N-1.
Seek to R*L in the file.
Read L bytes.
But you don't have fixed length lines. We can't do constant time, but we can do it in constant memory and O(n) time using the technique from The Art of Computer Programming, Volume 2, Section 3.4.2, by Donald E. Knuth.
Read a line. Remember its line number M.
Pick a random number from 1 to M.
If it's 1, remember this line.
That is, as you read each line you have a 1/M chance of picking it. Cumulatively this adds up to 1/N for every line.
If we have three lines, the first line has a 1/1 chance of being picked. Then a 1/2 chance of remaining. Then a 2/3 chance of remaining. Total chance: 1 * 1/2 * 2/3 = 1/3.
The second line has a 1/2 chance of being picked and a 2/3 chance of remaining. Total chance: 1/2 * 2/3 = 1/3.
The third line has a 1/3 chance of being picked.
package main
import(
"bufio"
"fmt"
"os"
"log"
"math/rand"
"time"
);
func main() {
file, err := os.Open("pods_array.txt")
if err != nil {
log.Fatal(err)
}
defer file.Close()
scanner := bufio.NewScanner(file)
randsource := rand.NewSource(time.Now().UnixNano())
randgenerator := rand.New(randsource)
lineNum := 1
var pick string
for scanner.Scan() {
line := scanner.Text()
fmt.Printf("Considering %v at 1/%v.\n", scanner.Text(), lineNum)
// Instead of 1 to N it's 0 to N-1
roll := randgenerator.Intn(lineNum)
fmt.Printf("We rolled a %v.\n", roll)
if roll == 0 {
fmt.Printf("Picking line.\n")
pick = line
}
lineNum += 1
}
fmt.Printf("Picked: %v\n", pick)
}
Because rand.Intn(n) returns [0,n), that is from 0 to n-1, we check for 0, not 1.
Maybe you're thinking "what if I seek to a random point in the file and then read the next full line?" That wouldn't quite be constant time, it would beO(longest-line), but it wouldn't be truly random. Longer lines would get picked more frequently.
Note that since these are (I assume) all IP addresses and ports you could have constant record lengths. Store the IPv4 address as a 32 bits and the port as a 16 bits. 48 bits per line.
However, this will break on IPv6. For forward compatibility store everything as IPv6: 128 bits for the IP and 16 bits for the port. 144 bits per line. Convert IPv4 addresses to IPv6 for storage.
This will allow you to pick random addresses in constant time, and it will save disk space.
Alternatively, store them in SQLite.
found a solution using ioutil and strings:
func main() {
randsource := rand.NewSource(time.Now().UnixNano())
randgenerator := rand.New(randsource)
firstLoc := randgenerator.Intn(10)
secondLoc := randgenerator.Intn(10)
candidate1 := ""
candidate2 := ""
dat, err := ioutil.ReadFile("pods_array.txt")
if err == nil {
ascii := string(dat)
splt := strings.Split(ascii, "\n")
candidate1 = splt[firstLoc]
candidate2 = splt[secondLoc]
}
fmt.Println(candidate1)
fmt.Println(candidate2)
}
Output
10.244.1.3:8080
10.244.1.11:8080

How to read inputs recursively in golang

In the following code after one recursion the inputs are not read(from stdin). Output is incorrect if N is greater than 1.
X is read as 0 after one recursive call and hence the array is not read after that.
Program is supposed to print sum of squares of positive numbers in the array. P.S has to done only using recursion
package main
// Imports
import (
"fmt"
"bufio"
"os"
"strings"
"strconv"
)
// Global Variables
var N int = 0;
var X int = 0;
var err error;
var out int = 0;
var T string = "0"; // All set to 0 just in case there is no input, so we don't crash with nil values.
func main() {
// Let's grab our input.
fmt.Print("Enter N: ")
fmt.Scanln(&N)
// Make our own recursion.
loop()
}
func loop() {
if N == 0 {return}
// Grab our array length.
fmt.Scanln(&X)
tNum := make([]string, X)
// Grab our values and put them into an array.
in := bufio.NewReader(os.Stdin)
T, err = in.ReadString('\n')
tNum = strings.Fields(T)
// Parse the numbers, square, and add.
add(tNum)
// Output and reset.
fmt.Print(out)
out = 0;
N--
loop()
}
// Another loop, until X is 0.
func add(tNum []string) {
if X == 0 {return}
// Parse a string to an integer.
i, err := strconv.Atoi(tNum[X-1])
if err != nil {}
// If a number is negative, make it 0, so when we add its' square, it does nothing.
if (i < 0) {
i = 0;
}
// Add to our total!
out = out + i*i
X--
add(tNum)
}
Input:
2
4
2 4 6 8
3
1 3 9
Output:
1200
Expected output:
120
91
bufio.Reader, like the name suggests, use a buffer to store what is in the reader (os.Stdin here), which means, each time you create a bufio.Reader and read it once, there are more than what is read stored into the buffer, and thus the next time you read from the reader (os.Stdin), you do not read from where you left.
You should only have one bufio.Reader for os.Stdin. Make it global (if that is a requirement) or make it an argument. In fact, bufio package has a Scanner type that can splits spaces and new lines so you don't need to call strings.Fields.
I think you should practise doing this yourself, but here is a playground link: https://play.golang.org/p/7zBDYwqWEZ0
Here is an example that illustrates the general principles.
// Print the sum of the squares of positive numbers in the input.
package main
import (
"bufio"
"fmt"
"io"
"os"
"strconv"
"strings"
)
func sumOfSquares(sum int, s *bufio.Scanner, err error) (int, *bufio.Scanner, error) {
if err != nil {
return sum, s, err
}
if !s.Scan() {
err = s.Err()
if err == nil {
err = io.EOF
}
return sum, s, err
}
for _, f := range strings.Fields(s.Text()) {
i, err := strconv.Atoi(f)
if err != nil || i <= 0 {
continue
}
sum += i * i
}
return sumOfSquares(sum, s, nil)
}
func main() {
sum := 0
s := bufio.NewScanner(os.Stdin)
sum, s, err := sumOfSquares(sum, s, nil)
if err != nil && err != io.EOF {
fmt.Fprintln(os.Stderr, err)
os.Exit(1)
}
fmt.Println(sum)
}
Input:
2
4
2 4 6 8
3
1 3 9
Output:
240

Invalid indirect of type func (int) string

I'm getting stucked with the following error:
./main.go:76: invalid indirect of Fizzbuzz (type func(int) string)
I understand that the Fizzbuzz function does not satisfy the writeString. My intuition is telling me that this is probably because I should be using an interface to Fizzbuzz? Can someone please give me some direction on how to execute this? What can I do to make this code Go idiomatic?
// -------------------------------INPUT--------------------------------------
// Your program should read an input file (provided on the command line),
// which contains multiple newline separated lines.
// Each line will contain 3 numbers which are space delimited.
// The first number is first number to divide by ('A' in this example),
// the second number is the second number to divide by ('B' in this example)
// and the third number is where you should count till ('N' in this example).
// You may assume that the input file is formatted correctly and the
// numbers are valid positive integers. E.g.
// 3 5 10
// 2 7 15
// -------------------------------OUTPUT------------------------------------
// Print out the series 1 through N replacing numbers divisible by 'A' by F,
// numbers divisible by 'B' by B and numbers divisible by both as 'FB'.
// Since the input file contains multiple sets of values, your output will
// print out one line per set. Ensure that there are no trailing empty spaces
// on each line you print. E.g.
// 1 2 F 4 B F 7 8 F B
// 1 F 3 F 5 F B F 9 F 11 F 13 FB 15
// ---------------------------PROPOSED SOLUTION-----------------------------
package main
import (
"bufio"
"fmt"
"log"
"os"
)
func Fizzbuzz(N int) (output string) {
var (
A = N%3 == 0
B = N%5 == 0
)
switch {
case A && B:
output = "FB"
case A:
output = "F"
case B:
output = "B"
default:
output = fmt.Sprintf("%v", N)
}
return
}
func openFile(name string) *os.File {
file, err := os.Open(name)
if err != nil {
log.Fatalf("failed opening %s for writing: %s", name, err)
}
return file
}
func Readln(r *bufio.Reader) {
line, prefix, err := r.ReadLine()
if err != nil {
log.Fatalf("failed reading a line: %v", err)
}
if prefix {
log.Printf("Line is too big for buffer, only first %d bytes returned", len(line))
}
}
func WriteString(w *bufio.Writer) {
if n, err := w.WriteString(*Fizzbuzz); err != nil {
log.Fatalf("failed writing string: %s", err)
} else {
log.Printf("Wrote string in %d bytes", n)
}
}
func main() {
file := openFile(os.Args[1])
defer file.Close()
fi := bufio.NewReader(file)
Readln(fi)
fo := bufio.NewWriter(file)
defer fo.Flush()
WriteString(fo)
}
Go-Playground
* as a unary operator is used to dereference (or "indirect") a pointer. Fizzbuzz is a function, not a pointer. That is why the compiler says:
Invalid indirect of type func (int) string
What you really want to do is call the function: Fizzbuzz()
So line:
if fizzbuzz, err := w.WriteString(*Fizzbuzz); err != nil {
should be:
if fizzbuzz, err := w.WriteString(Fizzbuzz()); err != nil{
It is not very idiomatic to call the first return of writestring something like fizzbuzz. Normally we name it "n".
if n, err := w.WriteString(Fizzbuzz()); err != nil{

Convert []string to []byte

I am looking to convert a string array to a byte array in GO so I can write it down to a disk. What is an optimal solution to encode and decode a string array ([]string) to a byte array ([]byte)?
I was thinking of iterating the string array twice, first one to get the actual size needed for the byte array and then a second one to write the length and actual string ([]byte(str)) for each element.
The solution must be able to convert it the other-way; from a []byte to a []string.
Lets ignore the fact that this is Go for a second. The first thing you need is a serialization format to marshal the []string into.
There are many option here. You could build your own or use a library. I am going to assume you don't want to build your own and jump to serialization formats go supports.
In all examples, data is the []string and fp is the file you are reading/writing to. Errors are being ignored, check the returns of functions to handle errors.
Gob
Gob is a go only binary format. It should be relatively space efficient as the number of strings increases.
enc := gob.NewEncoder(fp)
enc.Encode(data)
Reading is also simple
var data []string
dec := gob.NewDecoder(fp)
dec.Decode(&data)
Gob is simple and to the point. However, the format is only readable with other Go code.
Json
Next is json. Json is a format used just about everywhere. This format is just as easy to use.
enc := json.NewEncoder(fp)
enc.Encode(data)
And for reading:
var data []string
dec := json.NewDecoder(fp)
dec.Decode(&data)
XML
XML is another common format. However, it has pretty high overhead and not as easy to use. While you could just do the same you did for gob and json, proper xml requires a root tag. In this case, we are using the root tag "Strings" and each string is wrapped in an "S" tag.
type Strings struct {
S []string
}
enc := xml.NewEncoder(fp)
enc.Encode(Strings{data})
var x Strings
dec := xml.NewDecoder(fp)
dec.Decode(&x)
data := x.S
CSV
CSV is different from the others. You have two options, use one record with n rows or n records with 1 row. The following example uses n records. It would be boring if I used one record. It would look too much like the others. CSV can ONLY hold strings.
enc := csv.NewWriter(fp)
for _, v := range data {
enc.Write([]string{v})
}
enc.Flush()
To read:
var err error
var data string
dec := csv.NewReader(fp)
for err == nil { // reading ends when an error is reached (perhaps io.EOF)
var s []string
s, err = dec.Read()
if len(s) > 0 {
data = append(data, s[0])
}
}
Which format you use is a matter of preference. There are many other possible encodings that I have not mentioned. For example, there is an external library called bencode. I don't personally like bencode, but it works. It is the same encoding used by bittorrent metadata files.
If you want to make your own encoding, encoding/binary is a good place to start. That would allow you to make the most compact file possible, but I hardly thing it is worth the effort.
The gob package will do this for you http://godoc.org/encoding/gob
Example to play with http://play.golang.org/p/e0FEZm-qiS
same source code is below.
package main
import (
"bytes"
"encoding/gob"
"fmt"
)
func main() {
// store to byte array
strs := []string{"foo", "bar"}
buf := &bytes.Buffer{}
gob.NewEncoder(buf).Encode(strs)
bs := buf.Bytes()
fmt.Printf("%q", bs)
// Decode it back
strs2 := []string{}
gob.NewDecoder(buf).Decode(&strs2)
fmt.Printf("%v", strs2)
}
to convert []string to []byte
var str = []string{"str1","str2"}
var x = []byte{}
for i:=0; i<len(str); i++{
b := []byte(str[i])
for j:=0; j<len(b); j++{
x = append(x,b[j])
}
}
to convert []byte to string
str := ""
var x = []byte{'c','a','t'}
for i := 0; i < len(x); i++ {
str += string(x[i])
}
To illustrate the problem, convert []string to []byte and then convert []byte back to []string, here's a simple solution:
package main
import (
"encoding/binary"
"fmt"
)
const maxInt32 = 1<<(32-1) - 1
func writeLen(b []byte, l int) []byte {
if 0 > l || l > maxInt32 {
panic("writeLen: invalid length")
}
var lb [4]byte
binary.BigEndian.PutUint32(lb[:], uint32(l))
return append(b, lb[:]...)
}
func readLen(b []byte) ([]byte, int) {
if len(b) < 4 {
panic("readLen: invalid length")
}
l := binary.BigEndian.Uint32(b)
if l > maxInt32 {
panic("readLen: invalid length")
}
return b[4:], int(l)
}
func Decode(b []byte) []string {
b, ls := readLen(b)
s := make([]string, ls)
for i := range s {
b, ls = readLen(b)
s[i] = string(b[:ls])
b = b[ls:]
}
return s
}
func Encode(s []string) []byte {
var b []byte
b = writeLen(b, len(s))
for _, ss := range s {
b = writeLen(b, len(ss))
b = append(b, ss...)
}
return b
}
func codecEqual(s []string) bool {
return fmt.Sprint(s) == fmt.Sprint(Decode(Encode(s)))
}
func main() {
var s []string
fmt.Println("equal", codecEqual(s))
s = []string{"", "a", "bc"}
e := Encode(s)
d := Decode(e)
fmt.Println("s", len(s), s)
fmt.Println("e", len(e), e)
fmt.Println("d", len(d), d)
fmt.Println("equal", codecEqual(s))
}
Output:
equal true
s 3 [ a bc]
e 19 [0 0 0 3 0 0 0 0 0 0 0 1 97 0 0 0 2 98 99]
d 3 [ a bc]
equal true
I would suggest to use PutUvarint and Uvarint for storing/retrieving len(s) and using []byte(str) to pass str to some io.Writer. With a string length known from Uvarint, one can buf := make([]byte, n) and pass the buf to some io.Reader.
Prepend the whole thing with length of the string array and repeat the above for all of its items. Reading the whole thing back is again reading first the outer length and repeating n-times the item read.
You can do something like this:
var lines = []string
var ctx = []byte{}
for _, s := range lines {
ctx = append(ctx, []byte(s)...)
}
It can be done easily using strings package. First you need to convert the slice of string to a string.
func Join(elems []string, sep string) string
You need to pass the slice of strings and the separator you need to separate the elements in the string. (examples: space or comma)
Then you can easily convert the string to a slice of bytes by type conversion.
package main
import (
"fmt"
"strings"
)
func main() {
//Slice of Strings
sliceStr := []string{"a","b","c","d"}
fmt.Println(sliceStr) //prints [a b c d]
//Converting slice of String to String
str := strings.Join(sliceStr,"")
fmt.Println(str) // prints abcd
//Converting String to slice of Bytes
sliceByte := []byte(str) //prints [97 98 99 100]
fmt.Println(sliceByte)
//Converting slice of bytes a String
str2 := string(sliceByte)
fmt.Println(str2) // prints abcd
//Converting string to a slice of Strings
sliceStr2 := strings.Split(str2,"")
fmt.Println(sliceStr2) //prints [a b c d]
}

Which scan to use to read floats from a string?

This seems almost right but it chokes on the newline.
What's the best way to do this?
package main
import (
"fmt"
"strings"
)
func main() {
var z float64
var a []float64
// \n gives an error for Fscanf
s := "3.25 -12.6 33.7 \n 3.47"
in := strings.NewReader(s)
for {
n, err := fmt.Fscanf(in, "%f", &z)
fmt.Println("n", n)
if err != nil {
break
}
a = append(a, z)
}
fmt.Println(a)
}
Output:
n 1
n 1
n 1
n 0
[3.25 -12.6 33.7]
Update:
See the answer from #Atom below. I found another way which is to break if the error is EOF, and otherwise just ignore it. It's just a hack, I know, but I control the source.
_, err := fmt.Fscanf(in, "%f", &z)
if err == os.EOF { break }
if err != nil { continue }
If you are parsing floats only, you can use fmt.Fscan(r io.Reader, a ...interface{}) instead of fmt.Fscanf(r io.Reader, format string, a ...interface{}):
var z float64
...
n, err := fmt.Fscan(in, &z)
The difference between fmt.Fscan and fmt.Fscanf is that in the case of fmt.Fscan newlines count as space. The latter function (with a format string) does not treat newlines as spaces and requires newlines in the input to match newlines in the format string.
The functions with a format string give more control over the form of input, such as when you need to scan %5f or %10s. In this case, if the input contains newlines and it implements the interface io.RuneScanner you can use the method ReadRune to peek the next character and optionally unread it with UnreadRune if it isn't a space or a newline.
If your input is just a bunch of lines with floats separated by white space on each line, it might be easier to just read one line at a time from the file, run Sscanf on that line (assuming the number of floats on each line is fixed). But here's something that works in your example---there may be a way to make it more efficient.
package main
import (
"fmt"
"strings"
)
func main() {
var z float64
var a []float64
// \n gives an error for Fscanf
s := "3.25 -12.6 33.7 \n 3.47"
for _, line := range strings.Split(s, "\n") {
in := strings.NewReader(line)
for {
n, err := fmt.Fscanf(in, "%f", &z)
fmt.Println("n", n)
if err != nil {
fmt.Printf("ERROR: %v\n", err)
break
}
a = append(a, z)
}
}
fmt.Println(a)
}

Resources