How to use bufio.ScanWords - go

How do I use bufio.ScanWords and bufio.ScanLines functions to count words and lines?
I tried:
fmt.Println(bufio.ScanWords([]byte("Good day everyone"), false))
Prints:
5 [103 111 111 100] <nil>
Not sure what that means?

To count words:
input := "Spicy jalapeno pastrami ut ham turducken.\n Lorem sed ullamco, leberkas sint short loin strip steak ut shoulder shankle porchetta venison prosciutto turducken swine.\n Deserunt kevin frankfurter tongue aliqua incididunt tri-tip shank nostrud.\n"
scanner := bufio.NewScanner(strings.NewReader(input))
// Set the split function for the scanning operation.
scanner.Split(bufio.ScanWords)
// Count the words.
count := 0
for scanner.Scan() {
count++
}
if err := scanner.Err(); err != nil {
fmt.Fprintln(os.Stderr, "reading input:", err)
}
fmt.Printf("%d\n", count)
To count lines:
input := "Spicy jalapeno pastrami ut ham turducken.\n Lorem sed ullamco, leberkas sint short loin strip steak ut shoulder shankle porchetta venison prosciutto turducken swine.\n Deserunt kevin frankfurter tongue aliqua incididunt tri-tip shank nostrud.\n"
scanner := bufio.NewScanner(strings.NewReader(input))
// Set the split function for the scanning operation.
scanner.Split(bufio.ScanLines)
// Count the lines.
count := 0
for scanner.Scan() {
count++
}
if err := scanner.Err(); err != nil {
fmt.Fprintln(os.Stderr, "reading input:", err)
}
fmt.Printf("%d\n", count)

This is an exercise in book The Go Programming Language Exercise 7.1
This is an extension of #repler solution:
package main
import (
"bufio"
"fmt"
"os"
"strings"
)
type byteCounter int
type wordCounter int
type lineCounter int
func main() {
var c byteCounter
c.Write([]byte("Hello This is a line"))
fmt.Println("Byte Counter ", c)
var w wordCounter
w.Write([]byte("Hello This is a line"))
fmt.Println("Word Counter ", w)
var l lineCounter
l.Write([]byte("Hello \nThis \n is \na line\n.\n.\n"))
fmt.Println("Length ", l)
}
func (c *byteCounter) Write(p []byte) (int, error) {
*c += byteCounter(len(p))
return len(p), nil
}
func (w *wordCounter) Write(p []byte) (int, error) {
count := retCount(p, bufio.ScanWords)
*w += wordCounter(count)
return count, nil
}
func (l *lineCounter) Write(p []byte) (int, error) {
count := retCount(p, bufio.ScanLines)
*l += lineCounter(count)
return count, nil
}
func retCount(p []byte, fn bufio.SplitFunc) (count int) {
s := string(p)
scanner := bufio.NewScanner(strings.NewReader(s))
scanner.Split(fn)
count = 0
for scanner.Scan() {
count++
}
if err := scanner.Err(); err != nil {
fmt.Fprintln(os.Stderr, "reading input:", err)
}
return
}

This is an exercise in book The Go Programming Language Exercise 7.1
This is my solution:
package main
import (
"bufio"
"fmt"
)
// WordCounter count words
type WordCounter int
// LineCounter count Lines
type LineCounter int
type scanFunc func(p []byte, EOF bool) (advance int, token []byte, err error)
func scanBytes(p []byte, fn scanFunc) (cnt int) {
for true {
advance, token, _ := fn(p, true)
if len(token) == 0 {
break
}
p = p[advance:]
cnt++
}
return cnt
}
func (c *WordCounter) Write(p []byte) (int, error) {
cnt := scanBytes(p, bufio.ScanWords)
*c += WordCounter(cnt)
return cnt, nil
}
func (c WordCounter) String() string {
return fmt.Sprintf("contains %d words", c)
}
func (c *LineCounter) Write(p []byte) (int, error) {
cnt := scanBytes(p, bufio.ScanLines)
*c += LineCounter(cnt)
return cnt, nil
}
func (c LineCounter) String() string {
return fmt.Sprintf("contains %d lines", c)
}
func main() {
var c WordCounter
fmt.Println(c)
fmt.Fprintf(&c, "This is an sentence.")
fmt.Println(c)
c = 0
fmt.Fprintf(&c, "This")
fmt.Println(c)
var l LineCounter
fmt.Println(l)
fmt.Fprintf(&l, `This is another
line`)
fmt.Println(l)
l = 0
fmt.Fprintf(&l, "This is another\nline")
fmt.Println(l)
fmt.Fprintf(&l, "This is one line")
fmt.Println(l)
}

bufio.ScanWords and bufio.ScanLines (as well as bufio.ScanBytes and bufio.ScanRunes) are split functions: they provide a bufio.Scanner with the strategy to tokenize its input data – how the process of scanning should split the data. The split function for a bufio.Scanner is bufio.ScanLines by default but can be changed through the method bufio.Scanner.Split.
These split functions are of type SplitFunc:
type SplitFunc func(data []byte, atEOF bool) (advance int, token []byte, err error)
Usually, you won't need to call any of these functions directly; instead, bufio.Scanner will. However, you might need to create your own split function for implementing a custom tokenization strategy. So, let's have a look at its parameters:
data: remaining data not processed yet.
atEOF: whether or not the caller has reached EOF and therefore has no more new data to provide in the next call.
advance: number of bytes the caller must advance the input data for the next call.
token: the token to return to the caller as a result of the splitting performed.
To gain further understanding, let's see bufio.ScanBytes implementation:
func ScanBytes(data []byte, atEOF bool) (advance int, token []byte, err error) {
if atEOF && len(data) == 0 {
return 0, nil, nil
}
return 1, data[0:1], nil
}
As long as data isn't empty, it returns a token byte to the caller (data[0:1]) and tells the caller to advance the input data by one byte.

Related

Read line of numbers in Go

I have the following input, where on the first line is N - count of numbers, and on the second line N numbers, separated by space:
5
2 1 0 3 4
In Python I can read numbers without specifying its count (N):
_ = input()
numbers = list(map(int, input().split()))
How can I do the same in Go? Or I have to know exactly how many numbers are?
You can iterate through a file line-by-line using bufio, and the strings module can split a string into a slice. So that gets us something like:
package main
import (
"bufio"
"fmt"
"os"
"strconv"
"strings"
)
func main() {
readFile, err := os.Open("data.txt")
defer readFile.Close()
if err != nil {
fmt.Println(err)
}
fileScanner := bufio.NewScanner(readFile)
fileScanner.Split(bufio.ScanLines)
for fileScanner.Scan() {
// get next line from the file
line := fileScanner.Text()
// split it into a list of space-delimited tokens
chars := strings.Split(line, " ")
// create an slice of ints the same length as
// the chars slice
ints := make([]int, len(chars))
for i, s := range chars {
// convert string to int
val, err := strconv.Atoi(s)
if err != nil {
panic(err)
}
// update the corresponding position in the
// ints slice
ints[i] = val
}
fmt.Printf("%v\n", ints)
}
}
Which given your sample data will output:
[5]
[2 1 0 3 4]
Since you know the delimiter and you only have 2 lines, this is also a more compact solution:
package main
import (
"fmt"
"os"
"regexp"
"strconv"
"strings"
)
func main() {
parts, err := readRaw("data.txt")
if err != nil {
panic(err)
}
n, nums, err := toNumbers(parts)
if err != nil {
panic(err)
}
fmt.Printf("%d: %v\n", n, nums)
}
// readRaw reads the file in input and returns the numbers inside as a slice of strings
func readRaw(fn string) ([]string, error) {
b, err := os.ReadFile(fn)
if err != nil {
return nil, err
}
return regexp.MustCompile(`\s`).Split(strings.TrimSpace(string(b)), -1), nil
}
// toNumbers plays with the input string to return the data as a slice of int
func toNumbers(parts []string) (int, []int, error) {
n, err := strconv.Atoi(parts[0])
if err != nil {
return 0, nil, err
}
nums := make([]int, 0)
for _, p := range parts[1:] {
num, err := strconv.Atoi(p)
if err != nil {
return n, nums, err
}
nums = append(nums, num)
}
return n, nums, nil
}
The output out be:
5: [2 1 0 3 4]

A Tour of Go Exercise: rot13Reader, why is the reader called twice?

import (
"io"
"os"
"strings"
"fmt"
)
type rot13Reader struct {
r io.Reader
}
func (rot13 rot13Reader) Read(b []byte) (int, error){
n,err := rot13.r.Read(b)
fmt.Printf("\n%v %v\n",n, err)
const A byte ='A'
const a byte ='a'
for i,x := range b{
if x==0 {
n=i
break
}
switch {
case A<=x && x<a:
tmp := x-A
b[i] = A+((tmp+13)%26)
case a<=x && x<a+26 :
tmp := x-a
b[i] = a+((tmp+13)%26)
}
}
return n, err
}
func main() {
s := strings.NewReader("Lbh penpxrq gur pbqr!")
r := rot13Reader{s}
io.Copy(os.Stdout, &r)
}
The code above outputs
21 <nil>
You cracked the code!
0 EOF
Lbh penpxrq gur pbqr!
I have two questions :
Why is the reader being called twice (the second call returning 0 and EOF)
Why does on the second call, b is back to the original chiper? If it reads 0 character, shouldn't it still be "You cracked the code!"?

Go Test Input and Output from File

was trying to determine if there's a way to take a given input and expected output from a file for use in go test.
main.go:
package main
import (
"fmt"
"math"
)
func main() {
var n, m, a float64
fmt.Scanln(&n, &m, &a)
a_in_n_ceil := uint64(math.Ceil(n / a))
a_in_m_ceil := uint64(math.Ceil(m / a))
a_in_n_and_m := a_in_n_ceil * a_in_m_ceil
fmt.Println(a_in_n_and_m)
}
examples:
6 6 4
4
Would it be io.readfile or something similar to grab the first line of input from the examples file and then again for the seconds line of expected output in main_test.go? Guidance is appreciated.
Use os package for file read & write
To read from file : os.ReadFile(path_to_file)
To write file : os.WriteFile("output.txt", data_in_byte_array, file_permission)
package main
import (
"fmt"
"math"
"os"
"strconv"
"strings"
)
func check(e error) {
if e != nil {
panic(e)
}
}
func ReadFromFile(path string) []float64 {
dat, err := os.ReadFile(path) // read file contents
check(err)
stringArr := strings.Split(string(dat)," ")
var numbers []float64
for _, arg := range stringArr {
if n, err := strconv.ParseFloat(arg, 64); err == nil {
numbers = append(numbers, n)
}
}
return numbers // return file contents in required format (in this case []float64)
}
func WriteFile(data string) error{
err := os.WriteFile("output.txt", []byte(data), 0644)
if err != nil {
return err
}
return nil
}
func main(){
var n, m, a float64
numbers := ReadFromFile("input.txt")
fmt.Println(numbers)
n = numbers[0]
m = numbers[1]
a = numbers[2]
a_in_n_ceil := uint64(math.Ceil(n / a))
a_in_m_ceil := uint64(math.Ceil(m / a))
a_in_n_and_m := a_in_n_ceil * a_in_m_ceil
fmt.Println(a_in_n_and_m) // print to console
err = WriteFile(fmt.Sprint(a_in_n_and_m)) // write output to a file
if err != nil {
fmt.Println("Error in file write : ", err)
}
}
input.txt
6 6 4
output.txt
4

How split buf into two slice in one line code?

Split a buf into two slices.
One is
buf[:n]
other is
buf[n:].
n maybe larger than len(buf).
Finish it just using one line code .
Is there any grace code ?
This is not elegant, nor practical, but the evaluation is on one line...
package main
import (
"fmt"
)
func main() {
buf := "abcdefg"
n := 8
// fugly one-liner
a, b, err := func() (string, string, error) {if n > len(buf) {return "", "", fmt.Errorf("out of bounds")} else {return buf[:n], buf[n:], nil}}()
if err != nil {
fmt.Println(err.Error())
} else {
fmt.Print(a + ":" + b)
}
}

Looking for Go equivalent of scanf

I'm looking for the Go equivalent of scanf().
I tried with following code:
1 package main
2
3 import (
4 "scanner"
5 "os"
6 "fmt"
7 )
8
9 func main() {
10 var s scanner.Scanner
11 s.Init(os.Stdin)
12 s.Mode = scanner.ScanInts
13 tok := s.Scan()
14 for tok != scanner.EOF {
15 fmt.Printf("%d ", tok)
16 tok = s.Scan()
17 }
18 fmt.Println()
19 }
I run it with input from a text with a line of integers.
But it always output -3 -3 ...
And how to scan a line composed of a string and some integers?
Changing the mode whenever encounter a new data type?
The Package documentation:
Package scanner
A general-purpose scanner for UTF-8
encoded text.
But it seems that the scanner is not for general use.
Updated code:
func main() {
n := scanf()
fmt.Println(n)
fmt.Println(len(n))
}
func scanf() []int {
nums := new(vector.IntVector)
reader := bufio.NewReader(os.Stdin)
str, err := reader.ReadString('\n')
for err != os.EOF {
fields := strings.Fields(str)
for _, f := range fields {
i, _ := strconv.Atoi(f)
nums.Push(i)
}
str, err = reader.ReadString('\n')
}
r := make([]int, nums.Len())
for i := 0; i < nums.Len(); i++ {
r[i] = nums.At(i)
}
return r
}
Improved version:
package main
import (
"bufio"
"os"
"io"
"fmt"
"strings"
"strconv"
"container/vector"
)
func main() {
n := fscanf(os.Stdin)
fmt.Println(len(n), n)
}
func fscanf(in io.Reader) []int {
var nums vector.IntVector
reader := bufio.NewReader(in)
str, err := reader.ReadString('\n')
for err != os.EOF {
fields := strings.Fields(str)
for _, f := range fields {
if i, err := strconv.Atoi(f); err == nil {
nums.Push(i)
}
}
str, err = reader.ReadString('\n')
}
return nums
}
Your updated code was much easier to compile without the line numbers, but it was missing the package and import statements.
Looking at your code, I noticed a few things. Here's my revised version of your code.
package main
import (
"bufio"
"fmt"
"io"
"os"
"strconv"
"strings"
"container/vector"
)
func main() {
n := scanf(os.Stdin)
fmt.Println()
fmt.Println(len(n), n)
}
func scanf(in io.Reader) []int {
var nums vector.IntVector
rd := bufio.NewReader(os.Stdin)
str, err := rd.ReadString('\n')
for err != os.EOF {
fields := strings.Fields(str)
for _, f := range fields {
if i, err := strconv.Atoi(f); err == nil {
nums.Push(i)
}
}
str, err = rd.ReadString('\n')
}
return nums
}
I might want to use any input file for scanf(), not just Stdin; scanf() takes an io.Reader as a parameter.
You wrote: nums := new(vector.IntVector), where type IntVector []int. This allocates an integer slice reference named nums and initializes it to zero, then the new() function allocates an integer slice reference and initializes it to zero, and then assigns it to nums. I wrote: var nums vector.IntVector, which avoids the redundancy by simply allocating an integer slice reference named nums and initializing it to zero.
You didn't check the err value for strconv.Atoi(), which meant invalid input was converted to a zero value; I skip it.
To copy from the vector to a new slice and return the slice, you wrote:
r := make([]int, nums.Len())
for i := 0; i < nums.Len(); i++ {
r[i] = nums.At(i)
}
return r
First, I simply replaced that with an equivalent, the IntVector.Data() method: return nums.Data(). Then, I took advantage of the fact that type IntVector []int and avoided the allocation and copy by replacing that by: return nums.
Although it can be used for other things, the scanner package is designed to scan Go program text. Ints (-123), Chars('c'), Strings("str"), etc. are Go language token types.
package main
import (
"fmt"
"os"
"scanner"
"strconv"
)
func main() {
var s scanner.Scanner
s.Init(os.Stdin)
s.Error = func(s *scanner.Scanner, msg string) { fmt.Println("scan error", msg) }
s.Mode = scanner.ScanInts | scanner.ScanStrings | scanner.ScanRawStrings
for tok := s.Scan(); tok != scanner.EOF; tok = s.Scan() {
txt := s.TokenText()
fmt.Print("token:", tok, "text:", txt)
switch tok {
case scanner.Int:
si, err := strconv.Atoi64(txt)
if err == nil {
fmt.Print(" integer: ", si)
}
case scanner.String, scanner.RawString:
fmt.Print(" string: ", txt)
default:
if tok >= 0 {
fmt.Print(" unicode: ", "rune = ", tok)
} else {
fmt.Print(" ERROR")
}
}
fmt.Println()
}
}
This example always reads in a line at a time and returns the entire line as a string. If you want to parse out specific values from it you could.
package main
import (
"fmt"
"bufio"
"os"
"strings"
)
func main() {
value := Input("Please enter a value: ")
trimmed := strings.TrimSpace(value)
fmt.Printf("Hello %s!\n", trimmed)
}
func Input(str string) string {
print(str)
reader := bufio.NewReader(os.Stdin)
input, _ := reader.ReadString('\n')
return input
}
In a comment to one of my answers, you said:
From the Language Specification: "When
memory is allocated to store a value,
either through a declaration or make()
or new() call, and no explicit
initialization is provided, the memory
is given a default initialization".
Then what's the point of new()?
If we run:
package main
import ("fmt")
func main() {
var i int
var j *int
fmt.Println("i (a value) = ", i, "; j (a pointer) = ", j)
j = new(int)
fmt.Println("i (a value) = ", i, "; j (a pointer) = ", j, "; *j (a value) = ", *j)
}
The declaration var i int allocates memory to store an integer value and initializes the value to zero. The declaration var j *int allocates memory to store a pointer to an integer value and initializes the pointer to zero (a nil pointer); no memory is allocated to store an integer value. We see program output similar to:
i (a value) = 0 ; j (a pointer) = <nil>
The built-in function new takes a type T and returns a value of type *T. The memory is initialized to zero values. The statement j = new(int) allocates memory to store an integer value and initializes the value to zero, then it stores a pointer to this integer value in j. We see program output similar to:
i (a value) = 0 ; j (a pointer) = 0x7fcf913a90f0 ; *j (a value) = 0
The latest release of Go (2010-05-27) has added two functions to the fmt package: Scan() and Scanln(). They don't take any pattern string. like in C, but checks the type of the arguments instead.
package main
import (
"fmt"
"os"
"container/vector"
)
func main() {
numbers := new(vector.IntVector)
var number int
n, err := fmt.Scan(os.Stdin, &number)
for n == 1 && err == nil {
numbers.Push(number)
n, err = fmt.Scan(os.Stdin, &number)
}
fmt.Printf("%v\n", numbers.Data())
}

Resources