Read file and display its contents in Go - go

I'm new to Go, I want to do a simple program that reads filename from user and display it's contents back to user. This is what I have so far:
fname := "D:\myfolder\file.txt"
f, err := os.Open(fname)
if err != nil {
fmt.Println(err)
}
var buff []byte
defer f.Close()
buff = make([]byte, 1024)
for {
n, err := f.Read(buff)
if n > 0 {
fmt.Println(string(buff[:n]))
}
if err == io.EOF {
break
}
}
but I get error:
The filename, directory name, or volume label syntax is incorrect.

I suspect the backslashes in fname is the reason. Try with double backslash (\\).

Put the filename in backquotes. This makes it a raw string literal. With raw string literals, no escape sequences such as \f will be processed.
fname := `D:\myfolder\file.txt`

You can also use the unix '/' path separators instead.
Does the job.
fname := "D:/myfolder/file.txt"

Congrats on learning Go! Though the question was about a specific error in the example, let's break it down line by line and learn a bit about some of the other issues that may be encountered:
fname := "D:\myfolder\file.txt"
Like C and many other languages, Go uses the backslash character for an "escape sequence". That is, certain characters that start with a backslash get translated into other characters that would be hard to see otherwise (eg. \t becomes a tab character, which may otherwise be indistinguishable from a space).
The fix is to use a raw string literal (use backticks instead of quotes) where no escape sequences are processed:
fname := `D:\myfolder\file.txt`
This fixes the initial error you were seeing by removing the invalid \m and \f escape sequences. A full list of escape sequences and more explanation can be found by reading the String Literals section of the Go spec.
f, err := os.Open(fname)
if err != nil {
fmt.Println(err)
}
The first line of this chunk is good, but it can be improved. If an error occurs, there is no reason for our program to continue executing since we couldn't even open the file, so we should both print it (probably to standard error) and exit, preferably with a non-zero exit status to indicate that something bad happened. Also, as a matter of good habit we probably want to close the file at the end of the function if opening it was successful. Putting it right below the Open call is conventional and makes it easier when someone else is reading your code. I would rewrite this as:
f, err := os.Open(fname)
if err != nil {
fmt.Fprintln(os.Stderr, err)
os.Exit(2)
// It is also common to replace these two lines with a call to log.Fatal
}
defer f.Close()
The last chunk is a bit complicated, and we could rewrite it in multiple ways. Right now it looks like this:
var buff []byte
defer f.Close()
buff = make([]byte, 1024)
for {
n, err := f.Read(buff)
if n > 0 {
fmt.Println(string(buff[:n]))
}
if err == io.EOF {
break
}
}
But we don't need to define our own buffering, because the standard library provides us with the bufio and bytes packages which can do this for us. In this case though, we probably don't need them because we can also replace the iteration with a call to io.Copy which does its own internal buffering. We could also use one of the other copy variants such as io.CopyBuffer if we wanted to use our own buffer. It's also missing some error handling, so we'll add that. Now this entire chunk becomes:
_, err := io.Copy(os.Stdout, f)
if err != nil {
fmt.Fprintf(os.Stderr, "Error reading from file: `%s'\n", err)
os.Exit(2)
}
// We're done!

Related

I want to read a file by taking the file name as a user input

I have a JSON file called example.json. I need to read this file by taking its name as user input. I have tried with the below code.
func main() {
reader := bufio.NewReader(os.Stdin)
fmt.Print("Enter text: ")
text,_ := reader.ReadString('\n')
fmt.Println(text)
file,_ := ioutil.ReadFile(text)
// os.Exit()
fmt.Print(file)
}
But It's not working properly. I want to take the JSON file name as a command line input and read the JSON file.
I checked with the below method. But it's not matched with my case.
reader.ReadString does not strip out the first occurrence of delim
First things first. You can figure out why your code doesn't work if you simply deal with the errors properly. You're ignoring the error thrown when you call iotuil.ReadFile(text).
Just add the proper treatment and you will have a good clue why it isn't working
file, err := ioutil.ReadFile(text)
if err != nil {
log.Fatal(err)
}
: no such file or directory
The reason why your program does not work is likely because there's a break line character in your text variable.
From Go Documentation
"ReadString reads until the first occurrence of delim in the input,
returning a string containing the data up to and including the
delimiter."
Remove the break like character from the variable that holds your user's input and it should work, assuming the input actually matches to a existing file including its correct path.
func main() {
reader := bufio.NewReader(os.Stdin)
fmt.Print("Enter text: ")
text, _ := reader.ReadString('\n')
text = strings.TrimSuffix(text, "\n")
//Add the file path
//or else the user will be required to enter the entire file location
f := "path_to_the_file" + text
file, err := ioutil.ReadFile(f)
if err != nil {
log.Fatal(err)
}
fmt.Println(string(file))
}
In this case, I think you are better off a simple scan:
var namefile string
fmt.Scan(&namefile)
content, err := ioutil.ReadFile(namefile)
if err != nil {
log.Fatal(err)
}
Instead of adding a \n and then remove it.

io.Reader and Line Break issue involving a CSV file

I have an application which deals with CSV's being delivered via RabbitMQ from many different upstream applications - typically 5000-15,000 rows per file. Most of the time it works great. However a couple of these upstream applications are old (12-15 years) and the people who wrote them are long gone.
I'm unable to read CSV files from these older aplications due to the line breaks. I'm finding this a bit weird as the line breaks see to map to UTF-8 Carriage Returns (http://www.fileformat.info/info/unicode/char/000d/index.htm). Typically the app reads in only the headers from those older files and nothing else.
If I open one of these files in a text editor and save as utf-8 encoding overwriting the exiting file then it works with no issues at all.
Things I've tried I expected to work:
-Using a Reader:
ba := make([]byte, 262144000)
if _, err := file.Read(ba); err != nil {
return nil, err
}
ba = bytes.Trim(ba, "\x00")
bb := bytes.NewBuffer(ba)
reader := csv.NewReader(bb)
records, err := reader.ReadAll()
if err != nil {
return nil, err
}
-Using the Scanner to read line by line (get a bufio.Scanner: token too long)
scanner := bufio.NewScanner(file)
var bb bytes.Buffer
for scanner.Scan() {
bb.WriteString(fmt.Sprintf("%s\n", scanner.Text()))
}
// check for errors
if err = scanner.Err(); err != nil {
return nil, err
}
reader := csv.NewReader(&bb)
records, err := reader.ReadAll()
if err != nil {
return nil, err
}
Things I tried I expected not to work (and didn't):
Writing file contents to a new file (.txt) and reading the file back in (including running dos2unix against the created txt file)
Reading file into a standard string (hoping Go's UTF-8 encoding would magically kick in which of course it doesn't)
Reading file to Rune slice, then transforming to a string via byte slice
I'm aware of the https://godoc.org/golang.org/x/text/transform package but not too sure of a viable approach - it looks like the src encoding needs to be known to transform.
Am I stupidly overlooking something? Are there any suggestions how to transform these files into UTF-8 or update the line endings without knowing the file encoding whilst keeping the application working for all the other valid CSV files being delivered? Are there any options that don't involve me going byte to byte and doing a bytes.Replace I've not considered?
I'm hoping there's something really obvious I've overlooked.
Apologies - I can't share the CSV files for obvious reasons.
For anyone who's stumbled on this and wants an answer that doesn't involve strings.Replace, here's a method that wraps an io.Reader to replace solo carriage returns. It could probably be more efficient, but works better with huge files than a strings.Replace-based solution.
https://gist.github.com/b5/78edaae9e6a4248ea06b45d089c277d6
// ReplaceSoloCarriageReturns wraps an io.Reader, on every call of Read it
// for instances of lonely \r replacing them with \r\n before returning to the end customer
// lots of files in the wild will come without "proper" line breaks, which irritates go's
// standard csv package. This'll fix by wrapping the reader passed to csv.NewReader:
// rdr, err := csv.NewReader(ReplaceSoloCarriageReturns(r))
//
func ReplaceSoloCarriageReturns(data io.Reader) io.Reader {
return crlfReplaceReader{
rdr: bufio.NewReader(data),
}
}
// crlfReplaceReader wraps a reader
type crlfReplaceReader struct {
rdr *bufio.Reader
}
// Read implements io.Reader for crlfReplaceReader
func (c crlfReplaceReader) Read(p []byte) (n int, err error) {
if len(p) == 0 {
return
}
for {
if n == len(p) {
return
}
p[n], err = c.rdr.ReadByte()
if err != nil {
return
}
// any time we encounter \r & still have space, check to see if \n follows
// if next char is not \n, add it in manually
if p[n] == '\r' && n < len(p) {
if pk, err := c.rdr.Peek(1); (err == nil && pk[0] != '\n') || (err != nil && err.Error() == io.EOF.Error()) {
n++
p[n] = '\n'
}
}
n++
}
return
}
Have you tried to replace all line endings from \r\n or \r to \n ?

Read lines from text file in go, with bufio.reader

for {
v, err = nextNum(reader, ' ')
if err != nil {
break
}
w, err = nextNum(reader, ' ')
if err != nil {
break
}
cost, err = nextNum(reader, '\n')
if err != nil {
break
}
fmt.Println(v, w, cost)
}
My text file consists of three coloumns and n rows. The first time nextNum is called the number in the first row and first column will be returned, next time the number in the second column and first row, and so on. My problem is when i get to the end and i call nextNum for the last time then i will recieve an EOF error and the last line will never get printed out, becuase break will be called before. Any suggestions on how to solve the problem?
CHeers
I guess there is no new line in the last row in your file and it's simply ending with EOF. It his correct? As a result, the very last column is not being parsed correctly, as it doesn't end with an expected character (\n).
You didn't show us exactly how you're using bufio.Reader, but either way you will need to account for missing new line at the end of file (it's up to you whether treat it as an error or not). Using methods like bufio.Reader.ReadString with \n delimiter won't treat EOF as the end-of-line automatically, but will return you a valid content along with EOF (i.e. you can get both data and error at the same call – note this is a different behaviour than in bufio.Reader.Read).
Saying this, it might be beneficial for you to use the csv package instead. It will solve the EOF problem and you could also benefit from some nicer error messages on unexpected number of columns. The additional features like comments or quotes might be good or bad for your purposes.
Example:
// No line break at the end, pure EOF (still works)
data := "one 1\ntwo 2\nthree 3\nfour 4"
// You can wrap your file reader with bufio.Reader here
cr := csv.NewReader(bytes.NewReader([]byte(data)))
cr.Comma = ' '
cr.FieldsPerRecord = 2
var err error
for err == nil {
var columns []string
if columns, err = cr.Read(); err == nil {
fmt.Println(columns)
// err = processRow(columns)
}
}
if err != io.EOF {
// Parse error
panic(err)
}
From the bufio docs:
At EOF, the count will be zero and err will be io.EOF
So you can simply test for that. Like change your if err != nil to if err != nil && err != io.EOF
or
if err == io.EOF {
fmt.Println(v, w, cost)
break
}
if err != nil {
break
}
fmt.Println(v, w, cost)
Though you really should do something with the error and not just ignore it.

How to read a text file line-by-line in Go when some lines are long enough to cause "bufio.Scanner: token too long" errors?

I have a text file where each line represents a JSON object. I am processing this file in Go with a simple for loop like this:
scanner := bufio.NewScanner(file)
for scanner.Scan() {
jsonBytes = scanner.Bytes()
var jsonObject interface{}
err := json.Unmarshal(jsonBytes, &jsonObject)
// do stuff with "jsonObject"...
}
if err := scanner.Err(); err != nil {
log.Fatal(err)
}
When this code reaches a line with a particularly large JSON string (~67kb), I get the error message, "bufio.Scanner: token too long".
Is there an easy way to increase the max line size readable by NewScanner? Or is there another approach you can take altogether, when needing to read lines that are too large for NewScanner but are known to not be of unsafe size generally?
You can also do:
scanner := bufio.NewScanner(file)
buf := make([]byte, 0, 64*1024)
scanner.Buffer(buf, 1024*1024)
for scanner.Scan() {
// do your stuff
}
The second argument to scanner.Buffer() sets the maximum token size. In the above example you will be able to scan the file as long as none of the lines is larger than 1MB.
From the package docs:
Programs that need more control over error handling or large tokens,
or must run sequential scans on a reader, should use bufio.Reader
instead.
It looks like the preferred solution is bufio.Reader.ReadLine.
You surely don't want to be reading line-by-line in the first place. Why don't you just do this:
d := json.NewDecoder(file)
for {
var ob whateverType
err := d.Decode(&ob)
if err == io.EOF {
break
}
if err != nil {
log.Fatalf("Error decoding: %v", err)
}
// do stuff with "jsonObject"...
}

How to find EOF while reading from a file

I am using the following code to read a file in Go:
spoon , err := ioutil.ReadFile(os.Args[1])
if err!=nil {
panic ("File reading error")
}
Now I check for every byte I pick for what character it is. For example:
spoon[i]==' ' //for checking space
Likewise I read the whole file (I know there maybe other ways of reading it)
but keeping this way intact, how can I know that I have reached EOF of the file and I should stop reading it further?
Please don't suggest to find the length of spoon and start a loop. I want a sure shot way of finding EOF.
Use io.EOF to test for end-of-file. For example, to count spaces in a file:
package main
import (
"fmt"
"io"
"os"
)
func main() {
if len(os.Args) <= 1 {
fmt.Println("Missing file name argument")
return
}
f, err := os.Open(os.Args[1])
if err != nil {
fmt.Println(err)
return
}
defer f.Close()
data := make([]byte, 100)
spaces := 0
for {
data = data[:cap(data)]
n, err := f.Read(data)
if err != nil {
if err == io.EOF {
break
}
fmt.Println(err)
return
}
data = data[:n]
for _, b := range data {
if b == ' ' {
spaces++
}
}
}
fmt.Println(spaces)
}
ioutil.ReadFile() reads the entire contents of the file into a byte slice. You don't need to be concerned with EOF. EOF is a construct that is needed when you read a file one chunk at a time. You need to know which chunk has reached the end of the file when you're reading one chunk at a time.
The length of the byte slice returned by ioutil.ReadFile() is all you need.
data := ioutil.ReadFile(os.Args[1])
// Do we need to know the data size?
slice_size := len(data)
// Do we need to look at each byte?
for _,byte := range data {
// do something with each byte
}
This is what you need to look for to find out about End Of File(EOF)
if err != nil {
if errors.Is(err, io.EOF) { // prefered way by GoLang doc
fmt.Println("Reading file finished...")
}
break
}
When you use ioutil.ReadFile(), you don't ever see io.EOF, by design, because ReadFile will read the whole file until EOF is reached. So the slice it returns is the whole file. From the doc:
ReadFile reads the file named by filename and returns the contents. A successful call returns err == nil, not err == EOF. Because ReadFile reads the whole file, it does not treat an EOF from Read as an error to be reported.
From your question, you explicitly mention that you are aware there are other ways to read the file, and some of those ways require you to test the error for io.EOF, but not ReadFile.
Then, with the slice you have, you can read the file using the for...range construct, as others have mentioned. This is a sure way to read the whole file and nothing more (again, ReadFile takes care of that). Or iterating from 0 to len(spoon) - 1 would work too, but range is more idiomatic and basically does the same.
In other words: when you reach the end of the slice, you reach the end of the file (provided ReadFile did not return an error).
A slice has no concept of end of file. The slice returned by ioutil.ReadFile has a specific length, which reflects the size of the file it was read from. A common idiom, but only one of the possible used in this case, is to range the slice, effectively "consuming" all of the bytes, originally sitting in the file:
for i, b := range spoon {
// At index 'i' is byte 'b'
// At file's offset 'i', 'b' was read
... do something useful here
}

Resources