Replacing a line within a file with Golang - go

I'm new to Golang, starting out with some examples. Currently, what I'm trying to do is reading a file line by line and replace it with another string in case it meets a certain condition.
The file is use for testing purposes contains four lines:
one
two
three
four
The code working on that file looks like this:
func main() {
file, err := os.OpenFile("test.txt", os.O_RDWR, 0666)
if err != nil {
panic(err)
}
reader := bufio.NewReader(file)
for {
fmt.Print("Try to read ...\n")
pos,_ := file.Seek(0, 1)
log.Printf("Position in file is: %d", pos)
bytes, _, _ := reader.ReadLine()
if (len(bytes) == 0) {
break
}
lineString := string(bytes)
if(lineString == "two") {
file.Seek(int64(-(len(lineString))), 1)
file.WriteString("This is a test.")
}
fmt.Printf(lineString + "\n")
}
file.Close()
}
As you can see in the code snippet, I want to replace the string "two" with "This is a test" as soon as this string is read from the file.
In order to get the current position within the file I use Go's Seek method.
However, what happens is that always the last line gets replaced by This is a test, making the file looking like this:
one
two
three
This is a test
Examining the output of the print statement which writes the current file position to the terminal, I get that kind of output after the first line has been read:
2016/12/28 21:10:31 Try to read ...
2016/12/28 21:10:31 Position in file is: 19
So after the first read, the position cursor already points to the end of my file, which explains why the new string gets appended to the end. Does anyone understand what is happening here or rather what is causing that behavior?

The Reader is not controller by the file.Seek. You have declared the reader as: reader := bufio.NewReader(file) and then you read one line at a time bytes, _, _ := reader.ReadLine() however the file.Seek does not change the position that the reader is reading.
Suggest you read about the ReadSeeker in the docs and switch over to using that. Also there is an example using the SectionReader.

Aside from the incorrect seek usage, the difficulty is that the line you're replacing isn't the same length as the replacement. The standard approach is to create a new (temporary) file with the modifications. Assuming that is successful, replace the original file with the new one.
package main
import (
"bufio"
"io"
"io/ioutil"
"log"
"os"
)
func main() {
// file we're modifying
name := "text.txt"
// open original file
f, err := os.Open(name)
if err != nil {
log.Fatal(err)
}
defer f.Close()
// create temp file
tmp, err := ioutil.TempFile("", "replace-*")
if err != nil {
log.Fatal(err)
}
defer tmp.Close()
// replace while copying from f to tmp
if err := replace(f, tmp); err != nil {
log.Fatal(err)
}
// make sure the tmp file was successfully written to
if err := tmp.Close(); err != nil {
log.Fatal(err)
}
// close the file we're reading from
if err := f.Close(); err != nil {
log.Fatal(err)
}
// overwrite the original file with the temp file
if err := os.Rename(tmp.Name(), name); err != nil {
log.Fatal(err)
}
}
func replace(r io.Reader, w io.Writer) error {
// use scanner to read line by line
sc := bufio.NewScanner(r)
for sc.Scan() {
line := sc.Text()
if line == "two" {
line = "This is a test."
}
if _, err := io.WriteString(w, line+"\n"); err != nil {
return err
}
}
return sc.Err()
}
For more complex replacements, I've implemented a package which can replace regular expression matches. https://github.com/icholy/replace
import (
"io"
"regexp"
"github.com/icholy/replace"
"golang.org/x/text/transform"
)
func replace2(r io.Reader, w io.Writer) error {
// compile multi-line regular expression
re := regexp.MustCompile(`(?m)^two$`)
// create replace transformer
tr := replace.RegexpString(re, "This is a test.")
// copy while transforming
_, err := io.Copy(w, transform.NewReader(r, tr))
return err
}

OS package has Expand function which I believe can be used to solve similar problem.
Explanation:
file.txt
one
two
${num}
four
main.go
package main
import (
"fmt"
"os"
)
var FILENAME = "file.txt"
func main() {
file, err := os.ReadFile(FILENAME)
if err != nil {
panic(err)
}
mapper := func(placeholderName string) string {
switch placeholderName {
case "num":
return "three"
}
return ""
}
fmt.Println(os.Expand(string(file), mapper))
}
output
one
two
three
four
Additionally, you may create a config (yml or json) and
populate that data in the map that can be used as a lookup table to store placeholders as well as their replacement strings and modify mapper part to use this table to lookup placeholders from input file.
e.g map will look like this,
table := map[string]string {
"num": "three"
}
mapper := func(placeholderName string) string {
if val, ok := table[placeholderName]; ok {
return val
}
return ""
}
References:
os.Expand documentation: https://pkg.go.dev/os#Expand
Playground

Related

Prevent ReadFile or ReadAll from reading EOF

I start learning Go and I am a bit puzzled by the fact it includes the EOF when using the ioutil.ReadFile function. I want, for example, to read a file and parse all its lines on a field separator.
Sample input File:
CZG;KCZG;some text
EKY;KEKY;some text
A50;KA50;some text
UKY;UCFL;some text
MIC;KMIC;some text
K2M;K23M;some text
This is what I do to read and parse that file:
import(
"fmt"
"log"
"io/ioutil"
"strings"
)
func main() {
/* Read file */
airportsFile := "/path/to/file/ad_iata"
content, err := ioutil.ReadFile(airportsFile)
if err != nil {
log.Fatal(err)
}
/* split content on EOL */
lines := strings.Split(string(content), "\n")
/* split line on field separator ; */
for _, line := range lines {
lineSplit := strings.Split(line, ";")
fmt.Println(lineSplit)
}
}
The string.Split function adds a empty element at the end of the lineSplit slice when it sees the EOF (nothing to parse). Therefore, if I want to access the second index of that slice (lineSplit[1]) I run into a panic: runtime error: index out of range. I have to restrict the range by doing this
/* split line on field separator ; */
lenLines := len(lines) -1
for _, line := range lines[:lenLines] {
lineSplit := strings.Split(line, ";")
fmt.Println(lineSplit[1])
}
Is there a better way if I want to keep using ReadFile for its terseness ?
The same problem occurs when using ioutil.ReadAll
There is no such thing as an "EOF byte" or "EOF character". What you are seeing is probably caused by a line break character ('\n') at the very end of the file.
To read a file line by line, it's more idiomatic to use bufio.Scanner instead:
file, err := os.Open(airportsFile)
if err != nil {
log.Fatal(err)
}
defer file.Close()
scanner := bufio.NewScanner(file)
for scanner.Scan() {
line := scanner.Text()
// ... use line as you please ...
}
if err := scanner.Err(); err != nil {
log.Fatal(err)
}
And this actually addresses your problem, because Scanner will read the final newline without starting a new line, as evidenced by this playground example.
Your input File seeems to be CSV file, so you can use encoding/csv
airportsFile := "/path/to/file/ad_iata"
content, err := os.Open(airportsFile)
if err != nil {
log.Fatal(err)
}
r := csv.NewReader(content)
r.Comma = ';'
records, err := r.ReadAll() /* split line on field separator ; */
if err != nil {
log.Fatal(err)
}
fmt.Println(records)
which looks terse enough for me and provide correct output
[[CZG KCZG some text] [EKY KEKY some text] [A50 KA50 some text] [UKY UCFL some text] [MIC KMIC some text] [K2M K23M some text]]
You may use scanner.Err() to check for errors on file read.
// Err returns the first non-EOF error that was encountered by the Scanner.
func (s *Scanner) Err() error {
if s.err == io.EOF {
return nil
}
return s.err
}
In general in go the idiomatic way to read and parse a file is to use bufio.NewScanner which accept as an input parameter the file to read and returns a new Scanner.
Considering the above remarks here is a way you can read and parse a file:
package main
import (
"bufio"
"fmt"
"os"
)
func main() {
input, err := os.Open("example.txt")
if err != nil {
panic("Error happend during opening the file. Please check if file exists!")
os.Exit(1)
}
defer input.Close()
scanner := bufio.NewScanner(input)
for scanner.Scan() {
line := scanner.Text()
fmt.Printf("%v\n", line)
}
if err := scanner.Err(); err != nil {
fmt.Fprintln(os.Stderr, "reading input:", err)
}
}

How do I read in a large flat file

I have a flat file that has 339276 line of text in it for a size of 62.1 MB. I am attempting to read in all the lines, parse them based on some conditions I have and then insert them into a database.
I originally attempted to use a bufio.Scan() loop and bufio.Text() to get the line but I was running out of buffer space. I switched to using bufio.ReadLine/ReadString/ReadByte (I tried each) and had the same problem with each. I didn't have enough buffer space.
I tried using read and setting the buffer size but as the document says it actually a const that can be made smaller but never bigger that 64*1024 bytes. I then tried to use File.ReadAt where I set the starting postilion and moved it along as I brought in each section to no avail. I have looked at the following examples and explanations (not an exhaustive list):
Read text file into string array (and write)
How to Read last lines from a big file with Go every 10 secs
reading file line by line in go
How do I read in an entire file (either line by line or the whole thing at once) into a slice so I can then go do things to the lines?
Here is some code that I have tried:
file, err := os.Open(feedFolder + value)
handleError(err)
defer file.Close()
// fileInfo, _ := file.Stat()
var linesInFile []string
r := bufio.NewReader(file)
for {
path, err := r.ReadLine("\n") // 0x0A separator = newline
linesInFile = append(linesInFile, path)
if err == io.EOF {
fmt.Printf("End Of File: %s", err)
break
} else if err != nil {
handleError(err) // if you return error
}
}
fmt.Println("Last Line: ", linesInFile[len(linesInFile)-1])
Here is something else I tried:
var fileSize int64 = fileInfo.Size()
fmt.Printf("File Size: %d\t", fileSize)
var bufferSize int64 = 1024 * 60
bytes := make([]byte, bufferSize)
var fullFile []byte
var start int64 = 0
var interationCounter int64 = 1
var currentErr error = nil
for currentErr != io.EOF {
_, currentErr = file.ReadAt(bytes, st)
fullFile = append(fullFile, bytes...)
start = (bufferSize * interationCounter) + 1
interationCounter++
}
fmt.Printf("Err: %s\n", currentErr)
fmt.Printf("fullFile Size: %s\n", len(fullFile))
fmt.Printf("Start: %d", start)
var currentLine []string
for _, value := range fullFile {
if string(value) != "\n" {
currentLine = append(currentLine, string(value))
} else {
singleLine := strings.Join(currentLine, "")
linesInFile = append(linesInFile, singleLine)
currentLine = nil
}
}
I am at a loss. Either I don't understand exactly how the buffer works or I don't understand something else. Thanks for reading.
bufio.Scan() and bufio.Text() in a loop perfectly works for me on a files with much larger size, so I suppose you have lines exceeded buffer capacity. Then
check your line ending
and which Go version you use path, err :=r.ReadLine("\n") // 0x0A separator = newline? Looks like func (b *bufio.Reader) ReadLine() (line []byte, isPrefix bool, err error) has return value isPrefix specifically for your use case
http://golang.org/pkg/bufio/#Reader.ReadLine
It's not clear that it's necessary to read in all the lines before parsing them and inserting them into a database. Try to avoid that.
You have a small file: "a flat file that has 339276 line of text in it for a size of 62.1 MB." For example,
package main
import (
"bytes"
"fmt"
"io"
"io/ioutil"
)
func readLines(filename string) ([]string, error) {
var lines []string
file, err := ioutil.ReadFile(filename)
if err != nil {
return lines, err
}
buf := bytes.NewBuffer(file)
for {
line, err := buf.ReadString('\n')
if len(line) == 0 {
if err != nil {
if err == io.EOF {
break
}
return lines, err
}
}
lines = append(lines, line)
if err != nil && err != io.EOF {
return lines, err
}
}
return lines, nil
}
func main() {
// a flat file that has 339276 lines of text in it for a size of 62.1 MB
filename := "flat.file"
lines, err := readLines(filename)
fmt.Println(len(lines))
if err != nil {
fmt.Println(err)
return
}
}
It seems to me this variant of readLines is shorter and faster than suggested peterSO
func readLines(filename string) (map[int]string, error) {
lines := make(map[int]string)
data, err := ioutil.ReadFile(filename)
if err != nil {
return nil, err
}
for n, line := range strings.Split(string(data), "\n") {
lines[n] = line
}
return lines, nil
}
package main
import (
"fmt"
"os"
"log"
"bufio"
)
func main() {
FileName := "assets/file.txt"
file, err := os.Open(FileName)
if err != nil {
log.Fatal(err)
}
defer file.Close()
scanner := bufio.NewScanner(file)
for scanner.Scan() {
fmt.Println(scanner.Text())
}
}

Replace a line in text file Golang

How do I replace a line in a text file with a new line?
Assume I've opened the file and have every line in an array of string objects i'm now looping through
//find line with ']'
for i, line := range lines {
if strings.Contains(line, ']') {
//replace line with "LOL"
?
}
}
What matters here is not so much what you do in that loop. It's not like you're gonna be directly editing the file on the fly.
The most simple solution for you is to just replace the string in the array and then write the contents of the array back to your file when you're finished.
Here's some code I put together in a minute or two. It properly compiles and runs on my machine.
package main
import (
"io/ioutil"
"log"
"strings"
)
func main() {
input, err := ioutil.ReadFile("myfile")
if err != nil {
log.Fatalln(err)
}
lines := strings.Split(string(input), "\n")
for i, line := range lines {
if strings.Contains(line, "]") {
lines[i] = "LOL"
}
}
output := strings.Join(lines, "\n")
err = ioutil.WriteFile("myfile", []byte(output), 0644)
if err != nil {
log.Fatalln(err)
}
}
There's a gist too (with the same code)
https://gist.github.com/dallarosa/b58b0e3425761e0a7cf6

How to read/scan a .txt file in GO

The .txt file has many lines which each contain a single word. So I open the file and pass it to the reader:
file, err := os.Open("file.txt")
check(err)
reader := bufio.NewReader(file)
Now I want to store each line in a slice of strings. I believe I need to use ReadBytes, ReadString, ReadLine, or on of the Scan functions. Any advice on how to implement this would be appreciated. Thanks.
You can use ioutil.ReadFile() to read all lines into a byte slice and then call split on the result:
package main
import (
"fmt"
"io/ioutil"
"log"
"strings"
)
func main() {
data, err := ioutil.ReadFile("/etc/passwd")
if err != nil {
log.Fatal(err)
}
lines := strings.Split(string(data), "\n")
for _, line := range lines {
fmt.Println("line:", string(line))
}
}
Having r as an instance of *bufio.Reader, and myList as a slice of strings, than one could just loop and read lines till EOL.
for {
line, err := r.ReadBytes('\n')
if err != nil {
break
}
myList = append(myList, string(line))
}

How to improve this file reading code

I currently have this piece of code that will read a file line by line (delimited by a \n)
file, _ := os.Open(filename) //deal with the error later
defer file.Close()
buf := bufio.NewReader(file)
for line, err := buf.ReadString('\n'); err != io.EOF; line, err = buf.ReadString('\n')
{
fmt.Println(strings.TrimRight(line, "\n"))
}
However I don't feel comfortable with writing buf.ReadString("\n") twice in the for loop, does anyone have any suggestions for improvement?
bufio.ReadString reads until the first occurrence of delim in the input,
returning a string containing the data up to and including the
delimiter. If ReadString encounters an error before finding a
delimiter, it returns the data read before the error and the error
itself (often io.EOF). ReadString returns err != nil if and only if
the returned data does not end in delim.
If buf.ReadString('\n') returns an error other than io.EOF, for example bufio.ErrBufferFull, you will be in an infinite loop. Also, if the file doesn't end in a '\n', you silently ignore the data after the last '\n'.
Here's a more robust solution, which executes buf.ReadString('\n') once.
package main
import (
"bufio"
"fmt"
"io"
"os"
"strings"
)
func main() {
filename := "FileName"
file, err := os.Open(filename)
if err != nil {
fmt.Println(err)
return
}
defer file.Close()
buf := bufio.NewReader(file)
for {
line, err := buf.ReadString('\n')
if err != nil {
if err != io.EOF || len(line) > 0 {
fmt.Println(err)
return
}
break
}
fmt.Println(strings.TrimRight(line, "\n"))
}
}
Most code that reads line by line can be improved by not reading line by line. If your goal is to read the file and access the lines, something like the following is almost always better.
package main
import (
"fmt"
"io/ioutil"
"log"
"strings"
)
func main() {
b, err := ioutil.ReadFile("filename")
if err != nil {
log.Fatal(err)
}
s := string(b) // convert []byte to string
s = strings.TrimRight(s, "\n") // strip \n on last line
ss := strings.Split(s, "\n") // split to []string
for _, s := range ss {
fmt.Println(s)
}
}
Any errors come to you at a single point so error handling is simplified. Stripping a newline off the last line allows for files that may or may not have that final newline, as Peter suggested. Most text files are tiny compared to available memory these days, so reading these in one gulp is appropriate.

Resources