Using Go to create text files from columns of a csv data frame - go

I am trying to loop over a csv file and output text file titled after each row in the first column. Each text file is then be populated with data from the other rows for that column. I am able to print the contents of the csv to a text file, but I can not get the logic down using a for loop to grab the index of column one and use that to create/title a new .txt file.
package main
import (
"fmt"
"io"
"io/ioutil"
"log"
"os"
)
func main() {
fmt.Println("Enter file path to CSV: ")
var csvFile string
_, err := fmt.Scanln(&csvFile)
if err != nil {
log.Fatal("Cannot read input")
return
}
//open file
inFile, err := os.Open(csvFile)
if err != nil {
log.Fatal(err)
}
defer inFile.Close()
readMe, _ := ioutil.ReadAll(inFile)
blankFile, err := os.Create(`C:\temp\test.txt`)
if err != nil {
log.Fatal(err)
}
defer blankFile.Close()
//write data to text file
outFile, err := blankFile.Write(readMe)
if err == io.EOF {
log.Fatalln("Failed")
} else if err != nil {
log.Fatal(err)
}
//print bytes total
fmt.Println(outFile, " bytes printed")
}

Take multiple columns from a csv and print each column to a new text
file.
Loop over a csv and produce a new text file that will be titled after
each column in row #1. Each text file will then be populated with data
from the other rows for that column.
For example,
package main
import (
"encoding/csv"
"fmt"
"io"
"os"
"path/filepath"
)
func CsvFileToTxtFiles(inFile string) error {
in, err := os.Open(inFile)
if err != nil {
return err
}
defer in.Close()
r := csv.NewReader(in)
hdr, err := r.Read()
if err != nil {
return err
}
f := make([]*os.File, len(hdr))
w := make([]*csv.Writer, len(hdr))
pfx := filepath.Clean(inFile)
pfx = pfx[:len(pfx)-len(filepath.Ext(pfx))]
for i, col := range hdr {
var err error
f[i], err = os.Create(pfx + "." + col + ".txt")
if err != nil {
return err
}
defer f[i].Close()
w[i] = csv.NewWriter(f[i])
if err != nil {
return err
}
defer w[i].Flush()
}
for {
row, err := r.Read()
if err != nil {
if row == nil && err == io.EOF {
break
}
return err
}
for i, col := range row {
err := w[i].Write([]string{col})
if err != nil {
return err
}
}
}
for i := range hdr {
var err error
w[i].Flush()
err = w[i].Error()
if err != nil {
return err
}
err = f[i].Close()
if err != nil {
return err
}
}
return nil
}
func main() {
if len(os.Args) <= 1 {
usage := "usage: " + filepath.Base(os.Args[0]) + " FILE"
fmt.Fprintln(os.Stderr, usage)
return
}
inFile := os.Args[1]
err := CsvFileToTxtFiles(inFile)
if err != nil {
fmt.Fprintln(os.Stderr, err)
return
}
}
Output:
$ cat ioj.test.csv
one,two,three
1,2,3
11,22,33
$ go run ioj.go ioj.test.csv
$ cat ioj.test.one.txt
1
11
$ cat ioj.test.two.txt
2
22
$ cat ioj.test.three.txt
3
33
$

Related

Creating/Writing/Reading CSV file in Golang

The following code should create a CSV file and write the employee data to it, then read it out in the terminal. The code will create the file and not print the contents in the terminal. If I take the last chunk of code out to print the contents, it will work as an independent program as long as the CSV file is present. Why does the code not work when written all as one?
package main
import (
"encoding/csv"
"fmt"
"os"
"sort"
)
type Employee struct {
Name string
Age int
Salary float64
}
func main() {
employees := []Employee{
{"Fred", 44, 56000.00},
{"Amy", 25, 65000.00},
{"Zack", 29, 20400.00},
{"Jerry", 73, 120500.00},
}
sort.Slice(employees, func(i, j int) bool {
return employees[i].Name < employees[j].Name
})
file, err := os.Create("employees.csv")
if err != nil {
panic(err)
}
defer file.Close()
writer := csv.NewWriter(file)
defer writer.Flush()
header := []string{"Name", "Age", "Salary"}
if err := writer.Write(header); err != nil {
panic(err)
}
for _, employee := range employees {
record := []string{employee.Name, fmt.Sprintf("%d", employee.Age), fmt.Sprintf("%.2f", employee.Salary)}
if err := writer.Write(record); err != nil {
panic(err)
}
}
file, err = os.Open("employees.csv")
if err != nil {
panic(err)
}
defer file.Close()
reader := csv.NewReader(file)
records, err := reader.ReadAll()
if err != nil {
panic(err)
}
for _, record := range records {
fmt.Println(record)
}
}
//This code will print the contents of the CSV correctly when ran separately
package main
import (
"encoding/csv"
"fmt"
"os"
)
func main() {
file, err := os.Open("employees.csv")
if err != nil {
fmt.Println("Error opening file:", err)
return
}
defer file.Close()
reader := csv.NewReader(file)
reader.FieldsPerRecord = -1
records, err := reader.ReadAll()
if err != nil {
fmt.Println("Error reading CSV data:", err)
return
}
for _, record := range records {
fmt.Println(record)
}
}
The first program will create the CSV file with the expected cells, but instead of printing the contents to the terminal, the program completes.

how to get multiple line inputs in golang - interview coding

For the below type of inputs in golang coding interviews, what is the best way to get the input?
Input:
3
hello elloh
test estt
tier riet
I found two methods:
Method 1:
reader := bufio.NewReader(os.Stdin)
var lines []string
for {
line,err := reader.ReadString('\n') //this reads only one read
if err != nil {
log.Fatal(err)
}
if len(strings.TrimSpace(line)) == 0 {
break
}
line_s := strings.Split(line, " ")
lines = append(lines, line_s...)
}
Method 2:
bytes, err := ioutil.ReadAll(os.Stdin)
fmt.Println(len(bytes))
if err == nil {
input := strings.Split(string(bytes), "\n")
count, _ := strconv.Atoi(input[0])
fmt.Println(input)
var lines []string
for i := 1; i < count; i++ {
line := strings.Split(input[i], " ")
lines = append(lines, line...)
}
fmt.Println(lines)
}
But not sure how to end getting input from stdin in Method2.
Please suggest the best method to get input.
Use bufio.Scanner to read input. Use a function to encapsulate complexity and implementation details. For example,
package main
import (
"bufio"
"fmt"
"os"
"strconv"
"strings"
)
func readData(s *bufio.Scanner) ([][]string, error) {
var data [][]string
if !s.Scan() {
return nil, s.Err()
}
nLine, err := strconv.Atoi(strings.TrimSpace(s.Text()))
if err != nil {
return nil, err
}
for ; nLine > 0 && s.Scan(); nLine-- {
data = append(data, strings.Fields(s.Text()))
}
if err := s.Err(); err != nil {
return nil, err
}
if nLine != 0 {
err := fmt.Errorf("missing %d lines of data", nLine)
return nil, err
}
return data, nil
}
func main() {
s := bufio.NewScanner(os.Stdin)
data, err := readData(s)
if err != nil {
fmt.Fprintln(os.Stderr, err)
os.Exit(1)
}
fmt.Println(len(data))
for _, datum := range data {
fmt.Println(datum)
}
}
https://go.dev/play/p/0Xwp3-hwGyK
3
hello elloh
test estt
tier riet
3
[hello elloh]
[test estt]
[tier riet]

How to skip the first row when reading a csv file?

I have an awkward csv file and I need to skip the first row to read it.
I'm doing this easily with python/pandas
df = pd.read_csv(filename, skiprows=1)
but I don't know how to do it in Go.
package main
import (
"encoding/csv"
"fmt"
"log"
"os"
)
type mwericsson struct {
id string
name string
region string
}
func main() {
rows := readSample()
fmt.Println(rows)
//appendSum(rows)
//writeChanges(rows)
}
func readSample() [][]string {
f, err := os.Open("D:/in/20190629/PM_IG30014_15_201906290015_01.csv")
if err != nil {
log.Fatal(err)
}
rows, err := csv.NewReader(f).ReadAll()
f.Close()
if err != nil {
log.Fatal(err)
}
return rows
}
Error:
2019/07/01 12:38:40 record on line 2: wrong number of fields
PM_IG30014_15_201906290015_01.csv:
PTN Ethernet-Port RMON Performance,PORT_BW_UTILIZATION,2019-06-29 20:00:00,33366
DeviceID,DeviceName,ResourceName,CollectionTime,GranularityPeriod,PORT_RX_BW_UTILIZATION,PORT_TX_BW_UTILIZATION,RXGOODFULLFRAMESPEED,TXGOODFULLFRAMESPEED,PORT_RX_BW_UTILIZATION_MAX,PORT_TX_BW_UTILIZATION_MAX
3174659,H1095,H1095-11-ISM6-1(to ZJBSC-V1),2019-06-29 20:00:00,15,22.08,4.59,,,30.13,6.98
3174659,H1095,H1095-14-ISM6-1(to T6147-V),2019-06-29 20:00:00,15,2.11,10.92,,,4.43,22.45
skip the first row when reading a csv file
For example,
package main
import (
"bufio"
"encoding/csv"
"fmt"
"io"
"os"
)
func readSample(rs io.ReadSeeker) ([][]string, error) {
// Skip first row (line)
row1, err := bufio.NewReader(rs).ReadSlice('\n')
if err != nil {
return nil, err
}
_, err = rs.Seek(int64(len(row1)), io.SeekStart)
if err != nil {
return nil, err
}
// Read remaining rows
r := csv.NewReader(rs)
rows, err := r.ReadAll()
if err != nil {
return nil, err
}
return rows, nil
}
func main() {
f, err := os.Open("sample.csv")
if err != nil {
panic(err)
}
defer f.Close()
rows, err := readSample(f)
if err != nil {
panic(err)
}
fmt.Println(rows)
}
Output:
$ cat sample.csv
one,two,three,four
1,2,3
4,5,6
$ go run sample.go
[[1 2 3] [4 5 6]]
$
$ cat sample.csv
PTN Ethernet-Port RMON Performance,PORT_BW_UTILIZATION,2019-06-29 20:00:00,33366
DeviceID,DeviceName,ResourceName,CollectionTime,GranularityPeriod,PORT_RX_BW_UTILIZATION,PORT_TX_BW_UTILIZATION,RXGOODFULLFRAMESPEED,TXGOODFULLFRAMESPEED,PORT_RX_BW_UTILIZATION_MAX,PORT_TX_BW_UTILIZATION_MAX
3174659,H1095,H1095-11-ISM6-1(to ZJBSC-V1),2019-06-29 20:00:00,15,22.08,4.59,,,30.13,6.98
3174659,H1095,H1095-14-ISM6-1(to T6147-V),2019-06-29 20:00:00,15,2.11,10.92,,,4.43,22.45
$ go run sample.go
[[DeviceID DeviceName ResourceName CollectionTime GranularityPeriod PORT_RX_BW_UTILIZATION PORT_TX_BW_UTILIZATION RXGOODFULLFRAMESPEED TXGOODFULLFRAMESPEED PORT_RX_BW_UTILIZATION_MAX PORT_TX_BW_UTILIZATION_MAX] [3174659 H1095 H1095-11-ISM6-1(to ZJBSC-V1) 2019-06-29 20:00:00 15 22.08 4.59 30.13 6.98] [3174659 H1095 H1095-14-ISM6-1(to T6147-V) 2019-06-29 20:00:00 15 2.11 10.92 4.43 22.45]]
$
Simply call Reader.Read() to read a line, then proceed to read the rest with Reader.ReadAll().
See this example:
src := "one,two,three\n1,2,3\n4,5,6"
r := csv.NewReader(strings.NewReader(src))
if _, err := r.Read(); err != nil {
panic(err)
}
records, err := r.ReadAll()
if err != nil {
panic(err)
}
fmt.Println(records)
Output (try it on the Go Playground):
[[1 2 3] [4 5 6]]
while it was informative to learn about io.ReadSeeker, I think a simpler way to skip the first line/row (often times the header) of a csv is to use the slice functionality as follows:
func readCsv(filename string) [][]string {
f, err := os.Open(filename)
if err != nil {
log.Fatal(err)
}
defer f.Close()
records := [][]string{}
r := csv.NewReader(f)
for {
record, err := r.Read()
if err == io.EOF {
break
}
if err != nil {
log.Fatal(err)
}
records = append(records, record)
}
return records[1:] // skip the header
}
we can just use bufio.ReadBytes('\n') and pass bufio as Reader to csv.NewReader
func readSample(reader io.Reader) ([][]string, error) {
// if reader is bufio, we don't need to NewReader againg
buf, ok := (reader).(*bufio.Reader)
if !ok {
buf = bufio.NewReader(reader)
}
_, err := buf.ReadBytes('\n')
if err != nil {
return nil, err
}
rows, err := csv.NewReader(buf).ReadAll()
if err != nil {
return nil, err
}
return rows, nil
}

No output to error file

I'm coding a little Go program.
It reads files in a directory line by line, it only reads lines with a certain prefix, normalizes the data and outputs to one of two files, depending on whether the normalized record has certain number of elements.
Data is being outputted to the Data file, but errors are not being outputted to the Errors file.
Debugging I see no issue.
Any help is much appreciated.
Thanks,
Martin
package main
import (
"bufio"
"fmt"
"io/ioutil"
"log"
"os"
"strings"
)
func main() {
//Output file - Data
if _, err := os.Stat("allData.txt"); os.IsNotExist(err) {
var file, err = os.Create("allData.txt")
if err != nil {
fmt.Println(err)
return
}
defer file.Close()
}
file, err := os.OpenFile("allData.txt", os.O_WRONLY|os.O_APPEND, 0644)
if err != nil {
panic(err)
}
w := bufio.NewWriter(file)
//Output file - Errors
if _, err := os.Stat("errorData.txt"); os.IsNotExist(err) {
var fileError, err = os.Create("errorData.txt")
if err != nil {
fmt.Println(err)
return
}
defer fileError.Close()
}
fileError, err := os.OpenFile("errorData.txt", os.O_WRONLY|os.O_APPEND, 0644)
if err != nil {
panic(err)
}
z := bufio.NewWriter(fileError)
//Read Directory
files, err := ioutil.ReadDir("../")
if err != nil {
log.Fatal(err)
}
//Build file path
for _, f := range files {
fName := string(f.Name())
sPath := string("../" + fName)
sFile, err := os.Open(sPath)
if err != nil {
fmt.Println(err)
return
}
//Create scanner
scanner := bufio.NewScanner(sFile)
scanner.Split(bufio.ScanLines)
var lines []string
// This is the buffer now
for scanner.Scan() {
lines = append(lines, scanner.Text())
}
for _, line := range lines {
sRecordC := strings.HasPrefix((line), "DATA:")
if sRecordC {
splitted := strings.Split(line, " ")
splittedNoSpaces := deleteEmpty(splitted)
if len(splittedNoSpaces) == 11 {
splittedString := strings.Join(splittedNoSpaces, " ")
sFinalRecord := string(splittedString + "\r\n")
if _, err = fmt.Fprintf(w, sFinalRecord); err != nil {
}
}
if len(splittedNoSpaces) < 11 {
splitted := strings.Split(line, " ")
splittedNoSpaces := deleteEmpty(splitted)
splittedString := strings.Join(splittedNoSpaces, " ")
sFinalRecord := string(splittedString + "\r\n")
if _, err = fmt.Fprintf(z, sFinalRecord); err != nil {
}
err = fileError.Sync()
if err != nil {
log.Fatal(err)
}
}
}
}
}
err = file.Sync()
if err != nil {
log.Fatal(err)
}
}
//Delete Empty array elements
func deleteEmpty(s []string) []string {
var r []string
for _, str := range s {
if str != "" {
r = append(r, str)
}
}
return r
}
Don't open the file multiple times, and don't check for the file's existence before creating it, just use the os.O_CREATE flag. You're also not deferring the correct os.File.Close call, because it's opened multiple times.
When using a bufio.Writer, you should always call Flush() to ensure that all data has been written to the underlying io.Writer.

Golang: fetching data from 1 CSV File to anthoer

I am new to golang, and I am trying to fetch 1 csv file to another new csv file, but i need only 2 records from the old csv file.
How would you fetch only the first two records of that file?
Here is what I have tried so far (also in the play.golang.org):
package main
import (
"encoding/csv"
"fmt"
"io"
"os"
)
func main() {
//SELECTING THE FILE TO EXTRACT.......
csvfile1, err := os.Open("data/sample.csv")
if err != nil {
fmt.Println(err)
return
}
defer csvfile1.Close()
reader := csv.NewReader(csvfile1)
for i := 0; i < 3; i++ {
record, err := reader.Read()
if err == io.EOF {
break
} else if err != nil {
fmt.Println(err)
return
}
csvfile2, err := os.Create("data/SingleColomReading.csv")
if err != nil {
fmt.Println(err)
return
}
defer csvfile2.Close()
records := []string{
record,
}
writer := csv.NewWriter(csvfile2)
//fmt.Println(writer)
for _, single := range records {
er := writer.Write(single)
if er != nil {
fmt.Println("error", er)
return
}
fmt.Println(single)
writer.Flush()
//fmt.Println(records)
//a:=strconv.Itoa(single)
n, er2 := csvfile2.WriteString(single)
if er2 != nil {
fmt.Println(n, er2)
}
}
}
}
Fixing your program,
package main
import (
"encoding/csv"
"fmt"
"io"
"os"
)
func main() {
csvfile1, err := os.Open("data/sample.csv")
if err != nil {
fmt.Println(err)
return
}
defer csvfile1.Close()
reader := csv.NewReader(csvfile1)
csvfile2, err := os.Create("data/SingleColomReading.csv")
if err != nil {
fmt.Println(err)
return
}
writer := csv.NewWriter(csvfile2)
for i := 0; i < 2; i++ {
record, err := reader.Read()
if err != nil {
if err == io.EOF {
break
}
fmt.Println(err)
return
}
err = writer.Write(record)
if err != nil {
fmt.Println(err)
return
}
}
writer.Flush()
err = csvfile2.Close()
if err != nil {
fmt.Println(err)
return
}
}
However, since you are only interested in copying records (lines) as a whole and not individual fields of a record, you could use bufio.Scanner, as #VonC suggested. For example,
package main
import (
"bufio"
"fmt"
"os"
)
func main() {
csvfile1, err := os.Open("data/sample.csv")
if err != nil {
fmt.Println(err)
return
}
defer csvfile1.Close()
scanner := bufio.NewScanner(csvfile1)
csvfile2, err := os.Create("data/SingleColomReading.csv")
if err != nil {
fmt.Println(err)
return
}
writer := bufio.NewWriter(csvfile2)
nRecords := 0
for scanner.Scan() {
n, err := writer.Write(scanner.Bytes())
if err != nil {
fmt.Println(n, err)
return
}
err = writer.WriteByte('\n')
if err != nil {
fmt.Println(err)
return
}
if nRecords++; nRecords >= 2 {
break
}
}
if err := scanner.Err(); err != nil {
fmt.Println(err)
return
}
err = writer.Flush()
if err != nil {
fmt.Println(err)
return
}
err = csvfile2.Close()
if err != nil {
fmt.Println(err)
return
}
}
It owuld be easier to:
read your csv file into a string array (one line per element), for the two first lines only
var lines []string
scanner := bufio.NewScanner(file)
nblines := 0
for scanner.Scan() {
lines = append(lines, scanner.Text())
if nblines++; nblines >= 2 {
break
}
}
Then you can use a range lines to write those two lines in the destination file.
lines includes at most 2 elements.

Resources