How do I write a gota dataframe to a csv? - go

I have found many many code examples of writing to a CSV by passing in a [][]string. (like the following):
package main
import (
"os"
"log"
"encoding/csv"
)
var data = [][]string{
{"Row 1", "30"},
{"Row 2", "60"},
{"Row 3", "90"}}
func main() {
file, err := os.Create("tutorials_technology.csv")
if err != nil {
log.Fatal(err)
}
defer file.Close()
w := csv.NewWriter(file)
for _, value := range data {
if err := w.Write(value); err != nil {
log.Fatalln("Error writing record to csv: ", err)
}
}
w.Flush()
}
However, I haven't found any code examples that show how to use the gota dataframe.WriteCSV() function to write to a CSV. In the gota dataframe documentation, there isn't an example for writing to a csv, but there is an example for reading from a csv.
The dataframe function WriteCSV() requires an input of the io.Writer{} interface. I wasn't sure how to satsify that.
The following didn't work
writer := csv.NewWriter(f)
df.WriteCSV(writer) // TODO This writer needs to be a []byte writer
I've been working on this for quite a while. Does anyone have any clues?
I have looked into turning my gota dataframe into a [][]string type, but that's a little inconvenient because I put my data into a gota dataframe with the package's LoadStructs() function and I had read in some CSV in a semi-custom way before putting them into structs.
So I could write a function to turn my structs into a [][]string format, but I feel like that is pretty tedious and I'm sure there has got to be a better way. In fact, I'm sure there is because the dataframe type has the WriteCSV() method but I just haven't figured out how to use it.
Here are my structs
type CsvLine struct {
Index int
Date string
Symbol string
Open float64
High float64
Low float64
Close float64
// Volume float64
// Market_Cap float64
}
type File struct {
Rows []CsvLine
}
Disclaimer: I am a little bit of a golang newbie. I've only been using Go for a couple months, and this is the first time I've tried to write to a file. I haven't interacted much with the io.Writer interface, but I hear that it's very useful.
And yes, I frequently look at the Golang.org blog and I've read "Effective Go" and I keep referencing it.

So it turns out I misunderstood the io.Writer interface and I didn't understand what the os.Create() function returns.
It turns out the code is even simpler and easier than I thought it would be.
Here is the working code example:
df := dataframe.LoadStructs(file.Rows)
f, err := os.Create(outFileName)
if err != nil {
log.Fatal(err)
}
df.WriteCSV(f)

Related

Uses of io.ReadCloser

Could someone please explain (or/and share examples) when and why readers should to be closed explicitly, i.e. implement io.ReadCloser, not just io.Reader.
For example, when you are working with files, or any resource that should be closed to release the allocated resource (or memory for example for your resource, e.g. C code calling from Go).
You may use it when you have Read and Close methods, an example to show that you may use one common function to work with different types using io.ReadCloser:
package main
import (
"fmt"
"io"
"log"
"os"
)
func main() {
f, err := os.Open("./main.go")
if err != nil {
log.Fatal(err)
}
doIt(f)
doIt(os.Stdin)
}
func doIt(rc io.ReadCloser) {
defer rc.Close()
buf := make([]byte, 4)
n, err := rc.Read(buf)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%s\n", buf[:n])
}
Run and enter 12345 as an input, Output:
pack
12345
1234
See also:
Does Go automatically close resources if not explicitly closed?
It's for an explicit definition of Reader and Closer interface. So let's say you write some functionality that reads data, but you also want to close resource after doing it (again not to leak descriptors).
func ...(r io.ReaderCloser) {
defer r.Close()
... // some reading
}
anything you pass in it will need to have both interfaces defined, is it os.File or any custom struct, in this case, you are forcing client of your API to define Read and Close interfaces implementations, not just io.Reader.

Google sheets API - download data with no formatting

Using Go, when fetching sheet data, the data is arriving with its applied cell formatting
i.e. "$123,456" while I need the original 123456.
is there something in the api that can remove formatting? like formatting: false
code:
package main
import (
"log"
"golang.org/x/net/context"
"golang.org/x/oauth2/google"
"gopkg.in/Iwark/spreadsheet.v2"
)
func main() {
service := authenticate()
spreadsheet, err := service.FetchSpreadsheet(spreadsheetID)
checkError(err)
sheet, err := spreadsheet.SheetByIndex(1)
checkError(err)
for _, row := range sheet.Rows {
var csvRow []string
for _, cell := range row {
csvRow = append(csvRow, cell.Value)
}
log.Println(csvRow)
}
}
// function to authenticate on Google
func authenticate() *spreadsheet.Service {
data, err := ioutil.ReadFile("secret.json")
checkError(err)
conf, err := google.JWTConfigFromJSON(data, spreadsheet.Scope)
checkError(err)
client := conf.Client(context.TODO())
service := spreadsheet.NewServiceWithClient(client)
return service
}
func checkError(err error) {
if err != nil {
panic(err.Error())
}
}
Yes, there is a way. In the Method: spreadsheets.values.get endpoint, there is a request parameter called valueRenderOption, which one of its values is UNFORMATTED_VALUE, as its name suggests, it will give you back all the data without format.
Try this Request with the range of values you want to play around with the API and see in that way the unformatted values it will return.
You want to retrieve $123,456 as 123456 from Google Spreadsheet.
$123,456 is shown by the cell format. It's actually the number.
You want to achieve this using gopkg.in/Iwark/spreadsheet.v2 with golang.
You have already been able to get and put values for Google Spreadsheet using the service account with Sheets API.
If my understanding is correct, how about this answer? Please think of this as just one of several possible answers.
Modification points:
When I saw the script of the library of gopkg.in/Iwark/spreadsheet.v2, I noticed that the values are retrieved by the method of spreadsheets.get in Sheets API.
Furthermore, it was found that spreadsheetId,properties.title,sheets(properties,data.rowData.values(formattedValue,note)) was used as the fields. It seems that the fields is constant.
The reason of your issue is that the values are retrieved with formattedValue. In your case, the values are required to be retrieved with userEnteredValue.
When you want to achieve your goal using the library of gopkg.in/Iwark/spreadsheet.v2, in order to reflect above to the library, it is required to modify the script of library.
Modified script:
Please modify the files of gopkg.in/Iwark/spreadsheet.v2 as follows. Of course, please backup the original files in order to back to the original library.
1. service.go
Modify the line 116 as follows.
From:
fields := "spreadsheetId,properties.title,sheets(properties,data.rowData.values(formattedValue,note))"
To:
fields := "spreadsheetId,properties.title,sheets(properties,data.rowData.values(formattedValue,note,userEnteredValue))"
2. sheet.go
Modify the line 52 as follows.
From:
Value: cellData.FormattedValue,
To:
Value: strconv.FormatFloat(cellData.UserEnteredValue.NumberValue, 'f', 4, 64),
And add "strconv" to import section like below.
import (
"encoding/json"
"strings"
"strconv"
)
3. cell_data.go
Modify the line 8 as follows.
From:
// UserEnteredFormat *CellFormat `json:"userEnteredFormat"`
To:
UserEnteredFormat struct {
NumberValue float64 `json:"numberValue"`
} `json:"userEnteredFormat"`
Result:
In this case, your script is not required to be modified. After above modification, when you run your script, you can see [123456.0000] at the console. As an important point, it seems that this library uses the values as the string type. In this modification, I used this. But if you want to use it as other type, please modify the library.
Other pattern:
As the other pattern for achieving your goal, how about using google-api-go-client? About this, you can see it at Go Quickstart. When google-api-go-client is used, the sample script becomes as follows. In this case, as a test case, the method of spreadsheets.get was used.
Sample script 1:
In this sample script, authenticate() and checkError() in your script are used by modifying.
package main
import (
"fmt"
"io/ioutil"
"net/http"
"golang.org/x/net/context"
"golang.org/x/oauth2/google"
"google.golang.org/api/sheets/v4"
)
func main() {
c := authenticate()
sheetsService, err := sheets.New(c)
checkError(err)
spreadsheetId := "###" // Please set the Spreadsheet ID.
ranges := []string{"Sheet1"} // Please set the sheet name.
resp, err := sheetsService.Spreadsheets.Get(spreadsheetId).Ranges(ranges...).Fields("sheets.data.rowData.values.userEnteredValue").Do()
checkError(err)
for _, row := range resp.Sheets[0].Data[0].RowData {
for _, col := range row.Values {
fmt.Println(col.UserEnteredValue)
}
}
}
func authenticate() *http.Client {
data, err := ioutil.ReadFile("serviceAccount_20190511.json")
checkError(err)
conf, err := google.JWTConfigFromJSON(data, sheets.SpreadsheetsScope)
checkError(err)
client := conf.Client(context.TODO())
return client
}
func checkError(err error) {
if err != nil {
panic(err.Error())
}
}
Sample script 2:
When spreadsheets.values.get is used, the script of main() is as follows.
func main() {
c := authenticate()
sheetsService, err := sheets.New(c)
checkError(err)
spreadsheetId := "###" // Please set the Spreadsheet ID.
sheetName := "Sheet1" // Please set the sheet name.
resp, err := sheetsService.Spreadsheets.Values.Get(spreadsheetId, sheetName).ValueRenderOption("UNFORMATTED_VALUE").Do()
checkError(err)
fmt.Println(resp.Values)
}
Here, UNFORMATTED_VALUE is used for retrieving the values without the cell format. This has already been answered by alberto vielma
References:
Method: spreadsheets.get
google-api-go-client
Go Quickstart
Method: spreadsheets.values.get
If I misunderstood your question and this was not the direction you want, I apologize.

io.Writer in Go - beginner trying to understand them

As a beginner in Go, I have problems understanding io.Writer.
My target: take a struct and write it into a json file.
Approach:
- use encoding/json.Marshal to convert my struct into bytes
- feed those bytes to an os.File Writer
This is how I got it working:
package main
import (
"os"
"encoding/json"
)
type Person struct {
Name string
Age uint
Occupation []string
}
func MakeBytes(p Person) []byte {
b, _ := json.Marshal(p)
return b
}
func main() {
gandalf := Person{
"Gandalf",
56,
[]string{"sourcerer", "foo fighter"},
}
myFile, err := os.Create("output1.json")
if err != nil {
panic(err)
}
myBytes := MakeBytes(gandalf)
myFile.Write(myBytes)
}
After reading this article, I changed my program to this:
package main
import (
"io"
"os"
"encoding/json"
)
type Person struct {
Name string
Age uint
Occupation []string
}
// Correct name for this function would be simply Write
// but I use WriteToFile for my understanding
func (p *Person) WriteToFile(w io.Writer) {
b, _ := json.Marshal(*p)
w.Write(b)
}
func main() {
gandalf := Person{
"Gandalf",
56,
[]string{"sourcerer", "foo fighter"},
}
myFile, err := os.Create("output2.json")
if err != nil {
panic(err)
}
gandalf.WriteToFile(myFile)
}
In my opinion, the first example is a more straightforward and easier to understand for a beginner... but I have the feeling that the 2nd example is the Go idiomatic way of achieving the target.
Questions:
1. is above assumption correct (that 2nd option is Go idiomatic) ?
2. Is there a difference in the above options ? Which option is better ?
3. other ways to achieve the same target ?
Thank you,
WM
The benefit of using the second method is that if you are passing a Writer interface, you can pass anything which implements Write -- that is not only a file but a http.ResponseWriter, for example, or stdout os.Stdout, without changing the struct methods.
You can see this handy blog post on the package io walkthrough. The author makes the case that passing as parameter readers and writers makes your code more flexible, in part because so many functions use the Reader and Writer interface.
As you come to use Go more, you'll notice how much the standard library leans on Reader and Writer interfaces, and probably come to appreciate it :)
So this function (renamed):
// writes json representation of Person to Writer
func (p *Person) WriteJson(w io.Writer) error {
b, err := json.Marshal(*p)
if err != nil {
return err
}
_, err = w.Write(b)
if err != nil {
return err
}
return err
}
Would write to a File, http Response, a user's Stdout, or even a simple byte Buffer; making testing a bit simpler.
I renamed it because of what is does; that is, this function takes a Person struct and:
Marshals the struct into a json representation
Writes the json to a Writer
Returns any errors arising from marshalling/writing
One more thing, you might be confused as to what a Writer is, because it is not a data type, but rather an interface -- that is a behavior of a data type, a predefined method that a type implements. Anything that implements the Write() method, then, is considered a writer.
This can be a bit difficult for beginners to grasp at first, but there are lots of resources online to help understand interfaces (and ReadWriters are some of the more common interfaces to encounter, along with Error() (ei. all errors)).

Is there a simple way to convert data base rows to JSON in Golang

Currently on what I've seen so far is that, converting database rows to JSON or to []map[string]interface{} is not simple. I have to create two slices and then loop through columns and create keys every time.
...Some code
tableData := make([]map[string]interface{}, 0)
values := make([]interface{}, count)
valuePtrs := make([]interface{}, count)
for rows.Next() {
for i := 0; i < count; i++ {
valuePtrs[i] = &values[i]
}
rows.Scan(valuePtrs...)
entry := make(map[string]interface{})
for i, col := range columns {
var v interface{}
val := values[i]
b, ok := val.([]byte)
if ok {
v = string(b)
} else {
v = val
}
entry[col] = v
}
tableData = append(tableData, entry)
}
Is there any package for this ? Or I am missing some basics here
I'm dealing with the same issue, as far as my investigation goes it looks that there is no other way.
All the packages that I have seen use basically the same method
Few things you should know, hopefully will save you time:
database/sql package converts all the data to the appropriate types
if you are using the mysql driver(go-sql-driver/mysql) you need to add
params to your db string for it to return type time instead of a string
(use ?parseTime=true, default is false)
You can use tools that were written by the community, to offload the overhead:
A minimalistic wrapper around database/sql, sqlx, uses similar way internally with reflection.
If you need more functionality, try using an "orm": gorp, gorm.
If you interested in diving deeper check out:
Using reflection in sqlx package, sqlx.go line 560
Data type conversion in database/sql package, convert.go line 86
One thing you could do is create a struct that models your data.
**Note: I am using MS SQLServer
So lets say you want to get a user
type User struct {
ID int `json:"id,omitempty"`
UserName string `json:"user_name,omitempty"`
...
}
then you can do this
func GetUser(w http.ResponseWriter, req *http.Request) {
var r Role
params := mux.Vars(req)
db, err := sql.Open("mssql", "server=ServerName")
if err != nil {
log.Fatal(err)
}
err1 := db.QueryRow("select Id, UserName from [Your Datavse].dbo.Users where Id = ?", params["id"]).Scan(&r.ID, &r.Name)
if err1 != nil {
log.Fatal(err1)
}
json.NewEncoder(w).Encode(&r)
if err != nil {
log.Fatal(err)
}
}
Here are the imports I used
import (
"database/sql"
"net/http"
"log"
"encoding/json"
_ "github.com/denisenkom/go-mssqldb"
"github.com/gorilla/mux"
)
This allowed me to get data from the database and get it into JSON.
This takes a while to code, but it works really well.
Not in the Go distribution itself, but there is the wonderful jmoiron/sqlx:
import "github.com/jmoiron/sqlx"
tableData := make([]map[string]interface{}, 0)
for rows.Next() {
entry := make(map[string]interface{})
err := rows.MapScan(entry)
if err != nil {
log.Fatal("SQL error: " + err.Error())
}
tableData = append(tableData, entry)
}
If you know the data type that you are reading, then you can read into the data type without using generic interface.
Otherwise, there is no solution regardless of the language used due to nature of JSON itself.
JSON does not have description of composite data structures. In other words, JSON is a generic key-value structure. When parser encounters what is supposed to be a specific structure there is no identification of that structure in JSON itself. For example, if you have a structure User the parser would not know how a set of key-value pairs maps to your structure User.
The problem of type recognition is usually addressed with document schema (a.k.a. XSD in XML world) or explicitly through passed expected data type.
One quick way to go about being able to get an arbirtrary and generic []map[string]interface{} from these query libraries is to populate an array of interface pointers with the same size of the amount of columns on the query, and then pass that as a parameter on the scan function:
// For example, for the go-mssqldb lib:
queryResponse, err := d.pool.Query(query)
if err != nil {
return nil, err
}
defer queryResponse.Close()
// Holds all the end-results
results := []map[string]interface{}{}
// Getting details about all the fields from the query
fieldNames, err := queryResponse.Columns()
if err != nil {
return nil, err
}
// Creating interface-type pointers within an array of the same
// size of the number of columns we have, so that we can properly
// pass this to the "Scan" function and get all the query parameters back :)
var scanResults []interface{}
for range fieldNames {
var v interface{}
scanResults = append(scanResults, &v)
}
// Parsing the query results into the result map
for queryResponse.Next() {
// This variable will hold the value for all the columns, named by the column name
rowValues := map[string]interface{}{}
// Cleaning up old values just in case
for _, column := range scanResults {
*(column.(*interface{})) = nil
}
// Scan into the array of pointers
err := queryResponse.Scan(scanResults...)
if err != nil {
return nil, err
}
// Map the pointers back to their value and the associated column name
for index, column := range scanResults {
rowValues[fieldNames[index]] = *(column.(*interface{}))
}
results = append(results, rowValues)
}
return results, nil

From io.Reader to string in Go

I have an io.ReadCloser object (from an http.Response object).
What's the most efficient way to convert the entire stream to a string object?
EDIT:
Since 1.10, strings.Builder exists. Example:
buf := new(strings.Builder)
n, err := io.Copy(buf, r)
// check errors
fmt.Println(buf.String())
OUTDATED INFORMATION BELOW
The short answer is that it it will not be efficient because converting to a string requires doing a complete copy of the byte array. Here is the proper (non-efficient) way to do what you want:
buf := new(bytes.Buffer)
buf.ReadFrom(yourReader)
s := buf.String() // Does a complete copy of the bytes in the buffer.
This copy is done as a protection mechanism. Strings are immutable. If you could convert a []byte to a string, you could change the contents of the string. However, go allows you to disable the type safety mechanisms using the unsafe package. Use the unsafe package at your own risk. Hopefully the name alone is a good enough warning. Here is how I would do it using unsafe:
buf := new(bytes.Buffer)
buf.ReadFrom(yourReader)
b := buf.Bytes()
s := *(*string)(unsafe.Pointer(&b))
There we go, you have now efficiently converted your byte array to a string. Really, all this does is trick the type system into calling it a string. There are a couple caveats to this method:
There are no guarantees this will work in all go compilers. While this works with the plan-9 gc compiler, it relies on "implementation details" not mentioned in the official spec. You can not even guarantee that this will work on all architectures or not be changed in gc. In other words, this is a bad idea.
That string is mutable! If you make any calls on that buffer it will change the string. Be very careful.
My advice is to stick to the official method. Doing a copy is not that expensive and it is not worth the evils of unsafe. If the string is too large to do a copy, you should not be making it into a string.
Answers so far haven't addressed the "entire stream" part of the question. I think the good way to do this is ioutil.ReadAll. With your io.ReaderCloser named rc, I would write,
Go >= v1.16
if b, err := io.ReadAll(rc); err == nil {
return string(b)
} ...
Go <= v1.15
if b, err := ioutil.ReadAll(rc); err == nil {
return string(b)
} ...
data, _ := ioutil.ReadAll(response.Body)
fmt.Println(string(data))
func copyToString(r io.Reader) (res string, err error) {
var sb strings.Builder
if _, err = io.Copy(&sb, r); err == nil {
res = sb.String()
}
return
}
The most efficient way would be to always use []byte instead of string.
In case you need to print data received from the io.ReadCloser, the fmt package can handle []byte, but it isn't efficient because the fmt implementation will internally convert []byte to string. In order to avoid this conversion, you can implement the fmt.Formatter interface for a type like type ByteSlice []byte.
var b bytes.Buffer
b.ReadFrom(r)
// b.String()
I like the bytes.Buffer struct. I see it has ReadFrom and String methods. I've used it with a []byte but not an io.Reader.

Resources