I am trying to transform a UTF-8 string into a Latin-1 enabled string (converting all illegal Latin-1 chars into a '?') in order to save it into a txt file to be used in a Latin-1-only system.
For test purpose I used this code:
package main
import (
"errors"
"fmt"
"os"
"golang.org/x/text/encoding/charmap"
"golang.org/x/text/transform"
)
func main() {
strUTF8 := "example 1: Г, example 2: ≤, example 3: “, etc" // utf-8 string to be converted (not Latin-1 compatible, 3 uncompatibles runes)
t := charmap.ISO8859_1.NewEncoder() // transformer to Latin-1
ini := 0 // ini establishes the initial position of the string to be analized
strISO88591 := "" // initiate Latin-1 compatible string
counter := 0 // put a counter to forcibly break after 5 iters of the loop ('for' as a 'while' is not working as expected)
err := errors.New("initiate err with non-nil value") // initiate err with non-nil value to enter the loop
for err != nil { // loop should exit for when 'err != nil' evaluates to false, that is, when err == nil
str, n, err := transform.String(t, strUTF8[ini:])
if err != nil {
ini = ini + n
runes := []rune(strUTF8[ini:])
ini = ini + len(string(runes[0])) //initial position of string in next iter should jump chars already converted + not allowed rune
str = str + "?"
}
strISO88591 = strISO88591 + str
// prints to display results
fmt.Println("sISO88591:", strISO88591)
fmt.Println("err:", err)
fmt.Println("err!=nil:", err != nil)
fmt.Println()
// with the following 3 lines below it works (why not in the 'for' statement????)
//if err == nil {
// break
//}
// put a counter to forcibly break after 5 iters of the loop
counter += 1
if counter > 4 {
fmt.Println("breaking forcibly")
break
}
}
f, _ := os.Create("test.txt")
defer f.Close()
_, err = f.WriteString(strISO88591)
if err != nil {
panic(err)
}
}
That code prints in the terminal:
sISO88591: example 1: ?
err: encoding: rune not supported by encoding.
err!=nil: true
sISO88591: example 1: ?, example 2: ?
err: encoding: rune not supported by encoding.
err!=nil: true
sISO88591: example 1: ?, example 2: ?, example 3: ?
err: encoding: rune not supported by encoding.
err!=nil: true
sISO88591: example 1: ?, example 2: ?, example 3: ?, etc
err: <nil>
err!=nil: false
sISO88591: example 1: ?, example 2: ?, example 3: ?, etc, etc
err: <nil>
err!=nil: false
breaking forcibly
As we can see, after 4th iteration 'err!=nil' is evaluated to 'false' and so I expected it to exit the 'for err != nil' loop, but it never does (until I forcibly broke it with the help of a counter) .
Isn´t 'for' supposed to work as other languages 'while'? Am I doing something wrong?
Specs:
Go version: go1.19.5 windows/amd64
You are redeclaring err:
for err != nil { // This is using err declared before
str, n, err := transform.String(t, strUTF8[ini:]) // This is redeclaring and shadowing previous err
...
You can deal with it by:
for err != nil {
var str string
var n int
str, n, err = transform.String(t, strUTF8[ini:]) // This is redeclaring and shadowing previous err
...
If you omit the loop condition it loops forever, so an infinite loop is compactly expressed.
This is from a very popular tutorial for Golang, you can use structure for infinite or while loop by for command.
Furthermore, the syntax for error declaration is wrong in your code.
Related
If I have something like this
Case 1:
if str, err := m.something(); err != nil {
return err
}
fmt.Println(str) //str is undefined variable
Case 2:
str, err := m.something();
fmt.Println(str) //str is ok
My question is why does the scope of the variable str change when its used in a format like this
if str, err := m.something(); err != nil {
return err
//str scope ends
}
Because if statements (and for, and switch) are implicit blocks, according to the language spec, and := is for both declaration and assignment. If you want str to be available after the if, you could declare the variables first, and then assign to them in the if statement:
var s string
var err error
if str, err = m.something(); err != nil
// ...
I would like to parse a package and output all of the strings in the code. The specific use case is to collect sql strings and run them through a sql parser, but that's a separate issue.
Is the best way to do this to just parse this line by line? Or is it possible to regex this or something? I imagine that some cases might be nontrivial, such as multiline strings:
str := "This is
the full
string"
// want > This is the full string
Use the go/scanner package to scan for strings in Go source code:
src, err := os.ReadFile(fname)
if err != nil {
/// handle error
}
// Create *token.File to scan.
fset := token.NewFileSet()
file := fset.AddFile(fname, fset.Base(), len(src))
var s scanner.Scanner
s.Init(file, src, nil, 0)
for {
pos, tok, lit := s.Scan()
if tok == token.EOF {
break
}
if tok == token.STRING {
s, _ := strconv.Unquote(lit)
fmt.Printf("%s: %s\n", fset.Position(pos), s)
}
}
https://go.dev/play/p/849QsbqVhho
package main
import (
"fmt"
"log"
)
func main() {
var num int
n, err := fmt.Scanf("%d", &num)
fmt.Println(n, num)
if err != nil {
log.Fatalln(err)
}
var r1 rune
n, err = fmt.Scanf("%c", &r1)
fmt.Println(n, r1)
if err != nil {
log.Fatalln(err)
}
var r2 rune
n, err = fmt.Scanf("%c", &r2)
fmt.Println(n, r2)
if err != nil {
log.Fatalln(err)
}
}
input(Keyboard keys) is:
1 enter a enter
output is:
1 1
1 97
1 10
Why the value of r2 is \n but the the value of r1 is a?
In the comment of the fmt.Scanf:
Newlines in the input must match newlines in the format. The one exception: the verb %c always scans the next rune in the input, even if it is a space (or tab etc.) or newline.
It seems that the newline after the %d is eaten but the newline after %c is not. Is the newline after the %d miss matching?
Another example: https://play.studygolang.com/p/lRgxrUqyBTI , I try to use a buffer to substitute for the stdin, but the output is different from using the stdin.
go version is go version go1.17.1 windows/amd64
a newline is a single character that is valid input for a rune but not for a int, so when you capture a rune the new line is read and stored while that is not done for other types.
if you don't capture the new line while reading to a rune you will get the error "unexpected newlnie" if yuo try to read from stdin again.
i suggest you do the following to read runes so you always capture the newline and don't get unexpected results
var r1, r2 rune
n, err = fmt.Scanf("%c%c", &r1, &r2)
fmt.Println(n, r1, r2)
if err != nil {
log.Fatalln(err)
}
package main
import "fmt"
func main() {
fmt.Println("Enter a number: ")
var addendOne int = fmt.Scan()
fmt.Println("Enter another number: ")
var addendTwo int = fmt.Scan()
sum := addendOne + addendTwo
fmt.Println(addendOne, " + ", addendTwo, " = ", sum)
}
This raises an error:
multiple values in single-value context.
Why does it happen and how do we fix it?
fmt.Scan returns two values, and you're only catching one into addedOne.
you should catch the error as well like this:
addendTwo, err := fmt.Scan()
if err != nil {
// handle error here
}
if you want to ignore the error value (not recommended!), do it like this:
addendTwo, _ := fmt.Scan()
fmt.Scan() returns two values and your code expects just one when you call it.
The Scan signature func Scan(a ...interface{}) (n int, err error) returns first the number of scanned items and eventually an error. A nil value in the error position indicates that there was no error.
Change your code like this:
addendOne, err := fmt.Scan()
if err != nil {
//Check your error here
}
fmt.Println("Enter another number: ")
addendTwo, err := fmt.Scan()
if err != nil {
//Check your error here
}
If you really want to ignore the errors you can used the blank identifier _:
addendOne, _ := fmt.Scan()
Because Scan returns int and error so you should use the := syntax that is shorthand for declaring and initializing.
addendOne, err := fmt.Scan()
addendTwo, err := fmt.Scan()
From golang fmt documentation:
func Scan(a ...interface{}) (n int, err error)
I'm trying to parse a string from WebSockets connection in Go language. I'm implementing both sides of the connection, so the specification of data format is depending only on me.
As this is a simple app (generally for learning purposes), I've come up with ActionId Data, where ActionId is a uint8. BackendHandler is a handler for every request in WebSocket Connection.
Platform information
kuba:~$ echo {$GOARCH,$GOOS,`6g -V`}
amd64 linux 6g version release.r60.3 9516
code:
const ( // Specifies ActionId's
SabPause = iota
)
func BackendHandler(ws *websocket.Conn) {
buf := make([]byte, 512)
_, err := ws.Read(buf)
if err != nil { panic(err.String()) }
str := string(buf)
tmp, _ := strconv.Atoi(str[:0])
data := str[2:]
fmt.Println(tmp, data)
switch tmp {
case SabPause:
// Here I get `parsing "2": invalid argument`
// when passing "0 2" to websocket connection
minutes, ok := strconv.Atoui(data)
if ok != nil {
panic(ok.String())
}
PauseSab(uint8(minutes))
default:
panic("Unmatched input for BackendHandler")
}
}
All the output: (note the Println that I used for inspecting)
0 2
panic: parsing "2": invalid argument [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
I couldn't find the code from which this error is launch, only where the error code is defined (dependent on platform). I'd appreciate general ideas for improving my code, but mainly I just want to solve the conversion problem.
Is this related to my buffer -> string conversion and slice-manipulation(I didn't want to use SplitAfter methods)?
Edit
This code reproduces the problem:
package main
import (
"strconv"
"io/ioutil"
)
func main() {
buf , _ := ioutil.ReadFile("input")
str := string(buf)
_, ok := strconv.Atoui(str[2:])
if ok != nil {
panic(ok.String())
}
}
The file input has to contain 0 2\r\n (depending on the file ending, it may look different on other OSes). This code can be fixed by adding the ending index for reslice, this way:
_, ok := strconv.Atoui(str[2:3])
You didn't provide a small compilable and runnable program to illustrate your problem. Nor did you provide full and meaningful print diagnostic messages.
My best guess is that you have a C-style null-terminated string. For example, simplifying your code,
package main
import (
"fmt"
"strconv"
)
func main() {
buf := make([]byte, 512)
buf = []byte("0 2\x00") // test data
str := string(buf)
tmp, err := strconv.Atoi(str[:0])
if err != nil {
fmt.Println(err)
}
data := str[2:]
fmt.Println("tmp:", tmp)
fmt.Println("str:", len(str), ";", str, ";", []byte(str))
fmt.Println("data", len(data), ";", data, ";", []byte(data))
// Here I get `parsing "2": invalid argument`
// when passing "0 2" to websocket connection
minutes, ok := strconv.Atoui(data)
if ok != nil {
panic(ok.String())
}
_ = minutes
}
Output:
parsing "": invalid argument
tmp: 0
str: 4 ; 0 2 ; [48 32 50 0]
data 2 ; 2 ; [50 0]
panic: parsing "2": invalid argument
runtime.panic+0xac /home/peter/gor/src/pkg/runtime/proc.c:1254
runtime.panic(0x4492c0, 0xf840002460)
main.main+0x603 /home/peter/gopath/src/so/temp.go:24
main.main()
runtime.mainstart+0xf /home/peter/gor/src/pkg/runtime/amd64/asm.s:78
runtime.mainstart()
runtime.goexit /home/peter/gor/src/pkg/runtime/proc.c:246
runtime.goexit()
----- goroutine created by -----
_rt0_amd64+0xc9 /home/peter/gor/src/pkg/runtime/amd64/asm.s:65
If you add my print diagnostic statements to your code, what do you see?
Note that your tmp, _ := strconv.Atoi(str[:0]) statement is probably wrong, since str[:0] is equivalent to str[0:0], which is equivalent to the empty string "".
I suspect that your problem is that you are ignoring the n return value from ws.Read. For example (including diagnostic messages), I would expect,
buf := make([]byte, 512)
buf = buf[:cap(buf)]
n, err := ws.Read(buf)
if err != nil {
panic(err.String())
}
fmt.Println(len(buf), n)
buf = buf[:n]
fmt.Println(len(buf), n)
Also, try using this code to set tmp,
tmp, err := strconv.Atoi(str[:1])
if err != nil {
panic(err.String())
}