library/package in go that handles string encoding?

library/package in go that handles string encoding? - go

Are there any similar libraries/packages in go that emulate what vis(3) and unvis(3) do for BSD systems? I'm trying to do something that requires representation of strings that contain special characters like whitespace and such.

No, Not exactly, but if you are looking for URL encoding, You can do all the URL encoding you want with the net/url package:
see: Encode / decode URLs
and: Is there any example and usage of url.QueryEscape ? for golang
sample code:
fmt.Println(url.QueryEscape("https://stackoverflow.com/questions/tagged/go test\r \r\n"))
output:
http%3A%2F%2Fstackoverflow.com%2Fquestions%2Ftagged%2Fgo+test%0D+%0D%0A
or write your own:
in Go string is UTF-8 encoded, and is in effect a read-only slice of bytes:
you may get bytes like this:
str := "UTF-8"
bytes := []byte(str) // string to slice
fmt.Println(str, bytes) // UTF8 [85 84 70 45 56]
or convert bytes to string like this:
s := string([]byte{85, 84, 70, 45, 56, 32, 0xc2, 0xb5}) // slice to string
fmt.Println(s) // UTF-8 µ
0xC2 0xB5 is UTF-8 (hex) for Character 'MICRO SIGN' (U+00B5) see: http://www.fileformat.info/info/unicode/char/00b5/index.htm
also you may get bytes like this:
for i := 0; i < len(s); i++ {
fmt.Printf("%d: %d, ", i, s[i])
//0: 85, 1: 84, 2: 70, 3: 45, 4: 56, 5: 32, 6: 194, 7: 181,
}
or in compact Hex format:
fmt.Printf("% x\n", s) // 55 54 46 2d 38 20 c2 b5
and get runes (Unicode codepoints) like this:
for i, v := range s {
fmt.Printf("%d: %v, ", i, v)
//0: 85, 1: 84, 2: 70, 3: 45, 4: 56, 5: 32, 6: 181,
}
see: What is a rune?
and convert rune to string:
r := rune(181)
fmt.Printf("%#U\n", r) // U+00B5 'µ'
st := "this is UTF-8: " + string(r)
fmt.Println(st) // this is UTF-8: µ
convert slice of runes to string:
rs := []rune{181, 181, 181, 181}
sr := string(rs)
fmt.Println(sr) // µµµµ
convert string to slice of runes:
br := []rune(sr)
fmt.Println(br) //[181 181 181 181]
The %q (quoted) verb will escape any non-printable byte sequences in a string so the output is unambiguous:
fmt.Printf("%+q \n", "Hello, 世界") // "Hello, \u4e16\u754c"
unicode.IsSpace reports whether the rune is a space character as defined by Unicode's White Space property; in the Latin-1 space this is
'\t', '\n', '\v', '\f', '\r', ' ', U+0085 (NEL), U+00A0 (NBSP).
sample code:
package main
import (
"bytes"
"fmt"
"unicode"
)
func main() {
var buf bytes.Buffer
s := "\u4e16\u754c \u0020\r\n 世界"
for _, r := range s {
if unicode.IsSpace(r) {
buf.WriteString(fmt.Sprintf("\\u%04x", r))
} else {
buf.WriteString(string(r))
}
}
st := buf.String()
fmt.Println(st)
}
output:
世界\u0020\u0020\u000d\u000a\u0020\u0020世界
You can find more functions in the unicode/utf8, unicode, strconv and strings packages:
https://golang.org/pkg/unicode/utf8/
https://golang.org/pkg/unicode/
https://golang.org/pkg/strings/
https://golang.org/pkg/strconv/
https://blog.golang.org/strings

Related

Go copy bytes into struct fields with reflection

How can I iterate over a byte slice and assign them to the fields of a struct?
type s struct {
f1 []byte
f2 []byte
f3 []byte
}
func S s {
x := s{}
x.f1 = make([]byte, 4)
x.f1 = make([]byte, 2)
x.f1 = make([]byte, 2)
return x
}
func main() {
data := []byte{83, 117, 110, 83, 0, 1, 0, 65}
Z := S()
//pesudo code from here
i:= 0
for field in Z {
field = data[i:len(field)]
i += len(field)
}
Expecting:
f1 = [83,117,110,83]
f2 = [0,1]
f3 = [0,65]
I've done this in C/C++ before but I can't figure out how to do it in Go. I need the assigning function to be generic as I'm going to have several different structs some of which may not exist in the stream.
Ideally I want to pass in the initialized struct and my code would iterate over the struct fields filling them in.

Leverage the reflection code in the binary/encoding package.
Step 1: Declare the fields as arrays instead of slices.
type S struct {
F1 [4]byte
F2 [2]byte
F3 [2]byte
}
Step 2: Decode the data to the struct using binary.Read
var s S
data := []byte{83, 117, 110, 83, 0, 1, 0, 65}
err := binary.Read(bytes.NewReader(data), binary.LittleEndian, &s)
if err != nil {
log.Fatal(err)
}
Step 3: Done!
fmt.Print(s) // prints {[83 117 110 83] [0 1] [0 65]}
https://go.dev/play/p/H-e8Lusw0RC

You can use reflect.Copy. Like the built-in copy, it copies data into the destination up to its length. Make sure the fields you need to set are exported.
func main() {
data := []byte{83, 117, 110, 83, 0, 1, 0, 65}
z := S{
F1: make([]byte, 4),
F2: make([]byte, 2),
F3: make([]byte, 2),
}
SetBytes(&z, data)
fmt.Println(z) // {[83 117 110 83] [0 1] [0 65]}
}
func SetBytes(dst any, data []byte) {
v := reflect.ValueOf(dst)
if v.Kind() != reflect.Ptr {
panic("dst must be addressable")
}
v = v.Elem()
j := 0
for i := 0; i < v.NumField(); i++ {
field := v.Field(i)
if field.Kind() != reflect.Slice {
continue
}
j += reflect.Copy(v.Field(i), reflect.ValueOf(data[j:]))
}
}
Since data is assumed to be always []byte, you can subslice it directly.
Alternatively, you can use reflect.Value#Slice:
d := reflect.ValueOf(data)
// and later
j += reflect.Copy(v.Field(i), d.Slice(j, d.Len()))
Playground: https://go.dev/play/p/o1MR1qrW5pL

Go code running result in local environment is not same with run in go play

I am using Go to implement an algorithm described below:
There is an array,only one number appear one time,all the other numbers appear three times,find the number only appear one time
My code listed below:
import (
"testing"
)
func findBySum(arr []int) int {
result := 0
sum := [32]int{}
for i := 0; i < 32; i++ {
for _, v := range arr {
sum[i] += (v >> uint(i)) & 0x1
}
sum[i] %= 3
sum[i] <<= uint(i)
result |= sum[i]
}
return result
}
func TestThree(t *testing.T) {
// except one nubmer,all other number appear three times
a1 := []int{11, 222, 444, 444, 222, 11, 11, 17, -123, 222, -123, 444, -123} // unqiue number is 17
a2 := []int{11, 222, 444, 444, 222, 11, 11, -17, -123, 222, -123, 444, -123} // unque number is -17
t.Log(findBySum(a1))
t.Log(findBySum(a2))
}
However,I found that the running result in my PC is wrong,and the same code running in https://play.golang.org/p/hEseLZVL617 is correct,I do not know why.
Result in my PC:
Result in https://play.golang.org/p/hEseLZVL617:
As we see,when the unique number is positive,both result are right,but when the unique number is negative,the result in my PC in wrong and the result online is right.
I think it has something to do with the bit operations in my code,but I can't find the root cause.
I used IDEA 2019.1.1 and my Golang version listed below:
I don't know why the same code can works fine online and do not work in my local PC,can anyone help me analysis this? Thanks in advance!

Size of int is platform dependent, it may be 32-bit and it may be 64-bit. On the Go Playground it's 32-bit, on your local machine it's 64-bit.
If we change your example to use int64 explicitly instead of int, the result is the same on the Go Playground too:
func findBySum(arr []int64) int64 {
result := int64(0)
sum := [32]int64{}
for i := int64(0); i < 32; i++ {
for _, v := range arr {
sum[i] += (v >> uint64(i)) & 0x1
}
sum[i] %= 3
sum[i] <<= uint(i)
result |= sum[i]
}
return result
}
func TestThree(t *testing.T) {
// except one nubmer,all other number appear three times
a1 := []int64{11, 222, 444, 444, 222, 11, 11, 17, -123, 222, -123, 444, -123} // unqiue number is 17
a2 := []int64{11, 222, 444, 444, 222, 11, 11, -17, -123, 222, -123, 444, -123} // unque number is -17
t.Log(findBySum(a1))
t.Log(findBySum(a2))
}
You perform bitwise operations that assume 32-bit integer size. To get correct results locally (where your architecture and thus size of int and uint is 64-bit), change all ints to int32 and uint to uint32:
func findBySum(arr []int32) int32 {
result := int32(0)
sum := [32]int32{}
for i := int32(0); i < 32; i++ {
for _, v := range arr {
sum[i] += (v >> uint32(i)) & 0x1
}
sum[i] %= 3
sum[i] <<= uint(i)
result |= sum[i]
}
return result
}
func TestThree(t *testing.T) {
// except one nubmer,all other number appear three times
a1 := []int32{11, 222, 444, 444, 222, 11, 11, 17, -123, 222, -123, 444, -123} // unqiue number is 17
a2 := []int32{11, 222, 444, 444, 222, 11, 11, -17, -123, 222, -123, 444, -123} // unque number is -17
t.Log(findBySum(a1))
t.Log(findBySum(a2))
}
Lesson: if you perform calculations whose result depend on the representation size, always be explicit, and use fixed-size numbers like int32, int64, uint32, uint64.

Read input from console in Unicode instead of UTF-8 (hex) in golang

I am trying to read a user input with bufio in console. The text can have some special characters (é, à, ♫, ╬,...).
The code look like this :
reader := bufio.NewReader(os.Stdin)
input, _ := reader.ReadString('\n')
If I type for example "é", the ReadString will read it as "c3 a9" instead of "00e9". How can I read the text input in Unicode instead of UTF-8 ? I need to use this value as a hash table key.
Thanks

Go strings are conceptually a read-only slice to a read-only bytearray. The encoding of that bytearray is not specified, but string constants will be UTF-8 and using UTF-8 in other strings is the recommended approach.
Go provides convenience functions for accessing the UTF-8 as unicode codepoints (or runes in go-speak). A range loop over a string will do the utf8 decoding for you. Converting to []rune will give you a rune slice i.e. the unicode codepoints in order. These goodies only work on UTF-8 encoded strings/bytearrays. I would strongly suggest using UTF-8 internally.
An example:
package main
import (
"bufio"
"fmt"
"os"
)
func main() {
reader := bufio.NewReader(os.Stdin)
input, _ := reader.ReadString('\n')
println("non-range loop - bytes")
for i := 0; i < len(input); i++ {
fmt.Printf("%d %d %[2]x\n", i, input[i])
}
println("range-loop - runes")
for idx, r := range input {
fmt.Printf("%d %d %[2]c\n", idx, r)
}
println("converted to rune slice")
rs := []rune(input)
fmt.Printf("%#v\n", rs)
}
With the input: X é X
non-range loop - bytes
0 88 58
1 32 20
2 195 c3
3 169 a9
4 32 20
5 88 58
6 10 a
range-loop - runes
0 88 X
1 32
2 233 é
4 32
5 88 X
6 10
converted to rune slice
[]int32{88, 32, 233, 32, 88, 10}

Unicode and utf8 are not comparable. String can be both unicode and utf8. I learned a lot of stuff about those by reading Strings, bytes, runes and characters in Go.
To answer your question,
You can use DecodeRuneInString from unicode/utf8 package.
s := "é"
rune, _ := utf8.DecodeRuneInString(s)
fmt.Printf("%x", rune)
What DecodeRuneInString(s) does is, it returns the first utf8 encoded character (rune) in s along with that characters width in bytes. So if you want to get unicode code points of each rune in a string heres how to do it. This is the example given in the linked documentation only slightly modified.
str := "Hello, 世界"
for len(str) > 0 {
r, size := utf8.DecodeRuneInString(str)
fmt.Printf("%x %v\n", r, size)
str = str[size:]
}
Try in Playground.
Alternatively as Juergen points out you can use a range loop on the string to get runes contained in the string.
str := "Hello, 世界"
for _, rune := range(str) {
fmt.Printf("%x \n", rune)
}
Try in Playground

How can I convert from int to hex

I want to convert from int to hex in Golang.
In strconv, there is a method that converts strings to hex. Is there a similar method to get a hex string from an int?

Since hex is a Integer literal, you can ask the fmt package for a string representation of that integer, using fmt.Sprintf(), and the %x or %X format.
See playground
i := 255
h := fmt.Sprintf("%x", i)
fmt.Printf("Hex conv of '%d' is '%s'\n", i, h)
h = fmt.Sprintf("%X", i)
fmt.Printf("HEX conv of '%d' is '%s'\n", i, h)
Output:
Hex conv of '255' is 'ff'
HEX conv of '255' is 'FF'

"Hex" isn't a real thing. You can use a hexadecimal representation of a number, but there's no difference between 0xFF and 255. More info on that can be found in the docs which point out you can use 0xff to define an integer constant 255! As you mention, if you're trying to find the hexadecimal representation of an integer you could use strconv
package main
import (
"fmt"
"strconv"
)
func main() {
fmt.Println(strconv.FormatInt(255, 16))
// gives "ff"
}
Try it in the playground

If formatting some bytes, hex needs a 2 digits representation, with leading 0.
For exemple: 1 => '01', 15 => '0f', etc.
It is possible to force Sprintf to respect this :
h:= fmt.Sprintf("%02x", 14)
fmt.Println(h) // 0e
h2:= fmt.Sprintf("%02x", 231)
fmt.Println(h2) // e7
The pattern "%02x" means:
'0' force using zeros
'2' set the output size as two charactes
'x' to convert in hexadecimal

i := 4357640193405743614
h := fmt.Sprintf("%016x",i)
fmt.Printf("Decimal: %d,\nHexa: %s", i, h)
# Result
Decimal..: 4357640193405743614,
Hexa.....: 3c7972ab0ae9f1fe
Playground: https://play.golang.org/p/ndlMyBdQjmT

Sprintf is more versatile but FormatInt is faster. Choose what is better for you
func Benchmark_sprintf(b *testing.B) { // 83.8 ns/op
for n := 0; n < b.N; n++ {
_ = fmt.Sprintf("%x", n)
}
}
func Benchmark_formatint(b *testing.B) { // 28.5 ns/op
bn := int64(b.N)
for n := int64(0); n < bn; n++ {
_ = strconv.FormatInt(n, 16)
}
}

E.g. if its uint32, you can convert it to HEX as seen below =>
var p uint32
p = 4278190335
r := p >> 24 & 0xFF
g := p >> 16 & 0xFF
b := p >> 8 & 0xFF
fmt.Println(r, g, b)//255 0 0
DEMO
you can also check this online tool for ref. https://cryptii.com/pipes/integer-encoder

Converting []uint8 to float64

What is the best way to handle an http resp.Body which is formatted as []uint8 and not as JSON?
I would like to convert the bytes into a float64.
This is the returned value response:
value : %!F([]uint8=[48 46 48 48 49 50 53 53 50 49])

Try using ParseFloat from the strconv package (play):
b := []uint8{48, 46, 48, 48, 49, 50, 53, 53, 50, 49}
f, err := strconv.ParseFloat(string(b), 64)
if err != nil {
// Handle parse error
}
fmt.Printf("%f\n", f) // 0.001255

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

library/package in go that handles string encoding? - go

Are there any similar libraries/packages in go that emulate what vis(3) and unvis(3) do for BSD systems? I'm trying to do something that requires representation of strings that contain special characters like whitespace and such.

Related

Go copy bytes into struct fields with reflection

Go code running result in local environment is not same with run in go play

Read input from console in Unicode instead of UTF-8 (hex) in golang

How can I convert from int to hex

Converting []uint8 to float64

Categories

Resources