Convert rune to int? - go

In the following code, I iterate over a string rune by rune, but I'll actually need an int to perform some checksum calculation. Do I really need to encode the rune into a []byte, then convert it to a string and then use Atoi to get an int out of the rune? Is this the idiomatic way to do it?
// The string `s` only contains digits.
var factor int
for i, c := range s[:12] {
if i % 2 == 0 {
factor = 1
} else {
factor = 3
}
buf := make([]byte, 1)
_ = utf8.EncodeRune(buf, c)
value, _ := strconv.Atoi(string(buf))
sum += value * factor
}
On the playground: http://play.golang.org/p/noWDYjn5rJ

The problem is simpler than it looks. You convert a rune value to an int value with int(r). But your code implies you want the integer value out of the ASCII (or UTF-8) representation of the digit, which you can trivially get with r - '0' as a rune, or int(r - '0') as an int. Be aware that out-of-range runes will corrupt that logic.

For example, sum += (int(c) - '0') * factor,
package main
import (
"fmt"
"strconv"
"unicode/utf8"
)
func main() {
s := "9780486653556"
var factor, sum1, sum2 int
for i, c := range s[:12] {
if i%2 == 0 {
factor = 1
} else {
factor = 3
}
buf := make([]byte, 1)
_ = utf8.EncodeRune(buf, c)
value, _ := strconv.Atoi(string(buf))
sum1 += value * factor
sum2 += (int(c) - '0') * factor
}
fmt.Println(sum1, sum2)
}
Output:
124 124

why don't you do only "string(rune)".
s:="12345678910"
var factor,sum int
for i,x:=range s{
if i%2==0{
factor=1
}else{
factor=3
}
xstr:=string(x) //x is rune converted to string
xint,_:=strconv.Atoi(xstr)
sum+=xint*factor
}
fmt.Println(sum)

val, _ := strconv.Atoi(string(v))
Where v is a rune
More concise but same idea as above

Related

Splitting big.Int by digit

I'm trying to split a big.Int into a number of int64s such that each is a portion of the larger number, with a standard offset of 18 digits. For example, given the following input value of 1234512351234088800000999, I would expect the following output: [351234088800000999, 1234512]. For negative numbers, I would expect all of the parts to be negative (i.e. -1234512351234088800000999 produces [-351234088800000999, -1234512]).
I already know I can do this to get the result I want:
func Split(input *big.Int) []int64 {
const width = 18
asStr := in.Coefficient().Text(10)
strLen := len(asStr)
offset := 0
if in.IsNegative() {
offset = 1
}
length := int(math.Ceil(float64(strLen-offset) / width))
ints := make([]int64, length)
for i := 1; i <= length; i++ {
start := strLen - (i * width)
end := start + width
if start < 0 || (start == 1 && asStr[0] == '-') {
start = 0
}
ints[i-1], _ = strconv.ParseInt(asStr[start:end], 10, 64)
if offset == 1 && ints[i-1] > 0 {
ints[i-1] = 0 - ints[i-1]
}
}
return ints
}
However, I don't like the idea of using string-parsing nor do I like the use of strconv. Is there a way I can do this utilizing the big.Int directly?
You can use the DivMod function to do what you need here, with some special care to handle negative numbers:
var offset = big.NewInt(1e18)
func Split(input *big.Int) []int64 {
rest := new(big.Int)
rest.Abs(input)
var ints []int64
r := new(big.Int)
for {
rest.DivMod(rest, offset, r)
ints = append(ints, r.Int64() * int64(input.Sign()))
if rest.BitLen() == 0 {
break
}
}
return ints
}
Multiplying each output by input.Sign() ensures that each output will be negative if the input is negative. The sum of the output values multiplied by 1e18 times their position in the output should equal the input.

golang convert big.Float to big.Int

convert big.Float to big.Int, i write code below, but it overflow with uint64, so what's the correct way to cenvert big.Float to big.Int.
package main
import "fmt"
import "math/big"
func FloatToBigInt(val float64) *big.Int {
bigval := new(big.Float)
bigval.SetFloat64(val)
coin := new(big.Float)
coin.SetInt(big.NewInt(1000000000000000000))
bigval.Mul(bigval, coin)
result := new(big.Int)
f,_ := bigval.Uint64()
result.SetUint64(f)
return result
}
func main() {
fmt.Println("vim-go")
fmt.Println(FloatToBigInt(float64(10)))
fmt.Println(FloatToBigInt(float64(20)))
fmt.Println(FloatToBigInt(float64(30)))
fmt.Println(FloatToBigInt(float64(40)))
fmt.Println(FloatToBigInt(float64(50)))
fmt.Println(FloatToBigInt(float64(100)))
fmt.Println(FloatToBigInt(float64(1000)))
fmt.Println(FloatToBigInt(float64(10000)))
}
A big int bigger than uint64 will always cause an overflow as uint64 has fixed size. You should use the following method on *Float:
func (*Float) Int
The changes required would be:
func FloatToBigInt(val float64) *big.Int {
bigval := new(big.Float)
bigval.SetFloat64(val)
// Set precision if required.
// bigval.SetPrec(64)
coin := new(big.Float)
coin.SetInt(big.NewInt(1000000000000000000))
bigval.Mul(bigval, coin)
result := new(big.Int)
bigval.Int(result) // store converted number in result
return result
}
Working example: https://play.golang.org/p/sEhH6iPkrK
Use the function Float.Int(nil)
I have worked with a regular float64 number (not big.Float) and found out that conversion via string is the most precise way. Check it out
Note: the example is for float64 -> decimal(,20) conversion.
func bigIntViaString(flt float64) (b *big.Int) {
if math.IsNaN(flt) || math.IsInf(flt, 0) {
return nil // illegal case
}
var in = strconv.FormatFloat(flt, 'f', -1, 64)
const parts = 2
var ss = strings.SplitN(in, ".", parts)
// protect from numbers without period
if len(ss) != parts {
ss = append(ss, "0")
}
// protect from ".0" and "0." values
if ss[0] == "" {
ss[0] = "0"
}
if ss[1] == "" {
ss[1] = "0"
}
const (
base = 10
fraction = 20
)
// get fraction length
var fract = len(ss[1])
if fract > fraction {
ss[1], fract = ss[1][:fraction], fraction
}
in = strings.Join([]string{ss[0], ss[1]}, "")
// convert to big integer from the string
b, _ = big.NewInt(0).SetString(in, base)
if fract == fraction {
return // ready
}
// fract < 20, * (20 - fract)
var (
ten = big.NewInt(base)
exp = ten.Exp(ten, big.NewInt(fraction-int64(fract)), nil)
)
b = b.Mul(b, exp)
return
}
https://play.golang.org/p/_lkyQ_0udjd

How to remove Unicode characters from byte buffer in Go?

I have a bytes.Buffer type variable which I filled with Unicode characters:
var mbuff bytes.Buffer
unicodeSource := 'کیا حال ھے؟'
for i,r := range(unicodeSource) {
mbuff.WriteRune(r)
}
Note: I iterated over a Unicode literals here, but really the source is an infinite loop of user input characters.
Now, I want to remove a Unicode character from any position in the buffer mbuff. The problem is that characters may be of variable byte sizes. So I cannot just pick out the ith byte from mbuff.String() as it might be the beginning, middle, or end of a character. This is my trivial (and horrendous) solution:
// removing Unicode character at position n
var tempString string
currChar := 0
for _, ch := range(mbuff.String()) { // iterate over Unicode chars
if currChar != n { // skip concatenating nth char
tempString += ch
}
currChar++
}
mbuff.Reset() // empty buffer
mbuff.WriteString(tempString) // write new string
This is bad in many ways. For one, I convert buffer to string, remove ith element, and write a new string back into the buffer. Too many operations. Second, I use the += operator in the loop to concatenate Unicode characters into a new string. I am using buffers in the first place exactly to avoid concatenation using += which is slow as this answer points out.
What is an efficient method to remove the ith Unicode character in a bytes.Buffer?
Also what is an efficient way to insert a Unicode character after i-1 Unicode characters (i.e. in the ith place)?
To remove the ith rune from a slice of bytes, loop through the slice counting runes. When the ith rune is found, copy the bytes following the rune down to the position of the ith rune:
func removeAtBytes(p []byte, i int) []byte {
j := 0
k := 0
for k < len(p) {
_, n := utf8.DecodeRune(p[k:])
if i == j {
p = p[:k+copy(p[k:], p[k+n:])]
}
j++
k += n
}
return p
}
This function modifies the backing array of the argument slice, but it does not allocate memory.
Use this function to remove a rune from a bytes.Buffer.
p := removeAtBytes(mbuf.Bytes(), i)
mbuf.Truncate(len(p)) // backing bytes were updated, adjust length
playground example
To remove the ith rune from a string, loop through the string counting runes. When the ith rune is found, create a string by concatenating the segment of the string before the rune with the segment of the string after the rune.
func removeAt(s string, i int) string {
j := 0 // count of runes
k := 0 // index in string of current rune
for k < len(s) {
_, n := utf8.DecodeRuneInString(s[k:])
if i == j {
return s[:k] + s[k+n:]
}
j++
k += n
}
return s
}
This function allocates a single string, the result. DecodeRuneInString is a function in the standard library unicode/utf8 package.
Taking a step back, go often works on Readers and Writers, so an alternative solution would be to use the text/transform package. You create a Transformer, attach it to a Reader and use the new Reader to produce a transformed string. For example here's a skipper:
func main() {
src := strings.NewReader("کیا حال ھے؟")
skipped := transform.NewReader(src, NewSkipper(5))
var buf bytes.Buffer
io.Copy(&buf, skipped)
fmt.Println("RESULT:", buf.String())
}
And here's the implementation:
package main
import (
"bytes"
"fmt"
"io"
"strings"
"unicode/utf8"
"golang.org/x/text/transform"
)
type skipper struct {
pos int
cnt int
}
// NewSkipper creates a text transformer which will remove the rune at pos
func NewSkipper(pos int) transform.Transformer {
return &skipper{pos: pos}
}
func (s *skipper) Transform(dst, src []byte, atEOF bool) (nDst, nSrc int, err error) {
for utf8.FullRune(src) {
_, sz := utf8.DecodeRune(src)
// not enough space in the dst
if len(dst) < sz {
return nDst, nSrc, transform.ErrShortDst
}
if s.pos != s.cnt {
copy(dst[:sz], src[:sz])
// track that we stored in dst
dst = dst[sz:]
nDst += sz
}
// track that we read from src
src = src[sz:]
nSrc += sz
// on to the next rune
s.cnt++
}
if len(src) > 0 && !atEOF {
return nDst, nSrc, transform.ErrShortSrc
}
return nDst, nSrc, nil
}
func (s *skipper) Reset() {
s.cnt = 0
}
There may be bugs with this code, but hopefully you can see the idea.
The benefit of this approach is it could work on a potentially infinite amount of data without having to store all of it in memory. For example you could transform a file this way.
Edit:
Remove the ith rune in the buffer:
A: Shift all runes one location to the left (Here A is faster than B), try it on The Go Playground:
func removeRuneAt(s string, runePosition int) string {
if runePosition < 0 {
return s
}
r := []rune(s)
if runePosition >= len(r) {
return s
}
copy(r[runePosition:], r[runePosition+1:])
return string(r[:len(r)-1])
}
B: Copy to new buffer, try it on The Go Playground
func removeRuneAt(s string, runePosition int) string {
if runePosition < 0 {
return s // avoid allocation
}
r := []rune(s)
if runePosition >= len(r) {
return s // avoid allocation
}
t := make([]rune, len(r)-1) // Apply replacements to buffer.
w := copy(t, r[:runePosition])
w += copy(t[w:], r[runePosition+1:])
return string(t[:w])
}
C: Try it on The Go Playground:
package main
import (
"bytes"
"fmt"
)
func main() {
str := "hello"
fmt.Println(str)
fmt.Println(removeRuneAt(str, 1))
buf := bytes.NewBuffer([]byte(str))
fmt.Println(buf.Bytes())
buf = bytes.NewBuffer([]byte(removeRuneAt(buf.String(), 1)))
fmt.Println(buf.Bytes())
}
func removeRuneAt(s string, runePosition int) string {
if runePosition < 0 {
return s // avoid allocation
}
r := []rune(s)
if runePosition >= len(r) {
return s // avoid allocation
}
t := make([]rune, len(r)-1) // Apply replacements to buffer.
w := copy(t, r[0:runePosition])
w += copy(t[w:], r[runePosition+1:])
return string(t[0:w])
}
D: Benchmark:
A: 745.0426ms
B: 1.0160581s
for 2000000 iterations
1- Short Answer: to replace all (n) instances of a character (or even a string):
n := -1
newR := ""
old := "µ"
buf = bytes.NewBuffer([]byte(strings.Replace(buf.String(), old, newR, n)))
2- For replacing the character(string) in the ith instance in the buffer, you may use:
buf = bytes.NewBuffer([]byte(Replace(buf.String(), oldString, newOrEmptyString, ith)))
See:
// Replace returns a copy of the string s with the ith
// non-overlapping instance of old replaced by new.
func Replace(s, old, new string, ith int) string {
if len(old) == 0 || old == new || ith < 0 {
return s // avoid allocation
}
i, j := 0, 0
for ; ith >= 0; ith-- {
j = strings.Index(s[i:], old)
if j < 0 {
return s // avoid allocation
}
j += i
i = j + len(old)
}
t := make([]byte, len(s)+(len(new)-len(old))) // Apply replacements to buffer.
w := copy(t, s[0:j])
w += copy(t[w:], new)
w += copy(t[w:], s[j+len(old):])
return string(t[0:w])
}
Try it on The Go Playground:
package main
import (
"bytes"
"fmt"
"strings"
)
func main() {
str := `How are you?µ`
fmt.Println(str)
fmt.Println(Replace(str, "µ", "", 0))
buf := bytes.NewBuffer([]byte(str))
fmt.Println(buf.Bytes())
buf = bytes.NewBuffer([]byte(Replace(buf.String(), "µ", "", 0)))
fmt.Println(buf.Bytes())
}
func Replace(s, old, new string, ith int) string {
if len(old) == 0 || old == new || ith < 0 {
return s // avoid allocation
}
i, j := 0, 0
for ; ith >= 0; ith-- {
j = strings.Index(s[i:], old)
if j < 0 {
return s // avoid allocation
}
j += i
i = j + len(old)
}
t := make([]byte, len(s)+(len(new)-len(old))) // Apply replacements to buffer.
w := copy(t, s[0:j])
w += copy(t[w:], new)
w += copy(t[w:], s[j+len(old):])
return string(t[0:w])
}
3- If you want to remove all instances of Unicode character (old string) from any position in the string, you may use:
strings.Replace(str, old, "", -1)
4- Also this works fine for removing from bytes.buffer:
strings.Replace(buf.String(), old, newR, -1)
Like so:
buf = bytes.NewBuffer([]byte(strings.Replace(buf.String(), old, newR, -1)))
Here is the complete working code (try it on The Go Playground):
package main
import (
"bytes"
"fmt"
"strings"
)
func main() {
str := `کیا حال ھے؟` //How are you?
old := `ک`
newR := ""
fmt.Println(strings.Replace(str, old, newR, -1))
buf := bytes.NewBuffer([]byte(str))
// for _, r := range str {
// buf.WriteRune(r)
// }
fmt.Println(buf.Bytes())
bs := []byte(strings.Replace(buf.String(), old, newR, -1))
buf = bytes.NewBuffer(bs)
fmt.Println(" ", buf.Bytes())
}
output:
یا حال ھے؟
[218 169 219 140 216 167 32 216 173 216 167 217 132 32 218 190 219 146 216 159]
[219 140 216 167 32 216 173 216 167 217 132 32 218 190 219 146 216 159]
5- strings.Replace is very efficient, see inside:
// Replace returns a copy of the string s with the first n
// non-overlapping instances of old replaced by new.
// If old is empty, it matches at the beginning of the string
// and after each UTF-8 sequence, yielding up to k+1 replacements
// for a k-rune string.
// If n < 0, there is no limit on the number of replacements.
func Replace(s, old, new string, n int) string {
if old == new || n == 0 {
return s // avoid allocation
}
// Compute number of replacements.
if m := Count(s, old); m == 0 {
return s // avoid allocation
} else if n < 0 || m < n {
n = m
}
// Apply replacements to buffer.
t := make([]byte, len(s)+n*(len(new)-len(old)))
w := 0
start := 0
for i := 0; i < n; i++ {
j := start
if len(old) == 0 {
if i > 0 {
_, wid := utf8.DecodeRuneInString(s[start:])
j += wid
}
} else {
j += Index(s[start:], old)
}
w += copy(t[w:], s[start:j])
w += copy(t[w:], new)
start = j + len(old)
}
w += copy(t[w:], s[start:])
return string(t[0:w])
}

Convert int32 to string in Golang

I need to convert an int32 to string in Golang. Is it possible to convert int32 to string in Golang without converting to int or int64 first?
Itoa needs an int. FormatInt needs an int64.
One line answer is fmt.Sprint(i).
Anyway there are many conversions, even inside standard library function like fmt.Sprint(i), so you have some options (try The Go Playground):
1- You may write your conversion function (Fastest):
func String(n int32) string {
buf := [11]byte{}
pos := len(buf)
i := int64(n)
signed := i < 0
if signed {
i = -i
}
for {
pos--
buf[pos], i = '0'+byte(i%10), i/10
if i == 0 {
if signed {
pos--
buf[pos] = '-'
}
return string(buf[pos:])
}
}
}
2- You may use fmt.Sprint(i) (Slow)
See inside:
// Sprint formats using the default formats for its operands and returns the resulting string.
// Spaces are added between operands when neither is a string.
func Sprint(a ...interface{}) string {
p := newPrinter()
p.doPrint(a)
s := string(p.buf)
p.free()
return s
}
3- You may use strconv.Itoa(int(i)) (Fast)
See inside:
// Itoa is shorthand for FormatInt(int64(i), 10).
func Itoa(i int) string {
return FormatInt(int64(i), 10)
}
4- You may use strconv.FormatInt(int64(i), 10) (Faster)
See inside:
// FormatInt returns the string representation of i in the given base,
// for 2 <= base <= 36. The result uses the lower-case letters 'a' to 'z'
// for digit values >= 10.
func FormatInt(i int64, base int) string {
_, s := formatBits(nil, uint64(i), base, i < 0, false)
return s
}
Comparison & Benchmark (with 50000000 iterations):
s = String(i) takes: 5.5923198s
s = String2(i) takes: 5.5923199s
s = strconv.FormatInt(int64(i), 10) takes: 5.9133382s
s = strconv.Itoa(int(i)) takes: 5.9763418s
s = fmt.Sprint(i) takes: 13.5697761s
Code:
package main
import (
"fmt"
//"strconv"
"time"
)
func main() {
var s string
i := int32(-2147483648)
t := time.Now()
for j := 0; j < 50000000; j++ {
s = String(i) //5.5923198s
//s = String2(i) //5.5923199s
//s = strconv.FormatInt(int64(i), 10) // 5.9133382s
//s = strconv.Itoa(int(i)) //5.9763418s
//s = fmt.Sprint(i) // 13.5697761s
}
fmt.Println(time.Since(t))
fmt.Println(s)
}
func String(n int32) string {
buf := [11]byte{}
pos := len(buf)
i := int64(n)
signed := i < 0
if signed {
i = -i
}
for {
pos--
buf[pos], i = '0'+byte(i%10), i/10
if i == 0 {
if signed {
pos--
buf[pos] = '-'
}
return string(buf[pos:])
}
}
}
func String2(n int32) string {
buf := [11]byte{}
pos := len(buf)
i, q := int64(n), int64(0)
signed := i < 0
if signed {
i = -i
}
for {
pos--
q = i / 10
buf[pos], i = '0'+byte(i-10*q), q
if i == 0 {
if signed {
pos--
buf[pos] = '-'
}
return string(buf[pos:])
}
}
}
The Sprint function converts a given value to string.
package main
import (
"fmt"
)
func main() {
var sampleInt int32 = 1
sampleString := fmt.Sprint(sampleInt)
fmt.Printf("%+V %+V\n", sampleInt, sampleString)
}
// %!V(int32=+1) %!V(string=1)
See this example.
Use a conversion and strconv.FormatInt to format int32 values as a string. The conversion has zero cost on most platforms.
s := strconv.FormatInt(int64(n), 10)
If you have many calls like this, consider writing a helper function similar to strconv.Itoa:
func formatInt32(n int32) string {
return strconv.FormatInt(int64(n), 10)
}
All of the low-level integer formatting code in the standard library works with int64 values. Any answer to this question using formatting code in the standard library (fmt package included) requires a conversion to int64 somewhere. The only way to avoid the conversion is to write formatting function from scratch, but there's little point in doing that.
func FormatInt32(value int32) string {
return fmt.Sprintf("%d", value)
}
Does this work?

bytewise compare varint encoded int64's

I'm using levigo, the leveldb bindings for Go. My keys are int64's and need to be kept sorted. By default, leveldb uses a bytewise comparator so I'm trying to use varint encoding.
func i2b(x int64) []byte {
b := make([]byte, binary.MaxVarintLen64)
n := binary.PutVarint(b, x)
return key[:n]
}
My keys are not being sorted correctly. I wrote the following as a test.
var prev int64 = 0
for i := int64(1); i < 1e5; i++ {
if bytes.Compare(i2b(i), i2b(prev)) <= 0 {
log.Fatalf("bytewise: %d > %d", b2i(prev), i)
}
prev = i
}
output: bytewise: 127 > 128
playground
I'm not sure where the problem is. Am I doing the encoding wrong? Is varint not the right encoding to use?
EDIT:
BigEndian fixed width encoding is bytewise comparable
func i2b(x int64) []byte {
b := make([]byte, 8)
binary.BigEndian.PutUint64(b, uint64(x))
return b
}
The varint encoding is not bytewise comparable* wrt to the order of the values it caries. One option how to write the ordering/collating function (cmp bellow) is for example:
package main
import (
"encoding/binary"
"log"
)
func i2b(x int64) []byte {
var b [binary.MaxVarintLen64]byte
return b[:binary.PutVarint(b[:], x)]
}
func cmp(a, b []byte) int64 {
x, n := binary.Varint(a)
if n < 0 {
log.Fatal(n)
}
y, n := binary.Varint(b)
if n < 0 {
log.Fatal(n)
}
return x - y
}
func main() {
var prev int64 = 0
for i := int64(1); i < 1e5; i++ {
if cmp(i2b(i), i2b(prev)) <= 0 {
log.Fatal("fail")
}
prev = i
}
}
Playground
(*) The reason is (also) the bit fiddling performed.

Resources