Inverted Return from strings.Replace() Golang - go

I have a large dataset where I needed to do some string manipulation (I know strings are immutable). The Replace() function in the strings package does exactly what I need, except I need it to search in reverse.
Say I have this string: AA-BB-CC-DD-EE
Run this script:
package main
import (
"fmt"
"strings"
)
func main() {
fmt.Println(strings.Replace("AA-BB-CC-DD-EE", "-", "", 1))
}
It outputs: AABB-CC-DD-EE
What I need is: AA-BBCCDDEE, where the first instance of the search key is found, and the rest discarded.
Splitting the string, inserting the dash, and joining it back together works. But, I'm thinking there is a more performant way to achieve this.

String slices!
in := "AA-BB-CC-DD-EE"
afterDash := strings.Index(in, "-") + 1
fmt.Println(in[:afterDash] + strings.Replace(in[afterDash:], "-", "", -1))
(might require some tweaking to get the behavior you want in the case that the input has no dashes).

This can be another solution
package main
import (
"strings"
"fmt"
)
func Reverse(s string) string {
n := len(s)
runes := make([]rune, n)
for _, rune := range s {
n--
runes[n] = rune
}
return string(runes[n:])
}
func main() {
S := "AA-BB-CC-DD-EE"
S = Reverse(strings.Replace(Reverse(S), "-", "", strings.Count(S, "-")-1))
fmt.Println(S)
}
Another solution:
package main
import (
"fmt"
"strings"
)
func main() {
S := strings.Replace("AA-BB-CC-DD-EE", "-", "*", 1)
S = strings.Replace(S, "-", "", -1)
fmt.Println(strings.Replace( S, "*", "-", 1))
}

I think you want to use strings.Map rather than rigging things with compositions of functions. It's basically meant for this scenario: character replacement with more complex requirements than Replace and cousins can handle. The definition:
Map returns a copy of the string s with all its characters modified according to the mapping function. If mapping returns a negative value, the character is dropped from the string with no replacement.
Your mapping function can be built with a fairly simple closure:
func makeReplaceFn(toReplace rune, skipCount int) func(rune) rune {
count := 0
return func(r rune) rune {
if r == toReplace && count < skipCount {
count++
} else if r == toReplace && count >= skipCount {
return -1
}
return r
}
}
From there, it's a very straightforward program:
strings.Map(makeReplaceFn('-', 1), "AA-BB-CC-DD-EE")
Playground, this produces the desired output:
AA-BBCCDDEE
Program exited.
I'm not sure whether this is faster or slower than other solutions without benchmarking, because on one hand it has to call a function for each rune in the string, while on the other hand it doesn't have to convert (and thus copy) between a []byte/[]rune and string between each function call (though the subslicing answer by hobbs is probably overall the best).
In addition, the method can be easily adapted to other scenarios (e.g. retaining every other dash), with the caveat that strings.Map can only do rune to rune mapping, and not rune to string mapping like strings.Replace does.

This was a fun question to answer. While the solutions offered work neatly, splitting and replacing, to say nothing of calling Replace 3 times doesn't seem likely to be performant.
The answer? Don't reinvent the wheel, the go standard library has already almost solved this problem with Replace(), let's tweak it. I stumbled a bit over how the API of our new function should work, finally settling on leaving the signature unchanged, but deciding on minimal change from strings.Replace:
func ReplaceAfter(s,old,new string,skip int) string
The variable skip replaces n to clarify what it does since the caller will specify how many instances of old to skip replacing. skip==0 is defined as replacing every instance and skip==-1 is defined as replacing no instances.
From here there were really only a few bits of the function that needed changing.
func ReplaceAfter(s, old, new string, skip int) string {
if old == new || skip == -1 { // changed
return s // avoid allocation
}
// Compute number of replacements.
m := strings.Count(s, old)
if m == 0 || m < skip { // changed
return s // avoid allocation
} // changed (removed else if)
// Apply replacements to buffer.
n := m - skip // changed, n means the same thing but is calculated
t := make([]byte, len(s)+n*(len(new)-len(old))) // longer buffer
w := 0
start := 0
for i := 0; i < m; i++ {
j := start
if len(old) == 0 {
if i > 0 {
_, wid := utf8.DecodeRuneInString(s[start:])
j += wid
}
} else {
j += strings.Index(s[start:], old)
}
if i >= skip { // changed, replace
w += copy(t[w:], s[start:j])
w += copy(t[w:], new)
} else { // changed, skip ahead
w += copy(t[w:], s[start:j+len(old)])
}
start = j + len(old)
}
w += copy(t[w:], s[start:])
return string(t[0:w])
}
Here's a playground link with a working demo. If you're interested, I also copied and adapted the relevant Test functions from go/src/strings/, to make sure that the function as written behaved itself predictably.

Related

how to invoke a function from the result of another function

package main
import "fmt"
func Reverse(str string) string {
r := ""
for i := len(str) - 1; i >= 0; i-- {
r += string(str[i])
// fmt.Println(r)
}
return r
}
func Generate(str string) string {
str = Reverse(str)
// vowel := ""
for _, rne := range str {
if rne == 'a' {
str += "A"
}
if rne == 'e' {
str += "E"
}
if rne == 'i' {
str += "I"
}
if rne == 'o' {
str += "O"
}
if rne == 'u' {
str += "U"
}
}
return Reverse(str)
}
func main() {
fmt.Println(...("haigolang123"))
}
This program will accept a logic from the previous function, then combine it with the next function.
I wondering how to invoke a function from the result of another function.
expect output is "321gnAlOgIAh"
I didn't get why you are trying to reverse the string twice if your input is haigolang123 and expected output is 321gnAlOgIAh. Let's refactor step by step.
For vowels, if all you needed to do is convert lower case to upper, you can direct subtract number 32 from rune (since 'a'=97 & 'A'=65). So, use a function to common out the check.
func in(c rune, list []rune) bool {
for _, l := range list {
if c == l {
return true
}
}
return false
}
This can check as follows:
vowelsLower := []rune{'a', 'e', 'i', 'o', 'u'}
# Some code here
if in(c, vowelsLower) {
result += string(c-32)
}
There are many ways to append strings, refer here when working particularly with strings. However, we are working with runes. It is easier to append it to a byte slice. Looking at the bigger picture, []byte can be directly converted to string when needed.
var result []byte
# Some code here
if in(c, vowelsLower) {
result = append(result, byte(c-diff))
}
While returning,
return string(result)
This is your code with these changes.
Additionally, why to iterate twice (once in Generate, and again in Reverse). Try reverse iterating and do the vowel case switching. The noticeable difference of this approach is it uses bytes directly.
Range over string gives rune. Slicing the string gives byte. Of course they can be typecasted from one another.
Since we were already using bytes in previous approach, the code looks like this.
Happy coding!!
In Go, write
package main
import "fmt"
func toUpper(r rune) rune {
switch r {
case 'a', 'e', 'i', 'o', 'u':
r &= 0b1101_1111
}
return r
}
func Generate(s string) string {
g := []rune(s)
for i, j := 0, len(g)-1; i <= j; i, j = i+1, j-1 {
g[i], g[j] = toUpper(g[j]), toUpper(g[i])
}
return string(g)
}
func main() {
s := "haigolang123"
fmt.Printf("%q\n", s)
g := Generate(s)
fmt.Printf("%q\n", g)
}
https://go.dev/play/p/pGRas6qsi8O
"haigolang123"
"321gnAlOgIAh"
Go is designed for efficient solutions.
In Go, strings are immutable. concatenating strings a and b creates a new string of length len(a) + len(b) and copies both a and b to the new string. It can get expensive.
Testing characters for all the vowels, even after you have matched one, is unnecessary.
Refactor your functional decomposition of Generate to include reversing a string while using a toUpper function for vowels.

How to find the distance between two runes

I'm trying to solve a couple of example programming problems to familiarize myself with the language.
I am iterating over a string as follows:
func main() {
fullFile := "abcdDefF"
for i := 1; i < len(fullFile); i++ {
println(fullFile[i-1], fullFile[i], fullFile[i-1]-fullFile[i])
}
}
In the loop I want to get the difference between the current rune and the previous rune (trying to identify lower-case - upper-case pairs by finding any pairs where the difference is == 32.
Strangely, the subtraction doesn't work properly (in fact seems to yield addition in cases where I would expect a negative number) although I would expect it to since runes are represented by int32.
Figured it out: the data type returned was a byte.
Explicitly converted to int and everything works as expected.
func main() {
fullFile, _ := ioutil.ReadFile("input/input.txt")
previous := 0
current := 0
for i := 1; i < len(fullFile); i++ {
previous = int(fullFile[i-1])
current = int(fullFile[i])
println(current, previous, current-previous)
}
}

Better to compare slices or bytes?

I'm just curious on which of these methods is better (or if there's an even better one that I'm missing). I'm trying to determine if the first letter and last letter of a word are the same, and there are two obvious solutions to me.
if word[:1] == word[len(word)-1:]
or
if word[0] == word[len(word)-1]
As I understand it, the first is just pulling slices of the string and doing a string comparison, while the second is pulling the character from either end and comparing as bytes.
I'm curious if there's a performance difference between the two, and if there's any "preferable" way to do this?
In Go, strings are UTF-8 encoded. UTF-8 is a variable-length encoding.
package main
import "fmt"
func main() {
word := "世界世"
fmt.Println(word[:1] == word[len(word)-1:])
fmt.Println(word[0] == word[len(word)-1])
}
Output:
false
false
If you really want to compare a byte, not a character, then be as precise as possible for the compiler. Obviously, compare a byte, not a slice.
BenchmarkSlice-4 200000000 7.55 ns/op
BenchmarkByte-4 2000000000 1.08 ns/op
package main
import "testing"
var word = "word"
func BenchmarkSlice(b *testing.B) {
for i := 0; i < b.N; i++ {
if word[:1] == word[len(word)-1:] {
}
}
}
func BenchmarkByte(b *testing.B) {
for i := 0; i < b.N; i++ {
if word[0] == word[len(word)-1] {
}
}
}
If by letter you mean rune, then use:
func eqRune(s string) bool {
if s == "" {
return false // or true if that makes more sense for the app
}
f, _ := utf8.DecodeRuneInString(s) // 2nd return value is rune size. ignore it.
l, _ := utf8.DecodeLastRuneInString(s) // 2nd return value is rune size. ignore it.
if f != l {
return false
}
if f == unicode.ReplacementChar {
// First and last are invalid UTF-8. Fallback to
// comparing bytes.
return s[0] == s[len(s)-1]
}
return true
}
If you mean byte, then use:
func eqByte(s string) bool {
if s == "" {
return false // or true if that makes more sense for the app
}
return s[0] == s[len(s)-1]
}
Comparing individual bytes is faster than comparing string slices as shown by the benchmark in another answer.
playground example
A string is a sequence of bytes. Your method works if you know the string contains only ASCII characters. Otherwise, you should use a method that handles multibyte characters instead of string indexing. You can convert it to a rune slice to process code points or characters, like this:
r := []rune(s)
return r[0] == r[len(r) - 1]
You can read more about strings, byte slices, runes, and code points in the official Go Blog post on the subject.
To answer your question, there's no significant performance difference between the two index expressions you posted.
Here's a runnable example:
package main
import "fmt"
func EndsMatch(s string) bool {
r := []rune(s)
return r[0] == r[len(r) - 1]
}
func main() {
tests := []struct{
s string
e bool
}{
{"foo", false},
{"eve", true},
{"世界世", true},
}
for _, t := range tests {
r := EndsMatch(t.s)
if r != t.e {
fmt.Printf("EndsMatch(%s) failed: expected %t, got %t\n", t.s, t.e, r)
}
}
}
Prints nothing.

Golang: find first character in a String that doesn't repeat

I'm trying to write a function that returns the finds first character in a String that doesn't repeat, so far I have this:
package main
import (
"fmt"
"strings"
)
func check(s string) string {
ss := strings.Split(s, "")
smap := map[string]int{}
for i := 0; i < len(ss); i++ {
(smap[ss[i]])++
}
for k, v := range smap {
if v == 1 {
return k
}
}
return ""
}
func main() {
fmt.Println(check("nebuchadnezzer"))
}
Unfortunately in Go when you iterate a map there's no guarantee of the order so every time I run the code I get a different value, any pointers?
Using a map and 2 loops :
play
func check(s string) string {
m := make(map[rune]uint, len(s)) //preallocate the map size
for _, r := range s {
m[r]++
}
for _, r := range s {
if m[r] == 1 {
return string(r)
}
}
return ""
}
The benfit of this is using just 2 loops vs multiple loops if you're using strings.ContainsRune, strings.IndexRune (each function will have inner loops in them).
Efficient (in time and memory) algorithms for grabbing all or the first unique byte http://play.golang.org/p/ZGFepvEXFT:
func FirstUniqueByte(s string) (b byte, ok bool) {
occur := [256]byte{}
order := make([]byte, 0, 256)
for i := 0; i < len(s); i++ {
b = s[i]
switch occur[b] {
case 0:
occur[b] = 1
order = append(order, b)
case 1:
occur[b] = 2
}
}
for _, b = range order {
if occur[b] == 1 {
return b, true
}
}
return 0, false
}
As a bonus, the above function should never generate any garbage. Note that I changed your function signature to be a more idiomatic way to express what you're describing. If you need a func(string) string signature anyway, then the point is moot.
That can certainly be optimized, but one solution (which isn't using map) would be:
(playground example)
func check(s string) string {
unique := ""
for pos, c := range s {
if strings.ContainsRune(unique, c) {
unique = strings.Replace(unique, string(c), "", -1)
} else if strings.IndexRune(s, c) == pos {
unique = unique + string(c)
}
}
fmt.Println("All unique characters found: ", unique)
if len(unique) > 0 {
_, size := utf8.DecodeRuneInString(unique)
return unique[:size]
}
return ""
}
This is after the question "Find the first un-repeated character in a string"
krait suggested below that the function should:
return a string containing the first full rune, not just the first byte of the utf8 encoding of the first rune.

Go Unpacking Array As Arguments

So in Python and Ruby there is the splat operator (*) for unpacking an array as arguments. In Javascript there is the .apply() function. Is there a way of unpacking an array/slice as function arguments in Go? Any resources for this would be great as well!
Something along the lines of this:
func my_func(a, b int) (int) {
return a + b
}
func main() {
arr := []int{2,4}
sum := my_func(arr)
}
You can use a vararg syntax similar to C:
package main
import "fmt"
func my_func( args ...int) int {
sum := 0
for _,v := range args {
sum = sum + v
}
return sum;
}
func main() {
arr := []int{2,4}
sum := my_func(arr...)
fmt.Println("Sum is ", sum)
}
Now you can sum as many things as you'd like. Notice the important ... after when you call the my_func function.
Running example: http://ideone.com/8htWfx
Either your function is varargs, in which you can use a slice with the ... notation as Hunter McMillen shows, or your function has a fixed number of arguments and you can unpack them when writing your code.
If you really want to do this dynamically on a function of fixed number of arguments, you can use reflection:
package main
import "fmt"
import "reflect"
func my_func(a, b int) (int) {
return a + b
}
func main() {
arr := []int{2,4}
var args []reflect.Value
for _, x := range arr {
args = append(args, reflect.ValueOf(x))
}
fun := reflect.ValueOf(my_func)
result := fun.Call(args)
sum := result[0].Interface().(int)
fmt.Println("Sum is ", sum)
}
https://play.golang.org/p/2nN6kjHXIsd
I had a reason to unpack some vars from a map[string]string with single quotes around some of them as well as without. Here's the logic for it and the play link up top has the full working snippet.
func unpack(a map[string]string) string {
var stmt, val string
var x, y []string
for k, v := range a {
x = append(x, k)
y = append(y, "'"+v+"'")
}
stmt = "INSERT INTO tdo.rca_trans_status (" + strings.Join(x, ", ")
val = ") VALUES (" + strings.Join(y, ",") + ");"
return stmt + val}
Which presents cleanly for a mssql query as:
INSERT INTO tdo.rca_trans_status (rca_json_body, original_org, md5sum, updated, rca_key) VALUES ('blob','EG','2343453463','2009-11-10 23:00:00','prb-180');
No, there's no direct support for this in the language. Python and Ruby, as well as Javascript you're mentioning; are all dynamic/scripting languages. Go is way closer to, for example, C than to any dynamic language. The 'apply' functionality is handy for dynamic languages, but of little use for static languages like C or Go,

Resources