How to convert any negative value to zero with bitwise operators? - go

I'm writing the PopBack() operation for a LinkedList in Go, the code looks like this:
// PopBack will remove an item from the end of the linked list
func (ll *LinkedList) PopBack() {
lastNode := &ll.node
for *lastNode != nil && (*lastNode).next != nil {
lastNode = &(*lastNode).next
}
*lastNode = nil
if ll.Size() != 0 {
ll.size -= 1
}
}
I don't like the last if clause; if the size is zero we don't want to decrement to a negative value. I was wondering if there is a bitwise operation in which whatever the value is after the decrement, if it's only negative it should covert to a zero?

Negative values have the sign bit set, so you can do like this
ll.size += (-ll.size >> 31)
Suppose ll.size is int32 and ll.Size() returns ll.size. Of course this also implies that size is never negative. When the size is positive then the right shift will sign-extend -ll.size to make it -1, otherwise it'll be 0
If ll.size is int64 then change the shift count to 63. If ll.size is uint64 you can simply cast to int64 if the size is never larger than 263. But if the size can be that large (although almost impossible to occur in the far future) then things are much more trickier:
mask := uint64(-int64(ll.size >> 63)) // all ones if ll.size >= (1 << 63)
ll.size = ((ll.size - 1) & mask) | ((ll.size + uint64(-int64(ll.size) >> 63)) & ^mask)
It's basically a bitwise mux that's usually used in bithacks because you cannot cast bool to int without if in golang
Neither of these are quite readable at first glance so the if block is usually better

Trade a nil check in each iteration of the loop for a single nil check before the loop. With this change, the loop runs faster and the operator for updating size is subtraction.
func (ll *LinkedList) PopBack() {
if ll.node == nil {
return
}
lastNode := &ll.node
for (*lastNode).next != nil {
lastNode = &(*lastNode).next
}
*lastNode = nil
ll.size -= 1
}

Related

Why is accessing a variable so much slower than accessing len()?

I wrote this function uniq that takes in a sorted slice of ints
and returns the slice with duplicates removed:
func uniq(x []int) []int {
i := 0
for i < len(x)-1 {
if x[i] == x[i+1] {
copy(x[i:], x[i+1:])
x = x[:len(x)-1]
} else {
i++
}
}
return x
}
and uniq2, a rewrite of uniq with the same results:
func uniq2(x []int) []int {
i := 0
l := len(x)
for i < l-1 {
if x[i] == x[i+1] {
copy(x[i:], x[i+1:])
l--
} else {
i++
}
}
return x[:l]
}
The only difference between the two functions
is that in uniq2, instead of slicing x
and directly accessing len(x) each time,
I save len(x) to a variable l
and decrement it whenever I shift the slice.
I thought that uniq2 would be slightly faster than uniq
because len(x) would no longer be called iteration,
but in reality, it is inexplicably much slower.
With this test that generates a random sorted slice
and calls uniq/uniq2 on it 1000 times,
which I run on Linux:
func main() {
rand.Seed(time.Now().Unix())
for i := 0; i < 1000; i++ {
_ = uniq(genSlice())
//_ = uniq2(genSlice())
}
}
func genSlice() []int {
x := make([]int, 0, 1000)
for num := 1; num <= 10; num++ {
amount := rand.Intn(1000)
for i := 0; i < amount; i++ {
x = append(x, num)
}
}
return x
}
$ go build uniq.go
$ time ./uniq
uniq usually takes 5--6 seconds to finish.
while uniq2 is more than two times slower,
taking between 12--15 seconds.
Why is uniq2, where I save the slice length to a variable,
so much slower than uniq, where I directly call len?
Shouldn't it slightly faster?
You expect roughly the same execution time because you think they do roughly the same thing.
The only difference between the two functions is that in uniq2, instead of slicing x and directly accessing len(x) each time, I save len(x) to a variable l and decrement it whenever I shift the slice.
This is wrong.
The first version does:
copy(x[i:], x[i+1:])
x = x[:len(x)-1]
And second does:
copy(x[i:], x[i+1:])
l--
The first difference is that the first assigns (copies) a slice header which is a reflect.SliceHeader value, being 3 integer (24 bytes on 64-bit architecture), while l-- does a simple decrement, it's much faster.
But the main difference does not stem from this. The main difference is that since the first version changes the x slice (the header, the length included), you end up copying less and less elements, while the second version does not change x and always copies to the end of the slice. x[i+1:] is equivalent to x[x+1:len(x)].
To demonstrate, imagine you pass a slice with length=10 and having all equal elements. The first version will copy 9 elements first, then 8, then 7 etc. The second version will copy 9 elements first, then 9 again, then 9 again etc.
Let's modify your functions to count the number of copied elements:
func uniq(x []int) []int {
count := 0
i := 0
for i < len(x)-1 {
if x[i] == x[i+1] {
count += copy(x[i:], x[i+1:])
x = x[:len(x)-1]
} else {
i++
}
}
fmt.Println("uniq copied", count, "elements")
return x
}
func uniq2(x []int) []int {
count := 0
i := 0
l := len(x)
for i < l-1 {
if x[i] == x[i+1] {
count += copy(x[i:], x[i+1:])
l--
} else {
i++
}
}
fmt.Println("uniq2 copied", count, "elements")
return x[:l]
}
Testing it:
uniq(make([]int, 1000))
uniq2(make([]int, 1000))
Output is:
uniq copied 499500 elements
uniq2 copied 998001 elements
uniq2() copies twice as many elements!
If we test it with a random slice:
uniq(genSlice())
uniq2(genSlice())
Output is:
uniq copied 7956671 elements
uniq2 copied 11900262 elements
Again, uniq2() copies roughly 1.5 times more elements! (But this greatly depends on the random numbers.)
Try the examples on the Go Playground.
The "fix" is to modify uniq2() to copy until l:
copy(x[i:], x[i+1:l])
l--
With this "appropriate" change, performance is roughly the same.

how to manipulate very long string to avoid out of memory with golang

I trying for personal skills improvement to solve the hacker rank challenge:
There is a string, s, of lowercase English letters that is repeated infinitely many times. Given an integer, n, find and print the number of letter a's in the first n letters of the infinite string.
1<=s<=100 && 1<=n<=10^12
Very naively I though this code will be fine:
fs := strings.Repeat(s, int(n)) // full string
ss := fs[:n] // sub string
fmt.Println(strings.Count(ss, "a"))
Obviously I explode the memory and got an: "out of memory".
I never faced this kind of issue, and I'm clueless on how to handle it.
How can I manipulate very long string to avoid out of memory ?
I hope this helps, you don't have to actually count by running through the string.
That is the naive approach. You need to use some basic arithmetic to get the answer without running out of memory, I hope the comments help.
var answer int64
// 1st figure out how many a's are present in s.
aCount := int64(strings.Count(s, "a"))
// How many times will s repeat in its entirety if it had to be of length n
repeats := n / int64(len(s))
remainder := n % int64(len(s))
// If n/len(s) is not perfectly divisible, it means there has to be a remainder, check if that's the case.
// If s is of length 5 and the value of n = 22, then the first 2 characters of s would repeat an extra time.
if remainder > 0{
aCountInRemainder := strings.Count(s[:remainder], "a")
answer = int64((aCount * repeats) + int64(aCountInRemainder))
} else{
answer = int64((aCount * repeats))
}
return answer
There might be other methods but this is what came to my mind.
As you found out, if you actually generate the string you will end up having that huge memory block in RAM.
One common way to represent a "big sequence of incoming bytes" is to implement it as an io.Reader (which you can view as a stream of bytes), and have your code run a r.Read(buff) loop.
Given the specifics of the exercise you mention (a fixed string repeated n times), the number of occurrence of a specific letter can also be computed straight from the number of occurences of that letter in s, plus something more (I'll let you figure out what multiplications and counting should be done).
How to implement a Reader that repeats the string without allocating 10^12 times the string ?
Note that, when implementing the .Read() method, the caller has already allocated his buffer. You don't need to repeat your string in memory, you just need to fill the buffer with the correct values -- for example by copying byte by byte your data into the buffer.
Here is one way to do it :
type RepeatReader struct {
str string
count int
}
func (r *RepeatReader) Read(p []byte) (int, error) {
if r.count == 0 {
return 0, io.EOF
}
// at each iteration, pos will hold the number of bytes copied so far
var pos = 0
for r.count > 0 && pos < len(p) {
// to copy slices over, you can use the built-in 'copy' method
// at each iteration, you need to write bytes *after* the ones you have already copied,
// hence the "p[pos:]"
n := copy(p[pos:], r.str)
// update the amount of copied bytes
pos += n
// bad computation for this first example :
// I decrement one complete count, even if str was only partially copied
r.count--
}
return pos, nil
}
https://go.dev/play/p/QyFQ-3NzUDV
To have a complete, correct implementation, you also need to keep track of the offset you need to start from next time .Read() is called :
type RepeatReader struct {
str string
count int
offset int
}
func (r *RepeatReader) Read(p []byte) (int, error) {
if r.count == 0 {
return 0, io.EOF
}
var pos = 0
for r.count > 0 && pos < len(p) {
// when copying over to p, you should start at r.offset :
n := copy(p[pos:], r.str[r.offset:])
pos += n
// update r.offset :
r.offset += n
// if one full copy of str has been issued, decrement 'count' and reset 'offset' to 0
if r.offset == len(r.str) {
r.count--
r.offset = 0
}
}
return pos, nil
}
https://go.dev/play/p/YapRuioQcOz
You can now count the as while iterating through this Reader.

Convert string of numbers into 'binary representation'

Im recently made the "Winning Lottery Ticket" coding challange on hackerrank.
https://www.hackerrank.com/challenges/winning-lottery-ticket/
The idea is to count the combinations of two lines which contain all numbers from 0-9, in the example below its 5 combinations in total.
56789
129300455
5559948277
012334556
123456879
The idea is to change the the representation of something quicker for checking if all numbers are contained.
Example representation:
1278 --> 01100001100
Example with using the first two lines from above:
56789129300455 --> 1111111111
When checking if a number is contained with the concatenation of 2 lines I can abort directly if I encounter a zero because thats not gonna be a pair with all 0-9.
This logic works, but it fails when having a huge amount of lines to compare.
// Go code
func winningLotteryTicket(tickets []string) int {
counter := 0
for i := 0; i < len(tickets); i++ {
for j := i + 1; j < len(tickets); j++ {
if err := bitMask(fmt.Sprintf("%v%v", tickets[i], tickets[j])); err == nil {
counter++
}
}
}
return counter
}
func bitMask(s string) error {
for i := 0; i <= 9; i++ {
if !strings.Contains(s, strconv.Itoa(i)) {
return errors.New("No Pair")
}
}
return nil
}
Not sure if this representation is called a bitMaks, if not please correct me and I will adjust this post.
From my point of view there is no way the improove performance on the concatenation of the strings because I will have to check each combination.
For checking if a number is contained within the string at the function "bitMask" im not sure.
Do you have an idea how this could perform better ?
Bit masks are integers, not strings of ones and zeros. It's called a bitmask because we're not interested in the numerical value of these integers but only in the bit pattern. We can use bitwise operations on integers and those are really fast because they are implemented in hardware, directly in the CPU.
Here is a function that turns a string into an actual bitmask, with each one-bit signaling that a particular digit is present in the string:
func mask(s string) uint16 {
// We need ten bits, one for each possible decimal digit in s, so int16 and
// uint16 are the smallest possible integer types that fit. For bitmasks it
// is typical to select an unsigned type because the sign bit doesn't have
// any meaning. As I said earlier, mask's numerical value is irrelevant.
var mask uint16
for _, c := range s {
switch c {
case '0':
mask |= 0b0000000001
case '1':
mask |= 0b0000000010
case '2':
mask |= 0b0000000100
case '3':
mask |= 0b0000001000
case '4':
mask |= 0b0000010000
case '5':
mask |= 0b0000100000
case '6':
mask |= 0b0001000000
case '7':
mask |= 0b0010000000
case '8':
mask |= 0b0100000000
case '9':
mask |= 0b1000000000
}
}
return mask
}
This is rather verbose, but it should be pretty obvious what happens.
Note that the binary number literals can be replaced with bit shifts:
0b0000000001 is the same as 1<<0 (1 shifted zero times to the left)
0b0000000010 is the same as 1<<1 (1 shifted one time to the left
0b0000000100 is the same as 1<<2 (1 shifted two times to the left), and so on
Using this, and taking advantage of the fact that the bytes '0' through '9' are themselves just integers (48 through 57 in decimal, given by their place in the ASCII table, we can shorten this function like so:
func mask(s string) uint16 {
var mask uint16
for _, c := range s {
if '0' <= c && c <= '9' {
mask |= 1 << (c - '0')
}
}
return mask
}
To check two lines, then, all we have to do is OR the masks for the lines and compare to 0b1111111111 (i.e. check if all ten bits are set):
package main
import "fmt"
func main() {
a := "56789"
b := "129300455"
mA := mask(a)
mB := mask(b)
fmt.Printf("mask(%11q) = %010b\n", a, mA)
fmt.Printf("mask(%11q) = %010b\n", b, mB)
fmt.Printf("combined = %010b\n", mA|mB)
fmt.Printf("all digits present: %v\n", mA|mB == 0b1111111111)
}
func mask(s string) uint16 {
var mask uint16
for _, c := range s {
if '0' <= c && c <= '9' {
mask |= 1 << (c - '0')
}
}
return mask
}
mask( "56789") = 1111100000
mask("129300455") = 1000111111
combined = 1111111111
all digits present: true
Try it on the playground: https://play.golang.org/p/mr1KqnC9phB

Why is is leetcode saying my atoi answer is incorrect? Is it acutally incorrect? Or is there a bug in leetcode

I am doing the atoi problem in leetcode and I submitted my code below which isn't too important. I am wondering if it is a valid failure leetcode gave me. It seems like my code is doing the right thing.
Here is the problem description:
Here is the code:
const (
MaxInt32 = 1<<31 - 1
MinInt32 = -1 << 31
)
func myAtoi(str string) int {
if len(str) < 1 {
return 0
}
// count to keep track of the number of digits
count := 0
// container for digits
values := make([]rune, 0)
// constant we are going to need
minus := "-"
plus := "+"
lastWasPrefix := false
// is the number negative or lead with dash
negative := false
clean := strings.TrimSpace(str)
for _, char := range clean {
isNumber := unicode.IsNumber(char)
isMinus := string(char) == minus
isPlus := string(char) == plus
isPrefix := isMinus || isPlus
// checking for two prefixes following eachother
if isPrefix && lastWasPrefix {
return 0
}
if isPrefix {
lastWasPrefix = true
}
curLen := len(values)
if !isNumber && !isPrefix && curLen == 0 {
return 0
}
if !isNumber && !isPrefix && curLen != 0 {
break
}
if isMinus {
negative = true
continue
}
if isNumber {
// add value in order and inc. count
values = append(values, char)
count++
}
}
postLen := len(values)
if postLen == 0 {
return 0
}
multiplier := int32(1)
ten := int32(10)
total := int32(0)
for i := postLen - 1; i >= 0; i-- {
diff := MaxInt32 - total
added := CharToNum(values[i]) * multiplier
// added will be zero if we overflow the int
if added > diff || added < 0 {
return MinInt32
}
total += added
multiplier *= ten
}
if negative {
return int(total * int32(-1))
} else {
return int(total)
}
}
/**
a rune is a uni code char so we need to conver from unicode int to digit
*/
func CharToNum(r rune) (int32) {
for i := 48; i <= 57; i++ {
if int(r) == i {
return int32(r) - 48
}
}
return -1
}
Any help understanding this error would be much appreciated. I don't want any help with the algorithm. Is this a valid error or not?
Without checking your algorithm I can see the following in the error message:
The maximum 32bit int value is 2,147,483,647 which is expected to be returned when you get a string representing a larger value than that (e.g. your input was "2147483648" which is larger by one). Your program apparently returns -2147483648.
The specification is ambiguous "if the numerical value is out of the range of representable values INT_MAX or INT_MIN is retuned". The authors had in mind to return the value matching in sign but this is not clearly stated.
So I would say when you return INT_MIN for a number that is larger than INT_MAX this could be considered correct (although it is somewhat illogical).

How can I identify a matching pattern for a given number using go?

I'm trying to identify pattern matches for a given telephone number range to use in a Cisco Communications Manager platform.
Essentially, an 'X' matches the numbers 0-9 in a telephone number, and you can specify a range of digits using the [x-y] notation.
Given a telephone number range of 02072221000-02072221149 consisting of 150 numbers, this would create and should output two patterns: 020722210XX and 020722211[0-4]X
Obviously I'd like it to work on any range provided. I just can't seem to get my head around how to generate those patterns given the number ranges.
Any thoughts would be greatly appreciated.
Many thanks.
I believe I found a decent algorithm which should handle this for you. I apologize ahead of time if any of the explanation isn't detailed enough, but a lot of this came to intuition which can be hard to explain.
I started with more simplified cases, figuring out a method for how to get the fewest number of patterns from a comparison. For my examples I'll be comparing 211234 to 245245.
After a bit of thinking I worked out that you need to take the range of numbers from the smaller number up to 9 and handle the special case for the lowest digit in the smaller number. To explain in a bit more detail, in the number 211234 the ideal is to represent the last digit as an X but we can only do that for cases where the digit may be [0-9] the only case in this example where we can't use [0-9] is when our tens digit is 3 because we have a lower limit of 4. This logic then propagates up the rest of the number as we head toward the most significant digit. So for the tens digit in the next case we have a lower bound based on the previous example of 4 because we're handling the case when we allow a 3 specially. So for our tens range we end up with a 4-9 because the next digit over does not restrict our range.
In fact we won't be restricted until the most significant digit which is bounded by the numbers in the range between the numbers we're comparing. After working a few problems out by hand I noticed a bit of a pattern of the pyramid of Xs in the cases where the numbers digits were significantly apart:
compare: 211234
to: 245245
21123[4-9]
2112[4-9]X
211[3-9]XX
21[2-9]XXX
2[2-3]XXXX
24[0-4]XXX
245[0-1]XX
2452[0-3]X
24514[0-5]
This was my first hint as to how to handle it. Starting from the least significant moving up, taking advantage of the symmetry, but handling the case where we hit the "top of the pyramid". This example is easy though, there are many corner cases which will cause issues. For the sake of brevity I'm not going to go into detail for each but I'll give a short explanation for each:
What do you do when the 2 compared digits has one number between them, such as between 4 and 6?
In this case simply use the single digit in place of a range.
What do you do when the 2 compared digits have no number between them, such as between 4 and 5?
In this case throw away the row in which you'd handle the numbers between the digits as all cases will be handled explicitly.
What do you do when the minimum number in the range is 8?
In this case when we add 1 to the number to get a lower bound for the range we get a 9, which means we can simply substitute in a 9 rather than a range of [9-9]
What do you do when the minimum number in the range is 9?
In this case we simply don't bother handling that number as when handling the next digit up it should be covered by its use of X
I'm sure I'm missing some corner cases which I handle in the code which I simply didn't think to put in this list. I'm willing to clarify any part of the code if you just leave a comment asking.
Below is my stab at it in Go. It could probably be a bit more DRY but this is what I came up with after fiddling for a bit. I'm also pretty new to Go so please notify me of any spirit fouls in the comments and I'll correct them.
I don't guarantee this will handle every case, but it handled every case I threw at it. It's up to you to turn it into a script which takes in 2 strings ;)
Edit: I just realized via the example in the question (which for some reason I never ran) that this doesn't always condense the provided range in to the smallest number of outputs, but it should always give patterns which cover every case. Despite this drawback I think it's a good step in the right direction for you to work on top of. I'll update the answer if I find the time to get it to condense cases where the previous range is 1-9 and the special case is 0. The best means for which might end up being after the initial generation condensing these cases "manually".
package main
import (
"strconv"
"fmt"
)
func getStringFromMinAndMax(min int, max int) (string, bool){
minstr := strconv.Itoa(min)
maxstr := strconv.Itoa(max)
if max == min {
return minstr, false
}
if max < min{
return minstr, false
}
return "["+minstr+"-"+maxstr+"]", true
}
func main(){
str1 := "211234"
str2 := "245245"
diffLength := 0
for i := 0; i < len(str1); i++{
diffLength = i+1
number1, _ := strconv.Atoi(str1[:len(str1)-i-1])
number2, _ := strconv.Atoi(str2[:len(str2)-i-1])
if number1 == number2 {
break
}
}
elems := (diffLength * 2)-1
output := make([]*[]string, elems+1)
for i := 0; i < elems; i++ {
newSlice := make([]string, diffLength)
output[i] = &newSlice
}
for digit := 0; digit < diffLength; digit++ {
for j := 0; j < diffLength; j++ {
if j == digit {
if output[j] != nil {
min, _ := strconv.Atoi(string(str1[len(str1)-(digit+1)]))
max := 9
if digit == diffLength-1 {
max, _ = strconv.Atoi(string(str2[len(str1)-(digit+1)]))
max = max - 1
}
if digit != 0{
min = min+1
}
if min < 10 {
maxchar := strconv.Itoa(max)[0]
minchar := strconv.Itoa(min)[0]
newVal, safe := getStringFromMinAndMax(min, max)
if digit == diffLength-1 && !safe && (str1[len(str1)-(digit+1)] == maxchar || str2[len(str2)-(digit+1)] == minchar) {
output[j] = nil
} else {
(*output[j])[diffLength-digit-1] = newVal
}
} else {
output[j] = nil
}
}
if j != diffLength-1 && output[elems-1-j] != nil {
min := 0
max, _ := strconv.Atoi(string(str2[len(str1)-(digit+1)]))
if digit != 0{
max = max-1
}
if max >= 0{
newVal, _ := getStringFromMinAndMax(min, max)
(*output[elems-1-j])[diffLength-digit-1] = newVal
} else {
output[elems-1-j] = nil
}
}
} else {
if j > digit {
if output[j] != nil {
(*output[j])[diffLength-digit-1] = "X"
}
if j != diffLength-1 && output[elems-1-j] != nil {
(*output[elems-1-j])[diffLength-digit-1] = "X"
}
} else {
if output[j] != nil {
(*output[j])[diffLength-digit-1] = string(str1[len(str1)-digit-1])
}
if j != diffLength-1 && output[elems-1-j] != nil {
(*output[elems-1-j])[diffLength-digit-1] = string(str2[len(str2)-digit-1])
}
}
}
}
}
for _, list := range output {
if list != nil{
if len(str1) != diffLength{
fmt.Printf(str1[:len(str1)-diffLength])
}
for _, element := range *list {
fmt.Printf(element)
}
fmt.Printf("\n")
}
}
}
Footnotes:
diffLength is the number of characters on the end of the strings which differ, I couldn't think of a better way to get this number than what's in the script...
Me setting an output to nil is me saying, "This one will be handled explicitly, so throw it away"
j is a variable for which output I'm setting... But this also gets mirrored to the bottom, so I couldn't think of a concise name to give it thus I left it j.
digit is tracking which digit from the right we are modifying

Resources