How can I identify a matching pattern for a given number using go? - algorithm

I'm trying to identify pattern matches for a given telephone number range to use in a Cisco Communications Manager platform.
Essentially, an 'X' matches the numbers 0-9 in a telephone number, and you can specify a range of digits using the [x-y] notation.
Given a telephone number range of 02072221000-02072221149 consisting of 150 numbers, this would create and should output two patterns: 020722210XX and 020722211[0-4]X
Obviously I'd like it to work on any range provided. I just can't seem to get my head around how to generate those patterns given the number ranges.
Any thoughts would be greatly appreciated.
Many thanks.

I believe I found a decent algorithm which should handle this for you. I apologize ahead of time if any of the explanation isn't detailed enough, but a lot of this came to intuition which can be hard to explain.
I started with more simplified cases, figuring out a method for how to get the fewest number of patterns from a comparison. For my examples I'll be comparing 211234 to 245245.
After a bit of thinking I worked out that you need to take the range of numbers from the smaller number up to 9 and handle the special case for the lowest digit in the smaller number. To explain in a bit more detail, in the number 211234 the ideal is to represent the last digit as an X but we can only do that for cases where the digit may be [0-9] the only case in this example where we can't use [0-9] is when our tens digit is 3 because we have a lower limit of 4. This logic then propagates up the rest of the number as we head toward the most significant digit. So for the tens digit in the next case we have a lower bound based on the previous example of 4 because we're handling the case when we allow a 3 specially. So for our tens range we end up with a 4-9 because the next digit over does not restrict our range.
In fact we won't be restricted until the most significant digit which is bounded by the numbers in the range between the numbers we're comparing. After working a few problems out by hand I noticed a bit of a pattern of the pyramid of Xs in the cases where the numbers digits were significantly apart:
compare: 211234
to: 245245
21123[4-9]
2112[4-9]X
211[3-9]XX
21[2-9]XXX
2[2-3]XXXX
24[0-4]XXX
245[0-1]XX
2452[0-3]X
24514[0-5]
This was my first hint as to how to handle it. Starting from the least significant moving up, taking advantage of the symmetry, but handling the case where we hit the "top of the pyramid". This example is easy though, there are many corner cases which will cause issues. For the sake of brevity I'm not going to go into detail for each but I'll give a short explanation for each:
What do you do when the 2 compared digits has one number between them, such as between 4 and 6?
In this case simply use the single digit in place of a range.
What do you do when the 2 compared digits have no number between them, such as between 4 and 5?
In this case throw away the row in which you'd handle the numbers between the digits as all cases will be handled explicitly.
What do you do when the minimum number in the range is 8?
In this case when we add 1 to the number to get a lower bound for the range we get a 9, which means we can simply substitute in a 9 rather than a range of [9-9]
What do you do when the minimum number in the range is 9?
In this case we simply don't bother handling that number as when handling the next digit up it should be covered by its use of X
I'm sure I'm missing some corner cases which I handle in the code which I simply didn't think to put in this list. I'm willing to clarify any part of the code if you just leave a comment asking.
Below is my stab at it in Go. It could probably be a bit more DRY but this is what I came up with after fiddling for a bit. I'm also pretty new to Go so please notify me of any spirit fouls in the comments and I'll correct them.
I don't guarantee this will handle every case, but it handled every case I threw at it. It's up to you to turn it into a script which takes in 2 strings ;)
Edit: I just realized via the example in the question (which for some reason I never ran) that this doesn't always condense the provided range in to the smallest number of outputs, but it should always give patterns which cover every case. Despite this drawback I think it's a good step in the right direction for you to work on top of. I'll update the answer if I find the time to get it to condense cases where the previous range is 1-9 and the special case is 0. The best means for which might end up being after the initial generation condensing these cases "manually".
package main
import (
"strconv"
"fmt"
)
func getStringFromMinAndMax(min int, max int) (string, bool){
minstr := strconv.Itoa(min)
maxstr := strconv.Itoa(max)
if max == min {
return minstr, false
}
if max < min{
return minstr, false
}
return "["+minstr+"-"+maxstr+"]", true
}
func main(){
str1 := "211234"
str2 := "245245"
diffLength := 0
for i := 0; i < len(str1); i++{
diffLength = i+1
number1, _ := strconv.Atoi(str1[:len(str1)-i-1])
number2, _ := strconv.Atoi(str2[:len(str2)-i-1])
if number1 == number2 {
break
}
}
elems := (diffLength * 2)-1
output := make([]*[]string, elems+1)
for i := 0; i < elems; i++ {
newSlice := make([]string, diffLength)
output[i] = &newSlice
}
for digit := 0; digit < diffLength; digit++ {
for j := 0; j < diffLength; j++ {
if j == digit {
if output[j] != nil {
min, _ := strconv.Atoi(string(str1[len(str1)-(digit+1)]))
max := 9
if digit == diffLength-1 {
max, _ = strconv.Atoi(string(str2[len(str1)-(digit+1)]))
max = max - 1
}
if digit != 0{
min = min+1
}
if min < 10 {
maxchar := strconv.Itoa(max)[0]
minchar := strconv.Itoa(min)[0]
newVal, safe := getStringFromMinAndMax(min, max)
if digit == diffLength-1 && !safe && (str1[len(str1)-(digit+1)] == maxchar || str2[len(str2)-(digit+1)] == minchar) {
output[j] = nil
} else {
(*output[j])[diffLength-digit-1] = newVal
}
} else {
output[j] = nil
}
}
if j != diffLength-1 && output[elems-1-j] != nil {
min := 0
max, _ := strconv.Atoi(string(str2[len(str1)-(digit+1)]))
if digit != 0{
max = max-1
}
if max >= 0{
newVal, _ := getStringFromMinAndMax(min, max)
(*output[elems-1-j])[diffLength-digit-1] = newVal
} else {
output[elems-1-j] = nil
}
}
} else {
if j > digit {
if output[j] != nil {
(*output[j])[diffLength-digit-1] = "X"
}
if j != diffLength-1 && output[elems-1-j] != nil {
(*output[elems-1-j])[diffLength-digit-1] = "X"
}
} else {
if output[j] != nil {
(*output[j])[diffLength-digit-1] = string(str1[len(str1)-digit-1])
}
if j != diffLength-1 && output[elems-1-j] != nil {
(*output[elems-1-j])[diffLength-digit-1] = string(str2[len(str2)-digit-1])
}
}
}
}
}
for _, list := range output {
if list != nil{
if len(str1) != diffLength{
fmt.Printf(str1[:len(str1)-diffLength])
}
for _, element := range *list {
fmt.Printf(element)
}
fmt.Printf("\n")
}
}
}
Footnotes:
diffLength is the number of characters on the end of the strings which differ, I couldn't think of a better way to get this number than what's in the script...
Me setting an output to nil is me saying, "This one will be handled explicitly, so throw it away"
j is a variable for which output I'm setting... But this also gets mirrored to the bottom, so I couldn't think of a concise name to give it thus I left it j.
digit is tracking which digit from the right we are modifying

Related

how to manipulate very long string to avoid out of memory with golang

I trying for personal skills improvement to solve the hacker rank challenge:
There is a string, s, of lowercase English letters that is repeated infinitely many times. Given an integer, n, find and print the number of letter a's in the first n letters of the infinite string.
1<=s<=100 && 1<=n<=10^12
Very naively I though this code will be fine:
fs := strings.Repeat(s, int(n)) // full string
ss := fs[:n] // sub string
fmt.Println(strings.Count(ss, "a"))
Obviously I explode the memory and got an: "out of memory".
I never faced this kind of issue, and I'm clueless on how to handle it.
How can I manipulate very long string to avoid out of memory ?
I hope this helps, you don't have to actually count by running through the string.
That is the naive approach. You need to use some basic arithmetic to get the answer without running out of memory, I hope the comments help.
var answer int64
// 1st figure out how many a's are present in s.
aCount := int64(strings.Count(s, "a"))
// How many times will s repeat in its entirety if it had to be of length n
repeats := n / int64(len(s))
remainder := n % int64(len(s))
// If n/len(s) is not perfectly divisible, it means there has to be a remainder, check if that's the case.
// If s is of length 5 and the value of n = 22, then the first 2 characters of s would repeat an extra time.
if remainder > 0{
aCountInRemainder := strings.Count(s[:remainder], "a")
answer = int64((aCount * repeats) + int64(aCountInRemainder))
} else{
answer = int64((aCount * repeats))
}
return answer
There might be other methods but this is what came to my mind.
As you found out, if you actually generate the string you will end up having that huge memory block in RAM.
One common way to represent a "big sequence of incoming bytes" is to implement it as an io.Reader (which you can view as a stream of bytes), and have your code run a r.Read(buff) loop.
Given the specifics of the exercise you mention (a fixed string repeated n times), the number of occurrence of a specific letter can also be computed straight from the number of occurences of that letter in s, plus something more (I'll let you figure out what multiplications and counting should be done).
How to implement a Reader that repeats the string without allocating 10^12 times the string ?
Note that, when implementing the .Read() method, the caller has already allocated his buffer. You don't need to repeat your string in memory, you just need to fill the buffer with the correct values -- for example by copying byte by byte your data into the buffer.
Here is one way to do it :
type RepeatReader struct {
str string
count int
}
func (r *RepeatReader) Read(p []byte) (int, error) {
if r.count == 0 {
return 0, io.EOF
}
// at each iteration, pos will hold the number of bytes copied so far
var pos = 0
for r.count > 0 && pos < len(p) {
// to copy slices over, you can use the built-in 'copy' method
// at each iteration, you need to write bytes *after* the ones you have already copied,
// hence the "p[pos:]"
n := copy(p[pos:], r.str)
// update the amount of copied bytes
pos += n
// bad computation for this first example :
// I decrement one complete count, even if str was only partially copied
r.count--
}
return pos, nil
}
https://go.dev/play/p/QyFQ-3NzUDV
To have a complete, correct implementation, you also need to keep track of the offset you need to start from next time .Read() is called :
type RepeatReader struct {
str string
count int
offset int
}
func (r *RepeatReader) Read(p []byte) (int, error) {
if r.count == 0 {
return 0, io.EOF
}
var pos = 0
for r.count > 0 && pos < len(p) {
// when copying over to p, you should start at r.offset :
n := copy(p[pos:], r.str[r.offset:])
pos += n
// update r.offset :
r.offset += n
// if one full copy of str has been issued, decrement 'count' and reset 'offset' to 0
if r.offset == len(r.str) {
r.count--
r.offset = 0
}
}
return pos, nil
}
https://go.dev/play/p/YapRuioQcOz
You can now count the as while iterating through this Reader.

How to convert any negative value to zero with bitwise operators?

I'm writing the PopBack() operation for a LinkedList in Go, the code looks like this:
// PopBack will remove an item from the end of the linked list
func (ll *LinkedList) PopBack() {
lastNode := &ll.node
for *lastNode != nil && (*lastNode).next != nil {
lastNode = &(*lastNode).next
}
*lastNode = nil
if ll.Size() != 0 {
ll.size -= 1
}
}
I don't like the last if clause; if the size is zero we don't want to decrement to a negative value. I was wondering if there is a bitwise operation in which whatever the value is after the decrement, if it's only negative it should covert to a zero?
Negative values have the sign bit set, so you can do like this
ll.size += (-ll.size >> 31)
Suppose ll.size is int32 and ll.Size() returns ll.size. Of course this also implies that size is never negative. When the size is positive then the right shift will sign-extend -ll.size to make it -1, otherwise it'll be 0
If ll.size is int64 then change the shift count to 63. If ll.size is uint64 you can simply cast to int64 if the size is never larger than 263. But if the size can be that large (although almost impossible to occur in the far future) then things are much more trickier:
mask := uint64(-int64(ll.size >> 63)) // all ones if ll.size >= (1 << 63)
ll.size = ((ll.size - 1) & mask) | ((ll.size + uint64(-int64(ll.size) >> 63)) & ^mask)
It's basically a bitwise mux that's usually used in bithacks because you cannot cast bool to int without if in golang
Neither of these are quite readable at first glance so the if block is usually better
Trade a nil check in each iteration of the loop for a single nil check before the loop. With this change, the loop runs faster and the operator for updating size is subtraction.
func (ll *LinkedList) PopBack() {
if ll.node == nil {
return
}
lastNode := &ll.node
for (*lastNode).next != nil {
lastNode = &(*lastNode).next
}
*lastNode = nil
ll.size -= 1
}

Convert string of numbers into 'binary representation'

Im recently made the "Winning Lottery Ticket" coding challange on hackerrank.
https://www.hackerrank.com/challenges/winning-lottery-ticket/
The idea is to count the combinations of two lines which contain all numbers from 0-9, in the example below its 5 combinations in total.
56789
129300455
5559948277
012334556
123456879
The idea is to change the the representation of something quicker for checking if all numbers are contained.
Example representation:
1278 --> 01100001100
Example with using the first two lines from above:
56789129300455 --> 1111111111
When checking if a number is contained with the concatenation of 2 lines I can abort directly if I encounter a zero because thats not gonna be a pair with all 0-9.
This logic works, but it fails when having a huge amount of lines to compare.
// Go code
func winningLotteryTicket(tickets []string) int {
counter := 0
for i := 0; i < len(tickets); i++ {
for j := i + 1; j < len(tickets); j++ {
if err := bitMask(fmt.Sprintf("%v%v", tickets[i], tickets[j])); err == nil {
counter++
}
}
}
return counter
}
func bitMask(s string) error {
for i := 0; i <= 9; i++ {
if !strings.Contains(s, strconv.Itoa(i)) {
return errors.New("No Pair")
}
}
return nil
}
Not sure if this representation is called a bitMaks, if not please correct me and I will adjust this post.
From my point of view there is no way the improove performance on the concatenation of the strings because I will have to check each combination.
For checking if a number is contained within the string at the function "bitMask" im not sure.
Do you have an idea how this could perform better ?
Bit masks are integers, not strings of ones and zeros. It's called a bitmask because we're not interested in the numerical value of these integers but only in the bit pattern. We can use bitwise operations on integers and those are really fast because they are implemented in hardware, directly in the CPU.
Here is a function that turns a string into an actual bitmask, with each one-bit signaling that a particular digit is present in the string:
func mask(s string) uint16 {
// We need ten bits, one for each possible decimal digit in s, so int16 and
// uint16 are the smallest possible integer types that fit. For bitmasks it
// is typical to select an unsigned type because the sign bit doesn't have
// any meaning. As I said earlier, mask's numerical value is irrelevant.
var mask uint16
for _, c := range s {
switch c {
case '0':
mask |= 0b0000000001
case '1':
mask |= 0b0000000010
case '2':
mask |= 0b0000000100
case '3':
mask |= 0b0000001000
case '4':
mask |= 0b0000010000
case '5':
mask |= 0b0000100000
case '6':
mask |= 0b0001000000
case '7':
mask |= 0b0010000000
case '8':
mask |= 0b0100000000
case '9':
mask |= 0b1000000000
}
}
return mask
}
This is rather verbose, but it should be pretty obvious what happens.
Note that the binary number literals can be replaced with bit shifts:
0b0000000001 is the same as 1<<0 (1 shifted zero times to the left)
0b0000000010 is the same as 1<<1 (1 shifted one time to the left
0b0000000100 is the same as 1<<2 (1 shifted two times to the left), and so on
Using this, and taking advantage of the fact that the bytes '0' through '9' are themselves just integers (48 through 57 in decimal, given by their place in the ASCII table, we can shorten this function like so:
func mask(s string) uint16 {
var mask uint16
for _, c := range s {
if '0' <= c && c <= '9' {
mask |= 1 << (c - '0')
}
}
return mask
}
To check two lines, then, all we have to do is OR the masks for the lines and compare to 0b1111111111 (i.e. check if all ten bits are set):
package main
import "fmt"
func main() {
a := "56789"
b := "129300455"
mA := mask(a)
mB := mask(b)
fmt.Printf("mask(%11q) = %010b\n", a, mA)
fmt.Printf("mask(%11q) = %010b\n", b, mB)
fmt.Printf("combined = %010b\n", mA|mB)
fmt.Printf("all digits present: %v\n", mA|mB == 0b1111111111)
}
func mask(s string) uint16 {
var mask uint16
for _, c := range s {
if '0' <= c && c <= '9' {
mask |= 1 << (c - '0')
}
}
return mask
}
mask( "56789") = 1111100000
mask("129300455") = 1000111111
combined = 1111111111
all digits present: true
Try it on the playground: https://play.golang.org/p/mr1KqnC9phB

How to find the distance between two runes

I'm trying to solve a couple of example programming problems to familiarize myself with the language.
I am iterating over a string as follows:
func main() {
fullFile := "abcdDefF"
for i := 1; i < len(fullFile); i++ {
println(fullFile[i-1], fullFile[i], fullFile[i-1]-fullFile[i])
}
}
In the loop I want to get the difference between the current rune and the previous rune (trying to identify lower-case - upper-case pairs by finding any pairs where the difference is == 32.
Strangely, the subtraction doesn't work properly (in fact seems to yield addition in cases where I would expect a negative number) although I would expect it to since runes are represented by int32.
Figured it out: the data type returned was a byte.
Explicitly converted to int and everything works as expected.
func main() {
fullFile, _ := ioutil.ReadFile("input/input.txt")
previous := 0
current := 0
for i := 1; i < len(fullFile); i++ {
previous = int(fullFile[i-1])
current = int(fullFile[i])
println(current, previous, current-previous)
}
}

How to generate a fixed length random number in Go?

What is the fastest and simplest way to generate fixed length random numbers in Go?
Say to generate 8-digits long numbers, the problem with rand.Intn(100000000) is that the result might be far less than 8-digits, and padding it with leading zeros doesn't look like a good answer to me.
I.e., I care about the the quality of the randomness more in the sense of its length. So I'm thinking, for this specific problem, would the following be the fastest and simplest way to do it?
99999999 - rand.Int63n(90000000)
I.e., I guess Int63n might be better for my case than Intn. Is it ture, or it is only a wishful thinking? Regarding randomness of the full 8-digits, would the two be the same, or there is really one better than the other?
Finally, any better way than above?
UPDATE:
Please do not provide low + rand(hi-low) as the answer, as everyone knows that. It is equivalent of what I'm doing now, and it doesn't answer my real question, "Regarding randomness of the full 8-digits, would the two be the same, or there is really one better than the other? "
If nobody can answer that, I'll plot a 2-D scatter plot between the two and find out myself...
Thanks
This is a general purpose function for generating numbers within a range
func rangeIn(low, hi int) int {
return low + rand.Intn(hi-low)
}
See it on the Playground
In your specific case, trying to generate 8 digit numbers, the range would be (10000000, 99999999)
It depend on value range you want to use.
If you allow value range [0-99999999] and padding zero ip number of char < 8, then use fmt like fmt.Sprintf("%08d",rand.Intn(100000000)).
If you dont want padding, which value in range [10000000, 99999999], then give it a base like ranNumber := 10000000 + rand.Intn(90000000)`
See it on Playground
crypto/rand package is used to generate number.
func generateRandomNumber(numberOfDigits int) (int, error) {
maxLimit := int64(int(math.Pow10(numberOfDigits)) - 1)
lowLimit := int(math.Pow10(numberOfDigits - 1))
randomNumber, err := rand.Int(rand.Reader, big.NewInt(maxLimit))
if err != nil {
return 0, err
}
randomNumberInt := int(randomNumber.Int64())
// Handling integers between 0, 10^(n-1) .. for n=4, handling cases between (0, 999)
if randomNumberInt <= lowLimit {
randomNumberInt += lowLimit
}
// Never likely to occur, kust for safe side.
if randomNumberInt > int(maxLimit) {
randomNumberInt = int(maxLimit)
}
return randomNumberInt, nil
}
I recently needed to do something like this, but with a certain byte length (rather than number of digits) and with numbers larger than max int64 (so using math/big.Int). Here was my general solution:
See on the Playground (with added code comments)
func generateRandomBigInt(numBytes int) (*big.Int, error) {
value := make([]byte, numBytes)
_, err := rand.Reader.Read(value)
if err != nil {
return nil, err
}
for true {
if value[0] != 0 {
break
}
firstByte := value[:1]
_, err := rand.Reader.Read(firstByte)
if err != nil {
return nil, err
}
}
return (&big.Int{}).SetBytes(value), nil
}

Resources