What is the best way to get substring info from a string - algorithm

I am actually doing a function to get some data. The data has info that it's not necessary for me, so I'am searching for an optimal way to find it.
What I have is this
e := strings.Index(result, "Country") //Searching for the country info like Country: US
When I get the index I search for the Info that I need something like the next word. That's what I'm looking for.
Any idea to resolve this in an optimal way?

You can use FindStringIndex from the regexp package. It is guaranteed to run in time linear in the size of the input. FindStringIndex returns a slice of two elements defining the location of the matched string so you can easily get the next word.

I'm not sure what your result string looks like but if you can turn it into a format such as "City:San Fran;State:CA;Country:US" then this can easily be converted into a map. For instance:
result := "City:San Fran,State:CA,Country:US"
fields := strings.Split(result, ",")
fieldsMap := map[string]string{}
for _, field := range fields {
pair := strings.Split(field, ":")
fieldsMap[pair[0]] = pair[1]
}
Now you can access any value very easily:
country := fieldsMap["Country"] will give "US" if you fmt.Println(country)

It is another way to find the index of the string.
Split by that string and then length of the first index is the index of the first occurrence of the string.
https://play.golang.org/p/x7QWyuC_1wK
result := "City:San Fran,State:CA,Country:US"
// split the string
s := strings.Split(result, "US")
// index of US
index := len(s[0])
fmt.Println(index)
// Output
U 31

Related

Why do I get 6 entries in an array created from a map with only 3 entities?

I have a seemingly simple issue with my understanding of map types in Go. If I create a simple map such as
var thisMap = map[string]string {
"8f14e45fceea167a5a36dedd4bea2543": "Charming",
"1679091c5a880faf6fb5e6087eb1b2dc": "Citi",
"e4da3b7fbbce2345d7772b0674a318d5": "Chase",
}
Populate the keys into an array
keys := make([]string, len(supportedCards))
for k := range supportedCards {
keys = append(keys, k)
}
Then try to join these keys into a comma separated value that I can append to any string
fmt.Println(strings.Join(keys,","))
I expect the result to be
8f14e45fceea167a5a36dedd4bea2543,1679091c5a880faf6fb5e6087eb1b2dc,e4da3b7fbbce2345d7772b0674a318d5
But what I really see is
,,,8f14e45fceea167a5a36dedd4bea2543,1679091c5a880faf6fb5e6087eb1b2dc,e4da3b7fbbce2345d7772b0674a318d5
Why does iterating through the map create 6 entries instead of just 3?
https://play.golang.org/p/Ou67K1Kfvsf
With
keys := make([]string, len(supportedCards))
you create a []string with three empty elements. You can assign them using their indicies.
When you append it later, new entries are added at the end, producing your result with a length of 6, where the first three are empty.
In addition to #xarantolus's answer, instead of using indices you can keep your for-range loop unchanged by:
keys := make([]string, 0, len(supportedCards))
The third argument is capacity of the slide. Ref: https://tour.golang.org/moretypes/13

Efficient log parsing in golang

What would be an efficient (performance and readability) of parsing lines in a log file and extracting points of interest?
For example:
*** Time: 2/1/2019 13:51:00
17.965 Pump 10 hose FF price level 1 limit 0.0000 authorise pending (Type 00)
17.965 Pump 10 State change LOCKED_PSTATE to CALLING_PSTATE [31]
38.791 Pump 10 delivery complete, Hose 1, price 72.9500, level 1, value 100.0000, volume 1.3700, v-total 8650924.3700, m-total 21885705.8800, T13:51:38
Things I need to extract are 10 (for pump 10), Price Level. Limit
The _PSTATE changes the values from the delivery completel line etc.
Currently I'm using a regular expression to capture each one and using capture groups. But it feels inefficient and there is quite a bit of duplication.
For example, I have a bunch of these:
reStateChange := regexp.MustCompile(`^(?P<offset>.*) Pump (?P<pump>\d{2}) State change (?P<oldstate>\w+_PSTATE) to (?P<newstate>\w+)_PSTATE`)
Then inside a while loop
if match := reStateChange.FindStringSubmatch(text); len(match) > 0 {
matched = true
for i, name := range match {
result[reStateChange.SubexpNames()[i]] = name
}
} else if match := otherReMatch.FindStringSubmatch(text); len(match) > 0 {
matched = true
for i, name := range match {
result[reStateChange.SubexpNames()[i]] = name
}
} else if strings.Contains(text, "*** Time:") {
}
It feels that there could be a much better way to do this. I would trade some performance for readability. The log files are only really 10MB max. Often smaller.
I'm after some suggestions on how to make this better in golang.
If all your log lines are similar to that sample you posted, they seem quite structured so regular expressions might be a bit overkill and hard to generalize.
Another option would be for you to transform each of those lines to a slice of strings ([]string) by using strings.Fields, or even strings.FieldFunc so that you can strip both white space and commas.
Then you can design an interface like:
type LogLineProcessor interface {
CanParse(line []string)
GetResultFrom(line []string) LogLineResult
}
Where LogLineResult is an struct containing the extracted information.
You can then define multiple structs with methods that implement LogLineProcessor (each implementation would look at specific positions on that []string to realize if it is a line it can process or not, like looking for the words "hose", "FF" and "price" in the positions it expects to find them).
The GetResultFrom implementations would also extract each data point from specific positions in the []string (it can rely on that information being there if it already determined it was one of the lines it can process).
You can create a var processors []LogLineProcessor, put all your processors in there and then just iterate that array:
line := strings.Fields(text)
for _, processor := range processors {
if processor.CanParse(line) {
result := processor.GetResultFrom(line)
// do whatever needed with the result
}
}

Is there a built-in means to remove the first item from an array?

Learning Go, what a great language.
Is there a built-in means to remove the first item in an array? Kind of like PHP's array_shift
I have a string, "the brown fox jumps"
I've found strings.Fields() which turns it into an array. I'd like to turn that string into two strings:
"the", "brown fox jumps"
words := strings.Fields(theFoxString)
firstWord := // unshift first word from words
otherWords := // join what's left of words with ' '
Thank you for your help!
If we have any slice a, we can do this:
x, a := a[0], a[1:]
So using your code, we get:
words := strings.Fields(theFoxString)
firstWord, otherWords := words[0], words[1:]
Keep in mind the underlying array hasn't changed, but the slice we are using to look at that array has. For most purposes this is ok (and even advantageous performance wise!), but it is something to be aware of.
Source:
https://github.com/golang/go/wiki/SliceTricks

Get nth return value from function in Go

I have started learning Go yesterday and so far I am fascinated. But I bumped to a problem.
Lets assume that function returns multiple vars (awesome feature). How should I use only nth value and not first n-1 values? In example I just want to check whether key exists in hash map but do nothing with value.
var value, hasElement = myMap[key]
The error I get is
main.go:15: value declared and not used
If I just print value it will all be fine but that is not good.
In Go, the character _ is a placeholder indicating that you don't care about the value. Therefore, you can do
var _, hasElement = myMap[key]
which will ignore the first return value.
This extends to functions with any number of return values:
var a, _, _, d, e, _ = bigFunction()

Pattern for lookup in Go array

Go has convenient syntax to define array lookup tables:
var myTable = [...]string{
'a': "aaaa",
'b': "bbbb",
'z': "zoro",
}
In some cases (where keys are in known and not too big range) this is more efficient way to make table, than map. However, it is easy to make lookup in map and find if key not in it. But to do lookup by index in this array I have to do:
if index < len(myTable) {
if val := myTable[index]; val != "" {
// here I know index exists in array and val is its value
}
}
Is there simpler / more common pattern or library function to do this?
I don't think there is any special builtin syntax to remove the need for a bounds check here. One option would be to wrap the code in a custom type. For example:
type StringTable []string
func (st StringTable) Get(i int) string {
if i < 0 || i >= len(st) {
return ""
}
return st[i]
}
You can still use the same initialiser syntax with the custom type:
myTable := StringTable{
'a': "aaaa",
'b': "bbbb",
'z': "zoro",
}
fmt.Printf("%#v\n", myTable.Get('a'))
fmt.Printf("%#v\n", myTable.Get(-5))
fmt.Printf("%#v\n", myTable.Get('~')) // greater than 'z'
You can play around with this example here: http://play.golang.org/p/nhti2dVE8B
In some cases (where keys are in known and not too big range) this is more efficient way to make table, than map
Yes. You want to translate your key into an offset in the array. Then you can do the lookup in constant time.
Let's say you know all your keys will be in the set A-Z. So you create an array of 26 entries. When a key comes in, you subtract the ASCII value of "A" to get the index into your table. Boom, constant-time lookups, which will be much faster than a map lookup.
Note that you don't actually store the key anywhere, it's implicit. If you want to print out your table, you'd need to generate all keys (A-Z) yourself.

Resources