How to strings.Split on newline? - go

I'm trying to do the rather simple task of splitting a string by newlines.
This does not work:
temp := strings.Split(result,`\n`)
I also tried ' instead of ` but no luck.
Any ideas?

You have to use "\n".
Splitting on `\n`, searches for an actual \ followed by n in the text, not the newline byte.
playground

For those of us that at times use Windows platform, it can
help remember to use replace before split:
strings.Split(strings.ReplaceAll(windows, "\r\n", "\n"), "\n")
Go Playground

It does not work because you're using backticks:
Raw string literals are character sequences between back quotes ``. Within the quotes, any character is legal except back quote. The value of a raw string literal is the string composed of the uninterpreted (implicitly UTF-8-encoded) characters between the quotes; in particular, backslashes have no special meaning and the string may contain newlines.
Reference: http://golang.org/ref/spec#String_literals
So, when you're doing
strings.Split(result,`\n`)
you're actually splitting using the two consecutive characters "\" and "n", and not the character of line return "\n". To do what you want, simply use "\n" instead of backticks.

Your code doesn't work because you're using backticks instead of double quotes. However, you should be using a bufio.Scanner if you want to support Windows.
import (
"bufio"
"strings"
)
func SplitLines(s string) []string {
var lines []string
sc := bufio.NewScanner(strings.NewReader(s))
for sc.Scan() {
lines = append(lines, sc.Text())
}
return lines
}
Alternatively, you can use strings.FieldsFunc (this approach skips blank lines)
strings.FieldsFunc(s, func(c rune) bool { return c == '\n' || c == '\r' })

import regexp
var lines []string = regexp.MustCompile("\r?\n").Split(inputString, -1)
MustCompile() creates a regular expression that allows to split by both \r\n and \n
Split() performs the split, seconds argument sets maximum number of parts, -1 for unlimited

' doesn't work because it is not a string type, but instead a rune.
temp := strings.Split(result,'\n')
go compiler: cannot use '\u000a' (type rune) as type string in argument to strings.Split
definition: Split(s, sep string) []string

Related

How to escape a string with single quotes

I am trying to unquote a string that uses single quotes in Go (the syntax is same as Go string literal syntax but using single quotes not double quotes):
'\'"Hello,\nworld!\r\n\u1F60ANice to meet you!\nFirst Name\tJohn\nLast Name\tDoe\n'
should become
'"Hello,
world!
😊Nice to meet you!
First Name John
Last Name Doe
How do I accomplish this?
strconv.Unquote doesn't work on \n newlines (https://github.com/golang/go/issues/15893 and https://golang.org/pkg/strconv/#Unquote), and simply strings.ReplaceAll(ing would be a pain to support all Unicode code points and other backslash codes like \n & \r & \t.
I may be asking for too much, but it would be nice if it automatically validates the Unicode like how strconv.Unquote might be able to do/is doing (it knows that x Unicode code points may become one character), since I can do the same with unicode/utf8.ValidString.
#CeriseLimĂłn came up with this answer, and I just put it into a function with more shenanigans to support \ns. First, this swaps ' and ", and changes \ns to actual newlines. Then it strconv.Unquotes each line, since strconv.Unquote cannot handle newlines, and then reswaps ' and " and pieces them together.
func unquote(s string) string {
replaced := strings.NewReplacer(
`'`,
`"`,
`"`,
`'`,
`\n`,
"\n",
).Replace(s[1:len(s)-1])
unquoted := ""
for _, line := range strings.Split(replaced, "\n") {
tmp, err := strconv.Unquote(`"` + line + `"`)
repr.Println(line, tmp, err)
if err != nil {
return nil, NewInvalidAST(obj.In.Text.LexerInfo, "*Obj.In.Text.Text")
}
unquoted += tmp + "\n"
}
return strings.NewReplacer(
`"`,
`'`,
`'`,
`"`,
).Replace(unquoted[:len(unquoted)-1])
}

How to Print ascii text in go like python does

how to print ascii-text in go language like python does
like picture shown below
Using python
Using Golang
The problem is that your text contains backtick (`), which happen to be delimiter character for golang's raw string literal. This situation is comparable to your python code had your text contains 3 consecutive double-quotes, which is the delimiter being used in your python code.
I don't see any quick escape from this situation without modifying your ascii text, as we don't have other options for raw string delimiter in golang like we have in python. You may want to store your ascii text in a text file and read it from there :
import (
....
....
"io/ioutil"
)
func banner() string {
b, err := ioutil.ReadFile("ascii.txt")
if err != nil {
panic(err)
}
fmt.Println(string(b))
}
If you're ok with slight modification to the ascii text source, then you can temporarily use other character that isn't used anywhere else in the ascii text to represent backtick, and then do string replacement to put the actual backtick in place. Or, you can use fmt.Sprintf to supply the problematic backtick :
ascii := fmt.Sprintf(`....%c88b...`, '`')
fmt.Println(ascii)
// output:
// ....`88b...
Yes but you have to split lines with backtick and put them quoted into standard double quote ”.
... +
“888 6(, ` ‘ “ +
...

cmd line parameter string.Contains behaving differently from hardcoded parameter

I'm looking to get some clarification on why these two strings.Contains() calls behave differently.
package main
import (
"strings"
"os"
"errors"
"fmt"
)
func main() {
hardcoded := "col1,col2,col3\nval1,val2,val3"
if strings.Contains(hardcoded, "\n") == false {
panic(errors.New("The hardcoded string should contain a new line"))
}
fmt.Println("New line found in hardcoded string")
if len(os.Args) == 2 {
fmt.Println("parameter:", os.Args[1])
if strings.Contains(os.Args[1], "\n") == false {
panic(errors.New("The parameter string should contain a new line."))
}
fmt.Println("New line found in parameter string")
}
}
If I run this with
go run input-tester.go col1,col2,col3\\nval1,val2,val3
I get the following
New line found in hardcoded string
parameter: col1,col2,col3\nval1,val2,val3
panic: The parameter string should contain a new line.
goroutine 1 [running]:
panic(0x497100, 0xc42000e310)
/usr/local/go/src/runtime/panic.go:500 +0x1a1
main.main()
/home/user/Desktop/input-tester.go:21 +0x343
exit status 2
I can see that the string printed out is the same format as the string that is hardcoded yet the string.Contains() doesn't find the "\n".
I'm guessing this is an oversight on my part. Can anyone explain what I'm missing or misunderstanding?
It behaves differently because in hardcoded \n is considered as new line parameter.
And in command line arguments , argument type is string, where given condition is for "\n" which is considered as new line parameter.
Simply ` \n compaires with two consecutive characters "\" and "n" not with "\n" a new line character.
So for command line arguments use,
if strings.Contains(os.Args[1], `\n`) == false {
panic(errors.New("The parameter string should contain a new line."))
}
Reference : https://golang.org/ref/spec#String_literals
Raw string literals are character sequences between back quotes, as in foo. Within the quotes, any character may appear except back quote. The value of a raw string literal is the string composed of the uninterpreted (implicitly UTF-8-encoded) characters between the quotes; in particular, backslashes have no special meaning and the string may contain newlines.

Normalizing text input to ASCII

I am building a small tool which parses a user's input and finds common pitfalls in writing and flags them so the user can improve their text. So far everything works well except for text that has curly quotes compared to normal ASCII straight quotes. I have a hack now which will do a string replacement for opening (and closing) single curly quotes and double opening (and close) curly quotes like so:
cleanedData := bytes.Replace([]byte(data), []byte("ñ€ℱ"), []byte("'"), -1)
I feel like there must be a better way to handle this in the stdlib so I can also convert other non-ascii characters to an ascii equivalent. Any help would be greatly appreciated.
The strings.Map function looks to me like what you want.
I don't know of a generic 'ToAscii' type function, but Map has a nice approach for mapping runes to other runes.
Example (updated):
func main() {
data := "Hello “Frank” or â€čFrançoisâ€ș as you like to be ‘called’"
fmt.Printf("Original: %s\n", data)
cleanedData := strings.Map(normalize, data)
fmt.Printf("Cleaned: %s\n", cleanedData)
}
func normalize(in rune) rune {
switch in {
case '“', 'â€č', '”', 'â€ș':
return '"'
case '‘', '’':
return '\''
}
return in
}
Output:
Original: Hello “Frank” or â€čFrançoisâ€ș as you like to be ‘called’
Cleaned: Hello "Frank" or "François" as you like to be 'called'

Go lang differentiate "\n" and line break

I am trying read certain string output generated by linux command by the following code:
out, err := exec.Command("sh", "-c", cmd).Output()
The above out is of []byte type, how can I differentiate the "\n" character contained in line content with the real line break? I tried
strings.Split(output, "\n")
and
bufio.NewScanner(strings.NewReader(output))
but they both split the whole string buffer whenever seeing a "\n" character.
OK, to clarify, an "unreal" break is a "\n" character contained in a string as follows,
Print first result: "123;\n234;\n"
Print second result: "456;\n"
The whole output is one big multi-line string, it may also contain some other quoted strings, and I am processing the whole string output in my go program, but I can't control the command output and add a back slash before the "\n" character.
Further clarify: I meant to process byte sequence which contains string of strings, and want to preserve the "\n" contained in the inner string and use the the outer layer "\n" to break lines. So for the following byte sequence:
First line: "test1"
Second line: "123;\n234;\n345;"
Third line: "456;\n567;"
Fourth line: "test4"
I want to get 3 lines when processing the whole sequence, instead of getting 7 total lines. It's a old project, but I remember I can use Python to directly get 3 lines using syntax like "for line in f", and print the content of second inner string instead of rendering it.
It's possible that your "\n" is actually the escaped version of a line break character. You can replace these with real line breaks by searching for the escaped version and replacing with the non escaped version:
strings.Replace(sourceStr, `\n`, "\n", -1)
Since string literals inside backticks can be written over multiple lines, Go escapes any line break characters it sees.
There is no distinction between a "real" and an "unreal" line break.
If you're using a Unix-like system, the end of a line in a text file is denoted by the LF or '\n' character. You cannot have a '\n' character in the middle of a line.
A string in memory can contain as many '\n' characters as you like. The string "foo\nbar\n", when written to a text file, will create two lines, "foo" and "bar".
There is no effective difference between
fmt.Println("foo")
fmt.Println("bar")
and
fmt.Printf("foo\nbar\n")
Both print the same sequence of 2 lines, as does this:
fmt.Println("foo\nbar")
The encoding/csv package might suit your needs:
package main
import (
"encoding/csv"
"fmt"
"strings"
)
const s = `First line: "test1"
Second line: "123;
234;
345;"
Third line: "456;
567;"
Fourth line: "test4"
`
func main() {
r := csv.NewReader(strings.NewReader(s))
r.Comma = ':'
r.TrimLeadingSpace = true
a, e := r.ReadAll()
if e != nil {
panic(e)
}
fmt.Printf("%q\n", a)
}
Result:
[
["First line" "test1"]
["Second line" "123;\n234;\n345;"]
["Third line" "456;\n567;"]
["Fourth line" "test4"]
]
https://golang.org/pkg/encoding/csv
strings.Trim(string, "\f\t\r\n ")

Resources