Scandinavian characters not working in go-lang go-instagram API bindings - go

Hi I'm trying to wrap my head around what seems to be a problem with multibyte support in this open source library (https://github.com/carbocation/go-instagram/). I am using the code below to retrieve information about the tag blue in swedish. How ever I get an empty array when trying.
fmt.Println("Starting instagram download.")
client := instagram.NewClient(nil)
client.ClientID = "myid"
media, _, _ := client.Tags.RecentMedia("blå", nil)
fmt.Println(media)
I have tried using the api trough the browser and there are several pictures tagged with the tag. I have also tried using the code snippet with tags in English like blue and that returns the latest pictures as well. I would be glad if any one could explain why this might happen. Id like to update the lib so it supports multi-byte but I haven't got the go knowledge required. Is this a go problem or a problem with the library?
Thank you

The problem is in validTagName():
// Strip out things we know Instagram won't accept. For example, hyphens.
func validTagName(tagName string) (bool, error) {
//\W matches any non-word character
reg, err := regexp.Compile(`\W`)
if err != nil {
return false, err
}
if reg.MatchString(tagName) {
return false, nil
}
return true, nil
}
In Go, \W matches precisely [^0-9A-Za-z_]. This validation check is incorrect.

Related

How to join URLs in Go?

I'm creating a web crawler in GO. And after parsing and scraping all the URLs on the page, there are scenarios where I'm getting the hyperlinks in different formats:
/my/next/page
my/next/page
//my_next_page
https://different-domain.com
As you can see, there are many combinations here, and in some cases, the URL is entirely a different domain. The regular joins of the URLs will not work here. How to join URLs correctly so each resultant URL join can be fed into the crawler again to parse and scrape continuously?
URLs handling is different because of differences in the hyperlinks. As mentioned above, the hyperlink in an anchor tag can be in many formats. Here is the solution to handle the joins of a requested URL to the crawl and hyperlinks respectively to that URL.
func joinURLs(baseURL, hyperlink string) string {
parse, err := url.Parse(hyperlink)
if err != nil {
return ""
}
base, err := url.Parse(baseURL)
if err != nil {
return ""
}
nextURLToCrawl := base.ResolveReference(parse)
return nextURLToCrawl.String()
}
The best thing about this function is that it can handle functions that are on a different domain without having to maintain the checks whether the hostname is the same or not. Now your code is much more declarative.

Serving image from string in http golang

I need to display image with HTTP GET but the thing is i can only use String as the response body.
So for example (headers:image/png, body:Aeacxxffsaf(encoded representation or whatever) )
It's more or less like this web https://codebeautify.org/base64-to-image-converter, but i want the string to output into image when using http GET.
Some code snippets explanations here:
//string that is generated from image (encoded)
encString := "iVBORw0KGgoAAAANSUhEUgAAANIAAAAzCAYAAADigVZl..."
//set http headers to png
//and assign the encString to the body
Is there any way for that? By using string only to serve image
Sorry if my question is a bit confusing but it is the best i can describe it, i have been searching for the answer since several days ago
You do that just like with any other content, just decode the base64 first.
func handler(w http.ResponseWriter, r *http.Request) {
encString := "iVBORw0KGgoAAAANSUhEUgAAANIAAAAzCAYAAADigVZl..."
bytes, err := base64.StdEncoding.DecodeString(encString)
if err != nil {
// todo
}
w.Header().Set("Content-Type", "image/png")
_, err = w.Write(bytes)
if err != nil {
// todo
}
}
However, if you want to display it in browser without decoding, then you will have to do some client-side hacking.
It depends on how you interpret it on the other side.
The encoded b64, b32, byte array, etc. string all represent the same pattern of bytes.
To get it to display as an image on the other side all depends on how you interpret it from the other side.
EDIT:
I see what you mean now. Have a look at the image/jpeg package.
Sadly I don't have a code snippet to share with you right now, but with this you should be able to load your image onto a buffer and decode it to get an image.Image object.
Afterwards you can use that image.Image object to write it into your response body. Sadly I can't provide a code snippet right this second, but do let me know if it works for you.
remember to set your writer's appropriate header.
w.Header().Set("Content-Type", "image/jpeg")

Parsing Dynamic Time Format in Go

I'm working with an external API that will sometimes shorten the returned timestamp if the last values are zero.
layout := "2006-01-02T15:04:05.000"
opened, err := time.Parse(layout, externallyFormattedTimestamp)
if err != nil {
fmt.Println("something went wrong parsing the timestamp")
fmt.Println(err.Error())
}
If the returned externallyFormattedTimestamp is 2018-11-05T06:19:59.827, this will parse without issue.
But if the time returned ends with zeros, it'll be shortened like 2018-11-05T06:19:59.8 instead of 2018-11-05T06:19:59.800, and break. Do I need to manually append zeros to the string before parsing, or is there something I missed built into Go that will handle this for me?
Use this layout instead: 2006-01-02T15:04:05.999

unicode being output literally instead of as unicode

I am creating an IRC bot using Go as a first project to get to grips with the language. One of the bot functions is to grab data from the TVmaze API and display in the channel.
I have imported an env package which allows the bot admin to define how the output is displayed.
For example SHOWSTRING="#showname# - #status# – #network.name#"
I am trying to add functionality to it so that the admin can use IRC formatting functionality which is accessed with \u0002 this is bold \u0002 for example.
I have a function which generates the string that is being returned and displayed in the channel.
func generateString(show Show) string {
str := os.Getenv("SHOWSTRING")
r := strings.NewReplacer(
"#ID#", string(show.ID),
"#showname#", show.Name,
"#status#", show.Status,
"#network.name#", show.Network.Name,
)
result := r.Replace(str)
return result
}
From what i have read i think that i need to use the rune datatype instead of string and then converting the runes into a string before being output.
I am using the https://github.com/thoj/go-irceven package for interacting with IRC.
Although i think that using rune is the correct way to go, i have tried a few things that have confused me.
If i add \u0002 to the SHOWSTRING from the env, it returns \u0002House\u0002 - Ended - Fox. I am doing this by con.Privmsg(roomName, tvmaze.ShowLookup('house'))
However if i try con.Privmsg(roomName, "\u0002This should be bold\u0002") it outputs bold text.
What is the best option here? If it is converting the string into runes and then back to a string, how do i go about doing that?
I needed to use strconv.Unquote() on my return in the function.
The new generateString function now outputs the correct string and looks like this
func generateString(show Show) string {
str := os.Getenv("SHOWSTRING")
r := strings.NewReplacer(
"#ID#", string(show.ID),
"#showname#", show.Name,
"#status#", show.Status,
"#network.name#", show.Network.Name,
)
result := r.Replace(str)
ret, err := strconv.Unquote(`"` + result + `"`)
if err != nil {
fmt.Println("Error unquoting the string")
}
return ret
}

Download public file from Google Drive - Golang

I have a zip file stored on Google Drive (it is shared publicly). I want to know how to download it in Golang. This current code just creates a blank file named "file.zip":
package main
import (
"fmt"
"io"
"net/http"
"os"
)
func main() {
url := "https://docs.google.com/uc?export=download&id=0B2Q7X-dUtUBebElySVh1ZS1iaTQ"
fileName := "file.zip"
fmt.Println("Downloading file...")
output, err := os.Create(fileName)
defer output.Close()
response, err := http.Get(url)
if err != nil {
fmt.Println("Error while downloading", url, "-", eerrror)
return
}
defer response.Body.Close()
n, err := io.Copy(output, response.Body)
fmt.Println(n, "bytes downloaded")
}
This appears to be a bug, either with Google drive or with golang, I'm not sure which!
The problem is that the first URL you gave redirects to a second URL which looks something like this
https://doc-00-c8-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/8i67l6m6cdojptjuh883mu0qqmtptds1/1376330400000/06448503420061938118/*/0B2Q7X-dUtUBebElySVh1ZS1iaTQ?h=16653014193614665626&e=download
Note the * in the URL which is legal according to this stack overflow question. However it does have a special meaning as a delimeter.
Go fetches the URL with the * encoded as %2A like this
https://doc-00-c8-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/8i67l6m6cdojptjuh883mu0qqmtptds1/1376330400000/06448503420061938118/%2A/0B2Q7X-dUtUBebElySVh1ZS1iaTQ?h=16653014193614665626&e=download
Which Google replies "403 Forbidden" to.
Google doesn't seem to be resolving the %2A into a *.
According to this article on wikipedia reserved characters (of which * is one) used in a URI scheme: if it is necessary to use that character for some other purpose, then the character must be percent-encoded.
I'm not enough of an expert on this to say who is right, but since Google wrote both parts of the problem it is definitely their fault somewhere!
Here is the program I was using for testing
I found the solution.
Use: https://googledrive.com/host/ID
Instead of: https://docs.google.com/uc?export=download&id=ID
I'm still investigating on why this is happening, in the meanwhile you can use this workaround:
http://play.golang.org/p/SzGBAiZdGJ
CheckRedirect is called when a redirect happens and you can add an Opaque path to avoid having the URL url-encoded.
Francesc

Resources