Selenium + Go - how to? - go

I'm starting to take a look on Selenium with Go language, but I didn't find too much info.
I'm using github.com/tebeka/selenium.
In Python, I just install (pip install selenium) and code like this to open a browser:
from selenium import webdriver
driver = webdriver.Chrome(executable_path=r'./chromedriver.exe')
driver.get('http://www.hp.com')
How do I do the same in go?
I'm trying this, but it does not open the browser like Python does:
package main
import (
"fmt"
"github.com/tebeka/selenium"
)
func main() {
selenium.ChromeDriver("./chromedriver.exe")
caps := selenium.Capabilities{"browserName": "chrome"}
selenium.NewRemote(caps, fmt.Sprintf("http://www.google.com", 80))
}
Is there a simple way in go to just open the browser in my machine like that 3 Python lines do?
Thanks!

In python selenium, it automatically starts the browser. But Golang doesn't. You have to run the browser (service) explicitly. This is a simple example of using Chromedriver with Golang.
package main
import (
"github.com/tebeka/selenium"
"github.com/tebeka/selenium/chrome"
)
func main() error {
// Run Chrome browser
service, err := selenium.NewChromeDriverService("./chromedriver", 4444)
if err != nil {
panic(err)
}
defer service.Stop()
caps := selenium.Capabilities{}
caps.AddChrome(chrome.Capabilities{Args: []string{
"window-size=1920x1080",
"--no-sandbox",
"--disable-dev-shm-usage",
"disable-gpu",
// "--headless", // comment out this line to see the browser
}})
driver, err := selenium.NewRemote(caps, "")
if err != nil {
panic(err)
}
driver.Get("https://www.google.com")
}

Related

Dynamic Discovery Client For GCP Golang

I've recently shifted from python to golang. I had been using python to work with GCP.
I used to pass in the scopes and mention the discovery client I wanted to create like this :
def get_client(scopes, api, version="v1"):
service_account_json = os.environ.get("SERVICE_ACCOUNT_KEY_JSON", None)
if service_account_json is None:
sys.exit("Exiting !!! No SSH_KEY_SERVICE_ACCOUNT env var found.")
credentials = service_account.Credentials.from_service_account_info(
json.loads(b64decode(service_account_json)), scopes=scopes
)
return discovery.build(api, version, credentials=credentials, cache_discovery=False)
And this would create my desired discovery client, whether it be compute engine service or sqladmin
However in go I don't seem to find this.
I found this : https://pkg.go.dev/google.golang.org/api/discovery/v1
For any client that I want to create I would've to import that and then create that, like this :
https://cloud.google.com/resource-manager/reference/rest/v1/projects/list#examples
package main
import (
"fmt"
"log"
"golang.org/x/net/context"
"golang.org/x/oauth2/google"
"google.golang.org/api/cloudresourcemanager/v1"
)
func main() {
ctx := context.Background()
c, err := google.DefaultClient(ctx, cloudresourcemanager.CloudPlatformScope)
if err != nil {
log.Fatal(err)
}
cloudresourcemanagerService, err := cloudresourcemanager.New(c)
if err != nil {
log.Fatal(err)
}
req := cloudresourcemanagerService.Projects.List()
if err := req.Pages(ctx, func(page *cloudresourcemanager.ListProjectsResponse) error {
for _, project := range page.Projects {
// TODO: Change code below to process each `project` resource:
fmt.Printf("%#v\n", project)
}
return nil
}); err != nil {
log.Fatal(err)
}
}
So I've to import each client library to get the client for that.
"google.golang.org/api/cloudresourcemanager/v1"
There's no dynamic creation of it.
Is it even possible, cause go is strict type checking 🤔
Thanks.
No, this is not possible with the Golang Google Cloud library.
You've nailed the point on the strict type checking, as it would definitely defeat the benefits of compile time type checking. It would also be a bad Golang practice to return different objects with different signatures, as we don't do duck typing and instead we rely on interface contracts.
Golang is boring and verbose, and it's like that by design :)

Golang Colly Scraping - Website Captcha Catches My Scrape

I did make Scraping for Amazon Product Titles but Amazon captcha catches my scraper. I tried 10 times- go run main.go(8 times catches me - 2 times I scraped the product title)
I researched this but I did not find any solution for golang(there is just python) is there any solution for me?
package main
import (
"fmt"
"strings"0
"github.com/gocolly/colly"
)
func main() {
// Create a Collector specifically for Shopify
c := colly.NewCollector(
colly.AllowedDomains("www.amazon.com", "amazon.com"),
)
c.OnHTML("div", func(h *colly.HTMLElement) {
capctha := h.Text
title := h.ChildText("span#productTitle")
fmt.Println(strings.TrimSpace(title))
fmt.Println(strings.TrimSpace(capctha))
})
// Start the collector
c.Visit("https://www.amazon.com/Bluetooth-Over-Ear-Headphones-Foldable-Prolonged/dp/B07K5214NZ")
}
Output:
Enter the characters you see below Sorry, we just need to make sure
you're not a robot. For best results, please make sure your browser is
accepting cookies.
If you don't mind a different package, I wrote a package to search HTML
(essentially thin wrapper around github.com/tdewolff/parse):
package main
import (
"github.com/89z/parse/html"
"net/http"
"os"
)
func main() {
req, err := http.NewRequest(
"GET", "https://www.amazon.com/dp/B07K5214NZ", nil,
)
req.Header = http.Header{
"User-Agent": {"Mozilla"},
}
res, err := new(http.Transport).RoundTrip(req)
if err != nil {
panic(err)
}
defer res.Body.Close()
lex := html.NewLexer(res.Body)
lex.NextAttr("id", "productTitle")
os.Stdout.Write(lex.Bytes())
}
Result:
Bluetooth Headphones Over-Ear, Zihnic Foldable Wireless and Wired Stereo
Headset Micro SD/TF, FM for Cell Phone,PC,Soft Earmuffs &Light Weight for
Prolonged Waring(Rose Gold)
https://github.com/89z/parse

How to open an URL in a browser with an authentication header?

In Golang we can launch a browser window to open an URL using exec.Command method. An example can be found here
My question is how can we open that URL with a header?
If you're using Chrome, you could use Chrome DevTools Protocol to attach to a running Chrome instance and issue a command to navigate to a URL with specific headers.
First, Launch Chrome with Chrome Devtools Protocol enabled by using the flag --remote-debugging-port=9222
You'll get a response similar to DevTools listening on ws://127.0.0.1:9222/devtools/browser/2393d6e8-a85d-40a2-a79e-13f1585ff336
Pass that ws://... URL into the program below:
package main
import (
"context"
"flag"
"log"
"github.com/chromedp/cdproto/network"
"github.com/chromedp/chromedp"
)
func main() {
var devToolWsURL string
flag.StringVar(&devToolWsURL, "devtools-ws-url", "", "DevTools Websocket URL")
flag.Parse()
// Create contexts.
actxt, cancelActxt := chromedp.NewRemoteAllocator(context.Background(), devToolWsURL)
defer cancelActxt()
// Create new tab.
ctxt, _ := chromedp.NewContext(actxt)
// Custom header.
headers := map[string]interface{}{
"X-Header": "my request header",
}
task := chromedp.Tasks{
network.Enable(),
network.SetExtraHTTPHeaders(network.Headers(headers)),
chromedp.Navigate("http://google.com"),
}
// Run task.
err := chromedp.Run(ctxt, task)
if err != nil {
log.Fatal(err)
}
}
Notes:
9222 is the default port for this protocol but you can use any port
you want.
I didn't include the exec.Command code for brevity.
References:
Header example
Remote Chrome control example
UPDATE
Found a simpler way. You can just launch Chrome straight from chromedp by overriding the default headless option:
func main() {
// Create contexts.
opts := append(chromedp.DefaultExecAllocatorOptions[:], chromedp.Flag("headless", false))
actx, cancel := chromedp.NewExecAllocator(context.Background(), opts...)
ctx, cancel := chromedp.NewContext(actx)
// Call cancel() to close Chrome on some condition.
if false {
cancel()
}
// Custom header.
headers := map[string]interface{}{
"X-Header": "my request header",
}
task := chromedp.Tasks{
network.Enable(),
network.SetExtraHTTPHeaders(network.Headers(headers)),
chromedp.Navigate("http://tested.com"),
}
// Run task.
err := chromedp.Run(ctx, task)
if err != nil {
log.Fatal(err)
}
}

Can Anyone help me with this GoLand terminal Error

I'm using GoLand IDE and I'm getting a problem when running my Go web app. The code isn't compiling when the Terminal is used.
Here is the problem: The terminal duplicated the command prompt when I make an attempt at running the code.
C:\Users\Evan\go\src\awesomeProject9>go run main.go
C:\Users\Evan\go\src\awesomeProject9>
package main
import (
"fmt"
"html/template"
"net/http"
)
var tpl *template.Template
func init(){
template.Must(template.ParseGlob("templates/*.html"))
}
func main() {
http.HandleFunc("templates/index", idx)
http.ListenAndServe("8000", nil)
fmt.Println("hello World")
}
func idx(w http.ResponseWriter, r *http.Request){
tpl.ExecuteTemplate(w, "templates/index.html", nil)
}
Thanks to #zerkms for pointing out, that I was wrong. I simply ran into the exact mistake I tried to warn you later on:
you really should use the err returned by called functions, since these really help you a lot! For startes simply:
err := http.ListenAndServe("8000", nil)
if err != nil {
log.Fatal(err)
}
This panics with:
2018/12/18 10:43:16 listen tcp: address 8000: missing port in address
the correct line should be
err := http.ListenAndServe(":8000", nil)
WRONG only for documentation
ListenAndServe doesn't block the further code execution....

How to take screenshot of a website using Golang?

What I'm looking to do, given a URL and take a screenshot of the website using Golang. I searched for results but I didn't get any. Can anyone please help me.
You can use a Go version of Selenium if you want to go that route. https://godoc.org/github.com/tebeka/selenium
There is no pure golang way to do at the moment this since it must involve a browser is some form.
The easiest path to achieve this functionality is probably:
Find a nice NodeJS library to take website screenshots
Create a NodeJS script that is suits your needs for taking screenshots (i/o and settings)
Execute this NodeJS script from Golang and handle the results in your Golang code
Not the cleanest method to get this done though - if you want it cleaner you probably have to build/find a golang package that controls a browser so you can skip the NodeJS middleman.
I solved this issue using https://github.com/mafredri/cdp and a Chrome headless docker container.
You can see my service example here: https://gist.github.com/efimovalex/9f9b815b0d5b1b7889a51d46860faf8a
A few more tools using Go and Chrome/Chromium include:
gowitness CLI app
screenshot library
web2image CLI app based on chromedp
I was writing a program for this specific task. Here is a sample code that browse google.com and takes a screenshot.
package main
import (
"time"
driver "github.com/dreygur/webdriver"
)
func main() {
url := `https://google.com`
driver.RunServer("./geckodriver")
driver.GetSession()
driver.Get(url)
time.Sleep(8 * time.Second)
driver.Screenshot("google")
time.Sleep(8 * time.Second)
defer driver.Kill()
}
To install the module, run go get github.com/dreygur/webdriver
You can use chromedp.
But you need install chrome browser!
Example :
package main
import (
"context"
"fmt"
"os"
"time"
"github.com/chromedp/chromedp"
)
func TackScreenShot(ctx context.Context, url string) ([]byte, error) {
context, cancel := chromedp.NewContext(ctx)
defer cancel()
var filebyte []byte
if err := chromedp.Run(context, chromedp.Tasks{
chromedp.Navigate(url),
chromedp.Sleep(3 * time.Second),
chromedp.CaptureScreenshot(&filebyte),
}); err != nil {
return nil, err
}
return filebyte, nil
}
func main() {
url := "https://google.com"
ctx := context.TODO()
data, err := TackScreenShot(ctx, url)
if err != nil {
panic(err)
}
defer ctx.Done()
pngFile, err := os.Create("./shot.png")
if err != nil {
panic(err)
}
defer pngFile.Close()
pngFile.Write(data)
fmt.Println("screen shot tacked!")
}

Resources