How to reset or discard the buffered data - go

Not able to reset or discard the buffer.
I am trying to get the data over the serial port where I am getting data packet of some fixed length for every 10 seconds. I have an infinite for loop to receive the data packets continuously. After receiving the new data packet I am resetting the buffer but when I receive the next data packet, it overwrites the buffer and I get mixed data packet.
Let say I should receive packet abcdef continuously for every n second. But when I try the following code I am receiving packet bcdefa then after n second cdefab then defabc and so on
package main
import (
"bufio"
"log"
"time"
"github.com/tarm/serial"
)
func main() {
c := &serial.Config{Name: "/dev/ttyUSB0", Baud: 57600}
s, err := serial.OpenPort(c)
if err != nil {
log.Println(err)
return
}
for {
time.Sleep(time.Second / 2)
reader := bufio.NewReader(s)
pck, err := reader.Peek(46)
if err != nil {
log.Println(err)
}
go parse(pck)
reader.Reset(s)
}
}
How do I reset or discard the buffer data effectively so that I will receive the exact data packet.

bare in mind i cant check what i m saying here...
1/ you must not instantiate the bufio reader at each iteration
2/ bufio.Reader.Peek does NOT advance the reader
https://golang.org/pkg/bufio/#Reader.Peek
3/ Unless you get a malformed packet, i think you dont need to reset at all.
4/ Please indent your code at play.golang.org
5/ You are not checking the read error for termination
6/ All package i can found to work with serial ports in go exposes an instance of io.Reader, so it might be useless to use an additional bufio.Reader. I suspect you r using https://godoc.org/github.com/tarm/serial#OpenPort
This is probably not the definitive answer, but it should help.
package main
import (
"io"
"log"
"time"
)
func main() {
s, err := serial.OpenPort(c)
if err != nil {
log.Fatal(err)
}
pck := make([]byte, 46)
for {
time.Sleep(time.Second / 2)
n, err := s.Read(pck)
if err != nil {
if err == io.EOF {
break
}
log.Println(err)
}
pck = pck[:n]
go parse(pck)
}
}

Related

How to make concurrent GET requests from url pool

I completed the suggested go-tour, watched some tutorials and gopher-conferences on YouTube. And that's pretty much it.
I have a project which requires me to send get requests and store the results in files. But amount of URL's is around 80 million.
I'm testing with 1000 URLs only.
Problem: I think I couldn't managed to make it concurrent, although I've followed some guidelines. I don't know what's wrong. But maybe I'm wrong and it's concurrent, just did not seem fast to me, the speed felt like sequential requests.
Here is the code I've written:
package main
import (
"bufio"
"io/ioutil"
"log"
"net/http"
"os"
"sync"
"time"
)
var wg sync.WaitGroup // synchronization to wait for all the goroutines
func crawler(urlChannel <-chan string) {
defer wg.Done()
client := &http.Client{Timeout: 10 * time.Second} // single client is sufficient for multiple requests
for urlItem := range urlChannel {
req1, _ := http.NewRequest("GET", "http://"+urlItem, nil) // generating the request
req1.Header.Add("User-agent", "Mozilla/5.0 (X11; Linux i586; rv:31.0) Gecko/20100101 Firefox/74.0") // changing user-agent
resp1, respErr1 := client.Do(req1) // sending the prepared request and getting the response
if respErr1 != nil {
continue
}
defer resp1.Body.Close()
if resp1.StatusCode/100 == 2 { // means server responded with 2xx code
text1, readErr1 := ioutil.ReadAll(resp1.Body) // try to read the sourcecode of the website
if readErr1 != nil {
log.Fatal(readErr1)
}
f1, fileErr1 := os.Create("200/" + urlItem + ".txt") // creating the relative file
if fileErr1 != nil {
log.Fatal(fileErr1)
}
defer f1.Close()
_, writeErr1 := f1.Write(text1) // writing the sourcecode into our file
if writeErr1 != nil {
log.Fatal(writeErr1)
}
}
}
}
func main() {
file, err := os.Open("urls.txt") // the file containing the url's
if err != nil {
log.Fatal(err)
}
defer file.Close() // don't forget to close the file
urlChannel := make(chan string, 1000) // create a channel to store all the url's
scanner := bufio.NewScanner(file) // each line has another url
for scanner.Scan() {
urlChannel <- scanner.Text()
}
close(urlChannel)
_ = os.Mkdir("200", 0755) // if it's there, it will create an error, and we will simply ignore it
for i := 0; i < 10; i++ {
wg.Add(1)
go crawler(urlChannel)
}
wg.Wait()
}
My question is: why is this code not working concurrently? How can I solve the problem I've mentioned above. Is there something that I'm doing wrong for making concurrent GET requests?
Here's some code to get you thinking. I put the URLs in the code so it is self-sufficient, but you'd probably be piping them to stdin in practice. There's a few things I'm doing here that I think are improvements, or at least worth thinking about.
Before we get started, I'll point out that I put the complete url in the input stream. For one thing, this lets me support http and https both. I don't really see the logic behind hard coding the scheme in the code rather than leaving it in the data.
First, it can handle arbitrarily sized response bodies (your version reads the body into memory, so it is limited by some number of concurrent large requests filling memory). I do this with io.Copy().
[edited]
text1, readErr1 := ioutil.ReadAll(resp1.Body) reads the entire http body. If the body is large, it will take up lots of memory. io.Copy(f1,resp1.Body) would instead copy the data from the http response body directly to the file, without having to hold the whole thing in memory. It may be done in one Read/Write or many.
http.Response.Body is an io.ReadCloser because the HTTP protocol expects the body to be read progressively. http.Response does not yet have the entire body, until it is read. That's why it's not just a []byte. Writing it to the filesystem progressively while the data "streams" in from the tcp socket means that a finite amount of system resources can download an unlimited amount of data.
But there's even more benefit. io.Copy will call ReadFrom() on the file. If you look at the linux implementation (for example): https://golang.org/src/os/readfrom_linux.go , and dig a bit, you'll see it actually uses copy_file_range That system call is cool because
The copy_file_range() system call performs an in-kernel copy between two file descriptors without the additional cost of transferring data from the kernel to user space and then back into the kernel.
*os.File knows how to ask the kernel to deliver data directly from the tcp socket to the file without your program even having to touch it.
See https://golang.org/pkg/io/#Copy.
Second, I make sure to use all the url components in the filename. URLs with different query strings go to different files. The fragment probably doesn't differentiate response bodies, so including that in the path may be ill considered. There's no awesome heuristic for turning URLs into valid file paths - if this were a serious task, I'd probably store the data in files based on a shasum of the url or something - and create an index of results stored in a metadata file.
Third, I handle all errors. req1, _ := http.NewRequest(... might seem like a convenient shortcut, but what it really means is that you won't know the real cause of any errors - at best. I usually add some descriptive text to the errors when percolating up, to make sure I can easily tell which error I'm returning.
Finally, I return successfully processed URLs so that I can see the final results. When scanning millions of URLS, you'd probably also want a list of which failed, but a count of successful is a good start at sending final data back for summary.
package main
import (
"bufio"
"bytes"
"fmt"
"io"
"log"
"net/http"
"net/url"
"os"
"path/filepath"
"time"
)
const urls_text = `http://danf.us/
https://farrellit.net/?3=2&#1
`
func crawler(urls <-chan *url.URL, done chan<- int) {
var processed int = 0
defer func() { done <- processed }()
client := http.Client{Timeout: 10 * time.Second}
for u := range urls {
if req, err := http.NewRequest("GET", u.String(), nil); err != nil {
log.Printf("Couldn't create new request for %s: %s", u.String(), err.Error())
} else {
req.Header.Add("User-agent", "Mozilla/5.0 (X11; Linux i586; rv:31.0) Gecko/20100101 Firefox/74.0") // changing user-agent
if res, err := client.Do(req); err != nil {
log.Printf("Failed to get %s: %s", u.String(), err.Error())
} else {
filename := filepath.Base(u.EscapedPath())
if filename == "/" || filename == "" {
filename = "response"
} else {
log.Printf("URL Filename is '%s'", filename)
}
destpath := filepath.Join(
res.Status, u.Scheme, u.Hostname(), u.EscapedPath(),
fmt.Sprintf("?%s",u.RawQuery), fmt.Sprintf("#%s",u.Fragment), filename,
)
if err := os.MkdirAll(filepath.Dir(destpath), 0755); err != nil {
log.Printf("Couldn't create directory %s: %s", filepath.Dir(destpath), err.Error())
} else if f, err := os.OpenFile(destpath, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0644); err != nil {
log.Printf("Couldn't open destination file %s: %s", destpath, err.Error())
} else {
if b, err := io.Copy(f, res.Body); err != nil {
log.Printf("Could not copy %s body to %s: %s", u.String(), destpath, err.Error())
} else {
log.Printf("Copied %d bytes from body of %s to %s", b, u.String(), destpath)
processed++
}
f.Close()
}
res.Body.Close()
}
}
}
}
const workers = 3
func main() {
urls := make(chan *url.URL)
done := make(chan int)
var submitted int = 0
var inputted int = 0
var successful int = 0
for i := 0; i < workers; i++ {
go crawler(urls, done)
}
sc := bufio.NewScanner(bytes.NewBufferString(urls_text))
for sc.Scan() {
inputted++
if u, err := url.Parse(sc.Text()); err != nil {
log.Printf("Could not parse %s as url: %w", sc.Text(), err)
} else {
submitted++
urls <- u
}
}
close(urls)
for i := 0; i < workers; i++ {
successful += <-done
}
log.Printf("%d urls input, %d could not be parsed. %d/%d valid URLs successful (%.0f%%)",
inputted, inputted-submitted,
successful, submitted,
float64(successful)/float64(submitted)*100.0,
)
}
When setting up a concurrent pipeline, a good guideline to follow is to always first set up and instantiate the listeners that will execute concurrently (in your case, crawlers), and then start feeding them data through the pipeline (in your case, the urlChannel).
In your example, the only thing preventing a deadlock is the fact that you've instantiated a buffered channel with the same number of rows that your test file has (1000 rows). What the code does is it puts URLs inside the urlChannel. Since there are 1000 rows inside your file, the urlChannel can take all of them without blocking. If you put more URLs inside the file, the execution will block after filling up the urlChannel.
Here is the version of the code that should work:
package main
import (
"bufio"
"io/ioutil"
"log"
"net/http"
"os"
"sync"
"time"
)
func crawler(wg *sync.WaitGroup, urlChannel <-chan string) {
defer wg.Done()
client := &http.Client{Timeout: 10 * time.Second} // single client is sufficient for multiple requests
for urlItem := range urlChannel {
req1, _ := http.NewRequest("GET", "http://"+urlItem, nil) // generating the request
req1.Header.Add("User-agent", "Mozilla/5.0 (X11; Linux i586; rv:31.0) Gecko/20100101 Firefox/74.0") // changing user-agent
resp1, respErr1 := client.Do(req1) // sending the prepared request and getting the response
if respErr1 != nil {
continue
}
if resp1.StatusCode/100 == 2 { // means server responded with 2xx code
text1, readErr1 := ioutil.ReadAll(resp1.Body) // try to read the sourcecode of the website
if readErr1 != nil {
log.Fatal(readErr1)
}
resp1.Body.Close()
f1, fileErr1 := os.Create("200/" + urlItem + ".txt") // creating the relative file
if fileErr1 != nil {
log.Fatal(fileErr1)
}
_, writeErr1 := f1.Write(text1) // writing the sourcecode into our file
if writeErr1 != nil {
log.Fatal(writeErr1)
}
f1.Close()
}
}
}
func main() {
var wg sync.WaitGroup
file, err := os.Open("urls.txt") // the file containing the url's
if err != nil {
log.Fatal(err)
}
defer file.Close() // don't forget to close the file
urlChannel := make(chan string)
_ = os.Mkdir("200", 0755) // if it's there, it will create an error, and we will simply ignore it
// first, initialize crawlers
wg.Add(10)
for i := 0; i < 10; i++ {
go crawler(&wg, urlChannel)
}
//after crawlers are initialized, start feeding them data through the channel
scanner := bufio.NewScanner(file) // each line has another url
for scanner.Scan() {
urlChannel <- scanner.Text()
}
close(urlChannel)
wg.Wait()
}

Multiple serial requests result in empty buffer

The first TCP connection running on localhost on osx always parses the binary sent to it correctly. Subsequent requests lose the binary data, only seeing the first byte [8]. How have I failed to set up my Reader?
package main
import (
"fmt"
"log"
"net"
"os"
"app/src/internal/handler"
"github.com/golang-collections/collections/stack"
)
func main() {
port := os.Getenv("SERVER_PORT")
s := stack.New()
ln, err := net.Listen("tcp", ":8080")
if err != nil {
log.Fatalf("net.Listen: %v", err)
}
fmt.Println("Serving on " + port)
for {
conn, err := ln.Accept()
// defer conn.Close()
if err != nil {
log.Fatal("ln.Accept")
}
go handler.Handle(conn, s)
}
}
package handler
import (
"fmt"
"io"
"log"
"net"
"github.com/golang-collections/collections/stack"
)
func Handle(c net.Conn, s *stack.Stack) {
fmt.Printf("Serving %s\n", c.RemoteAddr().String())
buf := make([]byte, 0, 256)
tmp := make([]byte, 128)
n, err := c.Read(tmp)
if err != nil {
if err != io.EOF {
log.Fatalf("connection Read() %v", err)
}
return
}
buf = append(buf, tmp[:n]...)
}
log:
Serving [::1]:51699
------------- value ---------------:QCXhoy5t
Buffer Length: 9. First Value: 8
Serving [::1]:51700
------------- value ---------------:
Buffer Length: 1. First Value: 8
Serving [::1]:51701
test sent over:
push random string:
QCXhoy5t
push random string:
GPh0EnbS
push random string:
4kJ0wN0R
The docs for Reader say:
Read reads up to len(p) bytes into p. It returns the number of bytes read (0 <= n
<= len(p)) and any error encountered. Even if Read returns n < len(p), it may use
all of p as scratch space during the call. If some data is available but not
len(p) bytes, Read conventionally returns what is available instead of waiting
for more.
So the most likely cause of your issue is that Read is returning the data available (in this case a single character). You can fix this by using ioutil.ReadAll or performing the read in a loop (the fact the data is being added to a buffer makes it look like that was the original intention) with something like:
for {
n, err := c.Read(tmp)
if err != nil {
if err != io.EOF {
// Note that data might have also been received - you should process that
// if appropriate.
log.Fatalf("connection Read() %v", err)
return
}
break // All data received so process it
}
buf = append(buf, tmp[:n]...)
}
Note: There is no guarantee that any data is received; you should check the length before trying to access it (i.e. buf[0] may panic)

Can`t figure out how to use buffers with binary web socket

everyone!
I'm trying to get my go code work with openstack serial console. It`s exposed via web socket. And i have problems with it.
I found gorrilla websocket lib (which is great) and took this example as a reference
With a few tweaks, now i have a code like this:
package main
import (
"log"
"net/url"
"os"
"os/signal"
"time"
"net/http"
"github.com/gorilla/websocket"
)
func main() {
DialSettings := &websocket.Dialer {
Proxy: http.ProxyFromEnvironment,
HandshakeTimeout: 45 * time.Second,
Subprotocols: []string{"binary",},
ReadBufferSize: 4096,
WriteBufferSize: 4096,
}
log.SetFlags(0)
interrupt := make(chan os.Signal, 1)
signal.Notify(interrupt, os.Interrupt)
u, _ := url.Parse("ws://172.17.0.64:6083/?token=d1763f2b-3466-424c-aece-6aeea2a733d5") //websocket url as it outputs from 'nova get-serial-console test' cmd
log.Printf("connecting to %s", u.String())
c, _, err := DialSettings.Dial(u.String(), nil)
if err != nil {
log.Fatal("dial:", err)
}
defer c.Close()
done := make(chan struct{})
go func() {
defer close(done)
for {
_, message, err := c.ReadMessage()
if err != nil {
log.Println("read:", err)
return
}
log.Printf("%s", message)
}
}()
c.WriteMessage(websocket.TextMessage, []byte("\n")) //just to force output to console
for {
select {
case <-done:
return
case <-interrupt:
log.Println("interrupt")
// Cleanly close the connection by sending a close message and then
// waiting (with timeout) for the server to close the connection.
err := c.WriteMessage(websocket.CloseMessage, websocket.FormatCloseMessage(websocket.CloseNormalClosure, ""))
if err != nil {
log.Println("write close:", err)
return
}
select {
case <-done:
case <-time.After(time.Second):
}
return
}
}
}
And i get output like this:
connecting to ws://172.17.0.64:6083/?token=d1763f2b-3466-424c-aece-6aeea2a733d5
CentOS Linux 7
(C
ore)
K
erne
l
3.10.0-862.el7.x86_64
o
n an
x
86_64
centos
-test login:
Total mess...
I think it's because i recieve just a chunks of bytes with no way to delimit them. I need some buffer to store them and when do something like bufio.ReadLine. But i'm not most experienced go programmer, and i run out of ideas how to do this. At the end i just need strings to work with.
The log package writes each log message on a separate line. If the log message does not end with a newline, then the log package will add one.
These extra newlines are garbling the output. To fix the output, replace the call to log.Printf("%s", message) with a function that does not add newlines to the output. Here are some options:
Write the message to stderr (same destination as default log package config):
os.Stderr.Write(message)
Write the message to stdout (a more conventional location to write program output):
os.Stdout.Write(message)

concurrency | goroutines | golang | buffered reader

Believe I am either misunderstanding how go routines work, how buffered readers work or both.
Expect an asynchronous execution of the goroutine ( a buffered reader with a for loop reading the buffer, waiting for a message from the server )
Try METHOD A to call go xyz() before the client dials the server; so xyz() creates the buffer and starts reading in the background. Then, the client dials the server; server sends message back; the client is reading the buffer so, it gets the message and prints to console
What Actually Happens the client to send the message to the server, but does not pick up anything on the buffer while reading for possible reply from server; so it is running concurrently because I know the for loop has not stopped, but it lets the next line of code execute ( client sending message to server ).
But When METHOD B I call xyz() NOT concurrently and after the client dials the server, all things work as expected. The client gets the message back from the server and prints to console.
METHOD A, we have the order :
///////////// steps 1 and 2 are in the goroutine called by go xyz()
creates the buffered reader
for loop -- reading the buffer for message from the server -- print out
client dials the server
go xyz(conn, p)
fmt.Fprintf(conn, "Give me a hash to work on ...")
METHOD B, we have the order :
///////////// steps 2 and 3 are in the goroutine called by xyz()
client dials the server
creates buffered reader
for loop -- reading the buffer for message from the server -- print out
fmt.Fprintf(conn, "Give me a hash to work on ...")
xyz(conn, p)
client.go
package main
import (
"fmt"
"net"
"bufio"
)
func xyz(conn net.Conn, p []byte) {
rd := bufio.NewReader(conn)
for {
_, err := rd.Read(p)
if err == nil {
fmt.Printf("SERVER : %s\n", p)
} else {
fmt.Printf("Some error %v\n", err)
}
}
}
func main() {
p := make([]byte, 2048)
conn, err := net.Dial("udp", "127.0.0.1:1234")
if err != nil {
fmt.Printf("Some error %v", err)
return
}
go xyz(conn, p)
fmt.Fprintf(conn, "Give me a hash to work on ...")
}
server.go
package main
import (
"fmt"
"net"
)
func sendResponse(conn *net.UDPConn, addr *net.UDPAddr, hash string) {
_,err := conn.WriteToUDP([]byte("Hello, here is the hash - " + hash), addr)
if err != nil {
fmt.Printf("Couldn't send response %v", err)
}
}
func main() {
hash := "36";
p := make([]byte, 2048)
addr := net.UDPAddr{
Port: 1234,
IP: net.ParseIP("127.0.0.1"),
}
ser, err := net.ListenUDP("udp", &addr)
if err != nil {
fmt.Printf("Some error %v\n", err)
return
}
for {
_, remoteaddr, err := ser.ReadFromUDP(p)
fmt.Printf("CLIENT : %v : %s\n", remoteaddr, p)
if err != nil {
fmt.Printf("Some error %v", err)
continue
}
go sendResponse(ser, remoteaddr, hash)
}
}
The Go Programming Language Specification
Go statements
A "go" statement starts the execution of a function call as an
independent concurrent thread of control, or goroutine, within the
same address space.
... unlike with a regular call, program execution does not wait for
the invoked function to complete.
client.go starts the goroutine xyz and then keeps going to the end of the main function which terminates the program. The program doesn't wait for the xyz goroutine to run or finish.

Go: zlib uncompressing a slice of bytes

I am trying to parse a file that annoying consists of many separately zipped segments. I have parsed these segments one at a time into a slice of bytes and I want to uncompress them as I go.
Here is my current code that does the decompressing, which doesn't work. from and to are just set at the top as an example, in reality they are set by the code. data is the byte array containing the entire file. I don't want to seek it while it's on disk because its location on another server, so it's only realistic for me to load the entire file to []byte first and then parse it.
from, to := 0, 1000;
b := bytes.NewReader(data[from:from+to])
z, err := zlib.NewReader(b)
CheckErr(err)
defer z.Close()
p := make([]byte,0,1024)
z.Read(p)
fmt.Println(string(p))
So how is it so massively difficult just to unzip a slice of bytes? Anyway...
The problem appears to with how I am reading it out. Where it says z.Read, that doesn't seem to do anything.
How can I read the entire thing in one go into a slice of bytes?
Here's an outline for you. Note: In Go, CHECK FOR ERRORS!
package main
import (
"bytes"
"compress/zlib"
"fmt"
"io/ioutil"
)
func readSegment(data []byte, from, to int) ([]byte, error) {
b := bytes.NewReader(data[from : from+to])
z, err := zlib.NewReader(b)
if err != nil {
return nil, err
}
defer z.Close()
p, err := ioutil.ReadAll(z)
if err != nil {
return nil, err
}
return p, nil
}
func main() {
from, to := 0, 1000
data := make([]byte, from+to)
// ** parse input segments into data **
p, err := readSegment(data, from, to)
if err != nil {
fmt.Println(err)
return
}
fmt.Println(string(p))
}
Use ReadAll(r io.Reader) ([]byte, error) from the io/ioutil package.
p, err := ioutil.ReadAll(b)
fmt.Println(string(p))
Read only reads up to the length of the given slice (1024 bytes in your case).
To read in chunks of 1024 bytes:
p := make([]byte,1024)
for {
numBytes, err := l.Read(p)
if err == io.EOF {
// you are done, numBytes might be less than len(p)
break
}
// do what you want with p
}
If you are getting the data from a webserver, you might even do
import (
"net/http"
"io/ioutil"
)
...
resp, errGet := http.Get("http://example.com/somefile")
// do error handling
z, errZ := zlib.NewReader(resp.Body)
// do error handling
resp.Body.Close()
p, err := ioutil.ReadAll(b)
// do error handling
since resp.Body happens to be an io.Reader as most io related types.

Resources