I want to use inotify to watch some file changes, and use epoll to monitor if any inotify event occurs. However, I was having some trouble receiving inotify events. Say, how do I know if I got every events that occured.
If inotify fd sets to NON_BLOCK mode, EAGAIN would get returned when read() indicating that there's nothing else left to read, which is nice. But If inotify fd sets to BLOCKING mode, read() would block infinitely.
For example, if you
call read(2) by asking to read a certain amount of data and
read(2) returns a lower number of bytes, you can be sure of
having exhausted the read I/O space for the file descriptor.
Ref: epoll(7)
According to epoll(7), I can assure a thorough read if read() returns a lower number of bytes. But how should I deal with read() returning the same number of bytes ?
Here's the code I've tried.
func main() {
fd, _ := unix.InotifyInit()
epollfd, _ := unix.EpollCreate(1)
unix.EpollCtl(epollfd, unix.EPOLL_CTL_ADD, fd, &unix.EpollEvent{
Fd: int32(fd),
Events: unix.EPOLLIN,
})
unix.InotifyAddWatch(fd, os.Args[1], unix.IN_ALL_EVENTS)
epollevents := make([]unix.EpollEvent, 8)
for {
// epoll wait
var nepoll int
if nepoll, err = unix.EpollWait(epollfd, epollevents, -1); err == nil {
// nothing happens
} else if errors.Is(err, unix.EINTR) {
continue // ignore interrupt
} else {
log.Fatal(err.Error())
}
// process inotify events
for i := 0; i < nepoll; i += 1 {
var nr int
// buff for receiving
// eventBuf for processing
// two buff to prevent a large amount of data
eventBuff := make([]byte, (unix.SizeofInotifyEvent+unix.PathMax)*2)
buff := make([]byte, unix.SizeofInotifyEvent+unix.PathMax)
offset, nr, lastUnread := 0, 0, 0
for {
// read loop in case of a large amount of data
if nr, err = unix.Read(int(epollevents[i].Fd), buff); err == nil {
// nothing happens
} else if errors.Is(err, unix.EAGAIN) {
println("exhaust")
break
} else {
log.Fatal(err.Error())
}
copy(eventBuff[lastUnread:lastUnread+nr], buff[:nr])
lastUnread += nr
for offset < lastUnread &&
(nr < len(buff) || lastUnread-offset >= unix.PathMax+unix.SizeofInotifyEvent) {
event := (*unix.InotifyEvent)(unsafe.Pointer(&eventBuff[offset]))
switch event.Mask {
case unix.IN_ACCESS:
// ...
case unix.IN_OPEN:
// ...
case unix.IN_CLOSE_WRITE:
// ...
case unix.IN_CLOSE_NOWRITE:
// ...
}
offset += unix.SizeofInotifyEvent + int(event.Len)
}
copy(eventBuff[offset:lastUnread], eventBuff[:lastUnread-offset])
lastUnread -= offset
offset = 0
}
}
fmt.Println("Over") // NEVER got print out in BLOCKING mode
}
}
Related
I used do it like:
...
ws, err := websocket.Dial(url, "", origin)
...
var buffer = make([]byte, 512)
var rs = make([]byte, 0, 512)
L:
for {
m, err := ws.Read(buffer)
if err != nil {
if err == io.EOF {
break L
}
fmt.Println(err.Error())
return
}
rs = append(rs, buffer[:m]...)
if m < 512 {
break L
}
}
This has a bug: if the message's length is exactly 512 or 1024 or 2048... the loop never breaks; it will be stuck at ws.Read() and wait without throwing io.EOF.
Afterwards I observed that ws.Len() is always longer than the messages's length by 4.
I rewrote the code as:
var buffer = make([]byte, 512)
var rs = make([]byte, 0, 512)
var sum = 0
L:
for {
m, err := ws.Read(buffer)
if err != nil {
if err == io.EOF {
break L
}
fmt.Println(err.Error())
return
}
rs = append(rs, buffer[:m]...)
sum+=m
if sum >= ws.Len()-4 {
break L
}
}
This way is okay.
But the number 4 is a magic code.
Is there a way to find the message's max length?
Some friends suggest separating the message packet, but I think WebSocket should not consider packet stucking or separating.
What is the most proper way for a WebSocket client to read a message?
It looks like you are using the golang.org/x/net/websocket package. It's not possible to reliably detect message boundaries using that package's Read method.
To fix, use websocket.Message to read messages.
var msg string
err := websocket.Message.Receive(ws, &msg)
if err != nil {
// handle error
}
// msg is the message
Note that the golang.org/x/net/websocket documentation says:
This package currently lacks some features found in an alternative and more actively maintained WebSocket package:
https://godoc.org/github.com/gorilla/websocket
The Gorilla documentation and examples show how to read messages.
I have a goroutine that is constantly blocked reading the stdin, like this:
func routine() {
for {
data := make([]byte, 8)
os.Stdin.Read(data);
otherChannel <-data
}
}
The routine waits to read 8 bytes via stdin and feeds another channel.
I want to gracefully stop this goroutine from the main thread. However, since the goroutine will almost always be blocked reading from stdin, I can't find a good solution to force it to stop. I thought about something like:
func routine(stopChannel chan struct{}) {
for {
select {
case <-stopChannel:
return
default:
data := make([]byte, 8)
os.Stdin.Read(data);
otherChannel <-data
}
}
}
However, the problem is that if there is no more input in the stdin when the stopChannel is closed, the goroutine will stay blocked and not return.
Is there a good approach to make it return immediately when the main thread wants?
Thanks.
To detect that os.Stdin has been closed : check the error value returned by os.Stdin.Read().
One extra point : although you state that in your case you will always receive 8 bytes chunks, you should still check that you indeed received 8 bytes of data.
func routine() {
for {
data := make([]byte, 8)
n, err := os.Stdin.Read(data)
// error handling : the basic thing to do is "on error, return"
if err != nil {
// if os.Stdin got closed, .Read() will return 'io.EOF'
if err == io.EOF {
log.Printf("stdin closed, exiting")
} else {
log.Printf("stdin: %s", err)
}
return
}
// check that 'n' is big enough :
if n != 8 {
log.Printf("short read: only %d byte. exiting", n)
return // instead of returning, you may want to keep '.Read()'ing
// or you may use 'io.ReadFull(os.Stdin, data)' instead of '.Read()'
}
// a habit to have : truncate your read buffers to 'n' after a .Read()
otherChannel <-data[:n]
}
}
When the buffered io writer is used, and some error occur, how can I perform the retry?
For example, I've written 4096B using Write() and an error occur when the bufwriter automatically flushes the data. Then I want to retry writing the 4096B, how I can do it?
It seems I must keep a 4096B buffer myself to perform the retrying. Othersize I'm not able to get the data failed to be flushed.
Any suggestions?
You'll have to use a custom io.Writer that keeps a copy of all data, so that it can be re-used in case of a retry.
This functionality is not part of the standard library, but shouldn't be hard to implement yourself.
When bufio.Writer fails on a Write(..) it will return the amount of bytes written (n) to the buffer the reason why (err).
What you could do is the following. (Note I haven't yet tried this so it may be a little wrong and could use some cleaning up)
func writeSomething(data []byte, w *bufio.Writer) (err error) {
var pos, written int = 0
for pos != len(data) {
written, err = w.Write(data[pos:])
if err != nil {
if err == io.ErrShortWrite {
pos += written // Write was shot. Update pos and keep going
continue
} else netErr, ok := err.(net.Error); ok && netErr.Temporary() {
continue // Temporary error, don't update pos so it will try writing again
} else {
break // Unrecoverable error, bail
}
} else {
pos += written
}
}
return nil
}
I have a Go server that takes input from a number of TCP clients that stream in data. The format is a custom format and the end delimiter could appear within the byte stream so it uses bytes stuffing to get around this issue.
I am looking for hotspots in my code and this throws up a HUGE one and I'm sure it could be made more efficient but I'm not quite sure how at the moment given the provided Go functions.
The code is below and pprof shows the hotspot to be popPacketFromBuffer command. This looks at the current buffer, after each byte has been received and looks for the endDelimiter on it's own. If there is 2 of them in a row then it is within the packet itself.
I did look at using ReadBytes() instead of ReadByte() but it looks like I need to specify a delimiter and I'm fearful that this will cut off a packet mid stream? And also in any case would this be more efficient than what I am doing anyway?
Within the popPacketFromBuffer function it is the for loop that is the hotspot.
Any ideas?
// Read client data from channel
func (c *Client) listen() {
reader := bufio.NewReader(c.conn)
clientBuffer := new(bytes.Buffer)
for {
c.conn.SetDeadline(time.Now().Add(c.timeoutDuration))
byte, err := reader.ReadByte()
if err != nil {
c.conn.Close()
c.server.onClientConnectionClosed(c, err)
return
}
wrErr := clientBuffer.WriteByte(byte)
if wrErr != nil {
log.Println("Write Error:", wrErr)
}
packet := popPacketFromBuffer(clientBuffer)
if packet != nil {
c.receiveMutex.Lock()
packetSize := uint64(len(packet))
c.bytesReceived += packetSize
c.receiveMutex.Unlock()
packetBuffer := bytes.NewBuffer(packet)
b, err := uncompress(packetBuffer.Bytes())
if err != nil {
log.Println("Unzip Error:", err)
} else {
c.server.onNewMessage(c, b)
}
}
}
}
func popPacketFromBuffer(buffer *bytes.Buffer) []byte {
bufferLength := buffer.Len()
if bufferLength >= 125000 { // 1MB in bytes is roughly this
log.Println("Buffer is too large ", bufferLength)
buffer.Reset()
return nil
}
tempBuffer := buffer.Bytes()
length := len(tempBuffer)
// Return on zero length buffer submission
if length == 0 {
return nil
}
endOfPacket := -1
// Determine the endOfPacket position by looking for an instance of our delimiter
for i := 0; i < length-1; i++ {
if tempBuffer[i] == endDelimiter {
if tempBuffer[i+1] == endDelimiter {
i++
} else {
// We found a single delimiter, so consider this the end of a packet
endOfPacket = i - 2
break
}
}
}
if endOfPacket != -1 {
// Grab the contents of the provided packet
extractedPacket := buffer.Bytes()
// Extract the last byte as we were super greedy with the read operation to check for stuffing
carryByte := extractedPacket[len(extractedPacket)-1]
// Clear the main buffer now we have extracted a packet from it
buffer.Reset()
// Add the carryByte over to our new buffer
buffer.WriteByte(carryByte)
// Ensure packet begins with a valid startDelimiter
if extractedPacket[0] != startDelimiter {
log.Println("Popped a packet without a valid start delimiter")
return nil
}
// Remove the start and end caps
slice := extractedPacket[1 : len(extractedPacket)-2]
return deStuffPacket(slice)
}
return nil
}
Looks like you call popPacketFromBuffer() each time every single byte received from connection. However popPacketFromBuffer() copy hole buffer and inspect for delimeters every byte each tyme. Maybe this is overwhelming. For me you don't need loop
for i := 0; i < length-1; i++ {
if tempBuffer[i] == endDelimiter {
if tempBuffer[i+1] == endDelimiter {
i++
} else {
// We found a single delimiter, so consider this the end of a packet
endOfPacket = i - 2
break
}
}
}
in popPacketFromBuffer() Maybe instead of loop just testing last two bytes
if (buffer[len(buffer)-2] == endDelimiter) && (buffer[len(buffer)-1] != endDelimiter){
//It's a packet
}
would be enough for purpose.
I'm reading values from a channel in a loop like this:
for {
capturedFrame := <-capturedFrameChan
remoteCopy(capturedFrame)
}
To make it more efficient, I would like to read these values in a batch, with something like this (pseudo-code):
for {
capturedFrames := <-capturedFrameChan
multipleRemoteCopy(capturedFrames)
}
But I'm not sure how to do that. If I call capturedFrames := <-capturedFrameChan multiple times it's going to block.
Basically, what I would like is to read all the available values in captureFrameChan and, if none is available, it blocks as usual.
What would be the way to accomplish this in Go?
Something like this should work:
for {
// we initialize our slice. You may want to add a larger cap to avoid multiple memory allocations on `append`
capturedFrames := make([]Frame, 1)
// We block waiting for a first frame
capturedFrames[0] = <-capturedFrameChan
forLoop:
for {
select {
case buf := <-capturedFrameChan:
// if there is more frame immediately available, we add them to our slice
capturedFrames = append(capturedFrames, buf)
default:
// else we move on without blocking
break forLoop
}
}
multipleRemoteCopy(capturedFrames)
}
Try this (for channel ch with type T):
for firstItem := range ch { // For ensure that any batch could not be empty
var itemsBatch []T
itemsBatch = append(itemsBatch, firstItem)
Remaining:
for len(itemsBatch) < BATCHSIZE { // For control maximum size of batch
select {
case item := <-ch:
itemsBatch = append(itemsBatch, item)
default:
break Remaining
}
}
// Consume itemsBatch here...
}
But, if BATCHSIZE is constant, this code would be more efficient:
var i int
itemsBatch := [BATCHSIZE]T{}
for firstItem := range ch { // For ensure that any batch could not be empty
itemsBatch[0] = firstItem
Remaining:
for i = 1; i < BATCHSIZE; i++ { // For control maximum size of batch
select {
case itemsBatch[i] = <-ch:
default:
break Remaining
}
}
// Now you have itemsBatch with length i <= BATCHSIZE;
// Consume that here...
}
By using len(capturedFrames), you can do it like below:
for {
select {
case frame := <-capturedFrames:
frames := []Frame{frame}
for i := 0; i < len(capturedFrames); i++ {
frames = append(frames, <-capturedFrames)
}
multipleRemoteCopy(frames)
}
}
Seems you can also benchmark just
for {
capturedFrame := <-capturedFrameChan
go remoteCopy(capturedFrame)
}
without any codebase refactoring to see if it increase efficiency.
I've ended up doing it as below. Basically I've used len(capturedFrames) to know how many frames are available, then retrieved them in a loop:
for {
var paths []string
itemCount := len(capturedFrames)
if itemCount <= 0 {
time.Sleep(50 * time.Millisecond)
continue
}
for i := 0; i < itemCount; i++ {
f := <-capturedFrames
paths = append(paths, f)
}
err := multipleRemoteCopy(paths, opts)
if err != nil {
fmt.Printf("Error: could not remote copy \"%s\": %s", paths, err)
}
}