Long time ago , I built a website with Golang , now , I want to track online users.
I want to do this without Redis and working with SessionID.
What is the best way for my work ?
I wrote a global handler :
type Tracker struct {
http.Handler
}
func NewManager(handler http.Handler) *Tracker {
return &Tracker{Handler: handler}
}
func (h *Tracker) ServeHTTP(w http.ResponseWriter, r *http.Request) {
log.Println(r.RemoteAddr)
h.Handler.ServeHTTP(w,r)
}
.
.
.
srv := &http.Server{
Handler: newTracker(e),
Addr: "127.0.0.1" + port,
WriteTimeout: 15 * time.Second,
ReadTimeout: 15 * time.Second,
}
log.Fatal(srv.ListenAndServe())
I think one of work that I can do is :
Add a sessionID in client and save it in a map at server And counting online users and following there.
Is it good and right way ?
A global handler, middleware (if you're using a router pkg look at this) or just calling a stats function on popular pages would be enough. Be careful to exclude bots, rss hits, or other traffic you don't care about.
Assuming you have one process, and want to track users online in last 5 mins or something, yes, a map server side would be fine, you can issue tokens (depends on user allowing cookies, takes bandwidth on each request), or just hash ip (works pretty well, potential for slight undercounting). You then need to expire them after some interval, and use a mutex to protect them. On restart you lose the count, if running two processes you can't do this, this is the downside of in memory storage, you need another caching process to persist. So this is not suitable for large sites, but you could easily move to using a more persistent store later.
var PurgeInterval = 5 * time.Minute
var identifiers = make(map[string]time.Time)
var mu sync.RWMutex
...
// Hash ip + ua for anonymity in our store
hasher := sha256.New()
hasher.Write([]byte(ip))
hasher.Write([]byte(ua))
id := base64.URLEncoding.EncodeToString(hasher.Sum(nil))
// Insert the entry with current time
mu.Lock()
identifiers[id] = time.Now()
mu.Unlock()
...
// Clear the cache at intervals
mu.Lock()
for k, v := range identifiers {
purgeTime := time.Now().Add(-PurgeInterval)
if v.Before(purgeTime) {
delete(identifiers, k)
}
}
mu.Unlock()
Something like that.
Related
I am having some problems with the concurrent HTTP connection in the golang. Kindly read the whole question, and as the actual code is quite long, I am using pseudocode
In short, I have to create a single API, which will internally call 5 other APIs, unify their response, and send them as a single response.
I am using goroutines to call those 5 internal APIs along with timeout, and using channels to ensure that every goroutine has been completed, then I unify their response, and return the same.
Things are going fine when I do local testing, my response time is around 300ms, which is pretty good.
The problem arises when I do the locust load testing of 200 users, then my response time go as high as 7 8 sec. I am thinking it has to do with the HTTP client waiting for the resources as we are running a high number of goroutines.
like 1 API spin up 5 go-routine, so if each of 200 users makes API requests at the rate of supposing 5 req/sec. Then a total number of goroutines goes way higher. Again this is my assumption only
p.s. normally the API I am building is pretty good in response time,
I am using all the caching and stuff and any response greater than
400ms should not be the case
So can anyone please tell me how can I tackle this problem of
increasing response time when number of concurrent users increases
Locust test report
pseudo code
simple route
group.POST("/test", controller.testHandler)
controller
type Worker struct {
NumWorker int
Data chan structures.Placement
}
e := Worker{
NumWorker: 5, // Number of worker goroutine(s)
Data: make(chan, 5) /* Buffer Size */),
}
//call the goroutines along with the
for i := 0; i < e.NumWorker; i++ {
// Do some fake work
wg.Add(1)
go ad.GetResponses(params ,chan , &wg) //making HHTP call and returning the response in the channel
}
for v := range resChan {
//unifying all the response, and return the same as our response
switch v.Tyoe{
case A :
finalResponse.A = v
case B
finalResponse.B = v
}
}
return finalResponse
Request HTTP client
//i am using a global http client with custom transport , so that i can effectively use the resources
var client *http.Client
func init() {
tr := &http.Transport{
MaxIdleConnsPerHost: 1024,
TLSHandshakeTimeout: 0 * time.Second,
}
tr.MaxIdleConns = 100
tr.MaxConnsPerHost = 100
tr.MaxIdleConnsPerHost = 100
client = &http.Client{Transport: tr, Timeout: 10 * time.Second}
}
func GetResponses(params , chan ,wg){
res = client.Do(req)
chan <- res
}
So I have done some debugging and span monitoring , and turns out redis was the culprit in this. You can see this https://stackoverflow.com/a/70902382/9928176
To get an idea how I solved it
In a golang web server I want to measure times taken by some http controller. I am calling time.Now() before calling controller function, and time.Since() after controller function returns. But if it has long remote io request that takes 1 second, or the process is throttled, or controller is parallelized with goroutines - then that time will be not exactly what I want.
If we assume analogy to bash time command - then I am getting real time with this technique:
time go build
real 0m5,204s
user 0m12,012s
sys 0m2,043s
How can I measure user and sys times for a function run(preferably for a goroutine plus its forked children) in a golang program (preferably with standard packages)?
this is my profiler implementation. How can i extend it with sys and user time per goroutine?
const HeaderCost = "Cost"
// Timed middleware will set Cost header in http response
func Timed(h http.Handler) http.HandlerFunc {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
h.ServeHTTP(&responseWriterWithTimer{
ResponseWriter: w,
headerWritten: false,
startedAt: time.Now(),
}, r)
})
}
type responseWriterWithTimer struct {
http.ResponseWriter
headerWritten bool
startedAt time.Time
}
func (w *responseWriterWithTimer) WriteHeader(statusCode int) {
w.Header().Set(
HeaderCost,
strconv.FormatFloat(
time.Since(w.startedAt).Seconds(),
'g',
64,
64,
),
)
w.ResponseWriter.WriteHeader(statusCode)
w.headerWritten = true
}
func (w *responseWriterWithTimer) Write(b []byte) (int, error) {
if !w.headerWritten {
w.WriteHeader(http.StatusOK)
}
return w.ResponseWriter.Write(b)
}
If you want to do basic instrumentation at runtime, you can wrap your handlers to measure their execution time:
func perfMiddleware(h http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
t := time.Now()
h.ServeHTTP(w, r)
log.Printf("handler took %s", time.Since(t))
})
}
You could expose this more easily using expvar. Going beyond this, there are also numerous instrumentation/telemetry/APM libraries available for Go if you look for them, along with metrics management solutions like the TICK stack, Datadog, and so on.
As for the real, user, and sys data output by time, these are posix measures that don't perfectly apply to instrumenting a Go HTTP handler (or any other unit of code at runtime), for a number of reasons:
goroutines have no parent/child relationship; all are equal peers, so there is no metric of the time taken by "children" of your handler.
most of the I/O is handled within the stdlib, which isn't instrumented to this level (and instrumentation at this level would have a non-negligible performance impact of its own)
You can of course instrument each piece individually, which is often more useful; for example, instrument your HTTP handlers, as well as any code that is making its own external requests, in order to measure the performance of each component. From this you can analyze the data and get a much clearer picture of what is taking time, in order to address any performance issues you find.
If you want to measure something in isolation, benchmarks are probably exactly what you're after.
If you're trying to measure a http.Handler, you can use httptest.NewRecorder and httptest.NewRequest to create a new response writer and request object and just invoke the handler directly inside your benchmark.
func BenchmnarkHttpHandler(b*testing.B) {
req := httptest.NewRequest("GET", "/foo", nil)
myHandler := thingtotest.Handler{}
for n := 0; n < b.N; n++ {
myHandler.ServeHTTP(httptest.NewRecorder(), req);
}
}
How can I measure user and sys times for a function run
You cannot. That distinction is not an observable for Go functions.
(But honestly: Measuring them is of no real use and doesn't make much sense. This sound like a XY problem.)
I already created a function for limiting to 50 requests for API logins in one day.
var limit = 50
package middleware
import (
"log"
"net"
"net/http"
"sync"
"time"
"golang.org/x/time/rate"
)
// Create a custom request struct which holds the rate limiter for each
// visitor and the last time that the request was seen.
type request struct {
limiter *rate.Limiter
lastSeen time.Time
}
// Change the the map to hold values of the type request.
// defaultTime using 3 minutes
var requests = make(map[string]*request)
var mu sync.Mutex
func getRequest(ip string, limit int) *rate.Limiter {
mu.Lock()
defer mu.Unlock()
v, exists := requests[ip]
if !exists {
limiter := rate.NewLimiter(1, limit)
requests[ip] = &request{limiter, time.Now()}
return limiter
}
// Update the last seen time for the visitor.
v.lastSeen = time.Now()
return v.limiter
}
func throttle(next http.Handler, limit int) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
ip, _, err := net.SplitHostPort(r.RemoteAddr)
if err != nil {
log.Println(err.Error())
http.Error(w, "Internal Server Error", http.StatusInternalServerError)
return
}
limiter := getRequest(ip, limit)
fmt.Println(limiter.Allow())
if limiter.Allow() == false {
http.Error(w, http.StatusText(http.StatusTooManyRequests), http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}
Is it correct?
Because when I try it, it still passes. The function limit is not working.
I doubt with NewLimiter()
limiter := rate.NewLimiter(1, limit)
Does it mean one user only can request login 50 requests per day? (I already read the docs, but I do not understand.)
From the rate docs:
func NewLimiter(r Limit, b int) *Limiter
NewLimiter returns a new Limiter that allows events up to rate r and
permits bursts of at most b tokens.
So the first parameter is the rate-limit, not the second. Burst is the number of requests you want to allow that occur faster than the rate-limit - typically one uses a value of 1 to disallow bursting, anything higher will let this number of requests in before the regular rate-limit kicks in. Anyway...
To create the rate.Limit for your needs, you can use the helper function rate.Every():
rt := rate.Every(24*time.Hour / 50)
limiter := rate.NewLimiter(rt, 1)
NewLimited(1, 50) means 1 request/second with a burst of up to 50 requests. It's a token bucket, which means that there are 50 tokens, each accepted API call uses up one token, and the tokens are regenerated at the given rate, up to burst. Your code is creating a limiter per IP address, so that's a limit per IP address (which I guess you are approximating as one IP address is one user).
If you're running on a single persistent server, and the server and code never restarts, then you may be able to get something like 50 requests/day per user by specifying a rate of 50 / (3600*24) and a burst of 50. (Note: 3600*24 is the number of seconds in a day). But the rate limiting package you're using is not designed for such coarse rate-limiting (on the order of requests per day) -- it's designed to prevent server overload under heavy traffic in the short term (on the order of requests per second).
You probably want a rate-limiter that works with a database or similar (perhaps using a token bucket scheme, since that can be implemented efficiently). Probably there's a package somewhere for that, but I don't know of one of the top of my head.
Defining the problem:
We have this IOT device which each send us logs about cars locations. We want to compute the distance the car is travelling online! so when ever a log comes(after putting it in a queue etc) we do this:
type Delta struct {
DeviceId string
time int64
Distance float64
}
var LastLogs = make(map[string]FullLog)
var Distances = make(map[string]Delta)
func addLastLog(l FullLog) {
LastLogs[l.DeviceID] = l
}
func AddToLogPerDay(l FullLog) {
//mutex.Lock()
if val, ok := LastLogs[l.DeviceID]; ok {
if distance, exist := Distances[l.DeviceID]; exist {
x := computingDistance(val, l)
Distances[l.DeviceID] = Delta{
DeviceId: l.DeviceID,
time: distance.time + 1,
Distance: distance.Distance + x,
}
} else {
Distances[l.DeviceID] = Delta{
DeviceId: l.DeviceID,
time: 1,
Distance: 0,
}
}
}
addLastLog(l)
}
which basically calculates distance using a utility function! so in Distances each device Id is mapped to some distance traveled! now here is where the problem starts: While this distances are added to Distances map, I want a go routine to put this data in the database but since there are many devices and many logs and so on doing this query for every log is not a good idea. So I need to this for every 5 second which means every 5 seconds try to empty the list of all last distances added to the map. I wrote this function:
func UpdateLogPerDayTable() {
for {
for _, distance := range Distances {
logs := model.HourPerDay{}
result := services.CarDBProvider.DB.Table(model.HourPerDay{}.TableName()).
Where("created_at >? AND device_id = ?", getCurrentData(), distance.DeviceId).
Find(&logs)
if result.Error != nil && !result.RecordNotFound() {
log.Infof("Something went wrong while checking the log: %v", result.Error)
} else {
if !result.RecordNotFound() {
logs.CountDistance = distance.Distance
logs.CountSecond = distance.time
err := services.CarDBProvider.DB.Model(&logs).
Update(map[string]interface{}{
"count_second": logs.CountSecond,
"count_distance": logs.CountDistance,
})
if err.Error != nil {
log.Infof("Something went wrong while updating the log: %v", err.Error)
}
} else if result.RecordNotFound() {
dayLog := model.HourPerDay{
Model: gorm.Model{},
DeviceId: distance.DeviceId,
CountSecond: int64(distance.time),
CountDistance: distance.Distance,
}
err := services.CarDBProvider.DB.Create(&dayLog)
if err.Error != nil {
log.Infof("Something went wrong while adding the log: %v", err.Error)
}
}
}
}
time.Sleep(time.Second * 5)
}
}
it is called go utlis.UpdateLogPerDayTable() on another go routine. However there are many problems here:
I don't know how to secure Distances so when I add it in another routine I read it somewhere else ,every thing is ok!(The problem is that I want to use go channels and don't have any idea how to do it)
How can I schedule tasks in go for this problem?
Probably I will add a redis to store all the devices that or online so I could do the select query faster and just update the actual database. also add an expire time for redis so if a device didn't send and data for some time, it vanishes! where should I put this code?
Sorry If my explanations weren't enough but I really need some help. specifically for code implementation
Go has a really cool pattern using for / select over multiple channels. This allows you to batch distance writes using both a timeout and a max record size. Using this pattern requires using channels.
First thing is to model your distances as a channel:
distances := make(chan Delta)
Then you an keep track of the current batch
var deltas []Delta
Then
ticker := time.NewTicker(time.Second * 5)
var deltas []Delta
for {
select {
case <-ticker.C:
// 5 seconds up flush to db
// reset deltas
case d := <-distances:
deltas = append(deltas, d)
if len(deltas) >= maxDeltasPerFlush {
// flush
// reset deltas
}
}
}
I don't know how to secure Distances so when I add it in another
routine I read it somewhere else ,every thing is ok!(The problem is
that I want to use go channels and don't have any idea how to do it)
If you intend to keep a map and share memory you need to protect it using mutual exclusion (mutex) to synchronize access between go routines. Using a channel allows you to send a copy to a channel, removing the need for synchronizing across the Delta Object. Depending on your architecture you could also create a pipeline of go routines connected by channels, which could make it so only a single go routine (monitor go routine) is accessing the Delta, also removing the need for synchronization.
How can I schedule tasks in go for this problem?
Using a channel as the primitive for how you pass Deltas to different go routines :)
Probably I will add a redis to store all the devices that or online so
I could do the select query faster and just update the actual
database. also add an expire time for redis so if a device didn't send
and data for some time, it vanishes! where should I put this code?
This depends on your finished architecture. You could write a decorator for the select operation, which would check redis first then go to the DB. The client of this function wouldn't have to know about this. Write operations could be done the same way: Write to persistent store and then write back to redis with the cached value and the expiration. Using decorators the client wouldn't need to know about this, they would just perform the Reads and Writes and the cache logic would be implemented inside of the decorators. There are many ways for this, and its largely dependent on where your implementation settles.
This simple HTTP server contains a call to time.Sleep() that makes
each request take five seconds. When I try quickly loading multiple
tabs in a browser, it is obvious that each request
is queued and handled sequentially. How can I make it handle concurrent requests?
package main
import (
"fmt"
"net/http"
"time"
)
func serve(w http.ResponseWriter, r *http.Request) {
fmt.Fprintln(w, "Hello, world.")
time.Sleep(5 * time.Second)
}
func main() {
http.HandleFunc("/", serve)
http.ListenAndServe(":1234", nil)
}
Actually, I just found the answer to this after writing the question, and it is very subtle. I am posting it anyway, because I couldn't find the answer on Google. Can you see what I am doing wrong?
Your program already handles the requests concurrently. You can test it with ab, a benchmark tool which is shipped with Apache 2:
ab -c 500 -n 500 http://localhost:1234/
On my system, the benchmark takes a total of 5043ms to serve all 500 concurrent requests. It's just your browser which limits the number of connections per website.
Benchmarking Go programs isn't that easy by the way, because you need to make sure that your benchmark tool isn't the bottleneck and that it is also able to handle that many concurrent connections. Therefore, it's a good idea to use a couple of dedicated computers to generate load.
From Server.go , the go routine is spawned in the Serve function when a connection is accepted. Below is the snippet, :-
// Serve accepts incoming connections on the Listener l, creating a
// new service goroutine for each. The service goroutines read requests and
// then call srv.Handler to reply to them.
func (srv *Server) Serve(l net.Listener) error {
for {
rw, e := l.Accept()
if e != nil {
......
c, err := srv.newConn(rw)
if err != nil {
continue
}
c.setState(c.rwc, StateNew) // before Serve can return
go c.serve()
}
}
If you use xhr request, make sure that xhr instance is a local variable.
For example, xhr = new XMLHttpRequest() is a global variable. When you do parallel request with the same xhr variable you receive only one result. So, you must declare xhr locally like this var xhr = new XMLHttpRequest().