apm SetCustom function causing random panic - go

i have written a Golang app using GIN middleware with apm enabled and facing a problem which is random in nature.
for enabling the APM and elastic search monitoring i have used the middleware like below
func NewRouter() *gin.Engine {
gin.SetMode(gin.DebugMode)
router := gin.New()
router.Use(middlewares.Recover())
router.Use(apmgin.Middleware(router))
router.Use(middlewares.Logger(), gin.Logger())
//API Route Group
v2 := router.Group("/api")
//V1 Route
v2Routes(v2)
return router
}
uptill now , logging is going on fine on elastic search with all necessary parameters.
but now i want to send by POST request data also, which is not getting
logged currently by default
now inside my main handler function of the API , I am using a custom function to log my post request data to APM
func SendPostDataToAPM(ctx context.Context, data interface{}) {
defer RoutineRecovery()
tx := apm.TransactionFromContext(ctx)
tx.Context.SetCustom("postData", data)
Now the main problem that i am facing is that , the above function SendPostDataToAPM is giving me panic randomly, means it is working fine for some time and then randomly throws an panic, and again works fine for the next request
panic is coming specifically from the last line tx.Context.SetCustom("postData", data)
panic: runtime error: invalid memory address or nil pointer dereference
if anyone knows any workaround around this, kindly help
Updates
I know that there is some nil problem here so I have implemented a nil check before calling the SetCustom function, the solution worked , but still there were rare cases(like only 1 or 2 times in a day) , when even that too didn't work.
So I am trying to find a solution to this random behaviour, why this is panic is coming randomly.

Related

API design for fire and forget endpoints

I’m currently maintaining a few HTTP APIs based on the standard library and gorilla mux and running in kubernetes (GKE).
We’ve adopted the http.TimeoutHandler as our “standard” way to have a consistent timeout error management.
A typical endpoint implementation will use the following “chain”:
MonitoringMiddleware => TimeoutMiddleware => … => handler
so that we can monitor a few key metrics per endpoint.
One of our API is typically used in a “fire and forget” mode meaning that clients will push some data and not care for the API response. We are facing the issue that
the Golang standard HTTP server will cancel a request context when the client connection is no longer active (godoc)
the TimeoutHandler will return a “timeout” response whenever the request context is done (see code)
This means that we are not processing requests to completion when the client disconnects which is not what we want and I’m therefore looking for solutions.
The only discussion I could find that somewhat relates to my issue is https://github.com/golang/go/issues/18527; however
The workaround is your application can ignore the Handler's Request.Context()
would mean that the monitoring middleware would not report the "proper" status since the Handler would perform the request processing in its goroutine but the TimeoutHandler would be enforcing the status and observability would be broken.
For now, I’m not considering removing our middlewares as they’re helpful to have consistency across our APIs both in terms of behaviours and observability. My conclusion so far is that I need to “fork” the TimeoutHandler and use a custom context for when an handler should not depend on the client waiting for the response or not.
The gist of my current idea is to have:
type TimeoutHandler struct {
handler Handler
body string
dt time.Duration
// BaseContext optionally specifies a function that returns
// the base context for controling if the server request processing.
// If BaseContext is nil, the default is req.Context().
// If non-nil, it must return a non-nil context.
BaseContext func(*http.Request) context.Context
}
func (h *TimeoutHandler) ServeHTTP(w ResponseWriter, r *Request) {
reqCtx := r.Context()
if h.BaseContext != nil {
reqCtx = h.BaseContext(r)
}
ctx, cancelCtx := context.WithTimeout(reqCtx, h.dt)
defer cancelCtx()
r = r.WithContext(ctx)
...
case <-reqCtx.Done():
tw.mu.Lock()
defer tw.mu.Unlock()
w.WriteHeader(499) // write status for monitoring;
// no need to write a body since no client is listening.
case <-ctx.Done():
tw.mu.Lock()
defer tw.mu.Unlock()
w.WriteHeader(StatusServiceUnavailable)
io.WriteString(w, h.errorBody())
tw.timedOut = true
}
The middleware BaseContext callback would return context.Background() for requests to the “fire and forget” endpoint.
One thing I don’t like is that in doing so I’m losing any context keys written so this new middleware would have strong usage constraints. Overall I feel like this is more complex than it should be.
Am I completely missing something obvious?
Any feedback on API instrumentation (maybe our middlewares are an antipattern) /fire and forget implementations would be welcomed!
EDIT: as most comments are that a request for which the client does not wait for a response has unspecified behavior, I checked for more information on typical clients for which this happens.
From our logs, this happens for user agents that seem to be mobile devices. I can imagine that connections can be much more unstable and the problem will likely not disappear.
I would therefore not conclude that I shouldn't find a solution since this is currently creating false-positive alerts.

Proper logging implementation in Golang package

I have small Golang package which does some work. This work suppose a high amount of errors could be produced and this is OK. Currently all errors are ignored. Yes it may look strange, but visit the link and check the main purpose of package.
I'd like to extend functionality of the package and provide ability to see errors occurred during runtime. But due to lack of software design skills I have some questions with no answers.
At first, I thought to implement logging inside the package using the existing logging (zerolog, zap or whatever else). But, will it be ok for package's users? Because they might want to use other logging packages and would like to modify output format.
Maybe it's possible to provide a way to user to inject it's own logging?
I'd like to achieve the ability to provide easy-configurable way for logging which could be switched on or off on users demands.
Some go lib use logging like this
in your packge definite a logger interface
type Yourlogging interface{
Errorf(...)
Warningf(...)
Infof(...)
Debugf(...)
}
and definite a variable for this interface
var mylogger Yourlogging
func SetLogger(l yourlogging)error{
mylogger = l
}
in your func, you can call them for logging
mylogger.Infof(..)
mylogger.Errorf(...)
you don't need implement the interface, but you can use them who implement this interface
for example:
SetLogger(os.Stdout) //logging output to stdout
SetLogger(logrus.New()) // logging output to logrus (github.com/sirupsen/logrus)
In Go, you will see some libraries implement logging interfaces like other answers have suggested. However, you could completely avoid your packages needing to log if you structured your application differently, for your example.
For example, in your example application you linked, your main application runtime calls idleexacts.Run(), which starts this function.
// startLoop starts workload using passed settings and database connection.
func startLoop(ctx context.Context, log log.Logger, pool db.DB, tables []string, jobs uint16, minTime, maxTime time.Duration) error {
rand.Seed(time.Now().UnixNano())
// Increment maxTime up to 1 due to rand.Int63n() never return max value.
maxTime++
// While running, keep required number of workers using channel.
// Run new workers only until there is any free slot.
guard := make(chan struct{}, jobs)
for {
select {
// Run workers only when it's possible to write into channel (channel is limited by number of jobs).
case guard <- struct{}{}:
go func() {
table := selectRandomTable(tables)
naptime := time.Duration(rand.Int63n(maxTime.Nanoseconds()-minTime.Nanoseconds()) + minTime.Nanoseconds())
err := startSingleIdleXact(ctx, pool, table, naptime)
if err != nil {
log.Warnf("start idle xact failed: %s", err)
}
// When worker finishes, read from the channel to allow starting another worker.
<-guard
}()
case <-ctx.Done():
return nil
}
}
}
The problem here is all of the orchestration of your logic is happening inside of your packages. Instead, this loop should be running in your main application, and this package should provide users with simple actions such as selectRandomTable() or createTempTable().
If the orchestration of code was in your main application and the package only provided simple actions. It would be much easier to return errors to the user as part of the function calls.
It would also make your packages easier for others to reuse because they have simple actions and open users to use them in other ways than you intended.

Putting ListenAndServe into goroutine

I'm having trouble to estimate if there will be side effects of running http.ListenAndServe in goroutine.
To make it possible for prometheus to collect stats data from a /metrics endpoint of a service running a kafkaclient(running a kafka consumer in an infinite for-loop)
var addr = flag.String("listen-address", ":8070", "The address to listen on for HTTP requests.")
func main() {
flag.Parse()
http.Handle("/metrics", promhttp.Handler())
go http.ListenAndServe(*addr, nil)
for {....}
What would be the best practices to start the monitoring endpoint and run the infinite loop?
Best practice is to check for and handle errors. The error from http.ListenAndServe is ignored.
If return from http.ListenAndServe is fatal to the application, then use the following code or some variation on it to handle the error.
go func() {
log.Fatal(http.ListenAndServe(*addr, nil))
}()
The call to log.Fatal logs the error and exits the application.

Mattermost + New Relic APM

I want use new relic APM in the mattermost application. In order to monitor the performance the application I have added the code (as mentioned in new relic) just above the createpost api request handler in api/post.go file.
func createPost(c *Context, w http.ResponseWriter, r *http.Request) {
config := newrelic.NewConfig("mylocalstarfp", "####12337")
app, err1 := newrelic.NewApplication(config)
fmt.Println("config")
fmt.Println(config)
if nil != err1 {
fmt.Println(err1)
// os.Exit(1)
}
txn := app.StartTransaction("mylocalstar",w, r)
defer txn.End()
post := model.PostFromJson(r.Body)
.....
.......
}
The application is displayed on new relic dashboard and attributes like CPU and Memory are displayed.But no Response time and Throughput attributes are displayed.
As per new relic documentation (https://github.com/newrelic/go-agent) this code has to be added in main /init block or just at start of function where we need to monitor the performance.
But I am not able to monitor as response time and throughput attributes are not being displayed.
May be I am adding the code at wrong place.
Also I have tried to add the code at beginning of main() function in mattermost.go file. But no success.
Please suggest as to where I have to add the code.
Secondly, they have also mentioned that:
If you are using the standard HTTP library package, you can create transactions by wrapping HTTP requests, as an alternative to instrumenting a function's code.
Here is a before-and-after example of an HTTP handler being wrapped:
Before:
http.HandleFunc("/users", usersHandler)
After:
http.HandleFunc(newrelic.WrapHandleFunc(app, "/users", usersHandler))
This automatically starts and ends a transaction with the request and response writer.
As per this where should I add the code in Mattermost?
You might try using the latest release (1.3) which has support for short lived processes and then adding the code section below
newrelic.NewConfig("mylocalstarfp", "####12337")
app, err1 := newrelic.NewApplication(config)`
to the mattermost.go, and passing the app variable to anywhere you want to monitor transactions.
That’s not a guarantee, however. Just a thought not backed up by any testing.
Got the solution, hence posting for others to refer.
Solved the issue to track each request by this code in mattermost:
BaseRoutes.NeedTeam.Handle(newrelic.WrapHandle(app, "/users", ApiAppHandler(usersHandler))).Methods("POST")

Passing queries between chaincodes with privacy enabled

I have two chaincodes - let's call them A and B - and I am trying to get A to invoke a method on B, in a setup that has privacy enabled. An example of the sort of call I'm trying to make is shown below.
func (e *ChaincodeA) someFuncOnChaincodeA(stub *shim.ChaincodeStub, args []string) ([]byte, error) {
//Do stuff
newArgs := []string{"somevalue1","somevalue2"}
msg, err := stub.InvokeChaincode(chaincodeBName,"someFuncOnChaincodeB",args)
if err != nil{
fmt.Println(err.Error())
}
return msg, err
}
However, whenever I try to run this, it gives me the following error messages before killing my chaincode:
[72047168]Error chaincode-chaincode interactions not supported for
with privacy enabled.
Sending ERROR Error starting Simple chaincode:
Error handling message:
[72047168-5f5a-4017-862a-1329660e2076]Chaincode handler FSM cannot
handle message (COMPLETED) with payload size (0) while in state: ready
Process finished with exit code 0
Evidently privacy interferes with chaincode-chaincode communications. Is there any way around this, to enable communications while maintaining privacy? Or is it a best-practice to put absolutely everything into a single gigantic chaincode?
Additionally, why does privacy interfere with chaincode-chaincode communications? I don't understand exactly why this occurs.
It appears that this is a known issue with hyperledger, that is being worked on. As such there is no current workaround, but there will likely be one in the future when this issue is dealt with.
Relevent Issue

Resources