Go 1.18.1
pprof report
3549.93kB 49.73% 49.73% 3549.93kB 49.73% src/lag_monitor.PublishLagMetricToDataDog
514kB 7.20% 56.93% 514kB 7.20% bufio.NewWriterSize
512.88kB 7.18% 64.11% 512.88kB 7.18% encoding/pem.Decode
512.69kB 7.18% 71.30% 1536.98kB 21.53% crypto/x509.parseCertificate
512.50kB 7.18% 78.48% 512.50kB 7.18% crypto/x509.(*CertPool).AddCert
This piece of code appears to not release memory and based on pprof, the beloew function is the one consuming most memory.Memory graph
func caller() {
events := make([]string, 0)
//....
PublishLagMetricToDataDog(ctx, strings.Join(events, ","))
}
func PublishLagMetricToDataDog(ctx context.Context, events string) error {
msg := `{
"series": [%v]
}`
b := []byte(msg)
resp, err := http.Post("https://api.datadoghq.com/api/v1/series?api_key="+env.GetDataDogKey(), "application/json", bytes.NewBuffer(b))
if err != nil {
logger.Error(ctx, "Error submitting event to datadog, err = ", err)
return err
}
logger.Info(ctx, resp)
return nil
}
Above function is called in a loop. Since there are no global variables, and no reference to the byte slice from PublishLagMetricToDataDog, I am not able to pinpoint the memory leak. I read about Reset() and Truncate(), but this does not release the underlying memory.
You must close the response body for every http response you receive. Not doing so will potentially lead to resource leaks, such as the one you've observed.
Solution:
resp, err := http.Post("https://api.datadoghq.com/api/v1/series?api_key="+env.GetDataDogKey(), "application/json", bytes.NewBuffer(b))
if err != nil {
logger.Error(ctx, "Error submitting event to datadog, err = ", err)
return err
}
logger.Info(ctx, resp)
_ = resp.Body.Close() // <--- Add this
return nil
}
Related
I have to use goelastic library inserting the datas bulkly from coming pulsar. But i have a problem.
Firstly, pulsar send 1000 datas per partial bulkly. Then when i insert the elastic, there are a problem sometimes. This problem is attached. This problem cause data loss. Thanks for answer...
ERROR: circuit_breaking_exception: [parent] Data too large, data for [indices:data/write/bulk[s]] would be [524374312/500mb], which is larger than the limit of [510027366/486.3mb], real usage: [524323448/500mb], new bytes reserved: [50864/49.6kb], usages [request=0/0b, fielddata=160771183/153.3mb, in_flight_requests=50864/49.6kb, model_inference=0/0b, eql_sequence=0/0b, accounting=6898128/6.5mb]
This section is bulk code.
func InsertElastic(y []models.CP, ElasticStruct *config.ElasticStruct) {
fmt.Println("------------------")
bi, err := esutil.NewBulkIndexer(esutil.BulkIndexerConfig{
Index: enum.IndexName,
Client: ElasticStruct.Client,
FlushBytes: 10e+6,
})
if err != nil {
panic(err)
}
start := time.Now().UTC()
for _, x := range y {
data, err := json.Marshal(x)
if err != nil {
panic(err)
}
err = bi.Add(
context.Background(),
esutil.BulkIndexerItem{
Action: "index",
Body: bytes.NewReader(data),
OnSuccess: func(ctx context.Context, item esutil.BulkIndexerItem, res esutil.BulkIndexerResponseItem) {
i++
},
OnFailure: func(ctx context.Context, item esutil.BulkIndexerItem, res esutil.BulkIndexerResponseItem, err error) {
if err != nil {
log.Printf("ERROR: %s", err)
} else {
log.Printf("ERROR: %s: %s", res.Error.Type, res.Error.Reason)
}
},
},
)
if err != nil {
log.Fatalf("Unexpected error: %s", err)
}
x++
}
if err := bi.Close(context.Background()); err != nil {
log.Fatalf("Unexpected error: %s", err)
}
dur := time.Since(start)
fmt.Println(dur)
fmt.Println("Success writing data to elastic : ", i)
fmt.Println("Success incoming data from pulsar : ", x)
fmt.Println("Difference : ", x-i)
fmt.Println("Now : ", time.Now().UTC().String())
if i < x {
fmt.Println("FATAL")
}
fmt.Println("------------------")
}
Tldr;
It seems like you do not have enough JVM heap on your node.
You are hitting a circuit breaker to avoid Elasticsearch to be Out Of Memory(OOM).
Solution(s)
Increase the JVM memory, you will find here some documentation to size your nodes.
Smaller bulk request
Let's say I have a handler that makes a request and gets the latest data on the selected stock:
func (ss *stockService) GetStockInfo(ctx *gin.Context) {
code := ctx.Param("symbol")
ss.logger.Info("code", code)
url := fmt.Sprintf("URL/%v", code)
ss.logger.Info(url)
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
errs.HTTPErrorResponse(ctx, &ss.logger, errs.New(errs.Internal, err))
return
}
resp, err := http.DefaultClient.Do(req)
if err != nil {
errs.HTTPErrorResponse(ctx, &ss.logger, errs.New(errs.Internal, err))
return
}
defer resp.Body.Close()
var chart ChartResponse
err = json.NewDecoder(resp.Body).Decode(&chart)
if err != nil {
errs.HTTPErrorResponse(ctx, &ss.logger, errs.New(errs.Internal, err))
return
}
ctx.JSON(http.StatusOK, chart)
}
And I want to add caching here. Since I don't have a lot of experience right now, I'm interested in proper interaction with the cache.
I think that if, for example, it is not possible to save to the cache for some reason, then you can simply make a request to the api. Then I wonder if it would be right to save to the cache in a separate goroutine and immediately return the response:
func (ss *stockService) GetStockInfo(ctx *gin.Context) {
code := ctx.Param("symbol")
stockInfo, err := ss.cache.Get(code)
if err == nil {
// FIND
...
ctx.JSON(http.StatusOK, chart)
} else {
ss.logger.Info("code", code)
url := fmt.Sprintf("URL/%v", code)
ss.logger.Info(url)
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
...
err = json.NewDecoder(resp.Body).Decode(&chart)
// IS IT A GOOD WAY ?
go ss.cache.Save(code,chart,expireAt)
ctx.JSON(http.StatusOK, chart)
}
}
I use redis as a cache.
I will be glad if someone says what is wrong with this approach.
I'm new when it comes to Google PubSub(and pubsub applications in general). I'm also relatively new when it comes to Go.
I'm working on a pretty heavy backend service application that already has too many responsibilities. The service needs to fire off one message for each incoming request to a Google PubSub topic. It only needs to "fire and forget". If something goes wrong with the publishing, nothing will happen. The messages are not crucial(only used for analytics), but there will be many of them. We estimate between 50 and 100 messages per second for most of the day.
Now to the code:
func(p *publisher) Publish(message Message, log zerolog.Logger) error {
ctx := context.Background()
client, err := pubsub.NewClient(ctx, p.project)
defer client.Close()
if err != nil {
log.Error().Msgf("Error creating client: %v", err)
return err
}
marshalled, _ := json.Marshal(message)
topic := client.Topic(p.topic)
result := topic.Publish(ctx, &pubsub.Message{
Data: marshalled,
})
_, err = result.Get(ctx)
if err != nil {
log.Error().Msgf("Failed to publish message: %v", err)
return err
}
return nil
}
Disclaimer: p *publisher only contains configuration.
I wonder if this is the best way? Will this lead to the service creating and closing a client 100 times per second? If so, then I guess I should create the client once and pass it as an argument to the Publish()-function instead?
This is how the Publish()-function gets called:
defer func(publisher publish.Publisher, message Message, log zerolog.Logger) {
err := publisher.Publish(log, Message)
if err != nil {
log.Error().Msgf("Failed to publish message: %v", err)
}
}(publisher, message, logger,)
Maybe the way to go is to hold pubsubClient & pubsubTopic inside struct?
type myStruct struct {
pubsubClient *pubsub.Client
pubsubTopic *pubsub.Topic
logger *yourLogger.Logger
}
func newMyStruct(projectID string) (*myStruct, error) {
ctx := context.Background()
pubsubClient, err := pubusb.NewClient(ctx, projectID)
if err != nil {...}
pubsubTopic := pubsubClient.Topic(topicName)
return &myStruct{
pubsubClient: pubsubClient,
pubsubTopic: pubsubTopic,
logger: Logger,
// and whetever you want :D
}
}
And then for that struct create a method, which will take responsibility of marshalling the msg and sends it to Pub/sub
func (s *myStruct) request(ctx context.Context data yorData) {
marshalled, err := json.Marshal(message)
if err != nil {..}
res := s.pubsubTopic.Publish(ctx, &pubsub.Message{
Data: marshalled,
})
if _, err := res.Get(ctx); err !=nil {..}
return nil
}
I am newbie in kafka, i try build a service send mail with attach files.
Execution flow:
Kafka will receive a message to send mail
function get file will download file from url , scale image, and save file
when send mail i will get files from folder and attach to form
Issues:
when i send mail with large files many times , kafka retry many times, i will receive many mail
kafka error: "kafka server: The provided member is not known in the current generation"
I listened MaxProcessingTime , but i try to test a mail with large file, it still work fine
Kafka info : 1 broker , 3 consumer
func (s *customerMailService) SendPODMail() error { filePaths, err := DownloadFiles(podURLs, orderInfo.OrderCode)
if err != nil{
countRetry := 0
for countRetry <= NUM_OF_RETRY{
filePaths, err = DownloadFiles(podURLs, orderInfo.OrderCode)
if err == nil{
break
}
countRetry++
}
}
err = s.sendMailService.Send(ctx, orderInfo.CustomerEmail, tmsPod, content,filePaths)}
function download file :
func DownloadFiles(files []string, orderCode string) ([]string, error) {
var filePaths []string
err := os.Mkdir(tempDir, 0750)
if err != nil && !os.IsExist(err) {
return nil, err
}
tempDirPath := tempDir + "/" + orderCode
err = os.Mkdir(tempDirPath, 0750)
if err != nil && !os.IsExist(err) {
return nil, err
}
for _, fileUrl := range files {
fileUrlParsed, err := url.ParseRequestURI(fileUrl)
if err != nil {
logrus.WithError(err).Infof("Pod url is invalid %s", orderCode)
return nil, err
}
extFile := filepath.Ext(fileUrlParsed.Path)
dir, err := os.MkdirTemp(tempDirPath, "tempDir")
if err != nil {
return nil, err
}
f, err := os.CreateTemp(dir, "tmpfile-*"+extFile)
if err != nil {
return nil, err
}
defer f.Close()
response, err := http.Get(fileUrl)
if err != nil {
return nil, err
}
defer response.Body.Close()
contentTypes := response.Header["Content-Type"]
isTypeAllow := false
for _, contentType := range contentTypes {
if contentType == "image/png" || contentType == "image/jpeg" {
isTypeAllow = true
}
}
if !isTypeAllow {
logrus.WithError(err).Infof("Pod image type is invalid %s", orderCode)
return nil, errors.New("Pod image type is invalid")
}
decodedImg, err := imaging.Decode(response.Body)
if err != nil {
return nil, err
}
resizedImg := imaging.Resize(decodedImg, 1024, 0, imaging.Lanczos)
imaging.Save(resizedImg, f.Name())
filePaths = append(filePaths, f.Name())
}
return filePaths, nil}
function send mail
func (s *tikiMailService) SendFile(ctx context.Context, receiver string, templateCode string, data interface{}, filePaths []string) error {
path := "/v1/emails"
fullPath := fmt.Sprintf("%s%s", s.host, path)
formValue := &bytes.Buffer{}
writer := multipart.NewWriter(formValue)
_ = writer.WriteField("template", templateCode)
_ = writer.WriteField("to", receiver)
if data != nil {
b, err := json.Marshal(data)
if err != nil {
return errors.Wrapf(err, "Cannot marshal mail data to json with object %+v", data)
}
_ = writer.WriteField("params", string(b))
}
for _, filePath := range filePaths {
part, err := writer.CreateFormFile(filePath, filepath.Base(filePath))
if err != nil {
return err
}
pipeReader, pipeWriter := io.Pipe()
go func() {
defer pipeWriter.Close()
file, err := os.Open(filePath)
if err != nil {
return
}
defer file.Close()
io.Copy(pipeWriter, file)
}()
io.Copy(part, pipeReader)
}
err := writer.Close()
if err != nil {
return err
}
request, err := http.NewRequest("POST", fullPath, formValue)
if err != nil {
return err
}
request.Header.Set("Content-Type", writer.FormDataContentType())
resp, err := s.doer.Do(request)
if err != nil {
return errors.Wrap(err, "Cannot send request to send email")
}
defer resp.Body.Close()
b, err := ioutil.ReadAll(resp.Body)
if err != nil {
return err
}
if resp.StatusCode != http.StatusOK {
return errors.New(fmt.Sprintf("Send email with code %s error: status code %d, response %s",
templateCode, resp.StatusCode, string(b)))
} else {
logrus.Infof("Send email with attachment ,code %s success with response %s , box-code", templateCode, string(b),filePaths)
}
return nil
}
Thank
My team found my problem when I redeploy k8s pods, which lead to conflict leader partition causing rebalance. It will try to process the remaining messages in buffer of pods again.
Solution: I don't fetch many messages saved in buffer , I just get a message and process it by config :
ChannelBufferSize = 0
Example conflict leader parition:
consumer A and B startup in the same time
consumer A registers itself as leader, and owns the topic with all partitions
consumer B registers itself as leader, and then begins to rebalance and owns all partitions
consumer A rebalance and obtains all partitions, but can not consume because the memberId is old and need a new one
consumer B rebalance again and owns the topic with all partitions, but it's already obtained by consumer A
My two cents: in case of very big attachments, the consumer takes quite a lot of time to read the file and to send it as an attachment.
This increases the amount of time between two poll() calls. If that time is greater than max.poll.interval.ms, the consumer is thought to be failed and the partition offset is not committed. As a result, the message is processed again and eventually, if by chance the execution time stays below the poll interval, the offset is committed. The effect is a multiple email send.
Try increasing the max.poll.interval.ms on the consumer side.
Is there a way to download a large file using Go that will store the content directly into a file instead of storing it all in memory before writing it to a file? Because the file is so big, storing it all in memory before writing it to a file is going to use up all the memory.
I'll assume you mean download via http (error checks omitted for brevity):
import ("net/http"; "io"; "os")
...
out, err := os.Create("output.txt")
defer out.Close()
...
resp, err := http.Get("http://example.com/")
defer resp.Body.Close()
...
n, err := io.Copy(out, resp.Body)
The http.Response's Body is a Reader, so you can use any functions that take a Reader, to, e.g. read a chunk at a time rather than all at once. In this specific case, io.Copy() does the gruntwork for you.
A more descriptive version of Steve M's answer.
import (
"os"
"net/http"
"io"
)
func downloadFile(filepath string, url string) (err error) {
// Create the file
out, err := os.Create(filepath)
if err != nil {
return err
}
defer out.Close()
// Get the data
resp, err := http.Get(url)
if err != nil {
return err
}
defer resp.Body.Close()
// Check server response
if resp.StatusCode != http.StatusOK {
return fmt.Errorf("bad status: %s", resp.Status)
}
// Writer the body to file
_, err = io.Copy(out, resp.Body)
if err != nil {
return err
}
return nil
}
The answer selected above using io.Copy is exactly what you need, but if you are interested in additional features like resuming broken downloads, auto-naming files, checksum validation or monitoring progress of multiple downloads, checkout the grab package.
Here is a sample. https://github.com/thbar/golang-playground/blob/master/download-files.go
Also I give u some codes might help you.
code:
func HTTPDownload(uri string) ([]byte, error) {
fmt.Printf("HTTPDownload From: %s.\n", uri)
res, err := http.Get(uri)
if err != nil {
log.Fatal(err)
}
defer res.Body.Close()
d, err := ioutil.ReadAll(res.Body)
if err != nil {
log.Fatal(err)
}
fmt.Printf("ReadFile: Size of download: %d\n", len(d))
return d, err
}
func WriteFile(dst string, d []byte) error {
fmt.Printf("WriteFile: Size of download: %d\n", len(d))
err := ioutil.WriteFile(dst, d, 0444)
if err != nil {
log.Fatal(err)
}
return err
}
func DownloadToFile(uri string, dst string) {
fmt.Printf("DownloadToFile From: %s.\n", uri)
if d, err := HTTPDownload(uri); err == nil {
fmt.Printf("downloaded %s.\n", uri)
if WriteFile(dst, d) == nil {
fmt.Printf("saved %s as %s\n", uri, dst)
}
}
}