I need to implement a service for my web server that refreshes an access token from some outside rest-api because that token has a 10 minute expiration time. (This is not an accesstoken that my server produces, it is
a token I receive from an outside api that allows me to use their services for a limited time)
For implementing timed functions in Go I've come across both cronjobs and functions using time.Ticker, however I havent come across any posts on the advantages/disadvantages of using one over the other and would like to one which would possibly be a better use for my situation.
If there is an optional route I'd be open to exploring it as well.
Thank you
time.Ticker is included with the Go standard library. No "cron" library is. So you reduce your external dependencies by using time.Ticker.
Cron is designed to run jobs on a specified schedule. Usually these jobs are run outside the Go program by the operating system. This isn't quite what you want. There are other job runners, and libraries called "cron" which are actually job runners, but again they're third party libraries.
A time.Ticker inside a goroutine is very simple and you can just have a nice infinite loop that fetches an API token every few minutes and sends it down a channel to wherever it's needed. That's maybe eight lines of code.
Related
Currently utilising the Google Dataflow with Python for batch processing. This works fine, however, I'm interested in getting a bit more speed out of my Dataflow Jobs without having to deal with Java.
Using the Go SDK, I've implemented a simple pipeline that reads a series 100-500mb files from Google Storage (using textio.Read), does some aggregation and updates CloudSQL with the results. The number of files being read can range from dozens to hundreds.
When I run the pipeline, I can see from the logs that files are being read serially, instead of in parallel, as a result the job takes much longer. The same process executed with the Python SDK triggers autoscaling and runs multiple reads within minutes.
I've tried specifying the number of workers using --num_workers=, however, Dataflow scales the job down to one instance after a few minutes and from the logs no parallel reads take place in the time the instance was running.
Something similar happens if I remove the textio.Read and implement a custom DoFn for reading from GCS. The read process is still run serially.
I'm aware the current Go SDK is experimental and lacks many features, however, I haven't found a direct reference to limitations with Parallel processing, here. Does the current incarnation of the Go SDK support parallel processing on Dataflow?
Thanks in advance
Managed to find an answer for this after actually creating my own IO package for the Go SDK.
SplitableDoFns are not yet available in the Go SDK. This key bit of functionality is what allows the Python and Java SDKs to perform IO operations in parallel and thus, much faster than the Go SDK at scale.
Now (GO 1.16) it's built-in :
https://pkg.go.dev/google.golang.org/api/dataflow/v1b3
I am in need to call an external *.exe compiled in C++
from ASP.NET WEB API 2 using Process (System.Diagnostics)
This executable does some image processing stuff and use lot of memory.
SO my question is if change my API calls to Async. or implement threads will it help, Or it doesn't matter?
Note: All i have is executable so i can not go for a CLI Wrapper.
You can separate the two. Your api is one thing, it needs to be fast, responsive to be able to serve the clients. Your image processing thing is different.
You could implement a queuing system. The api is responsible for adding a new item to this queue and nothing more. You could keep track of what tasks are being run in a separate sql table let's say. Imagine you have a sql table called Tasks. Your api chucks data in there and the status is "Not Running".
Some other app which lives on another machine entirely keeps an eye on this table and takes care of running that executable for each item. When it starts, it changes the status to Running, when it completes it's Done. You do whatever else you need. You could have an api endpoint which takes the ID of the task so your client can keep calling this endpoint to see what the status is. Or you could raise an event when it's done, depending on your application needs.
Bottom line, keep things separate, you gain nothing for blocking the api while a resources heavy task is running. Think what happens if you start that process 5 times, at the same time. You've just killed your api basically.
The app that does the heavy work, could even be located on a separate machine, so it doesn't affect the api at all.
I have a django application in heroku and one thing I need to do sometimes that take a little bit of time is sending emails.
This is a typical use case of using workers. Heroku offers support for workers, but I have to leave them running all the time (or start and stop them manually), which is annoying.
I would like to use a one-off process to send every email. One possibility I first thought of was using IronWorker, since I thought that I could simply add the job to ironworker's queue and it would be exectuted with a mex of 15 min delay, which is ok for me.
The problem is that with ironworker, I need to put in a zip file all the modules and their dependencies in order to run the job, so in my email use case, as I use "EmailMultiAlternatives" from "django.core.mail.message", I would need to include all the django framework in my zip file in order to be able to use it.
According to this link, it's possible to add/remove workers from the app. Is it possible to start one-off processes from the app?
Does anyone has a better solution?
Thanks in advance
I'm developing an app that needs to fetch a POP3 account every 5-15 minutes to check for new email and process it. I have written all the code except for the part where it automatically runs every 5-15 minutes.
I'm using Sinatra, DataMapper and hosting on Heroku which means cron jobs are out of the question, because Heroku only provides hourly cron jobs at best.
I have looked into Delayed::Job which doesn't natively support Sinatra nor DataMapper but there are workarounds for both. Since my Ruby knowledge is limited I couldn't find a way to merge these two forks into one working Delayed::Job for Sinatra/DataMapper solution.
Initially I used Mailman to check for emails which has built-in polling and runs continuously, but since it's not Rack-based it doesn't run on Heroku.
Any pointers on where to go next? Before you say: a different webhost, I should add I really prefer to stick with Heroku because of its ease of use (except of course, for the above issue).
Heroku supports CloudMailin
A simple trick is to write your code contained in a loop, then sleep at the bottom of it for however long you want:
Untested sample code...
loop do
do_something_way_cool()
sleep 5 * 60 # it's in minutes
end
If it has to be contained in the main body of the app then use a Thread to wrap it so the thread does the work. You'll need to figure out your shared data structures to transfer the data out of the loop. Queue is your friend there.
This is my setting: I have written a .NET application for local client machines, which implements a feature that could also be used on a webpage. To keep this example simple, assume that the client installs a software into which he can enter some data and gets some data back.
The idea is to create a webpage that holds a form into which the user enters the same data and gets the same results back as above. Due to the company's available web servers, the first idea was to create a mono webservice, but this was dismissed for reasons unknown. The "service" is not to be run as a webservice, but should be called by a PHP script. This is currently realized by calling the mono application via shell_exec from PHP.
So now I am stuck with a mono port of my application, which works fine, but takes way too long to execute. I have already stripped out all unnecessary dlls, methods etc, but calling the application via the command line - submitting the desired data via commandline parameters - takes approximately 700ms. We expect about 10 hits per second, so this could only work when setting up a lot of servers for this task.
I assume the 700m are related to the cost of starting the application every time, because it does not make much difference in terms of time if I handle the request only once or five hundred times (I take the original input, vary it slighty and do 500 iterations with "new" data every time. Starting from the second iteration, the processing time drops down to approximately 1ms per iteration)
My next idea was to setup the mono application as a remoting server, so that it only has to be started once and can then handle incoming requests. I therefore wrote another mono application that serves as the client. Calling the client, letting the client pass the data to the server and retrieving the result now takes 344ms. This is better, but still way slower than I would expect and want it to be.
I have then implemented a new project from scratch based on this blog post and get stuck with the same performance issues.
The question is: am I missing something related to the mono-projects that could improve the speed of the client/server? Although the idea of creating a webservice for this task was dismissed, would a webservice perform better under these circumstances (as I would not need the client application to call the service), although it is said that remoting is faster than webservices?
I could have made that clearer, but implementing a webservice is currently not an option (and please don't ask why, I didn't write the requirements ;))
Meanwhile I have checked that it's indeed the startup of the client, which takes most of the time in the remoting scenario.
I could imagine accessing the server via pipes from the command line, which would be perfectly suitable in my scenario. I guess this would be done using sockets?
You can try to use AOT to reduce the startup time. On .NET you would use ngen for that purpoise, on mono just do a mono --aot on all assemblies used by your application.
AOT'ed code is slower than JIT'ed code, but has the advantage of reducing startup time.
You can even try to AOT framework assemblies such as mscorlib and System.
I believe that remoting is not an ideal thing to use in this scenario. However your idea of having mono on server instead of starting it every time is indeed solid.
Did you consider using SOAP webservices over HTTP? This would also help you with your 'web page' scenario.
Even if it is a little to slow for you in my experience a custom RESTful services implementation would be easier to work with than remoting.