Read template in init or in handler function? - performance

I'm writing a basic server for a website. Now I face a (for me) difficult performance question. Is it better to read the template file in the init() function?
// Initialize all pages of website
func init(){
indexPageData, err := ioutil.ReadFile("./tpl/index.tpl")
check(err)
}
Or in the http.HandlerFunc?
func index(w http.ResponseWriter, req *http.Request){
indexPageData, err := ioutil.ReadFile("./tpl/index.tpl")
check(err)
indexPageTpl := template.Must(template.New("index").Parse(string(indexPageData)))
indexPageTpl.Execute(w, "test")
}
I think in the first example, after the server is started you have no need to access the disk and increase the performance of the request.
But during development I want to refresh the browser and see the new content. That can be done with the second example.
Does someone have a state-of-the-art solution? Or what is the right from the performance point of view?

Let's analyze the performance:
We name your first solution (with slight changes, see below) a and your second solution b.
One request:
a: One disk access
b: One disk access
Ten requests:
a: One disk access
b: Ten disk accesses
10 000 000 requests:
a: One disk access
b: 10 000 000 disk accesses (this is slow)
So, performance is better with your first solution. But what about your concern regarding up-to-date data? From the documentation of func (t *Template) Execute(wr io.Writer, data interface{}) error:
Execute applies a parsed template to the specified data object, writing the output to wr. If an error occurs executing the template or writing its output, execution stops, but partial results may already have been written to the output writer. A template may be executed safely in parallel.
So, what happens is this:
You read a template from disk
You parse the file into a template
You choose the data to fill in the blanks with
You Execute the template with that data, the result is written out into an io.Writer
Your data is as up-to-date as you choose it. This has nothing to do with re-reading the template from disk, or even re-parsing it. This is the whole idea behind templates: One disk access, one parse, multiple dynamic end results.
The documentation quoted above tells us another thing:
A template may be executed safely in parallel.
This is very useful, because your http.HandlerFuncs are ran in parallel, if you have multiple requests in parallel.
So, what to do now?
Read the template file once,
Parse the template once,
Execute the template for every request.
I'm not sure if you should read and parse in the init() function, because at least the Must can panic (and don't use some relative, hard coded path in there!) - I would try to do that in a more controlled environment, e.g. provide a function (like New()) to create a new instance of your server and do that stuff in there.
EDIT: I re-read your question and I might have misunderstood you:
If the template itself is still in development then yes, you would have to read it on every request to have an up-to-date result. This is more convenient than to restart the server every time you change the template. For production, the template should be fixed and only the data should change.
Sorry if I got you wrong there.

Never read and parse template files in the request handler in production, that is as bad as it can get (you should like always avoid this). During development it is ok of course.
Read this question for more details:
It takes too much time when using "template" package to generate a dynamic web page to client in golang
You could approach this in multiple ways. Here I list 4 with example implementation.
1. With a "dev mode" setting
You could have a constant or variable telling if you're running in development mode which means templates are not to be cached.
Here's an example to that:
const dev = true
var indexTmpl *template.Template
func init() {
if !dev { // Prod mode, read and cache template
indexTmpl = template.Must(template.New("index").ParseFiles(".tpl/index.tpl"))
}
}
func getIndexTmpl() *template.Template {
if dev { // Dev mode, always read fresh template
return template.Must(template.New("index").ParseFiles(".tpl/index.tpl"))
} else { // Prod mode, return cached template
return indexTmpl
}
}
func indexHandler(w http.ResponseWriter, r *http.Request) {
getIndexTmpl().Execute(w, "test")
}
2. Specify in the request (as a param) if you want a fresh template
When you develop, you may specify an extra URL parameter indicating to read a fresh template and not use the cached one, e.g. http://localhost:8080/index?dev=true
Example implementation:
var indexTmpl *template.Template
func init() {
indexTmpl = getIndexTmpl()
}
func getIndexTmpl() *template.Template {
return template.Must(template.New("index").ParseFiles(".tpl/index.tpl"))
}
func indexHandler(w http.ResponseWriter, r *http.Request) {
t := indexTmpl
if r.FormValue("dev") != nil {
t = getIndexTmpl()
}
t.Execute(w, "test")
}
3. Decide based on host
You can also check the host name of the request URL, and if it is "localhost", you can omit the cache and use a fresh template. This requires the smallest extra code and effort. Note that you may want to accept other hosts as well e.g. "127.0.0.1" (up to you what you want to include).
Example implementation:
var indexTmpl *template.Template
func init() {
indexTmpl = getIndexTmpl()
}
func getIndexTmpl() *template.Template {
return template.Must(template.New("index").ParseFiles(".tpl/index.tpl"))
}
func indexHandler(w http.ResponseWriter, r *http.Request) {
t := indexTmpl
if r.URL.Host == "localhost" || strings.HasPrefix(r.URL.Host, "localhost:") {
t = getIndexTmpl()
}
t.Execute(w, "test")
}
4. Check template file last modified
You could also store the last modified time of the template file when it is loaded. Whenever the template is requested, you can check the last modified time of the source template file. If it has changed, you can reload it before executing it.
Example implementation:
type mytempl struct {
t *template.Template
lastmod time.Time
mutex sync.Mutex
}
var indexTmpl mytempl
func init() {
// You may want to call this in init so first request won't be slow
checkIndexTempl()
}
func checkIndexTempl() {
nm := ".tpl/index.tpl"
fi, err := os.Stat(nm)
if err != nil {
panic(err)
}
if indexTmpl.lastmod != fi.ModTime() {
// Changed, reload. Don't forget the locking!
indexTmpl.mutex.Lock()
defer indexTmpl.mutex.Unlock()
indexTmpl.t = template.Must(template.New("index").ParseFiles(nm))
indexTmpl.lastmod = fi.ModTime()
}
}
func indexHandler(w http.ResponseWriter, r *http.Request) {
checkIndexTempl()
indexTmpl.t.Execute(w, "test")
}

Related

What is the default mode in GoColly, sync or async?

What is the default mode in which network requests are executed in GoColly? Since we have the Async method in the collector I would assume that the default mode is synchronous.
However, I see no particular difference when I execute these 8 requests in the program other than I need to use Wait for async mode. It seems as if the method only controls how the program is executed (the other code) and the requests are always asynchronous.
package main
import (
"fmt"
"github.com/gocolly/colly/v2"
)
func main() {
urls := []string{
"http://webcode.me",
"https://example.com",
"http://httpbin.org",
"https://www.perl.org",
"https://www.php.net",
"https://www.python.org",
"https://code.visualstudio.com",
"https://clojure.org",
}
c := colly.NewCollector(
colly.Async(true),
)
c.OnHTML("title", func(e *colly.HTMLElement) {
fmt.Println(e.Text)
})
for _, url := range urls {
c.Visit(url)
}
c.Wait()
}
The default collection is synchronous.
The confusing bit is probably the collector option colly.Async() which ignores the actual param. In fact the implementation at the time of writing is:
func Async(a ...bool) CollectorOption {
return func(c *Collector) {
c.Async = true // uh-oh...!
}
}
Based on this issue, it was done this way for backwards compatibility, so that (I believe) you can pass an option with no param at it'll still work, e.g.:
colly.NewCollector(colly.Async()) // no param, async collection
If you remove the async option altogether and instantiate with just colly.NewCollector(), the network requests will be clearly sequential — i.e. you can also remove c.Wait() and the program won't exit right away.

Go http client setup for multiple endpoints?

I reuse the http client connection to make external calls to a single endpoint. An excerpt of the program is shown below:
var AppCon MyApp
func New(user, pass string, platformURL *url.URL, restContext string) (*MyApp, error) {
if AppCon == (MyApp{}) {
AppCon = MyApp{
user: user,
password: pass,
URL: platformURL,
Client: &http.Client{Timeout: 30 * time.Second},
RESTContext: restContext,
}
cj, err := cookiejar.New(nil)
if err != nil {
return &AppCon, err
}
AppCon.cookie = cj
}
return &AppCon, nil
}
// This is an example only. There are many more functions which accept *MyApp as a pointer.
func(ma *MyApp) GetUser(name string) (string, error){
// Return user
}
func main(){
for {
// Get messages from a queue
// The message returned from the queue provide info on which methods to call
// 'm' is a struct with message metadata
c, err := New(m.un, m.pass, m.url)
go func(){
// Do something i.e c.GetUser("123456")
}()
}
}
I now have the requirement to set up a client connections with different endpoints/credentials received via queue messages.
The problem I foresee is I can't just simply modify AppCon with the new endpoint details since a pointer to MyApp is returned, resulting in resetting c. This can impact a goroutine making a HTTP call to an unintended endpoint. To make matters non trivial, the program is not meant to have awareness of the endpoints (I was considering using a switch statement) but rather receive what it needs via queue messages.
Given the issues I've called out are correct, are there any recommendations on how to solve it?
EDIT 1
Based on the feedback provided, I am inclined to believe this will solve my problem:
Remove the use of a Singleton of MyApp
Decouple the http client from MyApp which will enable it for reuse
var httpClient *http.Client
func New(user, pass string, platformURL *url.URL, restContext string) (*MyApp, error) {
AppCon = MyApp{
user: user,
password: pass,
URL: platformURL,
Client: func() *http.Client {
if httpClient == nil {
httpClient = &http.Client{Timeout: 30 * time.Second}
}
return httpClient
}()
RESTContext: restContext,
}
return &AppCon, nil
}
// This is an example only. There are many more functions which accept *MyApp as a pointer.
func(ma *MyApp) GetUser(name string) (string, error){
// Return user
}
func main(){
for {
// Get messages from a queue
// The message returned from the queue provide info on which methods to call
// 'm' is a struct with message metadata
c, err := New(m.un, m.pass, m.url)
// Must pass a reference
go func(c *MyApp){
// Do something i.e c.GetUser("123456")
}(c)
}
}
Disclaimer: this is not a direct answer to your question but rather an attempt to direct you to a proper way of solving your problem.
Try to avoid a singleton pattern for you MyApp. In addition, New is misleading, it doesn't actually create a new object every time. Instead you could be creating a new instance every time, while preserving the http client connection.
Don't use constructions like this: AppCon == (MyApp{}), one day you will shoot in your leg doing this. Use instead a pointer and compare it to nil.
Avoid race conditions. In your code you start a goroutine and immediately proceed to the new iteration of the for loop. Considering you re-use the whole MyApp instance, you essentially introduce a race condition.
Using cookies, you make your connection kinda stateful, but your task seems to require stateless connections. There might be something wrong in such an approach.

How do you securely serve a file in golang

I'm new to developing web applications. I'm working with golang and would like to serve user uploaded files securely, such as allowing them to view their own files only.
Now I have saved those files to a local file system with random names. If I serve the entire directory, malicious users may view other users files. This sounds like a common use case, I wonder what's the best approach to deal with it?
This question in pretty vague and architectural decisions must be made to optimize data access and secure the files.
However, here is a simple solution that might serve your use-case.
package main
import (
"fmt"
"mime"
"net/http"
"path/filepath"
)
//UserFilesMap is the map that contains
var UserFilesMap map[string]FilePermission
type FilePermission map[string]struct{}
//FileServer is the function that serves files
func FileServer(w http.ResponseWriter, r *http.Request) {
//get the file path the user wants to access
filename := r.URL.Path[9:]
var uname, pass string
var ok bool
if uname, pass, ok = r.BasicAuth(); !ok {
w.WriteHeader(http.StatusForbidden)
return
}
if !(uname == "user" && pass == "1234") {
w.WriteHeader(http.StatusForbidden)
return
}
//Checking if user has permission to the file
if _, ok := UserFilesMap[uname][filename]; !ok {
w.WriteHeader(http.StatusForbidden)
return
}
w.Header().Set("Content-Type", mime.TypeByExtension(filepath.Ext(filename)))
http.ServeFile(w, r, "files/"+filename)
}
func main() {
UserFilesMap = make(map[string]FilePermission)
// UserFilesMap["user"] = FilePermission{"xyz.txt": struct{}{}}
UserFilesMap["user"] = FilePermission{"abc.txt": struct{}{}}
http.HandleFunc("/getFile/", FileServer)
if err := http.ListenAndServe(":8080", nil); err != nil {
fmt.Println("Error in ListenAndServe")
}
}
Here, I used a map to store the permissions of files. I would suggest you go for a SQL table instead.
If ur filenames are random and long enough and use a secure random generator this is already secure (unless directory-listing is enabled) with some limits though.
https://golang.org/pkg/crypto/rand/
One user will have access to the file only if he has the url with the random name. The limitation is though that the URL will be saved in the browser history, if someone else finds it he will also have access to it.

Go- authentication logic pattern in web app

I want to determine a simple and useful pattern for user authentication in a web app being written in golang.
I have come up with two patterns. First one is enabling the programmer to have his functions separate form the authentication logic, and has cleaner HandleFunc parts in main() that one can see only by loking main() to see what parts are under authentication control.
Second one is making programmer include a decision in every function deal with authentication required urls. An if statement checks by a authp() function defined else where.
Which one is better pattern for such necessity?
What are the better patterns for this job?
Is it even possible to pass a function to http.HandleFunc that has signature other than func urlFunc (ResponseWriter, *Request) bu like func urlFunc (successFunc, failFunc) or func urlFunc (ResponseWriter, *Request, successFunc, failFunc) as in authenticationGateKeeper function of First Way below, if not a suitable workaround for that?
//First Way
package main
func authGateKeeper(successFunc, failFunc) {
if (authp()) {
successFunc
} else {
failFunc
}
}
func authp() boolean {
//authentication logic, db query, or session check etc.
}
//usage in main
http.HandleFunc("/", authGateKeeper)
//Second Way; other alternative, in each function check pattern
func f(w, r) {
if (authp()) {
//function's processes
} else {
//the fail case function or processes
}
}
func authp() boolean {
//authentication logic, db query, or session check etc.
}
//usage in main
http.HandleFunc("/", f)
There are many ways to spin this, and it's arguable whether one is outright "better". I'd strongly suggest writing some middleware that wraps your routes and enforces the check, calling the wrapped handler only on success.
Note that I'm going to make a few assumptions here as you haven't told us how you're managing sessions (cookies? server-side?) and/or what kind of authorization you might need on top of authentication.
// Middleware - a function that sits in the 'middle' of your request processing.
func RequireAuth(h http.Handler) http.Handler) {
fn := func(w http.ResponseWriter, r *http.Request) {
// Assuming gorilla/sessions
session, err := store.Get("name", r)
if err != nil {
// Raise HTTP 500
return
}
// We'll assume you're storing the userID in the cookie|server session
// upon login elsewhere.
id := session.Values["userID"]
// Probably returns a *yourapp.User
user, err := db.GetUser(id)
if err != nil {
// Raise HTTP 500
return
}
if user == nil {
http.Error(w, http.StatusText(http.StatusUnauthorized), http.StatusUnauthorized)
// Don't forget these 'naked' returns - if you miss one, your
// handler will keep processing beyond the error and result in
// unintended side effects
return
}
// Further checks here - i.e. checking user.Active == true, etc.
// The userID matches one in the DB, so let's proceed
h.ServeHTTP(w, r)
}
return http.HandlerFunc(fn)
}
// And in your router - assuming just vanilla net/http
http.Handle("/", RequireAuth(yourHandlerFunc))
http.Handle("/", RequireAuth(someOtherHandler))
// Note that using gorilla/mux or goji can help give you "subrouters" so you
// don't have to wrap every single route with your middleware (messy, error prone)
I'd also suggest some reading on Go middleware1 composition2 which will help you in the future.
If you want to call a custom error page, just write a handler - e.g. UnauthorizedHandler that satisfies http.Handler and just call UnauthorizedHandler.ServeHTTP(w, r) instead of http.Error along the way.

Golang. What to use? http.ServeFile(..) or http.FileServer(..)?

I'm a little bit confused. Much of examples shows usage of both: http.ServeFile(..) and http.FileServer(..), but seems they have very close functionality. Also I have found no information about how to set custom NotFound handler.
// This works and strip "/static/" fragment from path
fs := http.FileServer(http.Dir("static"))
http.Handle("/static/", http.StripPrefix("/static/", fs))
// This works too, but "/static2/" fragment remains and need to be striped manually
http.HandleFunc("/static2/", func(w http.ResponseWriter, r *http.Request) {
http.ServeFile(w, r, r.URL.Path[1:])
})
http.ListenAndServe(":8080", nil)
I've tried to read source code and both of them use serveFile(ResponseWriter, *Request, FileSystem, string, bool) underlying function. However http.FileServer return fileHandler with its own ServeHTTP() method and make some preparation work before serving file (eg path.Clean()).
So why need this separation? Which method better to use? And how can I set custom NotFound handler, for example when requested file not found?
The main difference is that http.FileServer does effectively almost 1:1 mapping of an HTTP prefix with a filesystem. In plain english, it serves up an entire directory path. and all its children.
Say you had a directory called /home/bob/static and you had this setup:
fs := http.FileServer(http.Dir("/home/bob/static"))
http.Handle("/static/", http.StripPrefix("/static", fs))
Your server would take requests for e.g. /static/foo/bar and serve whatever is at /home/bob/static/foo/bar (or 404)
In contrast, the ServeFile is a lower level helper that can be used to implement something similar to FileServer, or implement your own path munging potentially, and any number of things. It simply takes the named local file and sends it over the HTTP connection. By itself, it won't serve a whole directory prefix (unless you wrote a handler that did some lookup similar to FileServer)
NOTE Serving up a filesystem naively is a potentially dangerous thing (there are potentially ways to break out of the rooted tree) hence I recommend that unless you really know what you're doing, use http.FileServer and http.Dir as they include checks to make sure people can't break out of the FS, which ServeFile doesn't.
Addendum
Your secondary question, how do you do a custom NotFound handler, unfortunately, is not easily answered. Because this is called from internal function serveFile as you noticed, there's no super easy place to break into that. There are potentially some sneaky things like intercepting the response with your own ResponseWriter which intercepts the 404 response code, but I'll leave that exercise to you.
Here a handler which sends a redirect to "/" if file is not found. This comes in handy when adding a fallback for an Angular application, as suggested here, which is served from within a golang service.
Note: This code is not production ready. Only illustrative (at best :-)
package main
import "net/http"
type (
// FallbackResponseWriter wraps an http.Requesthandler and surpresses
// a 404 status code. In such case a given local file will be served.
FallbackResponseWriter struct {
WrappedResponseWriter http.ResponseWriter
FileNotFound bool
}
)
// Header returns the header of the wrapped response writer
func (frw *FallbackResponseWriter) Header() http.Header {
return frw.WrappedResponseWriter.Header()
}
// Write sends bytes to wrapped response writer, in case of FileNotFound
// It surpresses further writes (concealing the fact though)
func (frw *FallbackResponseWriter) Write(b []byte) (int, error) {
if frw.FileNotFound {
return len(b), nil
}
return frw.WrappedResponseWriter.Write(b)
}
// WriteHeader sends statusCode to wrapped response writer
func (frw *FallbackResponseWriter) WriteHeader(statusCode int) {
Log.Printf("INFO: WriteHeader called with code %d\n", statusCode)
if statusCode == http.StatusNotFound {
Log.Printf("INFO: Setting FileNotFound flag\n")
frw.FileNotFound = true
return
}
frw.WrappedResponseWriter.WriteHeader(statusCode)
}
// AddFallbackHandler wraps the handler func in another handler func covering authentication
func AddFallbackHandler(handler http.HandlerFunc, filename string) http.HandlerFunc {
Log.Printf("INFO: Creating fallback handler")
return func(w http.ResponseWriter, r *http.Request) {
Log.Printf("INFO: Wrapping response writer in fallback response writer")
frw := FallbackResponseWriter{
WrappedResponseWriter: w,
FileNotFound: false,
}
handler(&frw, r)
if frw.FileNotFound {
Log.Printf("INFO: Serving fallback")
http.Redirect(w, r, "/", http.StatusSeeOther)
}
}
}
It can be added as in this example (using goji as mux):
mux.Handle(pat.Get("/*"),
AddFallbackHandler(http.FileServer(http.Dir("./html")).ServeHTTP, "/"))

Resources