How are modules in apache 2.2 operated - apache2.2

By which mechanism is apache module loaded during runtine, or during startup ?Are in that process used mechanisms like interprocess communication ? Is apache actively calling methods in module, or module itself calls apache method, or both.
For example, get request commes to apache and mod_spnego (kerberos authentication) is loaded. How does apache know when to call main function in module code ?

Apache and its modules are written in C. The modules are loaded during the startup, from shared object files (.so on Unix, .dll on Windows or as Dynamic Shared Objects). When building the Apache HTTPD, it's also possible to compile the modules statically into the httpd binary. Both way, they work from the same process.
How does Apache know when to call main function in module code?
Take a look at these articles, and see what's said about Hooks.
Apache 2.2, Converting Modules from Apache 1.3 to Apache 2.0
The new architecture uses a series of hooks to provide for calling
your functions. These you'll need to add to your module by way of a
new function, static void register_hooks(void). The function is really
reasonably straightforward once you understand what needs to be done.
Each function that needs calling at some stage in the processing of a
request needs to be registered, handlers do not.
Developing modules for the Apache HTTP Server 2.4
When handling requests in Apache HTTP Server 2.4, the first thing you
will need to do is create a hook into the request handling process. A
hook is essentially a message telling the server that you are willing
to either serve or at least take a glance at certain requests given by
clients. All handlers, whether it's mod_rewrite, mod_authn_*,
mod_proxy and so on, are hooked into specific parts of the request
process. As you are probably aware, modules serve different purposes;
Some are authentication/authorization handlers, others are file or
script handlers while some third modules rewrite URIs or proxies
content.

Related

ResolveRequestCache state takes much time

I have a C# MVC application hosted in IIS test environment with only one action method in APIController. Clients call this single method and depending upon the parameters different small processes are performed.
I am using IIS 10.0.17763. Application is built in .Net Framework 4.6
I have disabled these modules as i don't need them.
WebDAVModule
WindowsAuthentication
ScriptModule-4.0
DefaultAuthentication
ServiceModel-4.0
UrlAuthorization
FileAuthorization
The problem is that under load test from jmeter, all calls somehow stay longer in ResolveRequestCache State.
Can someone guide me the problem behind or suggest me something to check. I am not using any kind of caching due to business requirement.
Here is the Screenshot of requests states from IIS
Edit. I have removed some other modules too to check the effect.
Here is the list of loaded modules in my application

add plugins to a go program

I need to provide pluggable functionality to a Go program.
The idea is that a 3rd party can add functionality for a given path, i.e.
/alive maps to http://localhost:9876, or
/branding maps to http://localhost:9877 and so on.
I first tried to think of it as adding a JSON config file, where each such plugin would have an entry, e.g.:
{
"Uri": "alive",
"Address":"http://localhost:9876",
"Handler":"github.com/user/repo/path/to/implementation"
},
This though blatantly reveals Java thinking - and feels like utterly inadequate for Go - there is no notion of Class loaders in Go, and loading this would mean to have to use the loader package from golang's tools.
Proposals on how to do this in a more Go- idiosyncratic way? In the end I just need to be able to map a URI to a port and to an implementation.
Compile-time configuration
If you can live with compile-time configuration, then there is no need for a JSON (or any other) configuration file.
Your main package can import all the involved "plugins", and map their handler to the appropriate path. There is also no need to create multiple servers, although you may do so if that fits better your (or the modules') needs.
Run-time configuration
Run-time configuration and plugging in a new module requires code to be loaded at run-time. This is supported by the plugin package, but currently only under linux.
For this you may use a JSON config file, where you would list the compiled plugins (path to the compiled plugins) along with the paths you need to map them.
In the main package you can read the config file, and load the plugins, which should expose a variable or a function that returns you the handler that handles the traffic (requests). This is preferred to the plugins themselves firing up an http server for performance reasons, but both can work (plugins returning a handler for you to register, and the plugins launching their servers).
Note that there is also no need to make the configuration "static", the main app could receive and load new modules at runtime too (e.g. via a dedicated handler, which could receive the (file) path to the new module and the path to map it to, optionally maybe even the binary plugin code too; but don't forget about security!).
Note that while you can load plugins at runtime, there is no way to "unload" them. Once a plugin is loaded, it will stay in memory until the app exists.
Separate, multi-apps
There is a third solution, in which your main app would act as a proxy. Here you may start the additional "modules" as separate apps, listening on localhost at specific ports, and the main app would act as a proxy, forwarding requests coming in to the other independent apps listening on different ports #localhost (or even on other hosts).
The standard library provides you the httputil.ReverseProxy doing just this.
This does not require runtime code loading, as the "modules" are separate apps which can be launched separately. Still, this gives runtime configuration flexibility, and this solution also works on all platforms. Moreover this setup supports taking down modules at runtime, as you can just as easily un-map / close the apps of the independent modules.
The separate apps can also be launched separately, or from / by the main app, both solutions are viable.

Are Rack-based web servers represent FastCGI protocol?

I've read that CGI/FastCGI is a protocol for interfacing external applications to web servers.
so the web server (like Apache or NginX) sends environment information and the page request itself to a FastCGI process over a socket and responses are returned by FastCGI to the web server over the same connection, and the web server subsequently delivers that response to the end-user.
Now I'm confused between this and Rack, which is used by almost all Ruby web frameworks and libraries. It provides an interface for developing web applications in Ruby by wrapping HTTP requests and responses.
So, Is Rack-based web-servers like Unicorn, Thin, Passenger or Puma represents the same FastCGI approach? Can I say that Unicorn is a Ruby implementation of FastCGI ?
As you say:
FastCGI is a protocol
Rack is an API
So these are actually two quite different things, though they could
be used together.
FastCGI specifies how two different processes should talk to each other
FastCGI, as a protocol, specifies how two different processes (nominally a web server and an application server or "FastCGI server") should talk to each other over a network connection. The specification defines records of data in a particular format that are sent and received by the two processes.
Exactly what the programs that send and receive these messages look like is not specified, and could be anything. On one side you might have a C program that assembles data in memory and then makes system calls to have the OS send the data, and on the other side you might have a Ruby program that opens a socket, reads in data into Arrays, and then parses those data, and builds a new object encapsulating the request.
Rack specifies what Ruby objects and methods must be made available to higher-level software
On the other hand, Rack, being a Ruby API specification specifies precisely what Ruby objects and methods must be made available to higher-level software implementing some sort of web application, and how those objects and methods must behave, from the point of view of the application. (Don't be confused by the use of the word "protocol" in the document linked above. Here it's used not in the sense of data formats as sent over a communications link, but in the object-oriented programming sense of the conceptual "messages" exchanged between objects to express program behavior, though this is actually at various levels and times implemented as function calls.)
Being an API specification, the user of the Rack API ought at least to behave as if it has no idea what's going on underneath the hood when it calls methods on the various objects an implementation of Rack presents. (Frequently it will have no idea.) It could be the case that the library actually has set up communication with a separate process acting as a web server, via FastCGI or some other protocol, and reads messages from the other process and sends messages back to it, based on what the application using the API implementation does. But on the other hand, you could equally (at least in theory) drop in a completely different implementation of the API that itself has Ruby code to run a web server, and the very same process that ran Ruby code for the web application would be running additional Ruby code to talk the HTTP protocol directly with a client web browser or whatever.
You can't say that Unicorn (or any other implementation of the Rack API) is a "Ruby implementation of FastCGI"
The question does not apply in the way that you asked it, because the whole point of the Rack API specification is that you explicitly avoid thinking about the actual implementation of the services provided through that API. It could well be that some implementations are using FastCGI, but your application should work equally well with one that's not, and you really don't want to care about what's going on underneath the hood.

Why should I avoid using CGI?

I was trying to create my website using CGI and ERB, but when I search on the web, I see people saying I should always avoid using CGI, and always use Rack.
I understand CGI will fork a lot of Ruby processes, but if I use FastCGI, only one persistent process will be created, and it is adopted by PHP websites too. Plus FastCGI interface only create one object for one request and has very good performance, as opposed to Rack which creates 7 objects at once.
Is there any specific reason I should not use CGI? Or it is just false assumption and it is entirely ok to use CGI/FastCGI?
CGI, by which I mean both the interface and the common programming libraries and practices around it, was written in a different time. It has a view of request handlers as distinct processes connected to the webserver via environment variables and standard I/O streams.
This was state-of-the-art in its day, when there were not really "web frameworks" and "embedded server modules" as we think of them today. Thus...
CGI tends to be slow
Again, the CGI model spawns one new process per connection. While spawning processes per se is cheap these days, heavy web app initialization — reading and parsing scores of modules, making database connections, etc. — makes this quite expensive.
CGI tends toward too-low-level (IMHO) design
Again, the CGI model explicitly mentions environment variables and standard input as the interface between request and handler. But ... who cares? That's much lower level than the app designer should generally be thinking about. If you look at libraries and code based on CGI, you'll see that the bulk of it encourages "business logic" right alongside form parsing and HTML generation, which is now widely seen as a dangerous mixing of concerns.
Contrast with something like Rack::Builder, where right away the coder is thinking of mapping a namespace to an action, and what that means for the broader web application. (Suddenly we are free to argue about the semantic web and the virtues of REST and this and that, because we're not thinking about generating radio buttons based off user-supplied input.)
Yes, something like Rack::Builder could be implemented on top of CGI, but, that's the point. It'd have to be a layer of abstraction built on top of CGI.
CGI tends to be sneeringly dismissed
Despite CGI working perfectly well within its limitations, despite it being simple and widely understood, CGI is often dismissed out of hand. You, too, might be dismissed out of hand if CGI is all you know.
Don't use CGI. Please. It's not worth it. Back in the 1990s when nobody knew better it seemed like a good idea, but that was when scripts were infrequent, used for special cases like handling form submissions, not driving entire sites.
FastCGI is an attempt at a "better CGI" but it's still deficient in a large number of ways, especially because you have to manage your FastCGI worker processes.
Rack is a much better system, and it works very well. If you use Rack, you have a wide variety of hosting systems to choose from, even Passenger which is really simple and reliable.
I don't know what mean when you say Rack creates "7 objects at once" unless you mean there are 7 different Rack processes running somehow or you've made a mistake in your implementation.
I can't think of a single instance where CGI would be better than a Rack equivalent.
There exists a lot of confusion about what CGI, Rack etc. really are. As I describe here, Rack is an API, and FastCGI is a protocol. CGI is also a protocol, but in its narrow sense also an implementation, and for what you're speaking of is not at all the same thing as FastCGI. So let's start with the background.
Back in the early 90s, web servers simply read files (HTML, images, whatever) off the disk and sent them to the client. People started to want to do some processing at the time of the request, and the early solution that came out was to run a program that would produce the result sent back to the client, rather than just reading the file. The "protocol" for this was for the web server to be given a URL that it was configured to execute as a program (e.g., /cgi-bin/my-script), where the web server would then set up a set of environment variables with various information about the request and run the program with the body of the request on the standard input. This was referred to as the "Common Gateway Interface."
Given that this forks off a new process for every request, it's clearly inefficient, and you almost certainly don't want to use this style of dynamic request handling on high-volume web sites. (Starting a whole new process is relatively expensive in computational resources.)
One solution to making this more efficient is to, rather than starting a new process, send the request information to an existing process that's already running. This is what FastCGI is all about; it maintains a very similar interface to CGI (you have a set of variables with most of the request information, and a stream of data for the body of the request). But instead of setting actual Unix environment variables and starting a new process with the body on stdin, it sends a request similar to an HTTP request to an FCGI server already running on the machine where it specifies the values of these variables and the request body contents.
If the web server can have the program code embedded in it somehow, this becomes even more efficient because it just runs the code itself. Two classic examples of how you might do this would be:
Have PHP embedded in Apache, so that the "Apache server code" just calls the "PHP server code" that's part of the same process; and
Not run Apache at all, but have the web server be written in Ruby (or Python, or whatever) and load and run more Ruby code that's been custom-written to handle the request.
So where does Rack come in to this? Rack is an API that lets code that handles web requests receive it in a common way, regardless of the web server. So given some Ruby code to process a request that uses the Rack API, the web server might:
Be a Ruby web server that simply makes function calls in its own process to the Rack-compliant code that it loaded;
Be a web server (written in any language) that uses the FastCGI protocol to talk to another process with FastCGI server code that, again, makes function calls to the Rack-compliant code that handles the request; or
Be a server that starts a brand new process that interprets the CGI environment variables and standard input passed to it and then calls the Rack-compliant code.
So whether you're using CGI, FastCGI, another inter-process protocol, or an intra-process protocol, makes no difference; you can do any of those using Rack so long as the server knows about it or is talking to a process that can understand CGI, FastCGI or whatever and call Rack-compliant code based on that request.
So:
For performance scaling, you definitely don't want to be using CGI; you want to be using FastCGI, a similar protocol (such as the Tomcat one), or direct in-process calling of the code.
If you use the Rack API, you don't need to worry at the early stages which protocol you're using between your web server and your program because the whole point of APIs like Rack is that you can change it later.

CherryPy, SQLAlchemy Core thread safety?

in my web based app, i decided to use Cherrypy 3.2 as http framework.
I am using the cherrypy.Application class to create WSGI compatible appliaction object, which is served via Apache2 with mod_wsgi.
Also, I am using just core components of SQLalchemy 0.7.3 (not ORM). There are available some tools for cherrypy for correct session binding per request (like SATools). But Session object of SQLalchemy is part of the ORM, not the core.
So I have started thinking about how to make similar tool but without session.
The documentation of SQLalchemy says:
For a multiple-process application that uses the os.fork system call,
or for example the Python multiprocessing module, it’s usually
required that a separate Engine be used for each child process.
So how to correctly create one engine per cherrypy thread? Taking note that the threads are created by apache2 (probably).
Thank you!!
edit: it is maybe important, that wsgi application is ran in daemon mode by apache2
Under mod_wsgi I don't think this is an issue if I understand the question because the application isn't preloaded into memory prior to the fork in mod_wsgi. Instead, the application is separately loaded into each distinct process so there are no issues from shared stuff due to inheritance across a fork.

Resources