multiple processes writing to a single log file - windows

This is intended to be a lightweight generic solution, although the problem is currently with a IIS CGI application that needs to log the timeline of events (second resolution) for troubleshooting a situation where a later request ends up in the MySQL database BEFORE the earlier request!
So it boils down to a logging debug statements in a single text file.
I could write a service that manages a queue as suggested in this thread:
Issue writing to single file in Web service in .NET
but deploying the service on each machine is a pain
or I could use a global mutex, but this would require each instance to open and close the file for each write
or I could use a database which would handle this for me, but it doesnt make sense to use a database like MySQL to try to trouble shoot a timeline issue with itself. SQLite is another possability, but this thread
http://www.perlmonks.org/?node_id=672403
Suggests that it is not a good choice either.
I am really looking for a simple approach, something as blunt as writing to individual files for each process and consolidating them accasionally with a scheduled app. I do not want to over engineer this, nor spend a week implementing it. It is only needed occassionally.
Suggestions?

Try the simplest solution first - each write to the log opens and closes the file. If you experience problems with this, which you probably won't , look for another solution.

You can use file locking. Lock the file for writing, write the message, unlock.

My suggestion is to preserve performance then think in asynchronous logging. Why not send your data log info using UDP to service listening port and he write to log file.

I would also suggest some kind of a central logger that can be called by each process in an asynchronous way. If the communication is UDP or RPC or whatever would be an implementation detail.

Even thought it's an old post, has anyone got an idea why not using the following concept:
Creating/opening a file with share mode of FILE_SHARE_WRITE.
Having a named global mutex, and opening it.
Whenever a file write is desired, lock the mutex first, then write to the file.
Any input?

Related

Windows Named Pipes connections

I have search high and low for this answer. Can one code up a Named Pipes server where the connection that a client makes is persistent until you close the applications? This would be in C/C++. Not asking any one to actually do this, as I am capable. To explain in a little more detail if my question is not clear, I want to be able to have the client connect to the server and then be able to pass data back and forth without having to kill the connection at the end of each data transaction and then start a new one again for the next. It seams that in every example I have seen or read, the transaction only lasted for that one data exchange. That seems wasteful and extremely time consuming. Then I want to thread it so I can have up to 8 clients on the same named pipe. If you know of example code that does this, that would be great also. Already read the Microsoft examples, and they seem to be single data exchanges with new connections every time.
My confusion lies with the readfile() and writefile() functions. They need the pipe handle and pointers to the data structures just like a file R/W on the hard drive. Those files can be opened at program start, used, and then finally closed just before you exit your application. There are risks to doing this, but sometimes necessary. So I want my server application to be in control not the clients.
Thanks in advance. I will answer any questions if this is not clear to you.
I am not surprised at your answers. I was really wanting a way to keep the connection open per instance, but since this is not how pipes work, I get it. I have devised a better way to make my applications talk to each other. I originally had one server and many clients, so I turned that around and now have one client and many servers each with a different pipe name. Since my client, that was the server, did most of the initiating of the messages, I can now manage better they way I send and request the data via messages/pipes. The only draw back is not giving the data to all of them at once, but what is a few microseconds amount friends. Please let me know if this will not work as I expect it will, before I spend a lot of time on the code. Thanks. Any suggestions are welcomed.

how to monitor operation on mac platform

I am trying to get file open/write/create operation, I have tried fslogger which can only get file creation/delete....and other operations, can not get open/close operation,
then I wrote a driver to do it, I can get open/close operation but can not get create operation, what's more, it's too messy!
for example, if I open a file and modify it, and then close it, the driver gets a lot of open/write operations..I have no way to tell which one is really caused by user open/close operation..
any hints about this?
thanks.
Your best bet is going to be the KAuth system. You install your kauth handler (as a kernel extension) and get various callback codes when someone tries to create, open or close a file. This involves getting your callback in the critical path of opening files, so whatever you do has to be quick!
To quote:
KAUTH_SCOPE_FILEOP defines the following actions.
KAUTH_FILEOP_OPEN
KAUTH_FILEOP_CLOSE
KAUTH_FILEOP_CLOSE_MODIFIED
KAUTH_FILEOP_RENAME
KAUTH_FILEOP_EXCHANGE
KAUTH_FILEOP_LINK
KAUTH_FILEOP_EXEC
https://developer.apple.com/library/mac/technotes/tn2127/_index.html
If you're writing a kext you then have the question of how to get that info back into userland. FWIW I used Kqueue but you may have success with another method (let me know in the comments if you do!).
More info on Kauth here and KQueue here. It's not brilliantly documented, but there's enough info between those two to work out what you need to do.

Best practice when using a Rails app to overwrite a file that the app relies on

I have a Rails app that reads from a .yml file each time that it performs a search. (This is a full text search app.) The .yml file tells the app which url it should be making search requests to because different version of the search index reside on different servers, and I occasionally switch between indexes.
I have an admin section of the app that allows me to rewrite the aforementioned .yml file so that I can add new search urls or remove unneeded ones. While I could manually edit the file on the server, I would prefer to be able to also edit it in my site admin section so that when I don't have access to the server, I can still make any necessary changes.
What is the best practice for making edits to a file that is actually used by my app? (I guess this could also apply to, say, an app that had the ability to rewrite one of its own helper files, post-deployment.)
Is it a problem that I could be in the process of rewriting this file while another user connecting to my site wants to perform a search? Could I make their search fail if I'm in the middle of a write operation? Should I initially write my new .yml file to a temp file and only later replace the original .yml file? I know that a write operation is pretty fast, but I just wanted to see what others thought.
UPDATE: Thanks for the replies everyone! Although I see that I'd be better off using some sort of caching rather than reading the file on each request, it helped to find out what the best way to actually do the file rewrite is, given that I'm specifically looking to re-read it each time in this specific case.
If you must use a file for this then the safe process looks like this:
Write the new content to a temporary file of some sort.
Use File.rename to atomically replace the old file with the new one.
If you don't use separate files, you can easily end up with a half-written broken file when the inevitable problems occur. The File.rename class method is just a wrapper for the rename(2) system call and that's guaranteed to be atomic (i.e. it either fully succeeds or fully fails, it won't leave you in an inconsistent in-between state).
If you want to replace /some/path/f.yml then you'd do something like this:
begin
# Write your new stuff to /some/path/f.yml.tmp here
File.rename('/some/path/f.yml.tmp', '/some/path/f.yml')
rescue SystemCallError => e
# Log an error, complain loudly, fall over and cry, ...
end
As others have said, a file really isn't the best way to deal with this and if you have multiple servers, using a file will fail when the servers become out of sync. You'd be better off using a database that several servers can access, then you could:
Cache the value in each web server process.
Blindly refresh it every 10 minutes (or whatever works).
Refresh the cached value if connecting to the remote server fails (with extra error checking to avoid refresh/connect/fail loops).
Firstly, let me say that reading that file on every request is a performance killer. Don't do it! If you really really need to keep that data in a .yml file, then you need to cache it and reload only after it changes (based on the file's timestamp.)
But don't check the timestamp every on every request - that's almost as bad. Check it on a request if it's been n minutes since the last check. Probably in a before_filter somewhere. And if you're running in threaded mode (most people aren't), be careful that you're using a Mutex or something.
If you really want to do this via overwriting files, use the filesystem's locking features to block other threads from accessing your configuration file while it's being written. Maybe check out something like this.
I'd strongly recommend not using files for configuration that needs to be changed without re-deploying the app though. First, you're now requiring that a file be read every time someone does a search. Second, for security reasons it's generally a bad idea to allow your web application write access to its own code. I would store these search index URLs in the database or a memcached key.
edit: As #bioneuralnet points out, it's important to decide whether you need real-time configuration updates or just eventual syncing.

Creating proxy between application queries and Internet

Is it possible (for example with C++, but it does not really matter) to create a bridge/proxy application to get the data requested by another application? To be more detailed, I'm talking about a Adobe Air based game. (I want to create a report with stats based on the data acquired, but that is not actually part of this question.)
Rather than simple "boolean" answer please provide some link to example/documentation. Thanks
It would always be possible, and depending on the your target operating system, may require a fair amount of effort, which begs the question - is there a reason you cannot use Fiddler or some packet sniffing software for your target OS?
You can write a proxy by hand, in python can be quite easy. All you have to do is to set localhost as proxy, then forward the request and pass it back to the calling socket.
I've started writing something like this some times ago. The idea was to write a simple replacement for dansguardian.
I've uploaded it on github so you can give it a look if it can help.
I do not remember well (I've started writing it the last year) but maybe with some modification can fit well your requests.
Conceptually, this is your configuration:
app_client -> [app_channel] -> proxy -> [server_channel] -> app_server
Your proxy starts a server socket, the app_client connects to it. This is our app_channel. Now your proxy creates a connection to the app_server. This is your server_channel.
Now start 2 threads, one which reads from the app_channel and writes to the server_channel, the other reads from the server_channel and writes to the app_channel.
This will create a transparent connection to the app_server via your proxy. You can extract the data as you wish. If the data is encrypted though, there's very little you can actually do by way of analysis.

Looking for pattern/approach/suggestions for handling long-running operation tied to web app

I'm working on a consumer web app that needs to do a long running background process that is tied to each customer request. By long running, I mean anywhere between 1 and 3 minutes.
Here is an example flow. The object/widget doesn't really matter.
Customer comes to the site and specifies object/widget they are looking for.
We search/clean/filter for widgets matching some initial criteria. <-- long running process
Customer further configures more detail about the widget they are looking for.
When the long running process is complete the customer is able to complete the last few steps before conversion.
Steps 3 and 4 aren't really important. I just mention them because we can buy some time while we are doing the long running process.
The environment we are working in is a LAMP stack-- currently using PHP. It doesn't seem like a good design to have the long running process take up an apache thread in mod_php (or fastcgi process). The apache layer of our app should be focused on serving up content and not data processing IMO.
A few questions:
Is our thinking right in that we should separate this "long running" part out of the apache/web app layer?
Is there a standard/typical way to break this out under Linux/Apache/MySQL/PHP (we're open to using a different language for the processing if appropriate)?
Any suggestions on how to go about breaking it out? E.g. do we create a deamon that churns through a FIFO queue?
Edit: Just to clarify, only about 1/4 of the long running process is database centric. We're working on optimizing that part. There is some work that we could potentially do, but we are limited in the amount we can do right now.
Thanks!
Consider providing the search results via AJAX from a web service instead of your application. Presumably you could offload this to another server and let you web application deal with the content as you desire.
Just curious: 1-3 minutes seems like a long time for a lookup query. Have you looked at indexes on the columns you are querying to improve the speed? Or do you need to do some algorithmic process -- perhaps you could perform some of this offline and prepopulate some common searches with hints?
As Jonnii suggested, you can start a child process to carry out background processing. However, this needs to be done with some care:
Make sure that any parameters passed through are escaped correctly
Ensure that more than one copy of the process does not run at once
If several copies of the process run, there's nothing stopping a (not even malicious, just impatient) user from hitting reload on the page which kicks it off, eventually starting so many copies that the machine runs out of ram and grinds to a halt.
So you can use a subprocess, but do it carefully, in a controlled manner, and test it properly.
Another option is to have a daemon permanently running waiting for requests, which processes them and then records the results somewhere (perhaps in a database)
This is the poor man's solution:
exec ("/usr/bin/php long_running_process.php > /dev/null &");
Alternatively you could:
Insert a row into your database with details of the background request, which a daemon can then read and process.
Write a message to a message queue which a daemon then read and processed.
Here's some discussion on the Java version of this problem.
See java: what are the best techniques for communicating with a batch server
Two important things you might do:
Switch to Java and use JMS.
Read up on JMS but use another queue manager. Unix named pipes, for instance, might be an acceptable implementation.
Java servlets can do background processing. You could do something similar to this technology in a web technology with threading support. I don't know about PHP though.
Not a complete answer but I would think using AJAX and passing the 2nd step to something thats faster then PHP (C, C++, C#) then a PHP function pick the results off of some stack most likely just a database.

Resources