Logging vs. Debugging - debugging

Background: I've inherited a web application that is intended to create on-the-fly connections between local and remote equipment. There are a tremendous number of moving parts recently: the app itself has changed significantly; the development toolchain was just updated; and both the local and remote equipment have been "modified" to support those changes.
The bright side is that it has a reasonable logging system that will write debug messages to a file, and it will log to both the file and a real-time user screen. I have an opportunity to re-work the entire log/debug mechanism.
Examples:
All messages are time-stamped and prefixed with a severity level.
Logs are for the customer. They record the system's responses to his/her requests.
Any log that identifies a problem also suggests a solution.
Debugs are for developers and Tech Support. They reveal the system internals.
Debugs indicate the function and/or line that generated them.
The customer can adjust the debug level on the fly to set the verbosity.
Question: What best practices have you used as a developer, or seen as a consumer, that generate useful logs and debugs?
Edit: Many helpful suggestions so far, thanks! To clarify: I'm more interested in what to log: content, format, etc.--and the reasons for doing so--than specific tools.
What was it about the best logs you've seen that made them most helpful?
Thanks for your help!

Don't confuse Logging, Tracing and Error Reporting, some people I know do and it creates one hell of a log file to grep through in order to get the information I want.
If I want to have everything churned out, I seperate into the following:
Tracing -> Dumps every action and step, timestamped, with input and
output data of that stage (ugliest and
largest file)
Logging -> Log the business process steps only, client does enquiry so log
the enquiry criteria and output data
nothing more.
Error Reporting / Debugging -> Exceptions logged detailing where it
occurred, timestamped, input/output
data if possible, user information etc
That way if any errors occurred and the Error/Debug log doesn't contain enough information for my liking I can always do a grep -A 50 -B 50 'timestamp' tracing_file to get more detail.
EDIT:
As has also been said, sticking to standard packages like the built in logging module for python as an example is always good. Rolling your own is not a great idea unless the language does not have one in it's standard library. I do like wrapping the logging in a small function generally taking the message and value for determining which logs it goes to, ie. 1 - tracing, 2 - logging, 4 - debugging so sending a value of 7 drops to all 3 etc.

The absolutley most valueable thing done with any logging framework is a "1-click" tool that gathers all logs and mail them to me even when the application is deployed on a machine belonging to a customer.
And make good choices at what to log so you can roughly follow the main paths in your application.
As frameworks I've used the standards (log4net, log4java, log4c++)
do NOT implement your own logging framework, when there already is a good one out-of-the-box. Most people who do just reinvent the wheel.

Some people never use a debugger but logs everything. That's different philosophies, you have to make your own choice. You can find many advices like these, or this one. Note that these advice are not language related...
Coding Horror guy got an interesting post about logging problem and why abusive logging could be a time waste in certain conditions.
I simply believe logging is for tracing things that could remain in production. Debug is for development. Maybe it's a too simple way of seeing things, cause some people use logs for debugging because they can't stand debuggers. But debugger-mode can be a waste of time too: you don't have to use it like a sort of test case, because it's not written down and will disappear after debug session.
So I think my opinion about this is :
logging for necessary and useful traces through development and production environments, with development and production levels, with the use of a log framework (log4 family tools)
debugging-mode for special strange cases when things are going out of control
test cases are important and can save time spend in infernal labyrinthine debugging sessions, used as an anti-regression method. Note that most of the people don't use test cases.
Coding horror said resist to the tendency of logging everything. That's right, but I've already seen a hudge app that does the exact contrary in a pretty way (and through a database)...

I would just setup your logging system to have multiple logging levels, on the services I write I have a logging/audit for almost every action and it's assigned a audit level 1-5 the higher the number the more audit events you get.
The very basic logging: starting, stopping, and restarting
Basic logging: Processing x number of files etc
Standard logging: Beginning to Processing, Finished processing, etc
Advanced logging: Beginning and ending of every stage in Processing
Everything : every action taken
you set the audit level in a config file so it can be changed on the fly.

Some general rules-of-thumb I have found to be useful in server-side applications:
requestID - assign a request ID to each incoming (HTTP) request and then log that on every log line, so you can easily grep those logs later by that ID and find all relevant lines. If you think it is very tedious to add that ID to every log statement, then at least java logging frameworks have made it transparent with the use of Mapped Diagnostic Context (MDC).
objectID - if your application/service deals with manipulating some business objects that have primary key, then it is useful to attach also that primary key to diagnostic context. Later, if someone comes with question "when was this object manipulated?" you can easily grep by the objectID and see all log records related to that object. In this context it is (sometimes) useful to actually use Nested Diagnostic Context instead of MDC.
when to log? - at least you should log whenever you cross an important service/component boundary. That way you can later reconstruct the call-flow and drill down to the particular codebase that seems to cause the error.
As I'm a Java developer, I will also give my experience with Java APIs and frameworks.
API
I'd recommend to use Simple Logging Facade for Java (SLF4J) - in my experience, it is the best facade to logging:
full-featured: it has not followed the least-common denominator approach (like commons-logging); instead, it is using degrade gracefully approach.
has adapters for practically all popular Java logging frameworks (e.g. log4j)
has solutions available on how to redirect all legacy logging APIs (log4j, commons-logging) to SLF4J
Implementation
The best implementation to use with SLF4J is logback - written by the same guy who also created SLF4J API.

Use an existing logging format, such as that used by Apache, and you can then piggyback on the many tools available for analysing the format.

Related

How do I catch and and log ALL exceptions/errors globally for a WebAPI 2 application?

I have a web API going in a live environment soon. I want to log all exceptions/errors globally.
I found multiple solutions online, some newer than others like using Elmah or
doing a custom version using "Global error handling".
What is a reliable, painless solution to set up for this?
There are many different solutions depending on what you are looking for.
There are a few tools like Sentry that focus on exception tracking (Basically the Crashlytics, but for the web world). They focus a lot less on logs and mostly on uncaught (or sometimes caught exceptions). The pro is you don't have to code up many log messages and their tools can gather other context like the environmental variables or SDK versions.
There is also a group of tools that focus on logging. There is no SDK or agent unlike the first companies, rather they hook directly into the stdout/syslog outputs. Such companies would be Loggly and Logentries. These tools are designed more for searching millions of lines of logs.
There is also us (Moesif), we focus not on exception tracking, rather API errors and analytics by capturing context around API calls and the JSON payloads with them. While I work at Moesif, any of these options can be good and usually integrate pretty quickly. Each have 30 day to try, so sometimes you can try a few and see what fits in your workflow the best.

What is best for logging

since Umbraco v6 decided to implement logging to a text file by default, I would like to ask you guys what kind of logging you use.
Do you log to a text file on a production website, or do you log to a database table? Or do you implement any other kind of logging?
And what are the performance implications of this?
I do both type of logging file as well as DB on production environment, as I need to audit logs so need to have everything actual and saved.
I use nLog.
http://nlog-project.org/
Its robust, fast and good and have been using it in production environment from last year.
Its good and gives you logging at various levels.
I would recommend you to use NLog.
At one time I investigated question about the best frameworks for logging and stopped on NLog.
I have already used it on different projects and it always show good results.
With NLog you can sent your logs to a different targets:
file, database, event log, console, email, nlogviewer and so forth.
You can set up all configuration on config files. It's very cool and useful. You can easily set up how and where you want to write your logs.
At your disposal is also Wrapper Targets (see datail in documentation). In my opinion the most useful target is AsyncWrapper (provides asynchronous, buffered execution of target writes). It will give you good performance.
There are also a lot of another cool featers.

asp.net mvc logging framework

We need a logging framework that would allow us to do the following
Log all unhandled exceptions
Log the pages that a user visits and be able to reproduce their
history (so we can try and optimise the site better for user usage
patterns)
Log any custom messages that we might want to output in various
places in the code, ie. warning/information
Log asp.net membership sucess/failure
Log any other system related events
I've seen that ELMAH will do the #1, the asp.net health monitoring seems like it will handle #4 and #5 and possibly #2 (not 100% sure on that one though), and something like either NLog or Log4Net will take care of #3.
I did find a tutorial on implementing all of these but I'm wondering if its neccessary to have all of those libraries just for this type of logging, seems a bit much and then you have the overhead of trying to combine them into a standard type of view if thats how you are going to display them.
Is there a simpler way than this ?
Personally I like having different system logging different types of data. Each of these are, as I see it, different domains of logging roles. An analyst shouldn't care about #1 exceptions, those are for developers, and developers should care about say #2 (Google Analytics or Adobe SiteCatalyst). I feel that allowing the different systems to log as needed and distribute the logging information differently is appropriate although as you stated it is a bit much.
This question should probably have been asked on Programmers.StackExchange.Com.

Performance testing application for bottle necks using production data

I have been tasked with looking for a performance testing solution for one of our Java applications running on a Weblogic server. The requirement is to record production requests (both GET and POST including POST data) and then run these requests in a performance test environment with a copy of the production database.
The reasons for using production requests instead of a test script are:
It is a large application with no existing test scripts so it would be a a large amount of work to write scripts to cover the entire application.
Some performance issues only appear when users do a number of actions in a particular order.
To test using actual user interaction with the system not an estimation at how the users may interact with the system. We all know that users will do things we have not thought of.
I want to be able to fix performance issues and rerun the requests against the fixed code before releasing to production.
I have looked at using JMeters Access Log Sampler with server access logs however the access logs do not contain POST data and the access log sampler only looks at the request URL so it cannot simulate users submitting form data.
I have also looked at using the JMeter HTTP Proxy Server however this can record the actions of only one user and requires the user to configure their browser to use the proxy. This same limitation exist with Tsung and The Grinder.
I have looked at using Wireshark and TCReplay but recording at the packet level is excessive and will not give any useful reports at a request level.
Is there a better way to analyze production performance considering I need to be able to test fixes before releasing to production?
That is going to be a hard ask. I work with Visual Studio Test Edition to load test my applications and we are only able to "estimate" the users activity on the site.
It is possible to look at the logs and gather information on the likelyhood of certain paths through your app. You can then look at the production database to look at the likely values entered in any post requests. From that you will have to make load tests that approach the useage patterns of your production site.
With any current tools I don't think it is possible to record and playback actual user interation.
It is possible to alter your web app so that is records and logs every request and post against session and datetime. This custom logging could be then used to generate load test requests against a test website. This would be some serious code change to your existing site and would likely have performance impacts.
That said, I have worked with web apps that do this level of logging and the ability to analyse the exact series of page posts/requests that caused an error is quite valuable to a developer.
So in summary: It is possible, but I have not heard of any off the shelf tools that do it.
Please check out this Whitepaper by Impetus Technologies on this page.. http://www.impetus.com/plabs/sandstorm.html
Honestly, I'm not sure the task you're being asked to do is even possible, let alone a good idea. Depending on how complex the application's backend is, and how perfect you can recreate the state (ie: all the way down to external SOA services or the time/clock), it may not be possible to make those GET and POST requests reproduce the same behavior.
That said, performance testing against production data is always great, but it usually requires application-specific knowledge that will stress said data. Simply repeating HTTP GETs and POSTs will almost certainly not yield useful results.
Good luck!
I would suggest the following to get the production requests and simulate the accurate workload:
1) Use coremetrics: CoreMetrics provides such solutions using which you can know the application usage patterns. This would help in coming up with an accurate workload model. This model can then be converted into test scripts and executed against a masked copy of production database. This will provide you accurate results about the application performance in realtime.
2) Another option would be creating a small utility using AOP (Aspect oriented apporach) so that it can trace all the requests and corresponding method traces. This would help in identifying the production usage pattern and in turn accurate simulation of workload. AOP frameworks such as AspectJ can be used. This would not require any changes in code. The instrumentation can be done on the fly. The other benefit would be that thi cna only be enabled for a specific time window and then it can be turned off.
Regards,
batterywalam

Logging *Business* Events - use logging framework?

Something here doesn't feel right to me here, and so I would like the community's input - perhaps I am approaching this in the wrong way....
Q: Is is appropriate to use traditional infrastructure logging frameworks (like log4net) to log business events?
When I say business events, I mean I want a global log like this:
xx:xx Customer A purchased widget B.
xx:xx Widget B was dispatched from warehouse.
xx:xx Customer B payment declined.
Most traditional infrastructure logging frameworks have event levels something like this:
FATAL
ERROR
WARN
INFO
DEBUG
An of course these messages don't fit well into that. Best description would be INFO, but of course these are important events, and INFO is of very low importance.
I would still like this as a 'log' (e.g. I don't want to have to extract this from my business objects each time I want to see it)
Seems to me I have two options:
1) Use a framework like log4net and just define a special logger for this (and live with the fact that it doesn't feel right).
2) Provide a service for performing this that doesn't rely on a traditional logging services.
I'm leaning towards 2. What has anyone else done in a similar situations?
Thanks!
What you're wanting sounds like an auditing service, not a logging service. If I'm right, your goals are to keep track of these business events for historical and maybe even reporting purposes. You can use the details in the audit to, for lack of a better phrase, place blame for events that happen in the system.
I probably wouldn't use a logging system, like log4j, for this purpose. In our system, auditing is a first class citizen as a full service.
--
HTH,
Dusty
Leave the logger for things having to do with the program, not the business. It is just a tool to help the developers.
Write your own system to log business events. If it is a business requirement to have a record, you will want something you have control over and you will need to use the logger above to keep track of how it works.
Basically, #2 in your question.
To me the idea of a Business Event is that it plays a role in some future business processing, anything from actually triggering Business Actions to simply available for analytics.
Hence, completely different QOS requirements. needs its own API.
Conceviably initially that maps down to logging, but in future could go to reliable messaging or DB.
These sound like the sorts of things that your customers might potentially want to query or report on from within your application - the obvious choice would be the database.
In particualr, in this case I'd feel like traditional logging frameworks wouldnt be suitable because when it comes to data that you might later want to access within your application logging frameworks allow you to do things that dont really make sense, for example you might be able to change where the logging is sent to based on the app.config file (which is unhelful if you try and read it from a different location).
That said, if a logging framework allows you to do exactly what you want already then there isnt any shame in just using the logging framework as your implementation and saving yourself the effort:
class TransactionLogger
{
public void Log (Message message)
{
MyLoggingFramework.Log(message.string, etc...);
}
}
In my experience business events comprise large or huge number of technical operations behind the scenes, with only certain business events being important to the business.
This creates problems when trying to use a generic logging methodology, so in general, in the systems I've worked on, both are used.
Logging for the technical aspects, and business event logging for the business events.
The business event logging, doesn't use the same technology as the technical logging, and instead logs to a custom designed history/audit table (Sometimes these are split, depending on the required detail), which is designed specifically for each application. (This keeps the auditors and users nice and happy.)
This allows easy reporting, and management of the information, while obviously expanding the scope of each specification slightly.
you could use it but you need is business activity monitoring and event processing software. Off the top of my head, IBM WebSphere Business Monitor provides this capability. It processes Common Base Event (an IBM implementation of the Web Services Distributed Management Web Event Format standard) and then takes that data and create business activity dashboards.
Check out DiALog: A Distributed Model for Capturing Provenance and Auditing Information, apart from the distributed aspect, you can use the subject-predicate-object principle to record the business events. And afterwards reconstruct certain trails.
Here is a related post - mine. Audit logging and exception management framework.

Resources