What is best for logging - performance

since Umbraco v6 decided to implement logging to a text file by default, I would like to ask you guys what kind of logging you use.
Do you log to a text file on a production website, or do you log to a database table? Or do you implement any other kind of logging?
And what are the performance implications of this?

I do both type of logging file as well as DB on production environment, as I need to audit logs so need to have everything actual and saved.
I use nLog.
http://nlog-project.org/
Its robust, fast and good and have been using it in production environment from last year.
Its good and gives you logging at various levels.

I would recommend you to use NLog.
At one time I investigated question about the best frameworks for logging and stopped on NLog.
I have already used it on different projects and it always show good results.
With NLog you can sent your logs to a different targets:
file, database, event log, console, email, nlogviewer and so forth.
You can set up all configuration on config files. It's very cool and useful. You can easily set up how and where you want to write your logs.
At your disposal is also Wrapper Targets (see datail in documentation). In my opinion the most useful target is AsyncWrapper (provides asynchronous, buffered execution of target writes). It will give you good performance.
There are also a lot of another cool featers.

Related

How to define compatible logs

I have the chance to influence the log format of a logging solution we are about to set up for an existing backend system. It is not open-telemetry based and may never be, but at the moment I can still make suggestions and would like to make sure the logs are written in a compatible format. Is there some kind of overview or definition I can use as a base? Some kind of list of mandatory fields the need to be filled?
I see you found the data model (https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/logs/data-model.md) in the specification - keep in mind, logging support for OpenTelemetry is currently not stable and so this may change. Generally, I suspect that if you use something like the Elastic Common Schema (https://www.elastic.co/guide/en/ecs/master/ecs-log.html) then you should be broadly compatible going forward.

What would be the advantages of using ELK for log management over a simple python logging + existing database log table combo?

Assuming I have many Python processes running on an automation server such as Jenkins, let's say I want to use Python's native logging module and, other than writing to the Jenkins console or to a log file, I want to store & centralize the logs somewhere.
I thought of using ELK for that, but then I realized that I can just as well create a dedicated log table in an existing database (I'm using Redshift), use something like Grafana for log dashboards/visualization and save myself the trouble of deploying a new system (most of the people in my team are familiar with Redshift but not with ElasticSearch).
Although it sounds straightforward, I feel like I'm not looking at the big picture and that I would be missing some powerful capabilities that components like Logstash were written for the in the first place. What would these capabilities be and how would it be advantageous to use ELK instead of my solution?
Thank you!
I have implemented a full ELK stack in my company in the past year.
The project was huge and took a lot of time to properly implement. The advantages of using ELK and not implementing our own centralized logging solution would be:
Not needing to re-invent the wheel- There is already a product that is doing just that. (and the installation part is extremely easy)
It is battle tested and can stand huge amount of logs in a short time.
As your business and product grows and shift you will need to parse more logs with different structure which will mean DB changes on self built system. logstash will give you endless possibilities of filtering and parsing those new formatted logs.
It has Cluster and HA capabilities, and you can scale your logging system vertically and horizontally.
Very easy to maintain and change over time.
It can send the needed output to a variety of products including Zabbix, Grafana, elasticsearch and many more.
Kibana will give you ability to view the logs, build graphs and dashboards, alerts and more...
The options with ELK are really endless and the more I work with it, the more I find new ways it can help me. not just from viewing logs on distributed remote server systems, but also security alerts and SLA graphs and many other insights.

Microsoft.Extensions.Logging.Console unable to understand

I am confused why we have to us this package.
Microsoft.Extensions.Logging.Console
I have 3 questions.
1. what is logging, whats the advantages of logging, and if i don't use it then whats pitfalls.?
2.what is logging.console, why we have to use it?
3.what is loggerfactory?
You should always have a logging facility in your application, without log entries it will be hard to find runtime bugs as there will be no information on what is happening in your application. Its not required to use it, but in a production environment it is a must have thing.
Logging.Console configures the logging facility to print out the log entries in the console, there are others provider options and you can write a custom one as well.
The LoggerFactory is an abstraction that "behind the hood" redirects your log messages to all installed providers, so you can have as many log outputs as you want by only changing the application startup.
I recommend to read the asp net core logging fundamentals documentation

Performance testing application for bottle necks using production data

I have been tasked with looking for a performance testing solution for one of our Java applications running on a Weblogic server. The requirement is to record production requests (both GET and POST including POST data) and then run these requests in a performance test environment with a copy of the production database.
The reasons for using production requests instead of a test script are:
It is a large application with no existing test scripts so it would be a a large amount of work to write scripts to cover the entire application.
Some performance issues only appear when users do a number of actions in a particular order.
To test using actual user interaction with the system not an estimation at how the users may interact with the system. We all know that users will do things we have not thought of.
I want to be able to fix performance issues and rerun the requests against the fixed code before releasing to production.
I have looked at using JMeters Access Log Sampler with server access logs however the access logs do not contain POST data and the access log sampler only looks at the request URL so it cannot simulate users submitting form data.
I have also looked at using the JMeter HTTP Proxy Server however this can record the actions of only one user and requires the user to configure their browser to use the proxy. This same limitation exist with Tsung and The Grinder.
I have looked at using Wireshark and TCReplay but recording at the packet level is excessive and will not give any useful reports at a request level.
Is there a better way to analyze production performance considering I need to be able to test fixes before releasing to production?
That is going to be a hard ask. I work with Visual Studio Test Edition to load test my applications and we are only able to "estimate" the users activity on the site.
It is possible to look at the logs and gather information on the likelyhood of certain paths through your app. You can then look at the production database to look at the likely values entered in any post requests. From that you will have to make load tests that approach the useage patterns of your production site.
With any current tools I don't think it is possible to record and playback actual user interation.
It is possible to alter your web app so that is records and logs every request and post against session and datetime. This custom logging could be then used to generate load test requests against a test website. This would be some serious code change to your existing site and would likely have performance impacts.
That said, I have worked with web apps that do this level of logging and the ability to analyse the exact series of page posts/requests that caused an error is quite valuable to a developer.
So in summary: It is possible, but I have not heard of any off the shelf tools that do it.
Please check out this Whitepaper by Impetus Technologies on this page.. http://www.impetus.com/plabs/sandstorm.html
Honestly, I'm not sure the task you're being asked to do is even possible, let alone a good idea. Depending on how complex the application's backend is, and how perfect you can recreate the state (ie: all the way down to external SOA services or the time/clock), it may not be possible to make those GET and POST requests reproduce the same behavior.
That said, performance testing against production data is always great, but it usually requires application-specific knowledge that will stress said data. Simply repeating HTTP GETs and POSTs will almost certainly not yield useful results.
Good luck!
I would suggest the following to get the production requests and simulate the accurate workload:
1) Use coremetrics: CoreMetrics provides such solutions using which you can know the application usage patterns. This would help in coming up with an accurate workload model. This model can then be converted into test scripts and executed against a masked copy of production database. This will provide you accurate results about the application performance in realtime.
2) Another option would be creating a small utility using AOP (Aspect oriented apporach) so that it can trace all the requests and corresponding method traces. This would help in identifying the production usage pattern and in turn accurate simulation of workload. AOP frameworks such as AspectJ can be used. This would not require any changes in code. The instrumentation can be done on the fly. The other benefit would be that thi cna only be enabled for a specific time window and then it can be turned off.
Regards,
batterywalam

Logging vs. Debugging

Background: I've inherited a web application that is intended to create on-the-fly connections between local and remote equipment. There are a tremendous number of moving parts recently: the app itself has changed significantly; the development toolchain was just updated; and both the local and remote equipment have been "modified" to support those changes.
The bright side is that it has a reasonable logging system that will write debug messages to a file, and it will log to both the file and a real-time user screen. I have an opportunity to re-work the entire log/debug mechanism.
Examples:
All messages are time-stamped and prefixed with a severity level.
Logs are for the customer. They record the system's responses to his/her requests.
Any log that identifies a problem also suggests a solution.
Debugs are for developers and Tech Support. They reveal the system internals.
Debugs indicate the function and/or line that generated them.
The customer can adjust the debug level on the fly to set the verbosity.
Question: What best practices have you used as a developer, or seen as a consumer, that generate useful logs and debugs?
Edit: Many helpful suggestions so far, thanks! To clarify: I'm more interested in what to log: content, format, etc.--and the reasons for doing so--than specific tools.
What was it about the best logs you've seen that made them most helpful?
Thanks for your help!
Don't confuse Logging, Tracing and Error Reporting, some people I know do and it creates one hell of a log file to grep through in order to get the information I want.
If I want to have everything churned out, I seperate into the following:
Tracing -> Dumps every action and step, timestamped, with input and
output data of that stage (ugliest and
largest file)
Logging -> Log the business process steps only, client does enquiry so log
the enquiry criteria and output data
nothing more.
Error Reporting / Debugging -> Exceptions logged detailing where it
occurred, timestamped, input/output
data if possible, user information etc
That way if any errors occurred and the Error/Debug log doesn't contain enough information for my liking I can always do a grep -A 50 -B 50 'timestamp' tracing_file to get more detail.
EDIT:
As has also been said, sticking to standard packages like the built in logging module for python as an example is always good. Rolling your own is not a great idea unless the language does not have one in it's standard library. I do like wrapping the logging in a small function generally taking the message and value for determining which logs it goes to, ie. 1 - tracing, 2 - logging, 4 - debugging so sending a value of 7 drops to all 3 etc.
The absolutley most valueable thing done with any logging framework is a "1-click" tool that gathers all logs and mail them to me even when the application is deployed on a machine belonging to a customer.
And make good choices at what to log so you can roughly follow the main paths in your application.
As frameworks I've used the standards (log4net, log4java, log4c++)
do NOT implement your own logging framework, when there already is a good one out-of-the-box. Most people who do just reinvent the wheel.
Some people never use a debugger but logs everything. That's different philosophies, you have to make your own choice. You can find many advices like these, or this one. Note that these advice are not language related...
Coding Horror guy got an interesting post about logging problem and why abusive logging could be a time waste in certain conditions.
I simply believe logging is for tracing things that could remain in production. Debug is for development. Maybe it's a too simple way of seeing things, cause some people use logs for debugging because they can't stand debuggers. But debugger-mode can be a waste of time too: you don't have to use it like a sort of test case, because it's not written down and will disappear after debug session.
So I think my opinion about this is :
logging for necessary and useful traces through development and production environments, with development and production levels, with the use of a log framework (log4 family tools)
debugging-mode for special strange cases when things are going out of control
test cases are important and can save time spend in infernal labyrinthine debugging sessions, used as an anti-regression method. Note that most of the people don't use test cases.
Coding horror said resist to the tendency of logging everything. That's right, but I've already seen a hudge app that does the exact contrary in a pretty way (and through a database)...
I would just setup your logging system to have multiple logging levels, on the services I write I have a logging/audit for almost every action and it's assigned a audit level 1-5 the higher the number the more audit events you get.
The very basic logging: starting, stopping, and restarting
Basic logging: Processing x number of files etc
Standard logging: Beginning to Processing, Finished processing, etc
Advanced logging: Beginning and ending of every stage in Processing
Everything : every action taken
you set the audit level in a config file so it can be changed on the fly.
Some general rules-of-thumb I have found to be useful in server-side applications:
requestID - assign a request ID to each incoming (HTTP) request and then log that on every log line, so you can easily grep those logs later by that ID and find all relevant lines. If you think it is very tedious to add that ID to every log statement, then at least java logging frameworks have made it transparent with the use of Mapped Diagnostic Context (MDC).
objectID - if your application/service deals with manipulating some business objects that have primary key, then it is useful to attach also that primary key to diagnostic context. Later, if someone comes with question "when was this object manipulated?" you can easily grep by the objectID and see all log records related to that object. In this context it is (sometimes) useful to actually use Nested Diagnostic Context instead of MDC.
when to log? - at least you should log whenever you cross an important service/component boundary. That way you can later reconstruct the call-flow and drill down to the particular codebase that seems to cause the error.
As I'm a Java developer, I will also give my experience with Java APIs and frameworks.
API
I'd recommend to use Simple Logging Facade for Java (SLF4J) - in my experience, it is the best facade to logging:
full-featured: it has not followed the least-common denominator approach (like commons-logging); instead, it is using degrade gracefully approach.
has adapters for practically all popular Java logging frameworks (e.g. log4j)
has solutions available on how to redirect all legacy logging APIs (log4j, commons-logging) to SLF4J
Implementation
The best implementation to use with SLF4J is logback - written by the same guy who also created SLF4J API.
Use an existing logging format, such as that used by Apache, and you can then piggyback on the many tools available for analysing the format.

Resources