How to define compatible logs - open-telemetry

I have the chance to influence the log format of a logging solution we are about to set up for an existing backend system. It is not open-telemetry based and may never be, but at the moment I can still make suggestions and would like to make sure the logs are written in a compatible format. Is there some kind of overview or definition I can use as a base? Some kind of list of mandatory fields the need to be filled?

I see you found the data model (https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/logs/data-model.md) in the specification - keep in mind, logging support for OpenTelemetry is currently not stable and so this may change. Generally, I suspect that if you use something like the Elastic Common Schema (https://www.elastic.co/guide/en/ecs/master/ecs-log.html) then you should be broadly compatible going forward.

Related

Migration from eXist-db 1.4.x to 2.x

We're going to migrate our application from eXist-db 1.4.1 to ~2.2 (probably RC2).
I`m wondering if anybody already did such a migration and what impediments they met?
I already found some documentation about this on official website and tried to Google, but didn't find much. For now I know that there were significant changes in the security model and some APIs are also changed. But still I want to know if somebody investigated further or maybe can share success story.
The main difference between 1.4.1 and 2.1/2.2 is that stored XQueries now need the executable flag to be set. You can fix your permissions automatically using a query as described in the documentation.
It is also possible that some of your existing queries report errors on 2.1/2.2, which they did not before. In nearly all cases this happens because 1.4 was less strict about the XQuery specification and processed expressions which should not be allowed (the standard as well as the implementation evolved). Also, the query engine may now do additional checks to prevent potential issues. Usually the error messages by the compiler should directly lead to the code you have to fix. This may cost a few minutes, but it's worth the effort. Apart from this, no particular migration issues have been reported.

Which FHIR resource should I use for Treatment Preferences?

While trying to understand how an existing system will map to FHIR resources, I am stuck in the documentation on Treatment/Care Preferences like the ones outlined here: http://wiki.hl7.org/index.php?title=Care_Preference
Would these preferences be handled in a list of extended objects? Or will FHIR be implementing a CarePreference resource?
This isn't catered for in the current set of resources. I guess you use Other (http://hl7.org/implement/standards/fhir/other.htm). It does seem like the kind of thing we'd want to define a resource for, but I'm not aware of any plans for one right now. I forwarded the suggestion along to the appropriate team.
btw, I'm not sure this question meets Stack Overflows guidelines - it might get edited/closed.
"Other" is the solution for now. Speed of the development of a specific resource is likely to be dependent on the number asking for it and the detail of the use-cases they supply. Consider sharing these on the FHIR list server. Alerts might be another mechanism to flag important preferences.

What is best for logging

since Umbraco v6 decided to implement logging to a text file by default, I would like to ask you guys what kind of logging you use.
Do you log to a text file on a production website, or do you log to a database table? Or do you implement any other kind of logging?
And what are the performance implications of this?
I do both type of logging file as well as DB on production environment, as I need to audit logs so need to have everything actual and saved.
I use nLog.
http://nlog-project.org/
Its robust, fast and good and have been using it in production environment from last year.
Its good and gives you logging at various levels.
I would recommend you to use NLog.
At one time I investigated question about the best frameworks for logging and stopped on NLog.
I have already used it on different projects and it always show good results.
With NLog you can sent your logs to a different targets:
file, database, event log, console, email, nlogviewer and so forth.
You can set up all configuration on config files. It's very cool and useful. You can easily set up how and where you want to write your logs.
At your disposal is also Wrapper Targets (see datail in documentation). In my opinion the most useful target is AsyncWrapper (provides asynchronous, buffered execution of target writes). It will give you good performance.
There are also a lot of another cool featers.

Logging vs. Debugging

Background: I've inherited a web application that is intended to create on-the-fly connections between local and remote equipment. There are a tremendous number of moving parts recently: the app itself has changed significantly; the development toolchain was just updated; and both the local and remote equipment have been "modified" to support those changes.
The bright side is that it has a reasonable logging system that will write debug messages to a file, and it will log to both the file and a real-time user screen. I have an opportunity to re-work the entire log/debug mechanism.
Examples:
All messages are time-stamped and prefixed with a severity level.
Logs are for the customer. They record the system's responses to his/her requests.
Any log that identifies a problem also suggests a solution.
Debugs are for developers and Tech Support. They reveal the system internals.
Debugs indicate the function and/or line that generated them.
The customer can adjust the debug level on the fly to set the verbosity.
Question: What best practices have you used as a developer, or seen as a consumer, that generate useful logs and debugs?
Edit: Many helpful suggestions so far, thanks! To clarify: I'm more interested in what to log: content, format, etc.--and the reasons for doing so--than specific tools.
What was it about the best logs you've seen that made them most helpful?
Thanks for your help!
Don't confuse Logging, Tracing and Error Reporting, some people I know do and it creates one hell of a log file to grep through in order to get the information I want.
If I want to have everything churned out, I seperate into the following:
Tracing -> Dumps every action and step, timestamped, with input and
output data of that stage (ugliest and
largest file)
Logging -> Log the business process steps only, client does enquiry so log
the enquiry criteria and output data
nothing more.
Error Reporting / Debugging -> Exceptions logged detailing where it
occurred, timestamped, input/output
data if possible, user information etc
That way if any errors occurred and the Error/Debug log doesn't contain enough information for my liking I can always do a grep -A 50 -B 50 'timestamp' tracing_file to get more detail.
EDIT:
As has also been said, sticking to standard packages like the built in logging module for python as an example is always good. Rolling your own is not a great idea unless the language does not have one in it's standard library. I do like wrapping the logging in a small function generally taking the message and value for determining which logs it goes to, ie. 1 - tracing, 2 - logging, 4 - debugging so sending a value of 7 drops to all 3 etc.
The absolutley most valueable thing done with any logging framework is a "1-click" tool that gathers all logs and mail them to me even when the application is deployed on a machine belonging to a customer.
And make good choices at what to log so you can roughly follow the main paths in your application.
As frameworks I've used the standards (log4net, log4java, log4c++)
do NOT implement your own logging framework, when there already is a good one out-of-the-box. Most people who do just reinvent the wheel.
Some people never use a debugger but logs everything. That's different philosophies, you have to make your own choice. You can find many advices like these, or this one. Note that these advice are not language related...
Coding Horror guy got an interesting post about logging problem and why abusive logging could be a time waste in certain conditions.
I simply believe logging is for tracing things that could remain in production. Debug is for development. Maybe it's a too simple way of seeing things, cause some people use logs for debugging because they can't stand debuggers. But debugger-mode can be a waste of time too: you don't have to use it like a sort of test case, because it's not written down and will disappear after debug session.
So I think my opinion about this is :
logging for necessary and useful traces through development and production environments, with development and production levels, with the use of a log framework (log4 family tools)
debugging-mode for special strange cases when things are going out of control
test cases are important and can save time spend in infernal labyrinthine debugging sessions, used as an anti-regression method. Note that most of the people don't use test cases.
Coding horror said resist to the tendency of logging everything. That's right, but I've already seen a hudge app that does the exact contrary in a pretty way (and through a database)...
I would just setup your logging system to have multiple logging levels, on the services I write I have a logging/audit for almost every action and it's assigned a audit level 1-5 the higher the number the more audit events you get.
The very basic logging: starting, stopping, and restarting
Basic logging: Processing x number of files etc
Standard logging: Beginning to Processing, Finished processing, etc
Advanced logging: Beginning and ending of every stage in Processing
Everything : every action taken
you set the audit level in a config file so it can be changed on the fly.
Some general rules-of-thumb I have found to be useful in server-side applications:
requestID - assign a request ID to each incoming (HTTP) request and then log that on every log line, so you can easily grep those logs later by that ID and find all relevant lines. If you think it is very tedious to add that ID to every log statement, then at least java logging frameworks have made it transparent with the use of Mapped Diagnostic Context (MDC).
objectID - if your application/service deals with manipulating some business objects that have primary key, then it is useful to attach also that primary key to diagnostic context. Later, if someone comes with question "when was this object manipulated?" you can easily grep by the objectID and see all log records related to that object. In this context it is (sometimes) useful to actually use Nested Diagnostic Context instead of MDC.
when to log? - at least you should log whenever you cross an important service/component boundary. That way you can later reconstruct the call-flow and drill down to the particular codebase that seems to cause the error.
As I'm a Java developer, I will also give my experience with Java APIs and frameworks.
API
I'd recommend to use Simple Logging Facade for Java (SLF4J) - in my experience, it is the best facade to logging:
full-featured: it has not followed the least-common denominator approach (like commons-logging); instead, it is using degrade gracefully approach.
has adapters for practically all popular Java logging frameworks (e.g. log4j)
has solutions available on how to redirect all legacy logging APIs (log4j, commons-logging) to SLF4J
Implementation
The best implementation to use with SLF4J is logback - written by the same guy who also created SLF4J API.
Use an existing logging format, such as that used by Apache, and you can then piggyback on the many tools available for analysing the format.

What parts of application you prefer to be externalized as configuration and why?

What parts of your application are not coded?
I think one of the most obvious examples would be DB credentials - it's considered bad to have them hard coded. And in most of situations it is easy to decide if you want something to be externalized or coded.For me the rules are simple. Some part of the application should be externalized if:
it can and should be changed by non-developer, but not so often to be included in application settings defined in UI (DB credentials, service URLs, etc)
it does not require programming language and seems unnatural being coded (localization)
Do you have anything to add?
This is a little related to this question about spring cfg.
Spring configuration seems less obvious example for me, because in my practice it is never modified by anyone except the developer. And the road of externalizing can take you far away, to the entire project being "configured", not coded - so where to stop?
So please post here some examples from your experience, when you got benefit from having something configured, not coded - like dependency injection configuration in spring, etc.
And if you use spring - how often is configuration changed without recompiling?
Anything that needs to differ between different deployments of your application. That is, anything specific to the environment.
Examples include:
Database connection strings
URLs for web or WCF services
Logging configuration
Any information your application uses that is "data" and that could change depending on where it is installed. Things like:
smtp mail server used to send e-mails
Database connect strings
Paths to file locations / folders used by the app
FTP servers & connect info
Active Directory servers used for authentication
Any links displayed in the application to external information
sources
Warning limit values
I've even put the RegEx filters used to limit the allowable characters
for data entry fields.
Besides the obvious changing stuff (paths, servers, ports, and so on), some people argue that you should be able to easily change whatever might reasonably change, for instance, say you have a generic engine which operates on the business logic (a rule engine).
You would then define the rules on a "configuration file" which ends up being is no less than programming in a DSL instead of in the generic purpose language. Benefits being it's closer to the domain so it's easier and more maintainable, and that you can easily change things that otherwise would demand a new build.
The main argument behind this is that things you assumed would never change always end up changing nonetheless, so you better be prepared.
paths and server names/addresses come to mind..
I agree with your two conditions, which is why I:
Rarely include a config file as part of a Windows or Windows Mobile application (web apps yes).
If I did include a config file meant to be tweaked by end users, it certainly wouldn't be XML.
Employee emails/names since employees can come and go... (you should typically try to keep them out of an application though)
Configuration files should include:
deployment details
DB credentials
file paths
host names
anything that is used in many places but that may change
contact email addresses
options that aren't in the GUI
The last one is a bit open-ended, but very important. I've found it very useful to foresee variables that the client may, in the future, want to change. If changes are infrequent, I or they can edit the config file. If it becomes a frequent thing, it's trivial to add the option to the GUI, which isn't hardcoded.
I would also add encryption keys (which themselves should be encrypted)...
Basically the rule of thumb is information the application needs BEFORE it's regular, functional operation, data that it MUST have on-hand (i.e. local and not networked).
Note that this data should not be dynamically changing or large amounts of it, otherwise it should be in the database.
With Spring apps I actually distinguish between two types of configuration:
Items externalized into property files which are "deploy time" concerns or "environment-specific": server IP's / addresses, file system locations, etc etc
Spring XML configuration which can do lots of things, like indicate the overall application structure, apply behavior via AOP, etc.
I use Spring to wire all the beans in a J2SE application that has no GUI (a transactional switch). That way it's very easy for me to have different configurations in each deployment (we have this thing running in different countries), without having to code anything different.
Another thing I like to have is to manage all the SQL statements separately from the code, when I use plain JDBC (or Spring JDBC). Like in a properties file or XML or something, sometimes even as String properties in the beans that will use the statement (when there is only one bean that will use the statement, such as a DAO).
I am going to use spring JDBC or vanilla JDBC for data persistence, here we have decided to externalize all the SQL from the Java code, so can be better mangable in terms of SQL query tuning and optimization, we don't need to disturb the java code.

Resources