Oscar Westra van Holthe - Kind


In order to write (web) applications that a user can trust, logging is essential. Good logging ensures that an application is:

all activities that affect user state or balances are tracked
it's possible to determine where an activity occurs in all tiers of the application
High integrity
logs cannot be overwritten or tampered by local or remote users

Note that these items cover two areas: development and system administration.

Well-written applications will dual-purpose logs and activity traces for audit and monitoring, and make it easy to track a transaction without excessive effort or access to the system. They should possess the ability to easily track or identify potential fraud or anomalies end-to-end.

Ok, This is all very nice, but what does it mean? As a programmer, when do I log what? I came up with these usages for the Log4J logging levels:

Log level Used for Description
FATAL Application crashes A crash occurs when the application cannot continue to run. In such a case, log as much as you can about the cause of the crash.
ERROR Unrecoverable errors Some errors are unrecoverable, and result in some functionality not being available. Log such errors as error, and describe why they happened.
WARN Recovereable errors Other errors are recoverable. In such cases you should do so, and log the error as warning. Naturally, you also log how it could have been prevented.
INFO Data changes In order to have a complete log of who does what to the data (and when), the informative log level is used exclusively. Here, log which user made what change to the data.
DEBUG Application decsions In order to debug an application, you need to know what is does. The debugging log level provides a first approximation by logging all appplication decisions. Note that this should include the data the decisions are based upon.
TRACE Application flow When the debug log has identified a possible location for a bug, you can use the tracing log level to find out where exactly it goes wrong. In order to help with this, the trace level should at least log all entries and exits into and from methods. If you think you need additional information, for example about the results of library calls or code you didn't write, add it here.

Generally, all output in the levels WARN, ERROR and FATAL should be cause for an investigation, and at least those and output for INFO should be logged to a durable format (like a backed up file). Output for DEBUG and/or TRACE can be enabled to diagnose problems.

Logging sensitive information

As you can understand, log files are very valuable. Not only do they allow technicians to diagnose problems, but they also allow an application to be audited and traced to fulfill legal requirements.

But this means that they can and must contain sensitive information, which carries potential risks. But what exactly is sensitive information? Passwords? Sure. Social security numbers? Maybe. Medical information? Definitely, but not often useful. Financial information? Yes, but often logged out of necessity. So let's be honest: what exactly constitutes sensitive information is subject to change. Not only that, some information we believe to be sensitive must be logged to satisfy auditors. So regardless of whether you log information you know to be sensitive, you'll probably also log sensitive information you're not aware of. And that opens possibilities for industrial espionage, identity theft, etc.

The solution for a programmer is to listen carefully to whatever the auditors have to say. Not only do they decide what constitutes an acceptable risk, they can also help decide what measures to take to prevent the log files from falling into the wrong hands. This might mean that some information is not logged, but the auditors know if the application is then still sufficiently auditable and traceable. Most measures however will be for the system administrators, as they guard the fortress than protects the log files.