First, Do No Harm - Designing Robust Infrastructure

Hippocrates - 460-377 BC

Hippocrates’ Primum non nocere, “First do no harm”

Several customers have requested a notification mechanism to be alerted when errors are detected in their programs.  Simply raising an event is straightforward, but our promise to our customers is that we’ll do the hard thinking that ensures Gibraltar is safe and robust in production systems.  Our mantra is: first, do no harm.

In this case, we asked ourselves questions like:

  • What if a customer’s error notification logic is slow?  How do we ensure that it doesn’t slow down the application as a whole?
  • What if the program starts screaming thousands of errors?  How do we ensure that we don’t swamp the error notification handler?
  • What if there are errors in the customer’s error notification handler?  What if it throws an exception?  What if it hangs?

This resulted in a design that ensures that the logging infrastructure (including Gibraltar itself AND customer logic that interfaces with it) will be robust and safe.

Our central Log object in Gibraltar Agent now has a MessageAlert event that is raised when warning, error, or critical messages are recorded.  This event has a number of safety features such as:

  • Asynchronous: The event is raised on a background thread that is not part of the logging path, ensuring that time spent handling the event will not slow down logging or affect other threads.
  • Batching: When a burst of messages are recorded that qualify they will typically be raised together to allow more efficient processing
  • Throttling: A minimum delay between events can be easily specified to ensure the event isn’t raised too frequently, particularly in error cascade scenarios.  Messages are batched up until the next time the event can be raised.
  • Hang Protection: If the event handler never returns the Agent will continue to process messages and not queue them, allowing them to be released from memory.
  • Loop Protection: Messages that are recorded by your event handler will not cause additional events to be raised.  This prevents notification loops where an event handler records an error during notification which subsequently causes the message alert notification to be raised again.
  • Low Overhead: We don’t spin up anything (the threading, queue, etc.) until someone subscribes to the event so if you don’t use this feature it doesn’t take up resources either.

The MessageAlert event is particularly useful for automatically triggering immediate data transmission in the case of an error and implementing your own error notification mechanism.  The full detail of each log message is available in the event.

Check out our recent post on [<div id="attachment_458" style="width: 279px" class="wp-caption alignright"> Hippocrates - 460-377 BC

Hippocrates’ Primum non nocere, “First do no harm”

</div>

Several customers have requested a notification mechanism to be alerted when errors are detected in their programs.  Simply raising an event is straightforward, but our promise to our customers is that we’ll do the hard thinking that ensures Gibraltar is safe and robust in production systems.  Our mantra is: first, do no harm.

In this case, we asked ourselves questions like:

  • What if a customer’s error notification logic is slow?  How do we ensure that it doesn’t slow down the application as a whole?
  • What if the program starts screaming thousands of errors?  How do we ensure that we don’t swamp the error notification handler?
  • What if there are errors in the customer’s error notification handler?  What if it throws an exception?  What if it hangs?

This resulted in a design that ensures that the logging infrastructure (including Gibraltar itself AND customer logic that interfaces with it) will be robust and safe.

Our central Log object in Gibraltar Agent now has a MessageAlert event that is raised when warning, error, or critical messages are recorded.  This event has a number of safety features such as:

  • Asynchronous: The event is raised on a background thread that is not part of the logging path, ensuring that time spent handling the event will not slow down logging or affect other threads.
  • Batching: When a burst of messages are recorded that qualify they will typically be raised together to allow more efficient processing
  • Throttling: A minimum delay between events can be easily specified to ensure the event isn’t raised too frequently, particularly in error cascade scenarios.  Messages are batched up until the next time the event can be raised.
  • Hang Protection: If the event handler never returns the Agent will continue to process messages and not queue them, allowing them to be released from memory.
  • Loop Protection: Messages that are recorded by your event handler will not cause additional events to be raised.  This prevents notification loops where an event handler records an error during notification which subsequently causes the message alert notification to be raised again.
  • Low Overhead: We don’t spin up anything (the threading, queue, etc.) until someone subscribes to the event so if you don’t use this feature it doesn’t take up resources either.

The MessageAlert event is particularly useful for automatically triggering immediate data transmission in the case of an error and implementing your own error notification mechanism.  The full detail of each log message is available in the event.

Check out our recent post on](/blog/cool-charting-enhancements-coming-in-gibraltar “Charting Enhancements Coming in Gibraltar”) for more examples of how we are incorporating customer feedback to ensure that Gibraltar provides a robust logging infrastructure allowing you to build rock solid .NET software.

Related Posts

PostSharp Diagnostics Now Supports Loupe in the Box

The latest update to PostSharp Diagnostics adds Loupe support, enabling extensive high-performance logging to be added to any .NET application with virtually no code changes. PostSharp even has a great free option for developers that complements Loupe Desktop! Read more

Loupe Agent for .NET Core Now Available

The first release of the Loupe Agent for .NET Core is also our first open source version of the Loupe Agent. This is the first step in our plan to open source the entire Loupe Agent to make it easier for anyone to extend and take advantage of what Loupe... Read more

We've Moved Loupe Service to App.OnLoupe.Com

Loupe Service now has a shorter, direct site name that's faster, anywhere in the world. Just to go App.OnLoupe.Com, the new CDN-accelerated endpoint for the Loupe Service. Your existing Agents and Loupe Desktops are unaffected by this change, but access to the web UI will be redirected to the new... Read more

Rock solid centralized .NET logging

Unlimited applications, unlimited errors, scalable from solo startup to enterprise.