Intellisense Driven API Design

Like most framework developers, we put a lot of care and effort into what our public API looks like.  We read some good books (even took their advice), looked at a lot of examples, and had some very long arguments about it.  We employed a range of modern object approaches to maximize code reuse:  Inheritance, interfaces, abstract classes…  it all made good sense to us and appeared to work great.

Our beta customers weren’t as successful.  Despite a real willingness on their part and frequent interactions with our team, they had a lot of trouble figuring out how to achieve the results they needed.  Fortunately, the dialog back and forth gave us a lot of insight into where people were getting tripped up.  The a-ha moment was one comment:

I don’t think your notes were bad, I just think the manual event process is different than I expect from just poking around. Can you even call it the way I am trying? There were a couple of syntax attempts that looked ok to me, but didn’t work.

Novices of an api often just try to use the objects.

That got us started down a new path:  How do we make it as obvious as possible from the namespace what object they should use? From just what you can see with Intellisense, can they find what method(s) to use and why?  Making that happen became our mantra - Intellisense Driven Design.  A moderately experienced developer should be able to figure out what to do just from what’s visible using Intellisense in Visual Studio.

To meet this goal we made a number of changes:

  1. Ruthlessly get rid of classes in the namespace:  Every class was an opportunity for confusion where someone asks What is that and why do I need it?
  2. Reduce usage patterns:  We previously supported about five patterns of using the objects to suite a range of personal coding preferences.  We reduced this two the bare minimum of two patterns:  One declarative and one entirely programmatic.
  3. Separate things that are separate:  We were using inheritance to link together several things that we internally treated similarly.  At the API layer that confused users into thinking that the items were more interchangeable than they were.  We redesigned them to be more clearly distinct to prevent the objects from being interchangeable at the compiler level while preserving common usage patterns.
  4. Expose concrete classes:  While inheritance and interfaces maximize code reuse within the API itself, they tend to generate a lot of noise in the namespace.
  5. Make the API completely thread safe:  We expect users to want to use our API in multithreaded applications, but many developers aren’t fully versed on how to correctly handle shared data.  We changed the API to make it impossible to use in a thread unsafe manner.

To get this done we had to do some counter-optimizations.  The biggest issue in our way was that we reused the same classes to write and read data, however our customers would only use the API to write data.  The desire to reuse the classes was core to the additional complexity, driving extensive use of inheritance and interfaces.  It also necessitated exposing a large number of methods that were really only suitable for internal use.

To achieve the goals we wanted for usability we ended up creating a new API layer exclusively for our customers, even though that ended up creating a lot of duplication within our library.  In retrospect this seems almost obvious:  Customers don’t care about how clean our internal object model is or how much work it took, they care about having it be as easy as possible to understand.

The second counter optimization was that a few of the additional usage patterns were oriented at providing the highest possible performance for customers with certain scenarios.  This turned out to be an example of pre-emptive optimization; while they did improve the performance by around 10% the API was already fast enough even in its slowest usage scenario that such improvements were irrelevant in the real world.

The results

  • The number of exposed classes was reduced from 79 to 22.  Of the 22, 7 are attribute classes used for Declarative definition leaving only 15 classes users would ever need to code against.
  • The number of lines of code required to implement the average usage patterns was reduced by 50%.
  • Peak performance was reduced by 10%, but average performance was the same.  Despite the code duplication, the final binary ended up only 40kb larger than it was before.

What’s it all really mean?

Before the Refactoring

/// <summary>
/// Snapshot cache metrics
/// </summary>
public static void RecordCacheMetric(int pagesLoaded)
{
    CustomSampledMetricDefinition pageMetricDefinition;
    CustomSampledMetric pageMetric;
    CustomSampledMetricDefinition sizeMetricDefinition;
    CustomSampledMetric sizeMetric;
 
    //since sampled metrics have only one value per metric,
//we have to create multiple metrics
    MetricDefinition workingDefinition;
    if (Log.Metrics.TryGetValue("Example", "Database.Engine", "Cache Pages", out workingDefinition) == false)
    {
        //doesn't exist yet - add it in all of its glory
        pageMetricDefinition = new CustomSampledMetricDefinition("Example", "Database.Engine", "Cache Pages", MetricSampleType.RawCount);
        pageMetricDefinition.Description = "The number of pages in the cache";
        pageMetricDefinition.UnitCaption = "Pages";
    }
    else
    {
        //it already exists- just need to type cast it.
        pageMetricDefinition = (CustomSampledMetricDefinition)workingDefinition;
    }
 
    if (Log.Metrics.TryGetValue("Example", "Database.Engine", "Cache Size", out workingDefinition) == false)
    {
        //doesn't exist yet - add it in all of its glory
        sizeMetricDefinition = new CustomSampledMetricDefinition("Example", "Database.Engine", "Cache Size", MetricSampleType.RawCount);
        sizeMetricDefinition.Description = "The number of bytes used by pages in the cache";
        sizeMetricDefinition.UnitCaption = "Bytes";
    }
    else
    {
        //it already exists- just need to type cast it.
        sizeMetricDefinition = (CustomSampledMetricDefinition)workingDefinition;
    }

    //now that we know we have the definitions, make sure we've defined the metric instances.
    if (pageMetricDefinition.Metrics.TryGetValue(null, out pageMetric) == false)
    {
        pageMetric = pageMetricDefinition.Metrics.Add(null);
    }

    if (sizeMetricDefinition.Metrics.TryGetValue(null, out sizeMetric) == false)
    {
        sizeMetric = sizeMetricDefinition.Metrics.Add(null);
    }

    //now go ahead and write those samples....
    pageMetric.WriteSample(pagesLoaded);
    sizeMetric.WriteSample(pagesLoaded * 8196); 
}

After the refactoring

And with the new API changes:

/// <summary>
/// Snapshot cache metrics
/// </summary>
public static void RecordCacheMetric(int pagesLoaded)
{
SampledMetricDefinition pageMetricDefinition;
SampledMetricDefinition sizeMetricDefinition;

//since sampled metrics have only one value per metric, 
//we have to create multiple metrics
if (SampledMetricDefinition.TryGetValue("Example", "Database.Engine", "Cache Pages", out pageMetricDefinition) == false)
{
//doesn't exist yet - add it in all of its glory.
//This call is MT safe - we get back the object in cache even if registered on another thread.
pageMetricDefinition = SampledMetricDefinition.Register("Example", "Database.Engine", "cachePages", SamplingType.RawCount, "Pages", "Cache Pages", "The number of pages in the cache");
}

if (SampledMetricDefinition.TryGetValue("Example", "Database.Engine", "Cache Size", out sizeMetricDefinition) == false)
{
//doesn't exist yet - add it in all of its glory
//This call is MT safe - we get back the object in cache even if registered on another thread.
sizeMetricDefinition = SampledMetricDefinition.Register("Example", "Database.Engine", "cacheSize", SamplingType.RawCount, "Bytes", "Cache Size", "The number of bytes used by pages in the cache");
}

//now that we know we have the definitions, make sure we've defined the metric instances.
SampledMetric pageMetric = SampledMetric.Register(pageMetricDefinition, null);
SampledMetric sizeMetric = SampledMetric.Register(sizeMetricDefinition, null);

//now go ahead and write those samples....
pageMetric.WriteSample(pagesLoaded);
sizeMetric.WriteSample(pagesLoaded * 8196);
}

But wait there’s more!

That doesn’t seem too impressive until you realize that with the new API it’s possible to do the following, still in a thread-safe manner:

/// <summary>
/// Snapshot cache metrics using fewest lines of code
/// </summary>
public static void RecordCacheMetricShortestCode(int pagesLoaded)
{
//Alternately, it can be done in a single line of code each,
//although somewhat less readable.  Note the WriteSample call after the Register call.
SampledMetric.Register("Example", "Database.Engine", "cachePages", SamplingType.RawCount, "Pages", "Cache Pages", "The number of pages in the cache", null).WriteSample(pagesLoaded);

SampledMetric.Register("Example", "Database.Engine", "cacheSize", SamplingType.RawCount, "Bytes", "Cache Size", "The number of bytes used by pages in the cache", null).WriteSample(pagesLoaded * 8196);
}

One line of code per metric, only two lines of code down from 11 (excluding braces). More importantly, the original implementation wasn’t thread safe; you’d have to use an external lock in your own code over each metric.

Closing Thoughts

Time will tell how successful we were with our goals. You can check out the current API documentation online, or download the product and try it out yourself. Our internal unit tests and examples give us a lot of confidence that it’s much improved, but the ultimate test is our customers.  While we’re a huge believers in documentation, even we charge in and see how far we can get before reading anything, so why would we expect more from our customers?  We were very lucky to be able to correct this before we shipped the first version and were married to supporting it for backwards compatibility.

Finally, this highlights the value of leaving time for external beta testing.  We did three rounds of external beta testing, and each provided a wealth of essential information about our target market, what worked, and what didn’t.  While each of these rounds were invitation only, with new participants in each round we discovered what we needed to know to make the best software.  Regardless of where you are on the Agile development spectrum, if you’re making an API you can be sure that no matter how well you think it through your first users will point out key flaws.  Make sure you can address them before you are committed to preserving your mistakes for all time.

Rock solid centralized logging

Unlimited applications, unlimited errors, scalable from solo startup to enterprise.