Wednesday, February 4, 2015

Logging and Error-Handling Guidelines

I'm pretty sure I sound kind of logging-obsessed to my fellow developers sometimes. I'm sure I sound like a broken record sometimes when I ask to see the log when troubleshooting an issue, but when logging is done right it provides a clear path toward getting to the root cause of a problem. To my mind, this is about 1000 percent better than the alternative, which mostly consists of guessing and randomly trying things, followed by hiding from management when they come around demanding a status update.

So, logging is extremely valuable- at least good logging is. But what makes for good logging? I'm tempted to say "I know it when I see it", but in fact over the years I have managed to codify some guidelines about logging and error handling, which I'll lay out here.

We use log4net as our standard logging package for .NET projects. It's been around forever, and I know there are sexier options out there, but the guidelines below kind of assume log4net. If you use a different logging framework, I'm pretty sure most of the recommendations will still apply.

In a normal production environment, the logging level in configuration should be set to INFO. In log4net XML configuration, it would look something like this:

  <log4net>
    <appender name="TheAppender" etc.. />
    <root>
      <level value="INFO" />
      <appender-ref ref="TheAppender" />
    </root>
  </log4net>


When you are testing or troubleshooting, the level can be changed to DEBUG to gather more information about the system. That means that as you write log statements, you'll need to think about whether you'll want to see them generated as part of normal behavior, or only in debugging/troubleshooting situations.

Each class should have its own Logger instance. We declare it like this:

private static readonly log4net.ILog Log = log4net.LogManager.GetLogger(System.Reflection.MethodBase.GetCurrentMethod().DeclaringType.Name);

This allows the logger to automatically pick up the name of the class, and including the fully-qualified class name on everything allows it to be copied "as is" into a new class in one action without needing to add any "using" directives.

When coding, try to anticipate what types of log messages that would be helpful to a future troubleshooter. This is a bit of a balance, because you want to avoid "spamming" the log with low-value information.

Log every request coming in at the highest code level with a level of INFO. This means controller methods, Main(), etc. This gives you a permanent record of all the requests that you handle.

When logging unexpected error conditions or exceptions you can't recover from, do so with a logging level of ERROR. This may sound obvious, but if you're consistent about it, you can quickly search through any log for the problem that Support is probably asking you about.

Exceptions that are logged should be done so with the full details of the exception, including message, stack trace, inner exception (if any), etc. Log4net makes this easy. Having the full stack trace in the log is enormously valuable for troubleshooting.

Log every condition that can be handled but may be a sign of trouble with a logging level of WARN. These often provide the "smoking gun" for more inscrutable errors.

When you get bug reports around unexpected edge cases, log the condition when you're fixing the issue. These are often WARN statements. 

Log every database read and write (in Entity Framework, etc.) with a logging level of DEBUG. These are almost always too frequent to log as INFO (and can leak sensitive information) but are enormously helpful for troubleshooting.

Log other interactions with the outside world (file interaction, communication with external web services, etc.) with a logging level of INFO, unless they're so common they really spam the log, in which case go with DEBUG. The same reasoning as for database access - out-of-process calls of any type are a frequent source of failure.

Log any other helpful data, as needed, with a logging level of DEBUG. These types of log statements tend to be up to the discretion of the developer, and often they may be removed once a section of code is stable.

Errors that are not caught anywhere in the code should be caught and logged by the framework if possible. ELMAH is one tool that does this for ASP.NET web applications. If not possible, the entry point method (controller method, Main()) should have a general catch exception that logs the error.

If exceptions are caught, they should be re-thrown or completely handled. Under no circumstances should they leave the data in a bad state.

Try/catch blocks should be relatively rare and exist to handle specific error conditions. I've seen a lot of code where developers put a try/catch block in almost every method. This just adds cruft and overhead without adding any value. I think in most cases developers who do this are falling into a "cargo cult" mentality where they think they must add try/catch blocks everywhere in order to "do error handling right." They may be half-right - they should think about error handling in every method, but in my experience most of the time the best thing to do is simply let the exception be thrown and handled up the call stack.

Do not return status codes to indicate whether there was an error in a method. If there was an error, throw an exception from the method. This simplifies the calling code significantly as it does not have to check the return values or status codes of method calls. Older programmers who started out their careers using C or other languages without exception support are probably the most guilty of this.

Sometimes it makes sense to catch exceptions at a lower level in order to log specific details about the exception at that level. If this is done, be sure to re-throw the exception.
When re-throwing an exception, it is usually better to use “throw” by itself, rather than throwing the caught exception. Here is a quick example:
string base64 = …
byte[] bytes;
try
{
bytes = Convert.FromBase64String(base64);
}
catch (FormatException ex)
{
// Log the bad data for help in troubleshooting the problem
Log.ErrorFormat("Bad base64 string: '{0}'", base64);
throw; // this preserves the full stack trace
// do not do this as it resets the stack trace:
// throw ex;
}

In the code above, the only reason there is a try/catch block is so we can log the bad base64 string in the case that it cannot be converted to a byte array. By calling "throw" rather than "throw ex", the stack trace will appear upstream as if it were not caught at all at this level.

Avoid NullReferenceExceptions as they typically do not contain adequate information about what went wrong. Validating method arguments and throwing ArumentNullException, ArgumentOutOfRangeException, etc. is the best practice to avoid this.

Along the same lines, be sure to validate the result of data queries, including Entity Framework or LINQ queries, and throw an appropriate exception. An exception of this type ("Contact #123 was not found in database") is much more helpful than a NullReferenceException occurring many method calls later.

Client applications need to be able to handle various failed requests to the server (server not responding, 404 errors, 500 errors, etc.). If this is not done, it often leaves the client application "hanging" in an unresponsive state.

Taking a step back, you'll want to think about the process around all this. When a bug or problem is discovered, part of getting it fixed or resolved should be a discussion of what additional logging or instrumentation would have helped to diagnose the issue. Since you generally can't (or shouldn't) do source-level debugging on a production system, you have to rely on production logs, which should help you reproduce the issue in a development environment.

Monday, December 1, 2014

My Queue-Based Life

"In the future, there will be so much going on that no one will be able to keep track of it."

--David Byrne, "In the Future", from The Knee Plays

I love Instapaper. It helped solve the "20 browser tabs open" problem where you click on something interesting to read - but it turns out to be kinda long, and you don't have time to read it right now. In fact, you don't really want to read it on your work computer at all; you'd rather pull it up on your tablet in your living room with a fire going in the fireplace and a glass of Dogfish Head 60-Minute IPA at your side. Consequently, I tend to have a fairly long list of interesting articles queued up to read at any given time, lovely articles from The Atlantic and Slate and The New Yorker and Vanity Fair and Wired and Rolling Stone and qz.com and the like.

However, the Instapaper articles compete with digital issues of Newsweek, which are full of articles about NEWS which come out EVERY SINGLE WEEK. I'm usually about 6-8 issues behind, which means that I can sometimes skip articles because the Big Question they're asking has been answered, or is no longer relevant at all.

Newsweek and Instapaper, of course, compete for my reading time with books, still mostly paper but some digital. Authors have this annoying habit of writing books on fascinating subjects that I can't wait to dive into, and so those titles are added to the list I keep on a Google Drive spreadsheet, which is also more or less duplicated in a Goodreads account. Many of these books have been purchased and sit there, waiting to be read as soon as I can get around to them.

If I don't feel like reading, no problem - TiVo dutifully records all kinds of interesting TV programs, hours and hours and hours worth, more than my wife and I could ever hope to watch. Lately, an "AFI's List of 100 Greatest Movies" smart search thingy has intersected with an enhanced cable subscription that gives us Turner Classic Movies, and the TiVo lit up like a pinball machine, recording "Casablanca" and "Citizen Kane" and "The African Queen" and "Bringing Up Baby" and "Platoon" and "The Last Picture Show" and so on and so on.

If we're not in the mood for something on TiVo, Netflix has us covered - there are maybe 50 movies and a few TV series in the queue there. Who knows if we'll ever get around to watching them? Actually, I know - we'll never get around to watching them all. And that's not including the dozens of TV shows or movies that my friends tell me I MUST watch - how could I be missing out on Game of Thrones or Fargo or The Walking Dead or Adventure Time or Girls or Black-ish or Doctor Who or The Knick, etc, etc, etc.

Meanwhile, when I'm exercising or driving or doing mundane household chores or just lying sick in bed, there are the podcasts. I've discovered I'm something of a podcast junkie. It is so, so easy to subscribe to a new one, promising hours and hours of interesting listening - techie podcasts like Hanselminutes and .NET Rocks; wonky podcasts like Planet Money and Freakonomics and The Commonwealth Club of California; the Slate Political and Culture gabfests; Marc Maron's WTF interviews, and just damned interesting podcasts like Radiolab and 99% Invisible. I'm months behind on some of those.

This may be the ultimate First World Problem. There is so much quality content being produced, by so many amazingly talented writers/thinkers/researchers/producers/directors/actors/singers/playwrights/I/could/go/on/and/on that I occasionally stress out that I'm not consuming it fast enough. Each separate stream of content threatens to become its own to-do list to be worked through as quickly and efficiently as possible, like Lucy and Ethel wrapping candies at the conveyor belt.

I have, sometimes, learned to let go, not worry about all of it piling up, and just cherry-pick the best stuff .This doesn't solve the larger problem, however, which is that consuming all this stuff - as great as it can be - is not the same as creating something or doing something. Yes, I am an info-junkie, but i have to remember to put that information to good use in the service of action, not merely learning for learning's sake, as attractive as that can be.

Wednesday, November 26, 2014

Creating a Shuffled Classical Music Playlist

I have a reasonably large classical music collection, and sometimes I'm in the mood to put classical music on, but I don't want to choose exactly what. Yes, I could find some classical music radio station, either over the air or on the internet, but having grown up in an age before universal streaming, sometimes I just want to listen to my music. I'm generally a big fan of randomly shuffled music, and I like Smart Playlists in iTunes. Using smart playlists, I can set up metadata-based playlists like "play a random selection of all rock or blues tracks rated 4 stars or higher than have not been played in the last 3 months."

Classical music is different than rock, pop, blues, jazz, country or most other forms of music. Besides the obvious things, you're often more interested in the composer rather than the performer. For example, I know I like Beethoven better than Vivaldi, but I wouldn't say that I like the Boston Symphony better than the New York Philharmonic. (This isn't 100% true, particularly for solo artists - I might have particular pianists, violinists, etc. that I really like.)

You'll notice another difference with classical music if you try to listen to it by shuffling tracks. You might get the second movement of a Mendelssohn symphony, followed by the third movement of a string quartet, followed by the first movement of a piano concerto, followed by the William Tell Overture. This is not quite what I'm looking for - I'd rather have it play the entire symphony, followed by the entire string quartet, followed by the whole piano concerto, followed by the William Tell Overture.

I could shuffle by album, but that's not quite "shuffle-y" enough. Plus, occasionally I have a piece that spans two discs. I have a nice recording of Tchaikovsky's Symphonies 4, 5, and 6 on two discs, but the poor Fifth Symphony gets split across the discs.

In other words, I don't want to shuffle by track but by work. A classical work is the typical unit of composition and performance - Beethoven wrote the 5th symphony as a cohesive whole, and usually all four movements are performed together and in succession.

Fortunately, iTunes supports classical composers and works, sort of. If you right-click on a track and choose "Get Info", you'll see something like this:



There is a specific field for Composer, but what about Work? Well, it turns out that iTunes treats the Grouping field as the work, at least for tracks in the Classical genre. You used to be able to see this in the Classical Music smart playlist in older version of iTunes, but apparently they've changed things around in more recent versions so it doesn't say anything about the work there anymore.

To aid in the effort of adding composer and grouping information, I made smart playlists called "Classical with No Composer" and "Classical with No Work Grouping", which help to identify where I need to add metadata:


It's relatively easy to add composer information to everything, but quite a bit more work to add Work groupings if you have a large classical music collection. It turns out that having the Work entered for everything is nice, but not critically important, as I'll get to in a minute.

What I want, then, is to get all the tracks in the Classical genre that have a composer, group them by work, then shuffle the works, while preserving the track order within the work. Building on some earlier work coding against the iTunes API, this ultimately evolved into a high-level LINQ expression that's fairly expressive:

                 List<IITFileOrCDTrack > tracks =
                    iTunes.Library.FileTracks
                        .InGenre( "Classical")
                        .WithComposer()
                        .WithFile()
                        .ShuffleByWork()
                        .ToList();
  
This expression uses several extension methods on IEnumerable<IITFileOrCDTrack>:

         public static IEnumerable< IITFileOrCDTrack> InGenre(this IEnumerable<IITFileOrCDTrack > trackCollection, string genreName)
        {
            return from IITFileOrCDTrack track in trackCollection
                where track.Genre == genreName
                select track;
        }

        public static IEnumerable< IITFileOrCDTrack> WithComposer(this IEnumerable<IITFileOrCDTrack > trackCollection)
        {
            return from IITFileOrCDTrack track in trackCollection
                   where !string .IsNullOrWhiteSpace(track.Composer)
                   select track;
        }

        public static IEnumerable< IITFileOrCDTrack> WithFile(this IEnumerable<IITFileOrCDTrack > trackCollection)
        {
            return from IITFileOrCDTrack track in trackCollection
                   where File .Exists(track.Location)
                   select track;
        }
  
These are pretty straightforward, but obviously the ShuffleByWork method is the hard part. But first, we need a simple ClassicalWork class:

    public class ClassicalWork
    {
        public string Composer { get; private set ; }
        public string Album { get; private set ; }
        public string Name { get; private set ; }
    }
  
This class also has a constructor, plus the System.Object overrides Equals(), GetHashCode(), and ToString() methods, but nothing spectacular here.

The ShuffleByWork() method starts out by grouping all the tracks into ClassicalWork objects:

             var workGroups = from filetrack in filetracks
                             group filetrack by new ClassicalWork(filetrack.Composer, filetrack.Album, Classical.WorkName(filetrack.Grouping, filetrack.Name));
  
The Classical.WorkName() static method constructs the best possible name for the work. If we have a Grouping for the particular track, well, we know that's the work name, so we go with that. If not, we look at the track name. Overtures are normally only one track, and are good to intersperse between longer works, so if the track name contains "overture" return it as the work name. Otherwise, we just an empty string, which means that this track isn't really part of a work (but we'll still include it in the shuffled playlist).

Grouping tracks in this manner is really what defines what we consider to be a work. All tracks on a particular album that have the same composer and work name (as defined above) are considered to be the same work as far as our shuffling goes. This means that all tracks on an album that have the same composer but don't have the same Grouping will still get grouped together into one pseudo-work. In practice, this is fine - if I have several tracks on an album by Sibelius, say, but don't know anything else about whether they fit together, it's perfectly OK to group them together.

So now workGroups is a list of all the works we're dealing with, each with a list of the tracks in the work. Next, we just shuffle all the work groups:

             var shuffledWorkGroups = workGroups.Shuffle();
  
The Shuffle() extension method is perfectly generic (in both senses of the word) and could be used to shuffle any collection - a deck of cards, etc:
        public static IEnumerable<T> Shuffle<T>( this IEnumerable <T> enumerable)
        {
            var random = new Random();
            var shuffled = from item in enumerable
                           orderby random.Next()
                           select item;

            return shuffled;
        }
  
The last thing to do in the ShuffleByWork() method is to iterate through each item in our shuffled work groupings, and return the tracks in each grouping ordered by the original track number:

             foreach (var workGroup in shuffledWorkGroups)
            {
                // Work name is workGroup.Key;
                var tracks = workGroup.OrderBy(t => t.TrackNumber);
                 foreach (IITFileOrCDTrack track in tracks)
                {
                    yield return track;
                }
            }
  
Next, we'll create a playlist, deleting an existing playlist if necessary:

                IITUserPlaylist playlist = iTunes.FindPlaylistByName(playlistName);
                if (playlist != null )
                {
                    playlist.Delete();
                }
                playlist = iTunes.CreatePlaylist(playlistName);
                foreach (var track in tracks)
                {
                    playlist.AddTrack(track);
                }
Once the program runs, the playlist obligingly shows up in iTunes:


Finally, we'll create an M3U playlist, which is a very simple text file and will allow non-iTunes apps to access the data. 

We'd like to shuffle this on a regular basis, so I need to create a little batch file that can be run as a scheduled task every night. The essence of that is this line:

ClassicalPlaylist.exe "Classical Shuffle" "%USERPROFILE%\Music\My Playlists\Classical Shuffle.m3u"

With that done, I can point all the various music devices and services in my home (Sonos, Plex, etc.) to that .m3u file,

The project and source code is up on Github if you want to take a look.

Wednesday, August 14, 2013

Support and Troubleshooting Techniques for Software Developers

Developers at the company where I work, including myself, do a regular rotation as 2nd or 3rd level product support. Basically, whenever our regular customer support folks have support questions they can't resolve on their own, they reach out to the on-call developer. As you might expect, this is not our favorite part of the job, but sometimes it's rewarding. To it a little more bearable, sometimes I think of it as detective work. Think of a police procedural TV program like "Law and Order", where the detectives going around interviewing eyewitnesses to the original crime, which produces more leads, etc.

I'm assuming at this point that a support person has come to you with a customer problem that they can't solve. At this point, your job is to go where support people generally can't - into the code.

Root Cause Analysis


Here's my main point: Work Backwards. Yes, I said this in the last post as well, and I'll repeat the rest of it: Start at the point where the error first shows up, figure out what condition(s) cause that to occur, then figure out what causes that, etc. Methodically work your way up the chain of causality. Don't attempt a fix without any analysis, unless you have a really good reason to believe it will work in this situation - in which case, you're not troubleshooting, you're just applying a remedy to a known issue. If support didn't know about the issue, be sure they add it to their knowledge base.

Start with some Basic Information 


I probably sound like a broken record to our support team, because I'm always asking for two things right off the bat:

1. What version of the software is the customer running? If you have a single, web-based application that everyone in the world hits, this won't be a concern.
2. Can you get me the logs?

I need to know what version they're on so I know which version of the code to pull to start the causality investigation. I want the logs because the logs usually tell me a lot more about the error than the user-facing error message does (and sometimes there's no user-facing error message at all, just a "something didn't save correctly" or "so-and-so didn't respond." If it's an exception thrown, the logs will give me the full exception stack trace, which is invaluable. Again, this is the way we tie the error to the code.

If your software does not log errors when they occur, stop right now and make it a high priority. Do not release your next version without it. If you encounter resistance on this, start looking for another job, because I can't think of many things more frustrating than having to support software that doesn't have a way to tell you when errors occur, and what the nature of those errors is.

Some other things to ask the customer (or have Support ask the customer):
  • If something had been working and suddenly started having problems, what has recently changed in the environment? New hardware, new software, version upgrades, etc.?
  • Is the problem isolated to one user? If so, start investigating what is different about that user. Insufficient permissions?
  • Is the problem isolated to a single workstation? If so, same thing - what's different about that workstation?
  • Is the problem isolated to a single piece of data? Perhaps some data got corrupted.

Be Methodical


Don't be in a too much of a hurry - if you follow the steps, you'll almost always eventually come to the right answer. Apply logical reasoning: as a software developer, that's a big part of what you get paid for.

For example, if the user is seeing a particular error message, I need to know exactly what the error message says - preferably a screenshot or a text copy of the message. As a developer, your job is to tie the error back to the code. Sometimes you'll know exactly where in the code to look, but often, especially if your codebase is large, you'll need to search for the error text in the code.

Static Analysis


Once you've found the line of code that produces the error message, start doing some static analysis. Use your IDE's "find usages" or "find references to" functionality (or in the worst case, the straight "find" functionality) and start going up the call stack. The possible call stacks may start fanning out, but if you cross-reference against log messages, plus what you know about the configuration of the system, you'll usually have a good idea of the code path. You may need to go back to the support team or the client to get more information.

In modern AJAX web applications, the error may first be thrown on the server, but the call stack will eventually lead to a web service call, and from there into client-side JavaScript. The same rules apply, although the functional, untyped world of JavaScript can make static analysis a bit more difficult.

If you can't find the error or exception message in your code, it's probably coming from a third-party library you're depending on. Next stop: Google (or your favorite search engine) or maybe stackoverflow. Once you find the library function call involved, look for where it's called in your code, then follow the call stack up as before.

OK, That Didn't Work...


If nothing has panned out so far, here are some other things to try. These are roughly in order of easiest-to-hardest.
  • Is this a known issue? Check your bug database, release notes, knowledge base, etc. Yes, support should have done this, but you probably have access to more resources (the bug database, for example). Also, with your perspective as a developer, you may search the knowledge base in a different way and come up with different (sometimes better) results.
  • If the error is repeatable but you might be able to bump up the logging level (turn on Debug logging, for example), then reproduce the error and get the new, enhanced log. The debug messages may tell you enough about what's going on to determine the problem.
  • Sometimes you'll need to look at the change history of code, to find out when something was broken (or was fixed). If your changesets are tied to tickets, ether enhancements or bugs, it will further shed light on why a particular change was made.
  • See if your testing team can repeat the error. If so, they'll probably enter it as a defect.
  • See if you can repeat the error in your development environment. If so, you can probably capture the problem in the debugger, which will give you much richer information that the static analysis approach detailed above.

What To Do If You Get Stuck


  • Brainstorm the problem with your colleagues. They may remember a crucial piece of information you don't have, or come up with an approach you haven't thought of.
  • Relax your assumptions. If all assumptions were valid, the application would be working perfectly, right?
  • Some problems are really hard to diagnose because there are multiple problems interacting. Sometimes this is because there's a bug in the error-handling code
  • Remember that everyting is obeying physical laws, and there is a rational cause and effect going on. Do not treat the system as though it is magical.
  • That said, occasionally problems happen that you just can't explain. Hopefully this is very rare.

Workarounds


If it's a problem in production, you may actually have two separate tasks: getting it working ASAP. This may involve manually fixing some data in a database, or it might require a workaround or temporary expedient of some kind. If the problem is due to a software bug, making sure a bug report is entered so it gets triaged, prioritized, and (hopefully) fixed at some point.

Make It Easier to Troubleshoot Next Time


If the logging or other visibility into the code is inadequate, making the issue difficult to troubleshoot, the bug report may be "the so-and-so module has insufficient logging for troubleshooting and support," or better yet, something more specific, e.g. "The Flooper module doesn't log the full path of the file when it gets a 'cannot open file' error."

One thing I've learned about logging over the years is you definitely want to be able to log every significant interaction with the external environment: file opens, database interactions, web service calls, etc. You can't control these external things, but you can control exactly how you call them. In the worst case, these log messages could serve as a helpful bug report to the developers of the external module, but in most cases, you'll find the problem is with your own code.

Troubleshooting Techniques for Software Support

In this post, I'd like to talk about some basic troubleshooting skills for software support people. If you're such a support person, I'm assuming you can apply some basic logic to a problem, but I'm not assuming you're any sort of programmer. In a future post, I'll talk about the same subject, only from the point of software developers.

Be Specific


Oftentimes problem reports come in rather vague - "there was a problem with the application." Uhhh, can you be a bit more specific? A lot of the job involves information-gathering, and often it's iterative - you get a problem report, do a little research and analysis, ask for more information, do more analysis, etc.

Be Methodical - Work Backwards From the Error


Here's my main point: Work Backwards. Start at the point where the error first shows up, figure out what condition(s) cause that to occur, then figure out what causes that, etc. Methodically work your way up the chain of causality. Don't attempt a fix without any analysis, unless you have a really good reason to believe it will work in this situation - in which case, you're not troubleshooting, you're just applying a remedy to a known issue. Good for you for having an effective knowledge base!


Don't Make the Problem Worse


Maybe it's because I work with medical software that enhances patient safety, but to me this brings to mind the Hippocratic Oath: "First, do no harm." Don't make the problem worse! I'll say it again: Work backwards. As Gene Kranz in the movie Apollo 13 said, "Work the problem, people." You probably don't have lives at stake, but the same principles apply. Do not jump to conclusions, and don't just "try stuff" hoping it will work! A lot of times, your first thought may be "Hmmm, I wonder if changing the twizzle setting to false might do something." Yes, it might, but it also stands a very good chance of making things worse, or causing a second, unrelated problem.


Educate Yourself


The more you know about how the application works, and how the pieces work together, the more effective you'll be. For example, if you know where local data is stored, you can look there to see if things generally look OK (file names, file sizes, etc.). This may lead you to investigate file permissions problems, etc. You don't need to know the nuts and bolts of everything down to the code level, but if you generally know how data flows through the system, what format it's in (XML, JSON, text, binary, etc.), and generally what the data is used for, you'll be much more effective. If you don't know any of this, and it's not written down anywhere, ask a developer to spend a few minutes mapping it all out with you.

Likewise, you'll make yourself a much more valuable support person if you familiarize yourself with tools that can help in troubleshooting. For example, if you're supporting any kind of web-based application, a tool that allows you to see the HTTP traffic back and forth between the browser and server will be invaluable. (Fiddler is probably the most popular such tool). Learn about it and play with it during your downtime. If you frequently deal with file contention problems in a Windows environment, get familiar with Process Monitor or other free troubleshooting tools from Microsoft.

Communicate Effectively


If you need to get help from developers or other people, be sure to summarize the problem with the relevant context. If you just forward an email thread without any additional explanation or context, it will probably not make much sense to the recipient, and they'll have to come back and ask you for more details, which will just waste everyone's time.

Wednesday, December 12, 2012

Updating Your iTunes Library with BPM Data

Last time, I mentioned that MixMeister BPM Analyzer does a great job of crawling through your music files and deriving the Beats Per Minute (BPM) for each track. However, running iTunes afterward and displaying the Beats Per Minute column, or setting up a smart playlist that uses the BPM attribute, I found that the BPM data had not fully made its way into the iTunes database. If you select a track, choose "Get Info", and go to the Info tab, you'll see that the BPM value suddenly flashes into place, as if iTunes is reloading the track metadata (title, artist, genre, etc.) on demand.

I like smart playlists, and was hoping to use them to help come up with a good 90 BPM playlist for running, so I need to get iTunes to know about all those individual track tempos, but with thousands of tracks the idea of clicking on each one to get it to load is a nonstarter.

Solution #1


 Lifehacker has this helpful receipe, which I'll rewrite in a little more detail here:

  • Open iTunes
  • Select Music library
  • Make sure BPM is one of the columns displayed. If not, right-click on one of the other columns, and check "Beats per Minute"
  • Click on the BPM column to sort it, so all of the tracks that don't have BPM info are all at the beginning
  • Select all tracks showing blank BPM (or a subset if you have a lot of tracks and want to do this in batches)
  • Right-click on the selection and click "Get Info"
  • On the Multiple Item Information screen, click OK without modifying anything
  • If there are a lot of tracks, a progress bar should come up. The total time it takes to update everything will depend on how many tracks are involved and the speed of your hard drive, but shouldn't be more than a minute or two in most cases.
  • Close and reopen iTunes to refresh the music list
Only one problem: this didn't work for me, at least not consistently, and I was still left with thousands of "un-tempo'd" tracks. It may work for you; if it does, let me know in the comments.

Solution #2

(For the non-technical: skip to the bottom to download the app)

Time to write a little code. On Windows, iTunes has a COM interface (I hear you shuddering out there, but I'm nothing if not pragmatic). It doesn't appear to be well-documented, but a little web searching makes things clear enough. I'm working in C#/.NET/Visual Studio, but you could use Python, Powershell, or almost anything.

In Visual Studio, create a new Console Application project, add a reference, go to the COM tab, and select "iTunes 1.13 Type Library". Instantiate a new iTunesAppClass object representing the iTunes application:

var iTunesApp = new iTunesAppClass();

When done, in a finally block, be sure to release the object:

Marshal.ReleaseComObject(iTunesApp);

The LibraryPlaylist property contains all the tracks:

var mainLibrary = iTunesApp.LibraryPlaylist;


The collection of tracks for a playlist is on the Tracks property, from which you can enumerate individual tracks:

foreach (var track in mainLibrary.Tracks) { .. }

We want to limit ourselves to file-based tracks. One way to do that is to cast each track object to the IITFileOrCDTrack class, and only proceed if the cast succeeds.

var filetrack = track as IITFileOrCDTrack;

Finally, the IITFileOrCDTrack class has an UpdateInfoFromFile() method, which does what it sounds like:

filetrack.UpdateInfoFromFile();

If you're interested, you can download the source project, or the compiled application.

Links

Thursday, October 4, 2012

90/180 BPM Running Playlists and Beat Detection Software

In my last post, I discussed my motivation for coming up with an iPod playlist of songs with tempos and 90 or 180 beats per minute (BPM), to listen to while running. This helps in running with a cadence of 180 strides per minute. Fortunately, the interwebs oblige with lots of suitable running playlists, as well as quite an extensive list at jog.fm.

With sample playlists in hand, I was able to put together one based on my own collection of music that looks something like this:

Ramblin' Man - The Allman Brothers Band
Never Is Enough - Barenaked Ladies
Wild Honey Pie - The Beatles
I'm a Loser - The Beatles
One After 909 - Beatles
Run for Your Life - The Beatles
Norwegian Wood (This Bird Has Flown) - The Beatles
Can't Find My Way Home - Blind Faith / Eric Clapton
Subterranean Homesick Blues - Bob Dylan
From A Buick 6 - Bob Dylan
Modern Love (1999 Digital Remaster) - David Bowie
16 Military Wives - The Decemberists
The Boys of Summer - Don Henley
Man With a Mission - Don Henley
Maxine - Donald Fagen
Son of a Preacher Man - Dusty Springfield
At Last - Etta James
It's Your Thing - Isley Brothers
Feelin' Alright - Joe Cocker
One More Time - Joe Jackson
All My Love - Led Zeppelin
Custard Pie - Led Zeppelin
Boom, Like That - Mark Knopfler
Don't Crash the Ambulance - Mark Knopfler
Sunday Morning - Maroon 5
The Impression That I Get - The Mighty Mighty Bosstones
The Flyer - Nanci Griffith
Live Forever - Oasis
All Around the World or the Myth of Fingerprints - Paul Simon
The Obvious Child - Paul Simon
In Your Eyes - Peter Gabriel
Watching the Clothes - The Pretenders
Precious - Pretenders
Fat Bottomed Girls - Queen
Finest Worksong - R.E.M.
Breaking the Girl - Red Hot Chili Peppers
Give It Away - Red Hot Chili Peppers
Yertle the Turtle - The Red Hot Chili Peppers
Black Cow - Steely Dan
Hello It's Me - Todd Rundgren
Bright Side Of The Road - Van Morrison

If you noticed that this list is full of a lot of moldy oldies, you're right. But I'm of a certain age, and that's what's in my iTunes. In any event, it's not a bad list, and I have been running happily with it for several weeks. However, I quickly tire of almost anything, and longer term I would like more variety. Now, I should say that I own a lot of music. The CD rack holds about 800 discs, and it's full, and other CDs are squirreled away in various other locations. There are about 15,000 tracks in my iTunes music library, so I probably own a lot more tracks that are 90 or 180 BPM. How to find them?  Ideally I'd like something that could crawl through all of my iTunes tracks, analyze the music, and automatically fill in the BPM information. I know it’s not going to be fast, but that's OK. It can run overnight if necessary, or over many nights, for that matter. I found three programs for Windows, free and commercial, that claim to do what I'm looking for.

Cadence Desktop Pro


The Big Kahuna in the very sparse field of automatic beat detection software seems to be Cadence Desktop Pro. It claims to do pretty much exactly what I want: it communicates with iTunes so it knows about your playlists, and makes it fairly easy to select a playlist or part of one, find and display the tempo for each track, and then gives you the option of saving the BPM info back to the tracks. It also gives you a way to verify the tempo by playing a track and tapping with the beat on the space bar.



I downloaded and installed the trial version, and my initial results were good enough that I went ahead and paid $6.99 for the full version. The problem is, it crashes... a lot. It worked well at the beginning, but after a while it became completely unusable. Apparently this happens for other people who have a lot of tracks in their iTunes library. If you have a modest collection of tracks, probably less than a thousand, Cadence Desktop Pro is a very good tool for the purpose. Otherwise, you might want to evaluate carefully before buying.

MixMeister BPM Analyzer


MixMeister BPM Analyzer is a free tool intended for DJ's to find the perfect beat, but looks promising for those of us who don't make it into nightclubs much these days. Its user interface is a bit simpler than Cadence Desktop, and it doesn't integrate with iTunes - it just works off music files on the filesystem. You can browse to a directory, and the program will analyze all music files in that directory as well as all subdirectories (incidentally, it only works with WAV, MP3, and WMA formats). This allowed me to quickly find several more songs in the 88-92 BPM range in an old playlist directory of running songs:


All in all, I like this tool. It's very fast, maybe about 5 seconds per song on my reasonably high-end developers' PC. It automatically saves the Beats Per Minute data to each track, so you could certainly use it in conjunction with smart playlists on iTunes.

BPM Detector Pro


The final entry in the world of automatic beat detectors is BPM Detector Pro. This is also a file-based program, and works quickly, but the look and feel is certainly nothing to write home about:



The trial version is very limited, as you can only analyze three songs at a time. I never did figure out its scheme for exporting BPM data. According to the help file, "Source files may either be renamed, or copied to a new file with a new name. In either case the BPM value can be placed either at the beginning or at the end of the new file name." I didn't want it to do either, and fortunately it didn't create any new files, unless it put them an undisclosed location.

Conclusion


At $24.95 for the full version of BPM Detector Pro, I'll be sticking with MixMeister BPM Analyzer. Besides working well and being free, they also get bonus points for not including the word "Pro" in their name.