c#

Using XML for Code Documentation is Just Plain Wrong

I was just looking at some C# code at work today and it had XML Documentation (like javadoc or python docstrings, only with XML). Who was the idiot that came up with that idea? It's the most insane thing I've ever seen. Let's look at the predecessors to C#'s XML documentation:

Javadoc:

/**
 * Returns an Image object that can then be painted on the screen. 
 * The url argument must specify an absolute {@link URL}. The name
 * argument is a specifier that is relative to the url argument. 
 * <p>
 * This method always returns immediately, whether or not the 
 * image exists. When this applet attempts to draw the image on
 * the screen, the data will be loaded. The graphics primitives 
 * that draw the image will incrementally paint on the screen. 
 *
 * @param  url  an absolute URL giving the base location of the image
 * @param  name the location of the image, relative to the url argument
 * @return      the image at the specified URL
 * @see         Image
 */
 public Image getImage(URL url, String name) {
	try {
	    return getImage(new URL(url, name));
	} catch (MalformedURLException e) {
	    return null;
	}
 }

Then, doxygen, which looks a lot like javadoc:

      /**
       * a normal member taking two arguments and returning an integer value.
       * @param a an integer argument.
       * @param s a constant character pointer.
       * @see Test()
       * @see ~Test()
       * @see testMeToo()
       * @see publicVar()
       * @return The test results
       */
       int testMe(int a,const char *s);

Unfortunately Genshi doesn't syntax highlight the javadoc comments. But it looks fairly readable. Let's try a python docstring example. There is no one standard. One of the documentation generators for Python, Epydoc understands plaintext, javadoc, epydoc, and reStructuredText.

Python code with epydoc style docstrings:

def x_intercept(m, b):
    """
    Return the x intercept of the line M{y=m*x+b}.  The X{x intercept}
    of a line is the point at which it crosses the x axis (M{y=0}).
 
    This function can be used in conjuction with L{z_transform} to
    find an arbitrary function's zeros.
 
    @type  m: number
    @param m: The slope of the line.
    @type  b: number
    @param b: The y intercept of the line.  The X{y intercept} of a
              line is the point at which it crosses the y axis (M{x=0}).
    @rtype:   number
    @return:  the x intercept of the line M{y=m*x+b}.
    """
    return -b/m

Python code with one example of reStructuredText docstrings (this one includes the types of the parameters but they aren't necessary):

def fox_speed(size, weight, age):
    """
    Return the maximum speed for a fox.
 
    :Parameters:
      size
        The size of the fox (in meters)
      weight : float
        The weight of the fox (in stones)
      age : int
        The age of the fox (in years)
    """
    #[...]

I couldn't find any nice examples for C# XML Documentation. The C# XML Documentation Tutorial has some examples, but conveniently, none that include all the tags that I would need to replicate the javadoc example I showed above. So I'll convert the Java example to C#:

   /// <summary>
   /// Returns an Image object that can then be painted on the screen. 
   /// The url argument must specify an absolute {@link URL}. The name
   /// argument is a specifier that is relative to the url argument. 
   /// 
   /// This method always returns immediately, whether or not the 
   /// image exists. When this applet attempts to draw the image on
   /// the screen, the data will be loaded. The graphics primitives 
   /// that draw the image will incrementally paint on the screen.</summary>
   /// 
   /// <param name="url">an absolute URL giving the base location of the image</param>
   /// <param name="name">the location of the image, relative to the url argument</param>
   /// <returns>
   /// the image at the specified URL</returns>
   /// <seealso cref="Image">
   /// Read more about the Image class</seealso>
 */
 public Image getImage(URL url, String name) {
	try {
	    return getImage(new URL(url, name));
	} catch (MalformedURLException e) {
	    return null;
	}
 }

I followed Microsoft's convention (because they know best) of putting the opening tags on a line on their own.

The javadoc sucks because you have to put a <p> (or <br />?) to make a new line which is stupid. Otherwise it's pretty readable, and same goes for doxygen. Especially the @param and @return tags. The Epydoc-style python docstrings suck. You have to specify the type using a @type tag and the return type using an @rtype tag. The reStructuredText example looks the best to me. No tags at all, except for the :Parameters: heading which should be there anyways. The C# comments are an eyesore. Even if Visual Studio had syntax highlighting for the comments it would suck. Did Microsoft look at the two major previous implementations (doxygen and javadoc) and decide that XML was a better way to document code?

I recently saw an interesting comment in scipy's source about one of scipy's guiding principles in designing the docstring standard for their codebase:

A guiding principle is that human readers of the text are given precedence over contorting docstrings so our tools produce nice output. Rather than sacrificing the readability of the docstrings, we have written pre-processors to assist tools like epydoc_ and sphinx_ in their task.

Microsoft clearly took the opposite route and decided to make code documentation readability by human readers a low priority.

Why I Chose Python Over C#

I was recently tasked with writing a sipmle UI and the choice of what language/framework to use quickly boiled down to two choices: C# or Python. I eventually chose Python and here's why:

XML

When I started sketching out a design for this project in C#, I ended up envisioning lots of XML configuration files, in place of compiled-in, hard-coded configuration. In Python, however, XML is rarely required. Instead you can use python modules and data structure to specify your configuration.

You can notice this difference clearly if you look at C# and Java web frameworks and compare them to Python web frameworks. Django (a Python web framework) has a few configuration files. One is settings.py which is basically just a list of key value pairs. Some of the values, however, are Python tuples (immutable lists). It looks like a hybrid between a .ini file and an xml file. The beauty of this is that you can actually run a program like pychecker or pylint on your program and if you are trying to access a key in your settings file that doesn't exist, it will complain! Try doing that in a compiled language. So Django and other programs just take advantage of the fact that since code isn't compiled, you can put all your config in code, and you can easily tweak it later without needed to re-compile anything.

In web frameworks and other projects for C#/.NET and Java, XML is a huge part of configuration. Spring, Hibernate, Ant, log4j/log4net all rely on either XML configuration or annotations (which just couple settings to your code and bake it into the build). So if you write your own applications in Java/C# you will find yourself also using XML for configuration and writing code to parse and possibly write XML as well. The only time I ever enjoyed using XML was when I was using the ElementTree library for Python, which is now a standard library in Python.

Less code

Python code is generally a lot shorter. This means it's quicker to write, quicker to read, quicker to debug, quicker to modify later. No braces, no semi-colons, terse data structures, list comprehensions.

No compiling

Being able to modify code in the field is huge. Many times we've modified Python code in the field using a text editor after getting some obscure error and then re-ran the script. No re-compilation necessary.

Libraries

I tried porting a simple python script to C# just to see how easy/hard it would be. The first step was porting the command-line options parsing. I used GNU getopt style parsing, which is included in python's getopt library. No such thing is included in .NET. There is a third-party library, CSharpOptParse. Having to download this was a bit of a turn off. Then I looked for an example of CSharpOptParse usage and I found one. Ugh. The python getopt example is much nicer. If you don't like getopt there is also optparse (apparently, "optparse is a more convenient, flexible, and powerful library for parsing command-line options than getopt". I'll have to give it a try!). It looks even simpler than getopt!

The next thing I to find was to look at how to call an external executable in C#/.NET and capture stdout and/or stderr. Talk about annoying. Python's new-ish subprocess library is awesome.

Finally my Python script does some path splitting using os.path.split and os.path.splitext. I did find .NET's Path class to be pretty convenient, although no better than python's os.path.

Documentation

The Python documentation is far better than anything I have seen in the .NET/C# world. Maybe it's because smart people use Python.

Conclusion

I've been a long-time Python user but I do like languages like C# and Java as well, but when I put Python and C# side-by-side for this simple little project, nothing competes with it.

Tags:

Subscribe to RSS - c#