Tuesday, May 31, 2011

Doxygen: Automatically Generated Documentation

I find that automatically generated documentation is a great way to explore an existing code-base.

Specifically, the documentation produced by Doxygen (www.doxygen.org/index.html) is very helpful.  Even better, Doxygen has support for a large number of languages and producing HTML, LaTeX, and man page documentation.  Here's an example (from http://www.scfbm.org/) of a Doxygen-generated UML class diagram:




Adding meta-information to the source code is as simple as documenting it via the prescribed documentation format.  For example, Doxygen reads Javadoc-style comments:

/**
 * A String representation of an object suitable for debugging.
 */
public String toString() {
...

It's also worthwhile to look at the tags (e.g., "@todo") that Doxygen supports for the language that you're using.

Doxygen uses a configuration file, or Doxyfile to determine the type and style of documentation to produce.  I find that the easiest way to get started is by using Doxygen's GUI configuration file wizard: doxywizard

I highly recommend enabling the advanced features in "Expert" mode under the "Dot" topic for:
  • Class Diagrams,
  • Collaboration Graph,
  • UML Look,
  • Include/Included-by Graph,
  • Call/Caller Graph, and
  • Graphical Hierarchy.
Also in "Expert" mode, enable the "Source Browser."  It helps to do fast-pased IDE-style code traversal while you familiarize yourself with the code-base.

Hibernate's Lazy Initialization

Exception in thread "main" org.hibernate.LazyInitializationException: failed to lazily initialize a collection of role: <CLASS>.<METHOD>, no session or session was closed

If you've ever received this error from Hibernate, you're not alone.  It happens because of Hibernate's "Lazy" initialization for joined relations.  It's a common problem because it occurs when you close the Hibernate session before you've accessed all the members that you need from the class --- which happens when you try to abstract the Object-Relational Mapping layer from the higher layers of your application.

For example: a Student class has a list of Classes.  Logically, you add the Hibernate queries for students to the Student class.  Your code likely has the following structure:
  • Create a new session
  • Make the query
  • Close the session
  • Return the result
Unfortunately, if you return the Student class, then query getClassList() --- you end up with a "Lazy Initialization Error."  This is because hibernate only does a query for the class list when you explicitly ask for it.

Here's a way for you to keeps higher-level classes from knowing about Hibernate, while maintaining the ability to create/destroy Hibernate session(s) appropriately:

   
  2 import org.hibernate.Session;
  3 import org.hibernate.SessionFactory;
  4 import org.hibernate.cfg.Configuration;
  5               
  6 /**           
  7  * Keeps track of Hibernate session for us
  8  */           
  9 public abstract class Queryable {
 10               
 11    /**        
 12     * Hibernate session
 13     */  
 14    private static Session session = null;
 15         
 16    /**  
 17     * Get (or create a new) hibernate session
 18     */
 19    public static Session getSession() {
 20       if( session == null ) {
 21          SessionFactory sessionFactory = new
 22            Configuration().configure().buildSessionFactory();
 23          session = sessionFactory.openSession();
 24       }
 25       return session;
 26    } 
 27   
 28    /**
 29     * Manually close the session.
 30     */
 31    public void closeSession() {
 32       if( session != null ) {
 33          session.close();
 34          session = null;
 35       } 
 36    } 
 37   
 38 }

Simply make the Student class extend Queryable and use getSession() everywhere you would have created or referenced the Hibernate session.  Don't forget to close the session when you're done with the object!

Also, thanks to the following website that gave me some insight into the problem:
http://www.javalobby.org/java/forums/t20533.html

Friday, May 27, 2011

Subgraphs in Graphviz Dot

Graphviz Dot is an extremely useful program for visualizing graph data (e.g., Finite State Machines, network topologies, etc.).  What makes Dot even more powerful is the ability to draw subgraphs (graphs within graphs).  This capability is a bit tricky, however, so here is the simplest "Hello World" I can give:

  1 digraph dfg {
  2    compound = true;
  3    subgraph cluster_parent {
  4       color=blue;
  5       label="Parent";
  6       A->B [lhead=cluster_child_1];
  7       A->C;
  8       subgraph cluster_child_1 {
  9          color=green;
 10          label="Child #1";
 11          B->C;
 12       }
 13       subgraph cluster_child_2 {
 14          color=green;
 15          label="Child #2";
 16          C->D [ltail=cluster_child_1];
 17       }
 18       D->E;
 19       E->B [lhead=cluster_child_1];
 20    }
 21 }

The program above produces:
The key (non-intuitive) pieces for working with subgraphs are:
  1. Subgraph names must start with the string "cluster" --- for example, cluster_child_1.
  2. Edges that terminate at a subgraph (e.g., Node A to Child #1) must have the attribute "lhead" defined as the subgraph name.
  3. Edges that start at a subgraph (e.g., Child #1 to Node D) must have the attribute "ltail" defined as the subgraph name.

Thursday, May 26, 2011

Synchronized Text Editor Configuration

One of my most common pieces of advice is to get familiar with a text editor, configure it to your liking, and use it for everything!  I'm partial to Vim, but my advice applies to any configurable text editor (*cough* emacs *cough*).

Configuring your text editor can be a very time-consuming process and there's nothing worse than having a different configuration on all of the various computers you use.  My recommendation is to keep your text editor configuration in version control (e.g., subversion, mercurial, etc.) as a way of maintaining your preferences over time, as well as keeping your configuration consistent on every computer you use.

I keep my .vimrc under version control in Github as a way of sharing it:

https://github.com/radarku/kvim

The reason I chose Github is this great article about keeping all your Vim plugins up-to-date using Git submodules:

http://vimcasts.org/episodes/synchronizing-plugins-with-git-submodules-and-pathogen/

Wednesday, May 25, 2011

Modeling Inheritance in Relational Models

If you've ever designed Relational Model, you probably discovered that there are different ways to model the object-oriented concept of inheritance in relational models.  This becomes particularly important when using an Object-Relational Mapping (ORM) library, such as Hibernate.

There are several ways to accomplish this modeling, but I'll only talk about two methods, which I call:
  1. Differentiation on a "discriminator" field
  2. Shared Primary Key
Differentiation on a "discriminator" field
With this solution, there is a single table for the parent, children, grandchildren, and so on.  In this table, one field is a "discriminator," from which the type of the object can be derived.  The benefit to this solution is that it is a very simple way to model inheritance.  Unfortunately, it's also not a very robust solution.  First, if the children/grandchildren/etc. have many (disjoint) fields, it can lead to a very sparsely populated table (i.e., lots of "NULL" values).  Furthermore, making changes to the structure of the objects or hierarchy can be very tricky.

id type parent_field child_1_field child_2_field
1 parent 1 NULL NULL
2 child_1 2 1 NULL
3 child_2 3 NULL 1


Shared Primary Key
Another way of modeling inheritance in a relational model is by constraining the child's primary key (PK) as a foreign key (FK) to the parent's PK.  This solution makes changing the structure or inheritance hierarchy simpler than the "discriminator" solution.  It also eliminates the unnecessary "NULL" values for inapplicable fields.

Parent:
id parent_field
1 1
2 2
3 3

Child_1:
parent_id child_1_field
2 1

Child_2:
parent_id child_2_field
3 1

Tuesday, May 24, 2011

Cropping PDFs

If you ever received a PDF with a lot of extraneous whitespace, there's a nice little utility to remove it quickly and easily:

# pdfcrop <PDF File>

see http://pdfcrop.sourceforge.net/

...and even better: if you're a LaTeX user, you might already have it because it's included in TexLive!

In Ubuntu, it's in the package: texlive-extra-utils