jueves, febrero 17, 2011

Kurikaesu Ayamachi No Sonotabi Hito Wa Tada Aoi Sora No Aosawo Shiru



Maven 3 presents a new way to configure POM files. In Maven 2 POM files have to be configured using a XML file. In Maven 3 can be used the old way, but also as Groovy file. Groovy is a nice script language that offers functions for implementing a DSL (Domain Specific Language).

Let's look an example of Maven 3 POM file:

XML file:

<dependencies>
     <dependency>
          <groupId>junit</groupId>
          <artifactId>junit</artifactId>
          <version>4.7</version>
          <scope>test</scope>
    </dependency>
</dependencies>

can be translated to:

Groovy file:

dependencies {
      dependency { groupId 'junit'; artifactId 'junit'; version '4.7'; scope 'test' }
}

As you have noted Groovy file is as understandable as XML file, but much more readable, only two lines instead of 6.

After watch this new form, I got curious about how Maven people have created the parser with Groovy. In fact all is reduced to Closures, and Dynamic Invocation. Both capabilities are implemented by Groovy. So my next step was implementing some XML configuration as Groovy configuration and develop the "parser" in Groovy.

First of all I have chosen Liquibase configuration file, I have trunked a lot (for making the example simple).

Then I have implemented how I wanted to look the Groovy configuration file:

XML configuration file:


<?xml version="1.0" encoding="UTF-8"?>
<databaseChangeLog
xmlns="http://www.liquibase.org/xml/ns/dbchangelog/1.9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog/1.9
http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-1.9.xsd">
     <createTable tableName="department">
          <column name="id" type="int"></column>
          <column name="name" type="string"></column>
     </createTable>
</databaseChangeLog>

If you look original XML configuration you will notice that I have removed some tags. As I have said, this reduction has been done for simplifying developing.

And Groovy configuration file:

databaseChangeLog {
     createTable {
          id {type "int"}
         name {type "string"}
     }
}

Nice reduction from 9 lines to 6 lines in so simple example.

Now it is time for some Groovy theory.

In Groovy a Closure is composed by a definition (name), and arguments and statements to execute. In formal way:

name { [closureArguments->] statements }

if you look close to Groovy configuration file, this can be seen in first statement:

databaseChangeLog {
....
}

you have a closure name, a brace, no arguments, and a list of statements.

In Groovy a function can be called like in Java functionName(param1, param2, ...); but also without parentheses.

An example can be found in line 3:

id {type "int"}

the function call is named type and a string parameter is passed with value int.

We have almost done, we know that databaseChangeLog and createTable are closures name, as is id and name.

And I suppose you are wondering, "you are right I can see closures, I can see functions, but I need to read configured values". Yes you are right and this is a Groovy class that resides in another Script file.

With this information we have almost done, let me show the implementation and comment the last trick.

First of all we need to define a ExpandoMetaClass (that allows you to dynamically add methods, constructors, properties and static methods using a neat closure syntax) for matching the parent method, in our case databaseChangeLog.

script.metaClass = createEMC(script.class, {
     ExpandoMetaClass emc ->
     emc.databaseChangeLog = {
     Closure cl ->
     cl.delegate = new DatabaseChangeLogDelegate();
     cl.resolveStrategy = Closure.DELEGATE_FIRST
     cl()
   }
})

static ExpandoMetaClass createEMC(Class scriptClass, Closure cl) {
     ExpandoMetaClass emc = new ExpandoMetaClass(scriptClass, false)
     cl(emc)
     emc.initialize()
     return emc
}

This class have two important points.

First one, emc.databaseChangeLog.

Here we are telling that exists a method databaseChangeLog.

Second one:

cl.delegate = new DatabaseChangeLogDelegate();

In this line we are telling to Groovy that statements (method calls) defined into databaseChangeLog closure should be delegated to specific delegation class and after this delegation (Closure.DELEGATE_FIRST) the closure (method content) should be executed (cl()).

And delegate class is:

class DatabaseChangeLogDelegate {
     void createTable(Closure cl) {
           cl.delegate = new AttributeNameDelegate();
           cl.resolveStrategy = Closure.DELEGATE_FIRST;
           println "Creating Table";
           cl();
     }
}


DatabaseChangeLogDelegate delagate defines that when createTable method is found should create a new delegate, print that A Table is Created and finally calls the closure.

So recapitulate what Groovy executes after all initialization completes, or more formaly when script.run() (this runs our configuration Groovy script) is executed.

The script main method is executed (databaseChangeLog), this method is found into ExpandoMetaClass and what Groovy do is create the new delegation and calls databaseChangeLog. This method have into its body another method called (createTable). And tries to execute, how? finding them into the delegation. Look DatabaseChangeLogDelegate class, it contains a createTable method. This method creates a new delegation, cl.delegate = new AttributeNameDelegate();, for next methods and it is executed (cl()) which implies finding two methods, id and name.

I know it is a little bit confusing, think about delegation class like an @Around aspect in Spring AOP, that can execute some logic before and after, and instead of having proceed call have a call to closure (cl());

And now the last trick, you can think, yep man, you are redefining a methods that you know their name like databaseChangeLog, createTable or type. But what happens with methods that are defined by user and can be any valid function name like id or name, each table will create their own attribute names, so how to deal with methods that we don't know what are their names in developing time?

Groovy defines a method in delegates called methodMissing. This method is called when in current delegate class, the method definition is not found.

class AttributeNameDelegate {
     def methodMissing(String name, Object args) {
           if (args.length == 1) {
                   if (args[0] instanceof Closure) {
                        args[0].delegate = new AttributeTypeDelegate();
                        args[0].resolveStrategy = Closure.DELEGATE_FIRST;
                        println name
                       args[0]()
                  }
          }else {
                 throw new MissingMethodException(name, this.class, args as Object[])
          }
    }
}

As you can see, the same approach is used, but before calling body method the name of the function is printed and closure is called (first argument).

Final code is:

package liquibase
import groovy.lang.Closure;
import groovy.lang.GroovyShell;
import groovy.lang.Script;

class GroovyLiquibase {
     static void main(String[] args) {
          runLiquidParser();
     }

static void runLiquidParser() {
     runLiquidParser new File("test.groovy");
}

static void runLiquidParser(File file) {
     Script script = new GroovyShell().parse(file);
     script.metaClass = createEMC(script.class, {
           ExpandoMetaClass emc ->
           emc.databaseChangeLog = {
                   Closure cl ->
                   cl.delegate = new DatabaseChangeLogDelegate();
                   cl.resolveStrategy = Closure.DELEGATE_FIRST
                   cl()
          }
     })
     script.run()
}

static ExpandoMetaClass createEMC(Class scriptClass, Closure cl) {
      ExpandoMetaClass emc = new ExpandoMetaClass(scriptClass, false)
      cl(emc)
      emc.initialize()
      return emc
     }
}

class DatabaseChangeLogDelegate {
      void createTable(Closure cl) {
          cl.delegate = new AttributeNameDelegate();
          cl.resolveStrategy = Closure.DELEGATE_FIRST;
          cl();
      }
}

class AttributeNameDelegate {
      def methodMissing(String name, Object args) {
           if (args.length == 1) {
               if (args[0] instanceof Closure) {
                    args[0].delegate = new AttributeTypeDelegate();
                    args[0].resolveStrategy = Closure.DELEGATE_FIRST;
                    println name
                   args[0]()
              }
          }else {
                  throw new MissingMethodException(name, this.class, args as Object[])
         }
      }
}

class AttributeTypeDelegate {
      void type(String type) {
            println type
      }
}


First look seems very complicated code, but you will find that is easy to develop, and repeatable, after you have defined one Delegate are all exactly equals but changing required method names and configuration.

Some of the advantages of DSL approach for configuration files are:

  • Less information for expressing the same. Our new configuration file for Liquibase is essentially an XML file but without the noise generated by the XML tags.
  • More concise and more readable.
  • Enhance extensibility. Because it is based on script language, can be extended with script statements. For example, in case of Maven, if you wanted to execute some Groovy statements during an specific goal, you should import a plugin (GMaven) and configure it. See GMaven website for examples for watching why this feature can be interesting http://docs.codehaus.org/display/GMAVEN/Executing+Groovy+Code. But if your configuration file is already a Groovy file you can put Groovy statements natively, without adding any new plugin or configure it.
With Groovy you can create your own Doman-Specific Language. What I suggest is that you start thinking about creating a DSL for configuration file when you are writing a submodule that will be reused in many other projects, and in each one different configuration will be required.

Of course developing a DSL can be used in different situations rather than defined here. Some of you could think that XML files is the best way for creating configuration files, but I think is always nice to know different valid possibilities, and what offer.

Download Source Code

0 comentarios: