Experience by Doing: March 2011

Most of the programmers must be a way ahead of me in this issue. I have to admit that I do not have a stable knowledge on Maven. I have tried several times to understand it, but for me the xml is to hard to maintain, the one-artifact-per-project paradigm is to restrictive and the web of plugins that should be learned is too complex. I prefer Ant which is a bit old-fashioned, but well documented and readable. (Should I replace it Gradle seems to be a good choice.)
What I envy from Maven is the archetype and the dependency management. I really hate to collect all the dependent jars into the new project's lib directory. Fortunately there are some other ways to get along and that is the point of this post.

What is the adventage of using some kind of dependency management?

no need to store jars in VCS - lib directory becomes a first class citizen in .gitignore
no need to keep in mind transitive dependencies
in-house libraries can be stored in the same shared repository as the other libraries
easy to check upgrade possibilities

OK. So if it is so nice, how can I get there? As I considered Gradle as a replacement for Ant I saw that it uses Ivy to manage libraries, so it came naturally to take a look at it. Let us gather the main points that should be achieved:

test project that gets libs through declaration
shared repository for enterprise
publish in-house libs to shared repository

Declarative dependency
Setting up Ivy is easy: the ivy-[version].jar should be placed in the lib dir of ant. The dependency declaration goes to the ivy.xml file in the root directory. It could be something like this:

<ivy-module version="2.0">
    <info organisation="hu.progmatx" module="test-ivy"/>
    <dependencies>
        <dependency org="org.apache.velocity" 
                       name="velocity" rev="1.5"/>
    </dependencies>
</ivy-module>

(A nice repository of dependency descriptions is available on the net.)
So we have the definition how to get the files? Let us put together a very simple build.xml:

<project xmlns:ivy="antlib:org.apache.ivy.ant" 
         name="test-ivy" 
         default="resolve">
    <target name="resolve" 
            description="--> retrieve dependencies with ivy">
        <ivy:retrieve sync="true" 
                      symlink="true" 
                      refresh="true"/>
    </target>

    <target name="report" 
            description="--> report dependencies with ivy" 
            depends="resolve">
        <ivy:report />
    </target>
</project>

The main points are

the namespace declaration that makes easy to call Ivy targets,
resolve target which calls the retrieve task
report task can display the dependencies in html or graphml

Some explanation on retrieve task might come in handy. First it implicitly calls the resolve task to compute the dependency graph. Then it downloads the files into the local cache. Then it would copy the files into the lib directory, but I prefer using symlinks (as you can see from the attribute of the task). The sync attribute ensures that unused dependencies will be removed.

Repository
Our next step is to establish a shared place of libraries. The structure of the repository can be customized and the way Maven does it can also be suitable. Take a look at the following structure:

shared-repository/
└── no-namespace
    └── ant
        └── ant
            ├── ivys
            │   ├── ivy-1.6.xml
            │   ├── ivy-1.6.xml.md5
            │   └── ivy-1.6.xml.sha1
            └── jars
                ├── ant-1.6.jar
                ├── ant-1.6.jar.md5
                └── ant-1.6.jar.sha1

This can be accessed through the configuration of the following properties in the build.xml:

...
<property name="ivy.shared.default.root" value="/media/shared-repository"/>
<property name="ivy.shared.default.ivy.pattern" value="no-namespace/[organisation]/[module]/ivys/ivy-[revision].xml"/>
<property name="ivy.shared.default.artifact.pattern" value="no-namespace/[organisation]/[module]/[type]s/[artifact]-[revision].[ext]"/>
...

Of course with shared repository I have to say Good bye! to symlinks.

Publish libraries
To maintain a repository like that by hand could be hard. Fortunately we have ivy:publish and ivy:install tasks to the rescue. One can be used to load up our own homebrew jars and the other is suitable to get a copy of the dependencies from the default Maven repo. Let us do it one by one!
To publish your jar just put the following in the build.xml:

<target name="publish" depends="resolve"
        description="--> publish module to shared repository">
    <ivy:publish resolver="shared" pubrevision="1.0">
         <artifacts pattern="build/jars/[artifact].[ext]" />
    </ivy:publish>
</target>

OK, it was too easy, so let's look at the other problem! That seems to be more complex, because we have to edit two files. Append this to build:

<property name="ivy.cache.dir" value="${basedir}/cache"/>
<property name="dest.repo.dir" value="/media/shared-repository"/>

<target name="maven2"
        description="--> install module from maven 2 repository">
    <ivy:settings id="copy.settings" file="${basedir}/ivysettings.xml"/>
    <ivy:install settingsRef="copy.settings" 
          organisation="org.apache.velocity" module="velocity" revision="1.5" 
          from="libraries" to="my-repository"
                    overwrite="true" transitive="true"/>
</target>

And let's create the ivysettings.xml!

<ivysettings>
    <settings defaultCache="${ivy.cache.dir}/no-namespace" 
                 defaultResolver="libraries"
                 defaultConflictManager="all" />  <!-- in order to get all revisions without any eviction -->
    <resolvers>
        <ibiblio name="libraries" m2compatible="true" />
        <filesystem name="my-repository">
            <ivy pattern="${dest.repo.dir}/no-namespace/[organisation]/[module]/ivys/ivy-[revision].xml"/>
            <artifact pattern="${dest.repo.dir}/no-namespace/[organisation]/[module]/[type]s/[artifact]-[revision].[ext]"/>
        </filesystem>
    </resolvers>
</ivysettings>

Here the point is that we create a so called resolver, that is connected to the standard Maven repository and to our shared repository. In the install task the dependent lib is named and with the attributes set up this way we gather the transitive dependecies as well.

Conclusion
Wasn't it easy? I did not dare to think of it before - it seemed so many things to do and all these things seemed to be so complex. Fortunately Ivy is gentle, it just adopts to your pace and you do not have to customize more than what necessary is. I was only scratching its coat, but it was completly working already. It was worth to invest some time and gather some Experience by Doing!

I just have realized how easy it is to put up a service for searching my own codes.
During an architect-meeting the code reusing issue came up. Some of the colleagues wanted to create a storage for all written codes that must be tagged somehow to be searchable. Others wanted to generate and share documentation in html format, and search that one.
The problems with these solutions are obvious:

people under pressure do not care for putting their code into searchable archive
poorly documented code cannot be searched
overdocumented code breaks the DRY principle
programmers speak programming languages - so it easier for them to search that one

One of my friend threw in that Google is quicker and probably smarter than any kind of archive that we could put together. Maybe he was right, but I was eager to find some possibility to combine the two, and put up an in-house code search site.
Actually I thought that there are tons of apps like this out there in the Open Source world. But it came up that the only relevant candidate is OpenGrok. I tried it, and I was completely satisfied. It is not only making a fulltextsearch on the code, but it uses ctags, to get some insight knowledge on the structure. It can even browse VCS history, which might came in handy when studying the meaning of the code.

Putting up the OpenGrok is easy:

unzip the archive
create a basedir for projects
checkout the projects from the VCS
setup the etc/configuration.xml (mainly just setting the path)
run bin/OpenGrok index
build the webapp
deploy under tomcat

Now what you might need automate is that the projects should be updated on regular basis and the indexing must be called afterwards. But that one does not require a master degree.

However when I presented this to the architects it turned out that I was not the only guy who worked on this item. One of the fellows showed us a nice wiki page on comparison of documentation generators.
The point why I cite this is that I want to emphasize the difference between reading some docs or doing something relevant.
I still believe in Experience By Doing...

Experience by Doing

Sunday, March 27, 2011

Manage Dependency

Friday, March 11, 2011

Grok Your Code