A summary of the new clojurehelper

I’ve been packaging dependencies for quite some time now and I regret not doing this earlier. When I started writing clojurehelper I guess I had a different idea of what I really needed in order to easily package libraries. As of now I have some 13 packages on mentors (some of them have already been accepted into NEW) plus 4 or 5 more in my hard drive almost ready to be uploaded. Apart from learning that you should never estimate the amount of libraries you need in order to satisfy dependencies (because one leads to another which also leads to another which happens to depend on some JSR that has not been included into java yet and then some other library compiles but is not compatible…), I also learned that some of the features I was including in clojurehelper were not very useful and that I was also ignoring some basic functionality that I seemed to be constantly needing.  I used clojurehelper in “javahelper” mode to package most of the clojure libraries, and maven-debian-helper for the java ones. After talking to my mentor and we decided to go back to clojurehelper and finish integrating lein2 into the build system.

Sane(r) naming scheme and modularization

clojurehelper-schematic

Simple schematic of clojurehelper

The first thing that came to my mind when I started going through the code was “WHAT IS THIS?”. It was a bad sign that it took me some time to figure out again how everything worked. I guess it made sense back then when I knew what “tools”, “debtemplates”, “project” really were. So I spent some more time reading the code and came to the conclusion that some parts had to be refactored and renamed.

I explained in earlier posts that clojurehelper is only about parsing project.clj and creating debian/* files based on the information that was retrieved. The only special logic that is important but “hidden” is the fact that the package variables (such as dependencies, package name, version..) can also be retrieved from different sources and they all must be merged somewhere.

So what used to be tools is now called parsers (since it only contains the pom parser and the project parser). I realized there was no real advantage of having the template logic encapsulated in an object so all those methods are now functions and they are part of core, debtemplates (which ironically didn’t have the template logic) is also part of core.

 

Flexible specification of dependencies

Leiningen uses a project.clj file in order to know what it needs to fetch and configure before a build. When I started using maven-debian-helper I truly saw how related Leiningen was to Maven and made me think we could use the same approaches to solve the same problems. For maven-debian-helper to work it ‘tweaks’ the original pom.xml file so the maintainer can build the package without packaging every single version of the libraries that are already in the archives. 

An idea borrowed from maven-debian-helper is to create a rules file where the maintainer can indicate what libraries and which versions to use. Although it isn’t truly necessary to have the dependencies installed in the system, Leiningen 2 likes to complain if it cannot find them plus javahelper needs to have the libraries installed to figure out the name of the libraries by looking at the jars.

This is what a rules file looks like for now:

Screenshot from 2013-09-13 22:30:25

 

But it doesn’t really work like a rules file, it’s more of a list of dependencies that will be included in the project. For that reason you can delete a line if you don’t need it or just change a version manually.

Integration with Leiningen 2

The initial version of clojurehelper didn’t even rely on Leiningen. Our super sophisticated build system was a jar command which essentially just packaged the source files and included a manifest file as well. Leiningen 2 is much more robust than Leiningen 1 (thank you technomancy) and for that reason when can almost forget about how clojure libraries are built and just ask Leiningen to do the hard work.

Ideally we would like to change project.clj during the build, just like maven-debian-helper does with pom.xml, but the problem is that project.clj is much more harder to parse and edit (unless you have a Leiningen plugin which I don’t discard to be using in the future). Fortunately Leiningen 2 let’s you create profiles and specify dependencies there, if you take this into account and also use the update-in task you can pretty much do anything you want without editing project.clj.

I wrote a new script called lein_create_profile which reads debian/leiningen.rules and creates a profiles.clj file with the necessaries dependencies. Leiningen 2 can then be called this way in order to build the jar file (this is in fact the content of lein_build):

lein -o update-in :dependencies empty — with-profile debian-build jar

lein_configure (which comes after dh_auto_configure) makes sure that the profile is available before the build my calling lein_create_profile on the debian/leiningen.rules file created by lein_makepkg.

I like this profiles approach but I don’t know if we’ll find problems with it in the future, this is why the build system might change later on but still rely on Leiningen 2 to do the work of course.

More interaction if we need it

maven-debian-helper is very interactive. I like the fact that if it doesn’t know what to do it will ask you to avoid surprises. Clojurehelper does this now when it comes to resolving dependencies.

During the execution of lein_makepkg, clojurehelper will try to look for unmet dependencies and ask the user what to do about it. For now it only prompts for a version if the version is not installed in /usr/share/maven-repo but I will implement automatic Y.X -> Y.{1.2.3}.. type of matching shortlyScreenshot from 2013-09-13 22:49:30

 

Still missing

Since the major refactor and renaming all the tests break because of change in the interface. I will be fixing this soon as well as improving what already has been done on clojurehelper. Like I said previously smart version guessing is not being done at all and lein_makepkg should ask the user if he wants to export a “debian” version or not. Interoperability with the maven-repo-helper has not been tested yet and this is a MUST but at least Leiningen is creating jars and the pom files properly and according to the rules that we set.

Yet better testing, parsers, core.cache, some thoughts on packaging

I’ve been trying to package core.cache for the past few days. core.cache is not hard to package at all, in fact, it’s very close to data.xml but my obsession to get it right on ‘-1’ is what’s been keeping me from packaging it. Today babilen was kind enough to point out a few things to pay attention to when packaging core.* libraries such as the difference between the information found in project.clj and the pom files. core.cache (version 0.6.2) comes with a project.clj file that varies in some degree from the pom file. After a lot of thought and recalling my first impressions on packaging in general, my conclusion is that nothing is set on stone (except for the Debian Policy), all packages are different, and everything is flexible. I look at quoin, my first package, and I now understand what in the past used to be just rules without much meaning. I still have a lot to learn but I feel more confident when playing with the debian files, the tools, etc.

So most of my goals for this week are almost done. I’m trying very hard to keep a TDD workflow and so far it has been working great. I started to take the tests more seriously when I could clearly see bugs on lein_makepkg with core.cache. This proved that the tests I had were insufficient and were not covering all the possible scenarios. After reading more on the topic, I discovered coverage.py and the coverage plugin for nose and so when I started to try it the results were covering 50% of the code (shame on me). I also blame my poor design from the beginning which is why the goals set for last week were mostly centered on improving separation of concerns and trying to make everything very modular and structured. Tests now cover 84% of the code yay!. The reason why is not 100% is because the DebianTemplates class is not being tested, (they are being tested indirectly because the template tests call lein_makepkg, but it should be tested separately).

The old ProjectParser class now inherits from a new class ‘PropertiesParser’. The new design I mentioned here matches perfectly with my decision of calling it PropertiesParser: A project (what we are packaging) consists of Properties (inherent to the leiningen project itself) and Options. Both must be parsed and they can both come from different sources. Say we want to extract the properties of a project by parsing a pom file instead of the output of lein xml, we could create a class PomParser which provides project properties, this class can inherit from PropertiesParser to have the same interface that all the parsers have. In fact this PomParser has been implemented already and the next step would be to ask the user if he wants to use a project.clj or a pom.xml file (or both) when packaging. This was pointed out by babilen and I think is a great idea.

test test test, fixes and core.cache

I’ve spent this weekend trying to improve our tests. I read about code coverage and a few features that nose offers which I’ve been ignoring for too long now. Today I tried to put everything together including some more tidying up of lein_makepkg. The result was mixed since I didn’t finish. Tomorrow I will give more details because I’m exhausted zZZzZZ.

New components, hopefully for the best

As I mentioned in my last post, lein_makepkg was getting just too messy. I have to admit I’ve been adding features not worrying too much about how maintainable that would be. Today I started to make some changes regarding the structure of lein_makepkg and some of the classes within the leinpkg package.

The first thing I noticed when I analyzed the code I had was that the ‘project’ dictionary was being read and written all over the script, this was making it hard for me to add more features since I had to be aware of where everything was. One of the initial steps when running lein_makepkg is to parse options from different sources, this includes the configuration file, the command-line arguments, and the environment. There was also some logic on top of this, for example if we wanted to specify an itp bug number by using –guess-itp, we would have to make sure this is being done after the –itp flag is processed. Another example with be the –no-configfile option which disables configparser. So my solution was to create a class (OptionParser) to encapsulate argparser, configparser, and os.environ while providing some logic of the order in which the arguments should be parsed. After initialization this class provides a method, OptionParser.get_args(), which returns a dictionary of the options found from all sources taking into account their priority (cmdline > environ > configfile). The idea is to not worry about how the options are processed after .get_args() returns.

The second big step was to redefine the previous ‘Project’ class. This class was initialized with a path to project.xml, the file was parsed and all the important fields were extracted and put into members which could be read after initialization. This was another problem since the global ‘project’ dictionary had to be read and written using Project members. In the end this class was renamed to ProjectParser, because in the end is just an xml parser of project.xml. This class provides a method, ProjectParser.get_items(), which returns the properties that were read from project.xml, these are the name of the leiningen project, the version, the dependencies, etc.

Up to this point we have leiningen properties and options, all nice and separated. The goal of lein_makepkg is to use these leiningen properties and populate a debian folder with the possibility of manually change/override some fields depending on the options. So a new class ‘Project’ represents this idea. It takes a dictionary of leiningen properties and a dictionary of options, and generates variables to be passed to the jinja templates. I’m not really sure if this was the right decision, but after some writing I realized that at least part of it is right: Each of the variables we need for jinja are calculated in their own method within the Project class. So for instance if I want to redefine how the dependencies are calculated we only have to redefine one method. This also provides better unit-testing which was another one of the goals for this refactoring. These are a few of the methods:


def _get_package_name(self):
 """
 Returns the package name taking into account any possible options.
 Current name: lib<sourcename> OR any option that sets the package
 name.
 """

name = ''
 if self.options['package']:
 name = self.options['package']
 else:
 name = 'lib' + self._get_source_name()

return name

def _get_source_name(self):
 """
 Returns the source name.
 Current name: <propname>-clojure. Where propname is the name taken
 from the properties.
 """

return self.properties['name'] + '-clojure'

def _get_version(self):
 """
 Returns the version of this package.
 """

version = self.properties['version']
 if self.options['version']:
 version = self.options['version']

return version

I’m sure some of the decisions I took might have to be reconsidered but at least lein_makepkg has improved.

Packaged clojure.data.xml, more improvements on lein_makepkg, major refactor coming soon

Today was a very productive day. I managed to pack clojure.data.xml without too much hassle, a very interesting package since it doesn’t use leiningen and not all the source has to be bundled into the package. I think clojure.data.xml has been the most fun library I’ve had the chance to play so far; my other two packages, stencil and quoin, were very straightforward: Download the souce code, use our semi-standard rules file (or use clojurehelper –javahelper), fix a few more things and voilà, a fresh deb file ready to be installed. clojure.data.xml has a few interesting things:

  • comes with a changes file that we don’t need but process by default (package.docs)
  • has an src directory that includes tests
  • the tags changed over time in github which made it a little more difficult to write a watchfile

Ignoring CHANGES.md wasn’t much of an issue since changing the .docs file and only listing README.md was everything I needed. When I initially packaged clojure.data.xml I realized lein xml would complain because /clojure/data/xml.clj was not being found in the classpath. This was all due to the fact that src had a weird structure when I downloaded it from upstream. It included the tests and the data under the same folder (src) and our rules file automatically grabs the src and makes a jar out of it. The result was a jar file which contained the code and the tests under the same root directory:

`-- src
    |-- main
    |   `-- clojure
    |       `-- clojure
    |           `-- data
    |               `-- xml.clj
    `-- test
        `-- clojure
            `-- clojure
                `-- data
                    `-- xml
                        |-- test_emit.clj
                        |-- test_parse.clj
                        |-- test_seq_tree.clj
                        |-- test_sexp.clj
                        `-- test_utils.clj

Not what we want. The answer was to change the jar command on the rules file to make it read only from src/main/clojure/ resulting in a clojure/data/xml.clj structure.

The biggest change in lein_makepkg was to use configparser to read options from ~/.clojurehelper.conf. I had to read a lot about it in order to understand how it worked; as usual I had to spend some time fixing newbie mistakes. The tests now ignore the config file by default unless the ignore_configfile class variable is set to False manually. The strategy is to merge the options read from commandline into the configuration file properties. Also fixed a few other things suck as pep8 and python 3 compatibility.

Tomorrow I’ll work on a major refactor that must be done in lein_makepkg. I’m beginning to see how the current structure is getting less flexible and with every change I make I have to think on whether it will break something else. Thanks to the tests I detect when breaking happens but not always so I think is also time to improve the tests.

Survived exam period. Lein xml, clojure/data.xml, lein2 sequence, changed packaging test

I’ve been taking care of small issues after the third meeting with my mentors. The last week of exams was just horrible (3 final exams in 5 days) but hopefully my marks will be good =). So after fixing things like encoding, renaming, and fixing typos, I started working on lein xml again to get rid of the mess I initially created. The first approach was to print the xml file, line by line, with strings… Yes I know, an ugly naive solution that a monkey could have done better. So after some thinking this is what I came up with:

(ns leiningen.xml
    (:require [clojure.java.io :as io ])
    (:require [clojure.data.xml :as xml]))

(defn xml
    "Transform project.clj into a simpler XML file"
    [project & args]
    (with-open [out-file (io/writer "project.xml")]
      (let [project-xml (xml/element
                :project {}
                (xml/element :name {} (:name project))
                (xml/element :version {} (:version project))
                (xml/element :main {} (:main project))
                (xml/element :url {} (:url project))
                (xml/element :description {} (:description project))
                (xml/element :dependencies {}
                    (reduce
                        (fn [deps d]
                            (conj deps (xml/element :dependency {}
                                        (xml/element :name {} (str (d 0)))
                                        (xml/element :version {} (d 1)))))
                        nil (:dependencies project))))]
        (xml/emit project-xml out-file))))

I initially didn’t know what to use to export to xml. I thought of trying to not use any other libraries since it would add yet another dependency to our long list, but I went back to your titanpad and saw that data.xml is already planned to be packaged, so I’m just reusing what we’ll have.

I also want to package data.xml for fun. I think is a very useful package and looks very interesting since it doesn’t use leiningen. I sent a few questions to the mailing list regarding the package.

I also made the lein2 sequence depend on javahelper, but that’s almost self-explanatory.

git-buildpackage has been replaced for dpkg-buildpackage when it comes to running the packaging tests. It made no sense to use git-buildpackage, it only added more noise to the whole script since the code had to be imported, and then all changes be comited before the build. dpkg-buildpackage is also faster, making the test run at about 18 seconds on my machine (used to be about 26 seconds with git-buildpackage).

more updates on clojurehelper, itp for stencil, lein-xml, full packaging test

Before I took clojurehelper for a ride I remembered an issue we discussed in the clojure packaging team regarding a circular dependency that might be introduced if we package leiningen2 dependencies with clojurehelper. It is our hope that clojurehelper uses leiningen to build packages in the future, as of today this is not the case but we do use leiningen during the packaging process because of the lein-xml plugin. So in order to use clojurehelper without actually using it I implemented the –javahelper flag of ln_makepkg: it creates a debian folder with a rules file that doesn’t use lein2 but javahelper. I believe this option will be deprecated after we manage to package leiningen2 but for now I think it can help us test and package dependencies.

I also updated leinpkg.tools.Project to parse project.xml instead of pom.xml. project.xml is what lein-xml produces from project.clj. ln_makepkg was also updated to call lein-xml instead of lein pom. I discovered a few bugs on lein-xml after testing it for a bit but they are fixed now. I had to adapt the tests as well. Speaking of tests, I created a new type of test “package test” which builds a full package out upstream’s tarball, I think this goes against unit-testing but I found myself building quoin so many times trying to find bugs in clojurehelper that I decided to create a test for this. Is a little slow but it brings me joy to see it building it perfectly. Following the “refactor early, refactor often” idea, I made this new test more generic so other packages can be built with it. This is not as straightforward as testing the correctness of the templates since the files require a few mandatory modifications before the package can be built (for instance if the description is missing debuild will complain and stop). Different tests can override the _prebuild method to specify what they want to do before calling git-buildpackage, here’s an example of how to add something to the description field of the control file:


def _prebuild(self):
 control = open(self.git_dir + '/debian/control', 'r')
 lines = control.readlines()
 control.close()
 lines[-2] = 'Description: Lorem ipsum dolor sit amet\n'
 lines[-1] = ' Lorem ipsum dolor sit amet, consectetur adipiscing elit.'
 control = open(self.git_dir + '/debian/control', 'w')
 control.writelines(lines)
 control.close()
 call('git add debian/control', shell=True, stdout=self.stdout_file)
 call('git commit -m "Added description to control"', shell=True,
 stdout=self.stdout_file)

I’m glad I’m writing many tests, and I’m glad they’re failing. For instance when I migrated from the old pom approach to the new lein-xml parser, the tests were the one that spotted the mistakes even when I thought the change wouldn’t introduce any problems.

Many things are left to do. I decided to package core.cache and stencil which by the way clojurehelper managed to package with almost no interaction =).

Tomorrow first Debian weekly report. We’ll see how that goes.

A new member of the family: ln_builddocs

Last post ended with clojuretools not being able to build the documentation. Well, today I created the missing piece of the puzzle (one of many actually): ln_builddocs looks for markdown documents in the root folder and creates the html equivalents placing them under doc/html, the rest is taken care by the rest of the dh sequence. There’s not a lot of magic in the script itself, bash + markdown + sed + heredoc, nothing fancy but gets the job done, and most importantly, makes clojurehelper package the simpler libraries we have so far like quoin and scout with almost no interaction.

I have to say I spent most of the time trying to actually figure out a way to do all this cleanly. My first attempt was to dump the project variables from ln_makepkg into a package.control type of file in hope that the rest of the helpers would just read from it and figure out what to do. I even created a local branch for this, setup ln_makepkg to dump a json of the project dictionary and so… but when I stopped to actually think about it I realized it was wrong, I don’t mean that it was not working but rather it was just not right. Each helper should be able to do it’s function pretty much autonomously without some big central file, that’s how javatools works, each helper has a set of their own control files like .jlibs an  .classpath. Requiring ln_builddocs to be used with ln_makepkg makes no sense for someone who is not packaging clojure or wants to use the little helper by itself.

What’s missing now? A way for ln_build to figure out what to build without using the $PRODUCED_JAR  environment variable would be nice. I was thinking of also including the CLASSPATH variable in ln_build but I don’t know how that would affect the rest of the jh_ tools from javatools. I’m pretty sure it would break jh_build but then again I don’t think we have to use jh_build at all.

rules file, support for argument testing, itp feature, refactoring, more tests, ln_clean, ln_build

A lot has happened since my last post. Let’s start with ln_makepkg:

We can now create tests that make use of the argument parsing implemented for ln_makepkg. In order to pass arguments to ln_makepkg during a test one only has to initialize the _commandline_args variable accordingly (-i is for the new –itp option described below):


class TestSlingshot(TestTemplates):

__test__ = True
 _pom_file = 'pom_slingshot.xml'
 _expected_source_name = 'slingshot-clojure'
 _expected_package_name = 'libslingshot-clojure'
 _commandline_args = '-i 699546'
 _expected_version = '0.10.3-1'
 _expected_itp_bug = '699546'

An important change made in the tests was to capture the “match and compare” pattern, mentioned in my last post, into one method:


def _search_compare(self, file, regex, expected):
 result = re.search(regex, file)
 if expected and not result:
 print 'Expected value was not found'
 assert False
 elif expected:
 found = result.group(1)
 print 'Found:', found, 'Expected:', expected
 assert found == expected

I asked on the mailing list if it would be nice to have ln_makepkg to automatically check for the ITP bug number that the package closes. After a few responses I decided to integrate it into ln_makepkg as an optional feature using wnpp-check from the devtools package. In order to use this one can just call ln_makepkg with the –itp option to manually specify the bug number, or –guess-itp to fetch it from wnpp. I also wrote a test for this and it works quite nicely.


class TestQuoinArgsItp(TestTemplates):

__test__ = True
 _pom_file = 'pom_quoin.xml'
 _expected_source_name = 'quoin-clojure'
 _expected_package_name = 'libquoin2-clojure'
 _commandline_args = '-p libquoin2-clojure -v 0.1.1 --guess-itp'
 _expected_version = '0.1.1-1'
 _expected_itp_bug = '710113'

I’ve been playing around with the rules file in order to make our debian packages more straightforward. Ideally we would only have to call dh –with clojurehelper in the rules file and forget about the rest. That’s what I’ve been trying to accomplish:

#!/usr/bin/make -f

include /usr/share/javahelper/java-vars.mk
export JAVA_HOME=/usr/lib/jvm/default-java
export CLASSPATH=/usr/share/java/clojure.jar
export PRODUCED_JAR=quoin.jar

%:
dh $@ –with dh_lein2

The above works! =). I had to write ln_build and ln_clean, I wish they didn’t rely in the PRODUCED_JAR variable and it cannot build the documentation yet but I’ll sort that out soon.