Setuptools, Package indexes, and releases

I’ve been doing package management related stuff a lot recently, and thanks to a nice little script Chris McDonough showed me, it’s easy to turn a directory full of eggs into a PyPI (Python Package Index) style index page.

And now I’m pretty much sold on the the idea of having separate indexes for all our major releases so that there’s a known, tested, set of versions and dependencies, that can be easy_installed with one command to get a working install. When combined with Ian Bicking’s virtualenv package, this pretty much solves the package management problem for users, because they can keep a lightweight virtual environment for each version of the software they want to have, and because they can have a reliable and repeatable way to get a specific version of TurboGears along with all of it’s dependencies.

For projects like TurboGears, Pylons, or Zope, where there’s a pretty large set of dependencies it makes good sense for the development team to take on the burden of managing packages and versions for users. This is of course conceptually similar to creating a Linux distribution, but fortunately it’s orders of magnitude easier. We have around 40 packages in the turbogears 2 index, but Ubuntu has well over 10,000 packages. And in a world of interconnected componets one of the core value-adding propositions that a “megaframework” adds is the ability to get a pre-vetted set of packages that are known to work together.

And it’s in that light that I’m thinking about a time based release schedule makes sense. It’s not so much that we need to change the core TG2 code all that often, but there are lots of components and interactions, and it seems like there’s a value in providing regular “distributions” where we’ve tested everything and proved that it all works together well.

3 Responses to “Setuptools, Package indexes, and releases”


  1. 1Toshio Kuratomi

    While this is a style of development that has benefits for development releases and closed source software it has dangers for a quick moving open source software project. It can be beneficial by ensuring that your users always have access to the latest version of libraries that your framework needs but it is also easy to fall into the trap of requiring older versions of libraries than the current. The advantage of pegging your software to a particular, older, private versions of software is that you can distribute a version of your entire software stack that you have tested as working with each other.

    There are many drawbacks to consider as well, though:

    * When you peg to an older version of an upstream library you often have to maintain that library instead of upstream as they have moved on to later versions.

    * Upstream fixes take longer to propagate to your local repo.

    * Security fixes become a nightmare for users. Instead of upgrading one copy of a library on a system, the admin of the system or packager has to find and update the library in every project that uses it.

    In the realm of Linux distributions we are very conscious of the drawbacks of this style of development. To the point where some distributions ban bundled versions of libraries in their packages. Instead of promoting more of this, we try to help projects get their code working with the versions of upstream libraries that we ship. We do this in a variety of ways from testing with new library versions and alerting you to problematic interactions with newer releases to devoting our man hours to reading the source, patching software, and submitting fixes upstream that work with the versions of the libraries that we ship.

    For an open source project which runs on Linux, the best way to keep your framework updated with other libraries is to engage with your packagers on one or more distributions. The packagers have a vested interest in testing your software with new versions and keeping things running for their users. Working with them to help keep your software running on newer, maintained versions of libraries grows your developer community so that you can have more time for new features while the distributions spend more time porting code for newer versions of libraries. This is a win for everyone as it means the software has a larger base of maintainers who are each working on the things that have the most meaning for them.

  2. Toshio,

    Thanks for the detailed response. I have been thinking about it quite a bit, and I think that perhaps we’re somewhat talking about different things.

    My goal is not to tightly control the dependencies in the packages themselves, but to provide an “index” or “repository” of known working versions. As far as I can tell this should have no negative impact on linux distributors who want to repackage our stuff.

    Of course if people use our “repository” rather than the one from their distribution, and if they use multiple virtual environments, they will have some security and update burden that they would not if they just used what was in the distro. And I want people to understand that trade-off, but I also want people to have that choice, and I want to do everything I can as core TG dev to make sure that the pain of not using the distro package is minimized as much as I can minimize it.

    Partly this is because I know that for various reasons people are going to skip the packages and use stuff from us directly. And partly it’s because OS X and Windows are major target platforms for TurboGears, and they just don’t have good package repositories.

    But I also want to have a tested, well known set of working libraries because it’s my hope that will help distributions to package up our stuff better. Of course distro’s are free to use newer stuff than we’ve got in our latest index, but if they want a shortcut, looking at the versions we’re using in our index should be a very quick way to find a set of known good eggs.

  3. Re: timed releases: Best. Idea. Ever.

    Django’s development process has never been as focused, productive, or fast as it’s been since we formally scheduled our next release. It’s amazing how committing to a date makes everyone suddenly more productive.

    Now that I’ve seen how well that plan works, I’m never going back. It’s a simply brilliant planning structure for Open Source.

Leave a Reply