Archive for the 'TurboGears' Category

A peek at a new Sourceforge.net

So, I’ve been working on sf.net in various ways for about a year now.
http://sourceforge.net/p/. It’s written in Python using modern open source tools, from RabbitMQ, and MongoDB, to Git and Mercurial. And we are committed to making this the most open forge possible. We’re committed, to open processes, open code, and perhaps most importantly open data.

The first thing we did was create some new pages for downloads. Recently we releases a new service designed just for open source project leaders who want to use sf.net as a directory and downloads service.

But, we’re also aware that one of the most important services we provide is project hosting. For the last several months a small group of us have been trying to bring sourceforge.net’s tools into 2010. And now we’re releasing an early preview of those new developer/community tools:

We have a long way still to go, but every long journey begins with a single step, and today’s step is allowing you to try the new forge, to create new projects at:

https://sourceforge.net/register

Where you can go to get a new project, with our new tracker, wiki, git, svn, and other tools. Projects can have subprojects, and links to other tools hosted off site, along with the many features that sf.net brings (free web hosting, hosted apps, etc).

But, why do all this?

In 1999 SourceForge was cool.

It provided all the tools that an open source project needed to get going, from cvs hosting, to bug tracking, and e-mail list support.

They pioneered free free software project hosting, and helped to transform the software development culture from one which barely new about free software or open source, to one where nearly everybody I know uses open license software. Oh sure, some of them might not know it, but they have it on their phones, in their TVs, their wireless routers — not to mention all the websites they use everyday that run on open source.

But, time passed.

More alternatives came out, more projects (including my own) started self hosting, and the landscape of open source software development changed. SourceForge.net took a long time coming out with support for new tools like svn, and then git.

Still, SourceForge has a special place in my heart. Partly it’s nostalgia, I suppose, but I still think:

  • the core mission is still right
  • and there is still a real need

We (Open Source developers) still need tools like git, mercurial, and svn hosting. We still need bug trackers and mailing lists. And in a meeting of other open source project leaders last fall, nearly every single one of them identified the time wasted integrating and administering these tools as one of their most important frustrations.

Not enough…

But, for many sourceforge.net and other free project hosting services were just not good enough, they weren’t scriptable, the weren’t extensible, their data wasn’t portable, and so they felt like they had to take on that cost.

And I fundamentally believe that open source projects live an die by communication, and that sourceforge.net can do something new by integrating the various kinds of “conversations” that happen around the project. We can integrate mailing lists and forums, we can integrate SCM and ticket trackers, etc.

New and improved

So, a couple of us have been quietly working on something new. The new forge is designed around a few core ideas:

  • that data should be portable (every project gets their own database, which they can take with them if they want),
  • that the open source community ought to be able to extend and enhance the tools they need,
  • that integrating and cross linking the various kinds of conversations that open source projects need to have ought to be easier.

So, what we’re announcing today is more of a commitment to getting there on all these things, and a commitment to the “release early, release often” project management strategy.

So, expect us to take your feedback and make things better. Expect us to release lots of small fixes, and expect a few places where things are broken/incomplete because we value feedback more than polish at this point.

Premature optimization

We all know it’s bad. But, programming for performance in reasonable ways is good. So, what’s the difference?

Sometimes we think we know that a piece of code is important so we spend some time optimizing it. And in the end it’s less clear, and less maintainable, and it turns out that our bottlenecks are all elsewhere.

But, sometimes we do know where bottlenecks are going to be, we’ve learned from experience, and we know what needs to be done.

We know that architecture determines performance, and architecture isn’t easily bolted on at the end of the project.

So we have a conundrum. We shouldn’t optimize yet because we don’t know where the bottlenecks will be. We shouldn’t wait to optimize because we can’t easily retrofit a good architecture on a complex system.

Some of the conundrum is only apparent — there’s a difference between architectural problems that need to be set up front, and the kind of low level micro-optimization that obscures more than it helps. But, sometimes these conflicts are real — how do I know if I need a multi-process multi-consumer queue system for PDF generation before we build the system and benchmark it? If you don’t need it, that kind of extra architectural complexity just obscures the bit of code that actually solves the problem.

Solving the problem by going meta

Perhaps the problem really is that we’re dumb and optimize the wrong things at the wrong time. The solution to that problem is to get less dumb. Which means that we ought to spend time optimizing “learning”, both within our project processes, and across projects.

Codifying this learning is what the Patterns of Enterprise Application Architecture book was all about.

And I think it’s great as far as it goes, and if you haven’t read it you should buy it now.

But there are a lot of patterns that I can identify from my last half dozen projects that aren’t covered in PoEAA, so it would be great to see a next generation of books and blog posts that cover the modern architectural trade-offs that you have to make, something that covers some of the paterns of the web.

Scalability via in HTTP, etags, caching, and load balancing (the whole RESTful services argument), networked async processing patterns, etc. Scaling to the public web levels requires a whole different set of architectural principles than scaling to the old “enterprise” levels did, and that knowledge seems very much in flux.

It would be great if it also provided some advice for those of us who’ve moved into what Neil Ford has called the world of the Polyglot Programmer, patterns for coordinating activities across language barriers in a sensible way. That’s part of the nature of modern web systems too.

How do we expand Open Source?

So, one thing which keeps comming up in a bunch of different areas of my life is how we can expand the ethic of Open Source development.

People want TurboGears to do more than it does, they want other open source projects to grow, they want new open source projects in specific areas, and they want Open Source like activity in other professions like nursing or construction.

I definitely don’t have the answers. But I’ve had this conversation with a lot of folks over the last couple of months, and some of them had some great ideas.

So, in the spirit of opening up a larger conversation about these issues, here are a couple of thoughts distilled from all those conversations.

Institutionalizing Open Source Values

It is of course possible to create cultural institutions around which money can be channeled into Open Source development.

And all the legal mechanisms needed to structure those institutions in the right way are available today.

But the trick it seems is to create the institutions in such a way that money is delivered in small enough amounts that individuals remain in control. Money is powerfully persuasive, but one of the keys to the current success of open source is that collective action is always purely voluntary.

But at the same time the money needs to come in large enough amounts to make a difference. People need to be able to support lives and families on the work they do advancing various projects. To the extent that this is reliable income, we can remove competing priorities, and developers will be able to devote themselves more fully to projects that advance the common good.

So, the key to making all of this work is going to be the “bureaucracies” we create to manage the flow of money. They need to be tuned properly to the nature of the work, stable enough to provide a level of personal security, and perhaps above all they need to be financially transparent.

Creating the right kinds of organizational structures will help us channel the right amounts of money to the right people, and creating the wrong kinds will create perverse incentives that pollute the whole system.

Most of what’s been happening so far in this direction are ecosystems of companies built around open source offerings. This has worked pretty well, but it’s clear that there can be conflicts of interest, and the nature of commercial ownership leaves even the best run companies vulnerable to sudden changes (acquisition of small open source companies by huge proprietary competitors is already a fact of life).

But, what seems more interesting to me at this point is the number of foundations that are being are created for popular projects or groups of popular projects, etc.

These institutions will continue to grow, but they have the potential to change the way projects are run, so I expect a lot of fits and starts as we mature.

Open Source for other Professions

With that thought in mind perhaps lawyers, doctors, and other professions already have a form of the Open Source ethic, which has grown up around large institutions, and functions to spread knowledge and advance the state of the art of those groups. These institutions work to create new knowledge, train practitioners, and they seem to work pretty well.

If you haven’t caught on already I think it might be fair to say that this sub-section of these professions is called “academics.” ;)

Of course the university system isn’t perfect, and it’s taken hundreds of years to evolve to it’s current state, but I think it does provide some insight into how we might evolve larger institutional presences around open source, not in the next few years, but in the next few decades.

Python Template languages (Part 1 — Django)

I’ve been thinking a lot about template engines in Python recently. Partly because sourceforge.net’s new python code needed to choose a template language, and there were some questions about why we would choose one over the others.

But beyond that In the past few weeks used Genshi, Mako, Jinja, Django Templates, and Cheetah, and have been looking at, but not yet using out chameleon.genshi.

I figure all this promiscuous template library usage means that I should put my thoughts down somewhere. There are advantages and disadvantages of all these libraries, but I think that the choices are pretty clear once you know your constraints.

I’m not going to commit to covering them all in depth, but I’m going to try to put my thoughts about them down over the next few days.

For today let’s talk about the pros and cons of Django Templates. This is another post that has been developed over the last year or so, where typed stuff up while working on fossfor.us.

Django made making fossfor.us easy in lots of ways. Want threaded comments? Add the existing app in a couple hours — Done! Want OpenID? Again add an app — Done!

But it also had frustrations, and one of the biggest for me was the template language.

Continue reading ‘Python Template languages (Part 1 — Django)’

Things I’ve learned about Time Management

It’s easy enough to say that you don’t have enough time, but the reality is that time is the medium in which we live. Complaining you don’t have enough time very much like a fish complaining that he doesn’t have enough water. So, rather than complaining about the amount of time I have, I’ve been learning to think about my time management issues differently.

Where David Allen lead me wrong

The first insight I had is that Getting Things Done (GTD) has steared me wrong. This is hard to say because I think it’s a great book with a lot of great insights. In particular the strong admonition to get everything you need to do down on paper has changed the way I live and think. That list includes all the major and minor commitments I’ve made, and having it out of my head and in a system that I trust reduced my stress levels imensely.

But I found that I still have a lot of stress. And I found that I was still thrashing back and forth between projects without making the kind of decisive progress on any of them that I wanted to.

Why wasn’t the GTD process enough?

Because, as I eventually learned the GTD “inbox processing” strategy as described in the book is broken. You are supposed to choose between three options for each input: do it, deligate it, or defer it. But really there’s a key fourth option that is the really important one if you’re life is anything like mine.

What is that all important forth option?

  • Don’t do it. Say no. Avoid adding another commitment to your list.

Wait, there’s science behind this!

This is really critical because like any queue that’s processed by a limited resource (in this case my time and attention) filling it too full actually causes the system to break down. This process of breakdown even has a technical name that feels exactly right, it’s called “thrashing.”

Every programmer and computer user knows what this is like. Open too many apps at once, and your machine grinds to a halt, data keeps getting swapped out to disk and the rate at which the machine can process information goes down exponentially.

Ok, so know I knew the name for my problem. Naming it is good, but it’s not the solution.

So, how do I stop thrashing about and get stuff done?

At some point, i’m not exactly sure where, I had a realization that in life, just like a manufacturing line, or a software resource issue, there are two keys to preventing thrashing.

  1. Avoid too much multitasking. A system spends time switching between tasks and is less efficent when time-windows for work are too small, and task-switching happens too often.
  2. When multiple commitments are being made that require real-time or near real-time responses, you have to keep the task-switching window short enough that high-priority tasks can be scheduled immediately. This, of course, creates a tension with the first principle which suggests larger time windows between task-switching.

This is where Getting Things Done’s todo list system provided me with a lot of help. It helped me reduce my task-switching costs, and that’s allowed me to get a better balance on these two pressures.

Letting the work FLOW

I think there’s a very clear analogy with what happened at Toyota in the early days. Manufacturers had die presses which took a long time to change, so they would run them for a long time. This was great in that it helped keep the machine working and limited the downtime. But it also meant that there were long delays in the system since the switching time and the time for the long runs added up. Toyota decided that since they couldn’t afford more machines, they needed to figure out ways to reduce the cost of switching, which reduced total cycle time, made them more responsive to bugs (quality problems) and helped them to get more done in less time.

Reducing the cost of switching made it possible for the system to run differently, rather than batching up lots of work and pushing it through, it Toyota discovered that you could pull what you needed through the system just in time. This same thing works for time management, when you aren’t overloaded you can be more responsive to today’s needs, and you avoid the inevitable mismatches that come with long delays between request and response.

What happens when there’s still too much to do?

But, to come back to my main point, even when you’ve done everything you can do to reduce the task-switching costs, you still can have significant thrashing problems when the resource (that’s you or me) gets overscheduled.

Most projects I’ve worked on ended up in this situation at some point, where working harder stopped producing results because of schedule pressure and resource contention.

This is why saying no is a critical skill.

If you limit the commitments you make, you can provide rapid turn around on the commitements you do make, and everything runs much more smoothly because you’re thrashing less, and wasting less time on task switching.

Beyond the basics, I’ve discovered that one really critical notion is to have enough slack in your system to handle emergencies. If you schedule the system full, any high-priority, high-urgency task that enters the system can break the whole process.

Without slack in the system emergencies snowball…

I know this from experience, as I’ve had my share of emergencies in the last couple of years, and when there was slack in the system things settled down quickly, and when there wasn’t I ended up with new emergencies caused by the first emergency delaying things just a little bit too much.

So, when you are tempted to take on another commitment, think about what would happen to your life if you had to take a week off to deal with a death in the family, or to help a sick relative. If you don’t see any path to recovery, perhaps you’re over-scheduling your most important resource — you.

When No, really means Yes

In the end I discovered the biggest irony of time management: it’s only when you say No to some things, that you have the ability to say Yes, and make real commitments.

When you say no to helping a friend move, or to visiting family, or whatever feels valuable to you you will feel terrible. But, if you don’t say no sometimes to at least some of these things, you’ll end up not being able to do any of those things anyway, because you’re always behind and never quite able to fulfill the commitments you make.

I learned the hard way, hopefully you don’t have to ;)