Archive for the 'Ruby on Rails' Category

Working at SourceForge

I’ve been at SourceForge for a couple of months now, it’s been great, the work is surprisingly fun and rewarding. There’s a local office, and so I actually get to g and hang out with smart people whenever I want. I can still work from home, but having someplace to go in to has been a refreshing change.

I haven’t gotten to know many people outside the engineering team in Dexter, but they are great guys.

There’s lots of good stuff happening here, support for bazar, mercurial, git, trac, and other options on SourceForge itself, improved feeds, and other API’s for getting at SF data, etc. But I’m only peripherally aware of all that at the moment because I was hired to work on “totally new stuff” which is written in Python.

What I’m working on

Our first new project is a site called FossFor.Us, and it was the vision for this site, and the team that is working on this and other new stuff, that sold me on the coming to work for Sourceforge. It’s written in Django, and it’s been my first really large Django project, and while the experience has been pretty positive, there have been a number of things that have renewed my commitment to TurboGears development — but that’s a blog post for another day.

The backstory to the FossFor.Us site is that open source project hosting providers (Sourceforge and it’s recent competitors) have traditionally been pulled in two very different directions by two very different sets of users:

  • developers of open source software
  • and people who just want to.

And that tension has held us back in the past, we have to serve everybody with the same portal, and it ends up not serving either community as well as it should. But since developers are the most vocal users, it’s been the second class of user that’s been most neglected.


These people are just looking to get things done, and don’t care about the “project” part of open source software, they are, at least at first, only interested in the “product.” In many ways the Free and Open Source Software community has not served these people well. is in it’s first incarnation an attempt to create a window on the free software world, that’s just about finding and using software. But in a larger sense it’s an attempt to help us as a community to connect with potential users better.

I think connecting FOSS geeks and users is actually important

It’s important because people aren’t aware that there are free options, and are paying for software they can’t afford. There’s a prototypical user (based on a real person) that we talk about a lot, who’s a single mom, has an old laptop, and struggles week to week to pay her bills, but who bought Photoshop, because “that’s how you edit photos.” Her family could have used that money to more productive ends, but because she needed to edit photos, and didn’t know about the free alternatives all those opportunities are just lost.

Of course the same thing is true of small business owners, who could use free software to reduce their “overhead” costs, and actually spend money on creating things people love. Free software has the potential to lubricate the wheels of the economy, encourage entrepreneurial activity, and enrich people’s lives.

All of this is to say I think is a way to serve the world by making the product of all the open source developer’s labor more easily available and more accessible to real people. And when my mom actually used it to find some software a couple weeks ago, I knew we’d done something right.

REST is a design for the long run

Found this quote in a recent discussion or REST.

REST is software design on the scale of decades: every detail is intended to promote software longevity and independent evolution. Many of the constraints are directly opposed to short-term efficiency.

Threads, Processes, Rails, TurboGears, and Scalability

Threads may not be be best way, or the only way, to scale out your code. Multi-process solutions seem more and more attractive to me.

Unfortunately multi-process and the JVM are currently two tastes that don’t taste great together. You can do it, but it’s not the kind of thing you want to do too much. So, the Jruby guys had a problem — Rail’s scalability story is only multi-process (rails core is NOT thread safe), and Java’s not so good that that….

Solution: Running “multiple isolated execution environments” in a single java process.

I think that’s a neat hack. The JRuby team is to be congratulated in making this work. It lets Rails mix multi-process concurrency with multi-threaded concurrency, if only on the JVM. But it’s likely to incur some memory bloat, so it’s probably not as good as it would be if Rails itself were to become threadsafe.

I’m not sure that the Jython folks have done anything like this. And I’m not sure they should. It’s a solution python folks don’t really have. Django used to have some thread-safety issues, but those have been worked out on some level. While the Django people aren’t promising anything about thread safety, it seems that there are enough people using it in a multi-threaded environment to notice if anything’s not working right.

At the same time, TurboGears has been threadsafe, from the beginning, as has Pylons, Zope, and many other python web dev tools. The point is, you have good web-framework options, without resorting to multiple python environments in one JVM.

Why you actually want multi-threaded execution…

In TurboGears we’ve found that the combination of both multi-threaded and multi-process concurrency works significantly better than either one would alone. This allows us to use threads to maximize the throughput of one process up to the point where python’s interpreter lock becomes the bottleneck, and use multi-processing to scale beyond that point, and to provide additional system redundancy.

A multi threaded system is particularly important for people who use Windows, which makes multi-process computing much more memory intensive than it needs to be. As my Grandma always said Windows “can’t fork worth a damn.” ;)

But, given how hard multi-threaded computing can be to get right TurboGears and related projects work hard to keep our threads isolated and not manipulate any shared resources across threads. So, really it’s kinda like shared-memory optimized micro-processes running inside larger OS level processes, and that makes multi-threaded applications a lot more reasonable to wrap your brain around. Once you start down the path of lock managment the non-deterministic character of the system can quickly overwhelm your brain.

As far as i can see, the same would be true for a Ruby web server in Ruby 1.9, where there is both OS level thread support and an interpreter lock.

I’m well aware of the fact that stackless, twisted, and Nginx have proved that there are other (asynchronous) methods that can easily outperform the multi-threaded+multi-process model throughput/concurrency per unit of server hardware. The async model requires thinking about the problem space pretty differently, so it’s not a drop in replacement, but for some problems async is definitely the way to go.

Anyway, hats off to the Jruby team, and here’s hoping that Rails itself becomes threadsafe at some point in the future.

So many revolutions, so little time.

Tim Bray is blogging about “inflection points” in the uptake of various technologies.

Python get’s a very positive review:

Today you’d be nuts not to look seriously at PHP, Python, and Ruby.

So, the rise of the so-called scripting languages is one of the inflection points, but it’s not the only one.

He singles out web-framework development as one place where there’s a lot of stuff happening, and a lot of new “rails-like” frameworks are cropping up all the time. TurboGears will live or die in the context of a much larger web-development revolution, and we need to be prepared to make our way forward in the midst of that.

What comes after rails will not be a rails clone. It will learn the right lessons from rails, avoid the pitfalls of rails, but it will also need to carve out something new and better than rails. For RDBMS users, I think the key difference between TG and Rails is the power and flexibility of SQLAlchemy. We need to “sell” this better.

There are a lot of other revolutions coming according to Tim. And I do think we’re looking at big changes in terms of everything from programming language choice, to web-development tools, to end-user desktops, and data persistence mechanisms. We’re also just beginning to see what the world of high-end javascript and other “rich” internet applications is going to do to our view of end-user software.

He doesn’t even mention the rise of EC2 and the Google App Engine as sea-changes in the way we buy computational resources, and I think that’s going to have a huge impact.

In the end my prediction is that the way we develop applications will change more in the next 5 years than it did in the last 5, and it’s time to start getting our heads wrapped around these issues, or we’ll be left behind.

TurboGears 2 sprint

I’ll be hosting a TG2 sprint here in Ann Arbor Michigan on October 27th, hopefully we’ll also have a large virtual presence from folks around the world.

I’ve created a page on the Wiki for Sprint Organization tasks:

If the sprint goes well it could get us very, very close to the point where we could reasonably do a TurboGears 2 technology preview release.

The main things we’ll need to do is sync up with the latest pylons, improve our tests, and do some basic doc organization work (migrate some information from the doc strings to the docs wiki, and create some pages which link to external docs). There’s lots of other things I want to do like improve out user authentication/authorization/registration support, or create a toolbox tool for helping create SQLAlchemy models. So, there’s tasks for anybody who can lend a hand.

If you’re available and want to help out (either attending physically or virtually), feel free to add your name to the wiki.

Likewise, if you can host a in-person sprint in another location, please feel free to edit the wiki with that information.