Archive for May, 2008

What’s up with “Mastery” in programming?

The term “Mastery” is kind of loaded with political and historical baggage. When we hear it we might go down the path of masters and slaves, domination and subjugation, and power as a tool to control others. But there’s another, perhaps even deeper, set of associations at work here too. Mastery as excellence, self-discipline — of control turned inward.

???????? ????? ????????

What does it mean to become a master developer? It’s not just about intelligence, or experience. It’s also about mastering yourself, training your intuitions, and learning new ways of thinking about yourself and the world. Like mastery in any other discipline this is no easy process. It requires practice, the willingness to learn from your mistakes, and time.

I look at martial arts, horse training, dance, music, and many other fields and I see institutions designed to help people achieve mastery. I’m not so sure the software development field has yet developed any of these institutional aids. Programs like the MCSD, or JCP certifications don’t really encourage mastery, if anything they encourage lowest-common denominator thinking.

Yet we do have masters, and we do see evidence of huge productivity differences between the best and worst developers. What we need is some better ways of helping people achieve the highest level of mastery they are capable of. Sure not everybody can become a black belt, but most people can if they are willing to take instruction, practice, and keep working at it long enough. In software we just don’t have the same learning resources.

We can do better. I’m not exactly sure of the details, but I know we can do better.

Some TurboGears Love

I just found this blog post about Paris Envies, and why Jon chose TurboGears for that project. I think we come out looking pretty good ;)

He was well aquanted with Ruby on Rails, and Django when he was introduced to TurboGears:

I was sceptical at first, being in love with Django at that time. TurboGears taught me a lot of things and without it I’m sure Paris Envies wouldn’t be where it is today.

His site is very cool and well designed:

This site is a great example of what you can do with TurboGears, since it makes use of lots of TG features and add-ons. Widgets, JSON support, the user Registration module, and lots of other components are used. And the great thing is that he’s been able to rapidly add new features, and has been very happy with the flexibility of the TurboGears framework.

I made the Paris Envies mobile website in two days, no more. This included tests, integration of the WURFL mobile phone database (to get screen sizes) and Google Maps Static (I was using Yahoo Static Maps first, but google ones are much better ;)).

All in all, I would say this seems like a ringing endorsement of TurboGears.

I can code really faster with TurboGears than with any other framework, but when I say faster, I mean it.

And of course all this was done with TurboGears 1, which is still viable, still competitive, and still very much supported. Sure we’re working hard on TurboGears 2 to provide many more industrial strength solutions, and to make getting started even easier. But we’re also committed to maintaining and growing the TurboGears 1 platform at the same time. Unlike some other web-frameworks out there, we’re not abandoning 1.0 in favor of 2.0. Sure, we’ll eventually phase out TG1 support when people don’t want it anymore — but that’s a long way off, and don’t think shafting existing users is a path towards future success.


Prolonged adolecence is not a new problem, it’s just new to the masses:

Children of kings and great magnates were the first to grow up out of touch with the world. Suburbia means half the population can live like kings….

Paul Graham

You can’t shelter people from everything bad or scary, and expect them to live in the real world.

Project managers, System Administrators, and parents should take note of this.

People can only step up and take responsibility when they actually know what’s going on. Seems to me that there are lessons for how we talk to people about project risks, how we handle e-mail spam problems, and how we think about IT services. It’ll be a while before I figure out what exactly all of those lessons are….

What is data?

Ocean asks on his blog is data an asset?

Data is certainly not like many other assets, it doesn’t depreciate, you can copy it endlessly, and it’s next to impossible to imagine a commodities market for data. Heck copying the data can either increase it’s value (think “The DaVinchi Code”) or decrease it (think passwords). People don’t pay for data as much as they pay for human attention. You can use data to get attention or you can use attention to collate, assimilate, and otherwise transform raw data into useful information, but either way data needs people to understand and interpret it to become valuable.

So at best:

data + human_understanding == value

Bruce Schenier takes it one step further, calling data the pollution of the the information age.

Data Pollution

SmokestackData sucks up space, time and human attention. But more than that, data can be parsed, manipulated, and transformed to fit various agendas. And in a world where data about all of us is “owned” by various large corporations, from Amazon, to Google, to Enron, it’s not always clear how that data will be used. Besides which millions of credit card numbers are stolen from various companies who store our data “in good faith.” Data costs money in terms of maintenance, in terms of storage, and in terms of liability. Heck, I know people who work for companies who have an e-mail retention policy — which is really more of a mandatory e-mail deletion policy.

Polluted Data

And that assumes that all that data is verifiable true, and that’s definitely not the case. I sold a car once and the new owner didn’t take it to the DMV to get it registered before his friend drove it without a license and got it impounded. And that showed up on my credit report for years. I have a friend who somehow ended up “deceased” even though she’s still very much alive and well.

All of this is to say that as software developers, IT Mangers, and companies in general need to think a lot more about data, and to invest in some better terms for the various different things we call data.

We need to differentiate between raw data, information, and knowledge. We need to help our customers think about the life cycle of the data they want us to capture. We need to educate people about the costs and benefits associated with keeping data, and ultimately we need to follow the mantra:

Think before you store

And if you’re concerned about privacy, and individual liberty, please take a few min and read Bruce’s article.

Threads, Processes, Rails, TurboGears, and Scalability

Threads may not be be best way, or the only way, to scale out your code. Multi-process solutions seem more and more attractive to me.

Unfortunately multi-process and the JVM are currently two tastes that don’t taste great together. You can do it, but it’s not the kind of thing you want to do too much. So, the Jruby guys had a problem — Rail’s scalability story is only multi-process (rails core is NOT thread safe), and Java’s not so good that that….

Solution: Running “multiple isolated execution environments” in a single java process.

I think that’s a neat hack. The JRuby team is to be congratulated in making this work. It lets Rails mix multi-process concurrency with multi-threaded concurrency, if only on the JVM. But it’s likely to incur some memory bloat, so it’s probably not as good as it would be if Rails itself were to become threadsafe.

I’m not sure that the Jython folks have done anything like this. And I’m not sure they should. It’s a solution python folks don’t really have. Django used to have some thread-safety issues, but those have been worked out on some level. While the Django people aren’t promising anything about thread safety, it seems that there are enough people using it in a multi-threaded environment to notice if anything’s not working right.

At the same time, TurboGears has been threadsafe, from the beginning, as has Pylons, Zope, and many other python web dev tools. The point is, you have good web-framework options, without resorting to multiple python environments in one JVM.

Why you actually want multi-threaded execution…

In TurboGears we’ve found that the combination of both multi-threaded and multi-process concurrency works significantly better than either one would alone. This allows us to use threads to maximize the throughput of one process up to the point where python’s interpreter lock becomes the bottleneck, and use multi-processing to scale beyond that point, and to provide additional system redundancy.

A multi threaded system is particularly important for people who use Windows, which makes multi-process computing much more memory intensive than it needs to be. As my Grandma always said Windows “can’t fork worth a damn.” ;)

But, given how hard multi-threaded computing can be to get right TurboGears and related projects work hard to keep our threads isolated and not manipulate any shared resources across threads. So, really it’s kinda like shared-memory optimized micro-processes running inside larger OS level processes, and that makes multi-threaded applications a lot more reasonable to wrap your brain around. Once you start down the path of lock managment the non-deterministic character of the system can quickly overwhelm your brain.

As far as i can see, the same would be true for a Ruby web server in Ruby 1.9, where there is both OS level thread support and an interpreter lock.

I’m well aware of the fact that stackless, twisted, and Nginx have proved that there are other (asynchronous) methods that can easily outperform the multi-threaded+multi-process model throughput/concurrency per unit of server hardware. The async model requires thinking about the problem space pretty differently, so it’s not a drop in replacement, but for some problems async is definitely the way to go.

Anyway, hats off to the Jruby team, and here’s hoping that Rails itself becomes threadsafe at some point in the future.