Archive for the 'Lean IT' Category

Premature optimization

We all know it’s bad. But, programming for performance in reasonable ways is good. So, what’s the difference?

Sometimes we think we know that a piece of code is important so we spend some time optimizing it. And in the end it’s less clear, and less maintainable, and it turns out that our bottlenecks are all elsewhere.

But, sometimes we do know where bottlenecks are going to be, we’ve learned from experience, and we know what needs to be done.

We know that architecture determines performance, and architecture isn’t easily bolted on at the end of the project.

So we have a conundrum. We shouldn’t optimize yet because we don’t know where the bottlenecks will be. We shouldn’t wait to optimize because we can’t easily retrofit a good architecture on a complex system.

Some of the conundrum is only apparent — there’s a difference between architectural problems that need to be set up front, and the kind of low level micro-optimization that obscures more than it helps. But, sometimes these conflicts are real — how do I know if I need a multi-process multi-consumer queue system for PDF generation before we build the system and benchmark it? If you don’t need it, that kind of extra architectural complexity just obscures the bit of code that actually solves the problem.

Solving the problem by going meta

Perhaps the problem really is that we’re dumb and optimize the wrong things at the wrong time. The solution to that problem is to get less dumb. Which means that we ought to spend time optimizing “learning”, both within our project processes, and across projects.

Codifying this learning is what the Patterns of Enterprise Application Architecture book was all about.

And I think it’s great as far as it goes, and if you haven’t read it you should buy it now.

But there are a lot of patterns that I can identify from my last half dozen projects that aren’t covered in PoEAA, so it would be great to see a next generation of books and blog posts that cover the modern architectural trade-offs that you have to make, something that covers some of the paterns of the web.

Scalability via in HTTP, etags, caching, and load balancing (the whole RESTful services argument), networked async processing patterns, etc. Scaling to the public web levels requires a whole different set of architectural principles than scaling to the old “enterprise” levels did, and that knowledge seems very much in flux.

It would be great if it also provided some advice for those of us who’ve moved into what Neil Ford has called the world of the Polyglot Programmer, patterns for coordinating activities across language barriers in a sensible way. That’s part of the nature of modern web systems too.

Working at SourceForge

I’ve been at SourceForge for a couple of months now, it’s been great, the work is surprisingly fun and rewarding. There’s a local office, and so I actually get to g and hang out with smart people whenever I want. I can still work from home, but having someplace to go in to has been a refreshing change.

I haven’t gotten to know many people outside the engineering team in Dexter, but they are great guys.

There’s lots of good stuff happening here, support for bazar, mercurial, git, trac, and other options on SourceForge itself, improved feeds, and other API’s for getting at SF data, etc. But I’m only peripherally aware of all that at the moment because I was hired to work on “totally new stuff” which is written in Python.

What I’m working on

Our first new project is a site called FossFor.Us, and it was the vision for this site, and the team that is working on this and other new stuff, that sold me on the coming to work for Sourceforge. It’s written in Django, and it’s been my first really large Django project, and while the experience has been pretty positive, there have been a number of things that have renewed my commitment to TurboGears development — but that’s a blog post for another day.

The backstory to the FossFor.Us site is that open source project hosting providers (Sourceforge and it’s recent competitors) have traditionally been pulled in two very different directions by two very different sets of users:

  • developers of open source software
  • and people who just want to.

And that tension has held us back in the past, we have to serve everybody with the same portal, and it ends up not serving either community as well as it should. But since developers are the most vocal users, it’s been the second class of user that’s been most neglected.

foss_blog_image

These people are just looking to get things done, and don’t care about the “project” part of open source software, they are, at least at first, only interested in the “product.” In many ways the Free and Open Source Software community has not served these people well.

Fossfor.us is in it’s first incarnation an attempt to create a window on the free software world, that’s just about finding and using software. But in a larger sense it’s an attempt to help us as a community to connect with potential users better.

I think connecting FOSS geeks and users is actually important

It’s important because people aren’t aware that there are free options, and are paying for software they can’t afford. There’s a prototypical user (based on a real person) that we talk about a lot, who’s a single mom, has an old laptop, and struggles week to week to pay her bills, but who bought Photoshop, because “that’s how you edit photos.” Her family could have used that money to more productive ends, but because she needed to edit photos, and didn’t know about the free alternatives all those opportunities are just lost.

Of course the same thing is true of small business owners, who could use free software to reduce their “overhead” costs, and actually spend money on creating things people love. Free software has the potential to lubricate the wheels of the economy, encourage entrepreneurial activity, and enrich people’s lives.

All of this is to say I think fossfor.us is a way to serve the world by making the product of all the open source developer’s labor more easily available and more accessible to real people. And when my mom actually used it to find some software a couple weeks ago, I knew we’d done something right.

What is data?

Ocean asks on his blog is data an asset?

Data is certainly not like many other assets, it doesn’t depreciate, you can copy it endlessly, and it’s next to impossible to imagine a commodities market for data. Heck copying the data can either increase it’s value (think “The DaVinchi Code”) or decrease it (think passwords). People don’t pay for data as much as they pay for human attention. You can use data to get attention or you can use attention to collate, assimilate, and otherwise transform raw data into useful information, but either way data needs people to understand and interpret it to become valuable.

So at best:

data + human_understanding == value

Bruce Schenier takes it one step further, calling data the pollution of the the information age.

Data Pollution

SmokestackData sucks up space, time and human attention. But more than that, data can be parsed, manipulated, and transformed to fit various agendas. And in a world where data about all of us is “owned” by various large corporations, from Amazon, to Google, to Enron, it’s not always clear how that data will be used. Besides which millions of credit card numbers are stolen from various companies who store our data “in good faith.” Data costs money in terms of maintenance, in terms of storage, and in terms of liability. Heck, I know people who work for companies who have an e-mail retention policy — which is really more of a mandatory e-mail deletion policy.

Polluted Data

And that assumes that all that data is verifiable true, and that’s definitely not the case. I sold a car once and the new owner didn’t take it to the DMV to get it registered before his friend drove it without a license and got it impounded. And that showed up on my credit report for years. I have a friend who somehow ended up “deceased” even though she’s still very much alive and well.

All of this is to say that as software developers, IT Mangers, and companies in general need to think a lot more about data, and to invest in some better terms for the various different things we call data.

We need to differentiate between raw data, information, and knowledge. We need to help our customers think about the life cycle of the data they want us to capture. We need to educate people about the costs and benefits associated with keeping data, and ultimately we need to follow the mantra:

Think before you store

And if you’re concerned about privacy, and individual liberty, please take a few min and read Bruce’s article.

The motivational meeting…

Last week, I ranted a little bit about motivational meetings. Today I’ll make the opposite case.

Why have motivational meetings?

The right way to use motivational meetings is to reaffirm the purposes of the group, and help people to connect the dots between their individual efforts and the collective goals of the group, and to connect those goals with their own individual aspirations.

Basically, motivating people is easy:

  • Give them work that is meaningful to them and to the organization
  • Treat them with respect

Treating people with respect includes paying them a fair wage, and not doing any of these things.

Among other things it also means not letting people who aren’t contributing to the common goals of the organization hold back the group by not doing their job.

Research has shown that one of the survey questions most highly correlated with motivation and performance is:

Does my supervisor, or someone at work, seem to care about me as a person?

Which is another way of saying does your boss respect you. At the same time the single highest correlation for any question was:

I get to do what I do best everyday at work.

So, it’s really important to line people’s intrisic skills and internal long-term motivational drivers with the work you ask them to do.

If you’re not doing those two things, motivational meetings are a loss. If you are doing them you can use a meeting to remind people of how their deeper motivations are connected to what they are doing now.

P.S. My info on the top questions and their correlation to performance comes from Gallop research via the very interesting book First Break All the Rules, which is one of the best, and most evidence based, books on managing for exceptional performance I’ve read.

So many revolutions, so little time.

Tim Bray is blogging about “inflection points” in the uptake of various technologies.

Python get’s a very positive review:

Today you’d be nuts not to look seriously at PHP, Python, and Ruby.

So, the rise of the so-called scripting languages is one of the inflection points, but it’s not the only one.

He singles out web-framework development as one place where there’s a lot of stuff happening, and a lot of new “rails-like” frameworks are cropping up all the time. TurboGears will live or die in the context of a much larger web-development revolution, and we need to be prepared to make our way forward in the midst of that.

What comes after rails will not be a rails clone. It will learn the right lessons from rails, avoid the pitfalls of rails, but it will also need to carve out something new and better than rails. For RDBMS users, I think the key difference between TG and Rails is the power and flexibility of SQLAlchemy. We need to “sell” this better.

There are a lot of other revolutions coming according to Tim. And I do think we’re looking at big changes in terms of everything from programming language choice, to web-development tools, to end-user desktops, and data persistence mechanisms. We’re also just beginning to see what the world of high-end javascript and other “rich” internet applications is going to do to our view of end-user software.

He doesn’t even mention the rise of EC2 and the Google App Engine as sea-changes in the way we buy computational resources, and I think that’s going to have a huge impact.

In the end my prediction is that the way we develop applications will change more in the next 5 years than it did in the last 5, and it’s time to start getting our heads wrapped around these issues, or we’ll be left behind.