Archive for the 'writing' Category

Premature optimization

We all know it’s bad. But, programming for performance in reasonable ways is good. So, what’s the difference?

Sometimes we think we know that a piece of code is important so we spend some time optimizing it. And in the end it’s less clear, and less maintainable, and it turns out that our bottlenecks are all elsewhere.

But, sometimes we do know where bottlenecks are going to be, we’ve learned from experience, and we know what needs to be done.

We know that architecture determines performance, and architecture isn’t easily bolted on at the end of the project.

So we have a conundrum. We shouldn’t optimize yet because we don’t know where the bottlenecks will be. We shouldn’t wait to optimize because we can’t easily retrofit a good architecture on a complex system.

Some of the conundrum is only apparent — there’s a difference between architectural problems that need to be set up front, and the kind of low level micro-optimization that obscures more than it helps. But, sometimes these conflicts are real — how do I know if I need a multi-process multi-consumer queue system for PDF generation before we build the system and benchmark it? If you don’t need it, that kind of extra architectural complexity just obscures the bit of code that actually solves the problem.

Solving the problem by going meta

Perhaps the problem really is that we’re dumb and optimize the wrong things at the wrong time. The solution to that problem is to get less dumb. Which means that we ought to spend time optimizing “learning”, both within our project processes, and across projects.

Codifying this learning is what the Patterns of Enterprise Application Architecture book was all about.

And I think it’s great as far as it goes, and if you haven’t read it you should buy it now.

But there are a lot of patterns that I can identify from my last half dozen projects that aren’t covered in PoEAA, so it would be great to see a next generation of books and blog posts that cover the modern architectural trade-offs that you have to make, something that covers some of the paterns of the web.

Scalability via in HTTP, etags, caching, and load balancing (the whole RESTful services argument), networked async processing patterns, etc. Scaling to the public web levels requires a whole different set of architectural principles than scaling to the old “enterprise” levels did, and that knowledge seems very much in flux.

It would be great if it also provided some advice for those of us who’ve moved into what Neil Ford has called the world of the Polyglot Programmer, patterns for coordinating activities across language barriers in a sensible way. That’s part of the nature of modern web systems too.

How to write a better book (or just better docs!)

A lot of people tell me that they want to write a technical book for one reason or another. And I think that’s a great goal that can really stretch you as a communicator, as a programmer, and as a human being — so go for it. But if you’re thinking about it, I’d suggest that you learn from a couple of my mistakes. ;)

People might tell you that writing technical books sucks because you don’t make much money. (Which is true, as far as it goes). Or they may tell you that writing books sucks because it’s hard work. Or they might tell you how much time you spend away from those you love. And those things are true. But I don’t regret any of those things about writing the TurboGears book.

I do however, have a couple of process related regrets, and I’ve felt for a long time that I needed to write an article to codify some of the things I’ve learned about writing, so that prospective book authors and open source framework/library documenters have a shot at avoiding some of my rookie mistakes.

The two most important things that I learned from writing the TurboGears book were:

Book

  • Every single line of code needs to be tested, not just before it goes in the book, but every time you make changes. If you don’t do this code will get broken in the process of last minute reorganization, rewrites, and crazy insanity.
  • It’s better to take time to do it right, than to rush something out the door that’s not what people need.

The testing issue is the most critical thing about book writing and it comes in two parts — both of which are far too easy to ignore. First code needs to be tested to make sure that: it runs, it does the right thing, and it makes sense. The first two tests are automatable, and really need to be automated. Refactoring, and rewriting are fundamental to making good code and good books, and you can’t confidently refactor without tests. And since I think book authors should be testing the code to make sure it makes sense, but getting target audience readers to read-and-understand it and making it shockingly easy for them to provide feedback, it’s likely that lots of refactoring opportunities will come up.

Unfortunately, though the Pragmatic Press people have one, as do many, many authors, I’m not aware of a single openly available tool which is designed to testing book-code easy. And I think this is a shame because even if you’re not writing books, every open source library needs documentation, and most of them need tutorial style documentation which requires the same basic tools. So, I’m hoping that some of us can join forces to get a tool like this started at the PyCon Sprints next week.

There have been two approaches to the problem:

  1. Suck code from external source-code into the document itself.
  2. Take code from the document, and mark it up with a list of external resources needed to test it.

Based on my unscientific results it looks like the first approach is more popular than the second. But the second approach has one very significant advantage — all of the code is visible while you’re writing the text and therefore you are less likely to have “refactoring” bugs that cross the text/code boundary (a method name is changed in the code, but not the text that describes it).

With that said, there are a number of compelling advantages of the suck-in code method. First, it’s relatively language independent. You just need to define what comments you’ll use to mark off code in the project (to add formatting, and mark the beginning and ending boundaries) and create a simple structure that runs the native language tests, and then builds the document. You may need to adjust things slightly for languages with different commenting conventions. And it certianly seems like multi-language support would be a lot harder to achieve when pulling code out of the document.

Also, I’m very much a believer in the idea that both the source code and the document-text source should be in an plain text format that’s easy to keep in version control, easy to track and easy to diff. I also want to be able to use the same editor for both my document source and my source code.

But in order to mitigate the possibility of the kind of “refactoring” problems I mentioned a minute ago, we ought top make it really easy to create rendered documents. I suppose you could work in two windows with the source-document in one and the rendered version in a second, but it would be even better if you could leave the “processing directives” that grab the code above the rendered-source in a plain text document, and then mark the end of the code samples in the rendered document, so that a document could be safely edited (while looking at the source) and then re-rendered at will.

If you’ve got an internal toolchain you think might be valuable as a reference for us, please let me know. And if you’ve got a couple of days and want to contribute to making Open Source documentation better, while making it easier to write good technical books, feel free to drop in (in person, or virtually) to the TurboGears sprint at PyCon next week and we’ll see what we can do.

http://docs.turbogears.org/SprintOrganization