mod_wsgi anybody?

I’ve been meaning to blog about WSGI more lately. I still find Pythonistas who don’t know what WSGI is, and others who think it is some big scary J2EE like specification which should be avoided at all costs.

I don’t think theres any good reason to be scared, it’s actually pretty simple to understand and use. But, that post will have to wait for another day. For now, I just wanted to mention that I caught wind of an interesting mod_ project for Apache which should interest WSGI users.

That’s right, now there’s a mod_wsgi project, being written by Graham Dumpleton.

I reciently caught wind of mod_wsgi, on the Hex Dump blog, and the turbogears mailing list It’s not released yet, but it seems like a good idea to me.

Oh, and the “performance estimates” comparing cgi, mod_python and mod_wsgi look nice too.

Here’s hoping that it actually gets released in next month as planned. If we could actually get ISP’s to load this module, it would make the “commodity hosting” deployment options for modern python web apps much better. Graham’s original post to the mod_python list about how he sees mod_wsgi fitting in with commodity apache hosting is actually well worth the read.

So, here’s hoping that it’s a success.

9 Responses to “mod_wsgi anybody?”


  1. 1Jon

    It’s funny how mod_python people always compare mod_python to CGI. And only CGI. And the mod_wsgi people are continuing this by comparing mod_wsgi to mod_python and CGI, leaving other solutions (such as FastCGI, SCGI, and reverse-proxying) by the wayside.

    The fact is that embedding the python interpreter in Apache the way mod_python does is a bad idea for application deployment. It makes upgrading harder. It makes each Apache process take more memory, which makes it harder to scale. It makes it impossible to separate the Apache process with its associated user privileges from the webapp process and its associated privileges. And it makes things difficult to debug, because you cannot break a mod_python app down into individual pieces; a FastCGI app can be tested with simpler servers such as Lighttpd to find out where the problem lies, or an app running with its own mini HTTP server can be accessed directly via the browser. As the main responder in freenode’s #python-web, I have seen the debugging problems mod_python inflicts far too often.

    My preferred method is to run my WSGI apps via a mini HTTP server, such as Paste’s httpserver or CherryPy’s wsgiserver, and then use mod_proxy in Apache to proxy to it. I use DJB’s daemontools to keep my app process running, but I could use supervisor or any other similar tool. I can upgrade my Apache process without restarting my app, and vice versa. If I wanted to, I could run my app in a chrooted or other similarly locked-down environment without restricting Apache at all. It means more freedom and, coincidentally, higher performance. (I don’t have statistics for this, unfortunately, but I have anecdotal evidence from several people.)

    Mark Mayo also has some interesting things to say about various methods of running webapps: http://www.vmunix.com/mark/blog/archives/2006/01/02/fastcgi-scgi-and-apache-background-and-future
    His arguments have significantly influenced my views; I believe you may find them similarly persuasive.

    Footnote:
    I will admit that I consider mod_python to be superior than the plain-CGI approach that many $3 hosts restrict you to, and if mod_wsgi can be successfully marketed to these hosts, it will be a boon to Python. But my worry is that mod_wsgi/mod_python will be presented as the ONLY approach for all scales of deployment, the way mod_python already is being presented by some people. There are alternatives, and my contention is that some of the alternatives are so much better that the mod_python community does users a disservice not to mention them.

  2. 2Jon

    Minor note: the channel name in my previous comment is actually #python.web, not #python-web

  3. Jon,

    Thanks for the extensive feedback.

    Currently I use mod_proxy, or mod_rewrite and a seprate HTTP server for all of my python web application hosting. Like you, I have become progressively disillusioned with mod_python. And my experience with FastCGI and SCGI hasn’t been all that great either.

    So, HTTP based solutions seem to me to be the best option by far.

    But mod_wsgi provides another option, which is likely more easily deployed/controlled by ISP’s. And there are applications that I have which I think would run well enough in this environment on cheep shared servers. But my options here at the moment are seriously constrained.

    If we could get to the point where we didn’t have to support TurboGears deployments using mod_python, because mod_wsgi were commonly available, we’d have a much simpler, more unified deployment story to tell.

    It’s not so much that I think mod_wsgi is the best possible solution, it’s just that I look forward to the simplicity, and relative ease of administration when compared with the existing troubles we see our users have with mod_python.

    And, as for performance, HTTP based solutions add enough overhead, that it’s not entirely clear to me that we can make an a priori (sorry for the philosophy speak, it just means “before experience” determination that HTTP based solutions are going to outperform mod_proxy. So, let’s wait and see.

    But what I’m excited about is mod_wsgi as an simpler, better (for my purposes) replacement for mod_python, not as a replacement for mod_proxy.

  4. 4Graham Dumpleton

    In respect of Jon’s comments on the mod_wsgi performance figures, the only reason that the mod_wsgi performance estimates page doesn’t give a comparison to other Apache based solutions such as mod_fastcgi and mod_scgi is due to lack of time. I wasn’t even going to post performance figures until much later but Mark’s comments (Hex Dump Blog) forced my hand a bit. The reason for the lack of time was that I was about to go on holiday, which I am still on at this time, and would not have access to my own computers to do further work and would only have infrequent net access.

    Knowing that I would get criticised for not providing comparisons to more options than I did, I intentionally provided figures for serving up static files with the same Apache web server configuration. I also scaled all the figures against a normalised figure for the static files thus allowing the figure for static files to be used as a baseline. With such a baseline it would then be easier for others to try and do some meaningful comparisons against other options using their own computers by determining what figure they get for static files and thereby scaling their results correspondingly so they could be compared against what I provided.

    Given the time I had I believe I did as much as I could to provide some figures that could actually be compared meaningfully to something else. All too often you see figures for some web application serving technology and with the publisher of those figures then professing it to be the fastest thing on earth yet don’t then provide any baseline reference so comparisons can be done to evaluate how true that might be. I at least tried to avoid that. If people are still not happy, they’ll have to wait a few more weeks until I can do testing with mod_fastcgi and mod_scgi. BTW, there are no mod_wsgi people, it is just me. :-)

    As to the comments saying that the way that mod_python embeds the Python interpreter into Apache is bad, I can’t quite see the basis of some of the arguments presented. The only one that really has a good sound basis is that mod_python can’t provide a way of running a hosted Python application as a specific user different to that which Apache runs as, or for the application to run within a chroot’d environment.

    This is a known tradeoff with using mod_python and can well be an issue in a web hosting environment where the web hosting company doesn’t use operating system virtualisation to separate users and thus there can potentially be distinct applications running under the same Apache instance which are written by different customers.

    Although mod_fastcgi, mod_scgi or proxying solutions may provide a means of allowing distinct applications to be run in separate spaces as different users and optionally within a chroot’d environment, the fact is that for web hosting companies who are providing really cheap solutions, they aren’t going to necessarily want to use those alternatives either as they have their own trade offs such as the cost of setting them up and maintaining them could be see as being much more.

    Thus for cheap web hosting environments where they are trying to host as many users as possible on as little hardware as possible, mod_python isn’t acceptable because it doesn’t isolate users enough so they don’t use that, but at the same time the other solutions may involve more effort and cost to look after and manage and so they will not necessarily go for that either. So neither solution is really going to be acceptable for this case of web hosting environment and thus why you see at most only CGI being offered.

    Don’t get me wrong here and think that I am dismissing mod_fastcgi, mod_scgi and proxying options as valid alternatives as I am not. I don’t know who Jon thinks the mod_python people are, but on the mod_python mailing lists, where people specifically come to get help on mod_python, we will still point people at such other alternatives, thus steering them away from mod_python where their requirements mean that mod_python isn’t a reasonable option. So even though the lists may be for mod_python, we aren’t that biased that we don’t recognise that other solutions may be more appropriate for some people.

    Now, even though configuration of mod_wsgi may be as simple as CGI, because it works in a similar way to mod_python, the inability to separate users application is going to continue to be an obstacle for getting it accepted in cheap web hosting environments. As such, although it could feasibly be used in such environments, mod_wsgi is not go to be promoted as being a good option in those cases either and so one will still probably only ever see CGI as the only option there.

    Overall, I’d say that trying to improve the lot for Python developers in that specific area is probably a waste of time. Web hosters in that area want what is cheapest, easiest to manage and where it is difficult for one user to interfere with another. Purely because of the power that using Python gives you, it is always going to be very hard to come up with a solution which would meet their requirements.

    On the issue of debugging of mod_python applications, experience on the mod_python mailing lists is that a fair number of people who have problems (beyond installation issues) and end up asking questions on the list are simply lazy, can’t be bothered to read what documentation does exist and thus are ignorant of how things work, or are ignorant of general debugging techniques or of specific things that can be done when dealing with HTTP applications to debug issues. Poor documentation doesn’t necessarily help either.

    A lot of the time people don’t even try the most basic step of instrumenting code with debug statements to print out messages marking progress through code, thus allowing them to see where their code gets to and what values things may have at certain points. Even if people do, they don’t even bother to try printing out stuff in the Apache request object to understand what it is that the browser may be sending.

    Instead you get people simply posting their code and expecting someone to tell them why it doesn’t work, without even properly describing the problem or what they may have tried already themselves to debug the problem. Sometimes they don’t even post there code or the errors they are getting and just give a vague question of why something may not always work.

    The inability of people to apply such basic debugging techniques on their own is more often the biggest problem and this will not change no matter which application development environment is being used.

    That said, once you get past these sort of issues, yes the fact that everything runs within Apache can make it hard to split out bits for separate testing, but it doesn’t mean it isn’t possible. There are various people on the mod_python lists who manage to do unit testing on their applications outside of Apache by creating fake request objects and infrastructure to inject them into their applications.

    If one really needs to look at step by step progress of an application and using logging to display progress is not sufficient, one can always use the Python interactive debugger module (pdb). I’d say that most would be ignorant that this can be done though. That it is necessary to run Apache in single process mode does though make it inconvenient and it would thus not necessarily be an option for debugging an application running within a web hosting environment where the user has no control of the web server itself. Similar problems will exist with any web server that daemonises itself as you will need to be able to force it to run in the foreground instead.

    With WSGI now becoming popular, no matter whether one uses mod_python, mod_fastcgi, mod_scgi or a proxying solution, debugging should be easier because one can swap out the web server and use another within which it is easier to perform debugging. To that end, mod_wsgi can be seen as being preferable to mod_python because it will require WSGI compliance and thus the user has that option to debug under a different web server.

    On the issue of memory use of mod_python and scalability, I can’t really see the issue here. Even if one uses something like mod_fastcgi you are going to have similar issues. A lot of these issues are dealt with by how you configure Apache and whether you use prefork or worker MPMs and how many child processes/threads you allow to be created, plus whether you try and use a single machine or load balance everything across multiple machines.

    Unfortunately most people don’t seem to really understand the options that may exist, or perhaps don’t even understand how mod_python actually works. There was a recent discussion thread on comp.lang.python which in part shows how prevalent the problem is of people not understanding how Apache/mod_python work together. The thread basically started out with someone complaining how bad mod_python was and why solutions such as mod_fastcgi and mod_scgi were better. In the end after everything was explained to him as to why his ideas about how things worked were wrong, the only thing left was the user and chroot issue mentioned above. These latter issues weren’t though even something he was necessarily complaining about.

    What was disappointing though was that some of those trying to explain how things worked, were themselves posting incorrect information. I see such wrong information being posted every so often on various forums and in documentation for some of the large Python frameworks. One posts followups correctly explaining how things work and asking documentation to be amended, but quite a lot of the time they still don’t fix the documentation and so the incorrect information continues to propagate.

    Anyway, enough, I have filled up my bit of spare time on my holiday and my need for a rant. I’ll will say one more thing though. There is no intention on my part to promote mod_wsgi as the one and only solution, nor do I believe that mod_python is the solution either. Personally I think the whole mod_python code base should be done over again. At the moment mod_python is a poor solution when compared to what one can do with mod_perl and thus isn’t as good as it could be for those wanting to control and working closely with everything that Apache has to offer. As a result, it really stands as an application solution by itself rather than being what it should be which is a way to extend Apache. So, I don’t necessarily have that much love for mod_python either, to me it is just a tool, like everything else and I’ll use whatever makes best sense for the task at hand.

    BTW, Jon if you are always helping people with mod_python stuff you might want to just direct them to the mod_python mailling lists where we deal with such questions all the time. Personally I don’t recollect ever seeing you posting on the mod_python forums and I can’t see anything in the archives either.

  5. 5Jon

    Graham, I’ve replied privately to you via email, since I didn’t figure I should be cluttering up this blog with a response.

    (On the other hand, if you _want_ to see my response, Mark, let me know.)

  6. I run apache behind nginx and mod_wsgi is *perfect* for that purpose!

  7. Agreed, Mark — we hope it’s a success too! We’ve used it at microPledge.com for some time now, and it’s simple and fast.

    BTW, if anyone likes what Graham’s doing with mod_wsgi, feel free to donate via microPledge — he’s set up a donations project for mod_wsgi here:

    http://micropledge.com/projects/modwsgi

    And no, we’re not in league with him. :-) We just like what he’s doing and the help he gives the mod_wsgi and Python communities.

  8. Hello,
    If you need a quickstart instruction on how to get turbogears working with mod_wsgi and apache2 check out this website. It is really easy to get things started.
    http://lucasmanual.com/mywiki/TurboGears#head-36b7eef1526da4fe58c73738c925f34f6bc93c1d
    Lucas

  9. Then flicked herfingertip backseat blowjobs back and my panties. Even though i.

Comments are currently closed.