The latest versions of Ubuntu do not appear to have the tool udevinfo anymore, which is vital to find information about devices connected to the computer.

There is, however, a new tool called udevadm, and with a little syntax trick you can get it to spit out your familiar udevinfo syntax:

udevadm info -a -p `udevadm info -q path -n /dev/sdb`

shows:

Udevadm info starts with the device specified by the devpath and then
walks up the chain of parent devices. It prints for every device
found, all possible attributes in the udev rules key format.
A rule to match, can be composed by the attributes of the device
and the attributes from one single parent device.

  looking at device '/devices/pci0000:00/0000:00:13.2/usb2/2-1/2-1:1.0/host5/target5:0:0/5:0:0:0/block/sdb':
    KERNEL=="sdb"
    SUBSYSTEM=="block"
    DRIVER==""
    ATTR{range}=="16"
    ATTR{ext_range}=="256"
    ATTR{removable}=="1"
(...)

  looking at parent device '/devices/pci0000:00':
    KERNELS=="pci0000:00"
    SUBSYSTEMS==""
    DRIVERS==""

If you use this more often and don't like the idea of entering a huge line of code for such a simple command, drop the following into your .bashrc file (all in one line):

udevinfo () { udevadm info -a -p `udevadm info -q path -n "$1"`; }

Now (after starting a new session or typing source ~/.bashrc), a simple udevinfo /dev/sdb will do the trick.

Also helpful: A long time ago, I wrote a blog post about udev rules, showing what rules I used at the time to have consistent device names for my USB drives, no matter in what order I connect or disconnect them. The devices I mention there are long gone, but I keep going back to that post every time I need to write a new udev rule.

Read more…

I am delighted to report that in the project I am currently working on, I get to play with a lot of smiley faces:

Hope that lightens up your day a little :)

(Yes, you guessed right: For every smiley page there is an equivalent with a frowny face, but sheesh, don't tell anyone ;))

Read more…

When you look at the bottom of your GMail window, you'll notice links in the footer, cycling through more or less helpful tips as well as Google advertisements.

For years now, one of these links has been to the GMail Notifier for Mac:

Sadly though, this link to http://mail.google.com/mail/help/notifier/index.html (forwards to http://toolbar.google.com/gmail-helper) has, also for a long, long time now, been a not found error. As you can't open a bug report with Google, I've emailed the GMail service people about this before, but I guess dead links in production software are not on top of their todo list.

Ah well, maybe they google for "Gmail fail" sometimes and find the bug report this way ;)

Read more…

Everybody knows Mozilla makes Firefox. But there is a lot more software at work here at Mozilla that you might not be aware of. For example: What happens when you go to getfirefox.com and click on the download button?

By clicking on the button, you ask our servers to send you a specific file, for example: Firefox 3.6.3, for Windows, in German. On a small website, the server would just fetch the file and hand it to you. But if you need to handle millions of downloads a day like we do, a single server can't handle it all by itself, so it gets more complicated. In order to provide you with downloads, updates, etc., as fast and conveniently as possible, Mozilla collaborates with a number of mirror providers that have volunteered to host Firefox and other downloads on our behalf, thus sharing the load of our numerous downloads between a number of servers all over the world.

For some years now, we have been running a bundle of software called "Bouncer" to handle our downloads for us.

Bouncer consists of of three components: The user-facing bounce script, an administrative interface called Tuxedo, and a mirror checker called Sentry.

First, the bounce script. It is the only component the "ordinary user" gets to interact with. It essentially does the following after you click on a download link:

  • It determines if the product you asked for exists.
  • Out of our list of mirrors, it picks one that has your file. Initially, it would pick one at random. Over the years, the logic has become more elaborate though: Meanwhile, it takes into account in what country you currently are, as well as how strong the mirrors are (stronger mirrors serve more downloads, weaker ones serve less).
  • A split-second later, Bouncer refers you to the server it decided on, and that server will send you the file you asked for.

But wait, there is more! How does Bouncer know what products are available, for what operating systems, and in what languages? That's where the admin interface comes in. We have a release engineering team who work hard every day to deliver the newest software versions to you in handy little packages. Previously, during every release, an engineer would manually tell Bouncer that a new version was available for download. But just last week, we improved this process by introducing a new interface to Bouncer, with a project called Tuxedo. The release engineering team can now, fully automatically, feed new versions into Bouncer at the time of release, with no manual intervention. With less time spent on repetitive tasks, we can spend more time making Firefox awesome.

Finally, the Sentry component is a script that periodically checks the health of our mirrors, and adjusts our settings accordingly. This is to ensure that a situation where you are forwarded to a mirror that is currently unavailable is very, very rare. So far, these mirror checks happen from Mozilla Headquarters, and therefore reflect the connectivity we get to the mirrors from here. In the future, we want to improve that by taking into account more how our users' connectivity is to the specific mirrors (for the geeks out there: Network proximity != geographical proximity), which has the potential to result in faster download times, less expenses for mirror providers, and general happiness.

As you can see, there are a lot of things happening behind the scenes before Firefox makes its way onto your computer at home, and we are constantly working on improving the way we are doing things. Plus, as always: Bouncer is completely open source, and we have a public bug tracker, so if you notice any problems or see room for improvement, make sure to let us know.

Photo credit: "directions", CC-by licensed by Phillie Casablanca.

Read more…

After Apple and Microsoft have (finally!) publicly announced they are ready to pull the plug on Adobe Flash, the first makers of Flash webapps are starting to ditch it in favor of HTML5: As Techcrunch writes, Scribd, an online document hosting service, will focus its efforts on HTML5 from now on.

Scribd co-founder and chief technology officer Jared Friedman tells [Techcrunch]: “We are scrapping three years of Flash development and betting the company on HTML5 because we believe HTML5 is a dramatically better reading experience than Flash. Now any document can become a Web page.”

I am very pleased to hear that. Now that web standards are finally offering the kind of versatility modern web applications need, it is a fantastic development that companies are getting rid of the monster that is Flash. That's good for the user for so many reasons, and it's a great example of what HTML5 can really do.

Update: Ryan points out in the comments that Scribd has a demo document online of what this is going to look like. It's fantastic!

By the way: Another company I would like to see getting rid of Flash (in fact, I never understood why they used it in the first place) is slideshare. They are turning into a de-facto standard for posting presentation slides online, but as of yet, their main UI is solidly in Flash's claws. :(

Read more…

On a growing number of projects at Mozilla, we use a tool called Hudson that runs a complete set of tests on the code with every check-in. The beauty of this is that if you accidentally break something, you (and everyone else) will know immediately, so you can fix it quickly. We also use a bunch of plugins with Hudson, one of which assigns points to every check-in: For example, if all tests pass, you get a positive number of points, or if you broke something, you get a negative score.

An innocent little commit of mine gained me a whopping -100 points (yes, that is minus 100) today.

How did that happen? The build broke badly, not because I wrote a pile of horrendous code, or because I didn't test before committing. In fact, I've made it a habit to commit like this:

./manage.py test && git push origin master

This fun little one-liner will result in my code being pushed to the origin repository if and only if all tests pass.

So in my case, all tests passed locally, and then horribly broke once the server ran the tests again. After a little research, it turned out that when I deleted a now unneeded Python file, I did not remove its compiled cousin, the .pyc file, along with it. Sadly, this module was still imported somewhere else, and because Python still found the .pyc file locally, it did not mind the original .py file being gone, so all tests passed. On the server, however, with a completely clean environment, the file wasn't found and resulted in the failures of dozens of tests (all of which threw an ImportError).

What's the lesson? In the short term, I should wipe my .pyc files before running tests. One way to do that would be adding something like

find . -type f -name '*.pyc' | xargs rm

to my ever-growing commit one-liner, but a more general solution might want to perform this inside the test running script. On the other hand, since that script is written in Python, some of the imports that could break have already been performed by the time the script runs.

In general, run your tests on as clean an environment as possible. While any useful test framework will take care of your database having a consistent state for every test run, you also need to ensure that you start with a plane baseline of your code -- especially if Hudson, the merciless butler, will rub it in your face if you don't ;) .

Read more…

A few days ago, a colleague of mine mentioned that the font I was using on my blog looked borderline ugly on Linux. Here's a screen shot:

As you can see, the uneven glyphs make it look goofy and certainly hard to read. The problem was that I used a font that seems to be present on many Mac and Windows computers, but was unavailable on my colleague's Linux box. His browser tried to substitute it with a different font -- with limited success.

So I decided to use a nifty little web feature called @font-face that allows me to define and embed my desired fonts into the website. Ideally, every browser on every platform will download the fonts I am using, and display my blog the way it is intended to look. The fonts I am using now are called Goudy Bookletter 1911 (for the headings) and Droid Serif (for the text).

I hope you like the new fonts and find them pleasant to read. If you notice any problems, however, please let me know!

Thanks for the hint, Lars, and thanks to all commenters for providing valuable feedback!

Read more…

Today, Mozilla is starting the public process on revising its signature code license, the Mozilla Public License or MPL. Mitchell Baker, chair of the board of the Mozilla Foundation and author of the original MPL 1.0, has more information about the process on her blog.

The discussion is happening on the website mpl.mozilla.org that looks something like this:

I am happy about this for a number of reasons. Of course, I made the website (the design is borrowed from mozilla.org), so I am naturally happy to see it being available to a wider audience.

But I also hope that the revision process itself will be successful. While the MPL has been a remarkable help in Mozilla desktop projects' success, it is unpleasant (to say the least) to use in web applications, for a number of reasons:

The hideous license block. The MPL is a file-based license. It allows any file in the project, even in the same directory, to be licensed differently. Therefore, each MPL-licensed code file must have an over 30 lines long comment block on top. For big code modules, that's fine. For web applications, whose files often have a handful of lines, this balloons up the whole code base and makes files horribly unreadable. Sadly, the current license only allows an exception from that rule if that's impossible "due to [the file's] structure" which would essentially only be the case if that file type did not allow comments.

The copyleft. This one is debatable, but it's a fact that some open source communities, one prominent example is the Python community, does not appreciate strong copyleft provisions. While the MPL (unlike the GNU GPL) does not have a tendency to "taint" other code, this is not at all compatible with the BSD or MIT licenses' notion of "take it and do (almost) whatever you please with it". (As you may have noticed, the file-based MPL is both a curse and a blessing here). I hope that the revision process can make it clearer how this applies to hosted applications (i.e., mostly web applications).

I am excited to see what the broad community discussion will bring to light over the next few months.

Read more…

Update: The author of pdftk, Sid Steward, left the following comment:

A new version of pdftk is available (1.43) that fixes many bugs. This release also features an installer [for] OS X 10.6. Please visit to learn more and download: www.pdflabs.com.
This blog post will stick around for the time being, but I (the author of this blog) advise you to always run the latest version so that you can enjoy the latest bug fixes.

OS X Leopard users: Sorry, neither this version nor the installer offered on pdflabs.com works on OS X before 10.6. You might be able to compile from source though. Let us know if you are successful.


Due to my being a remote employee, I get to juggle with PDF files quite a bit. A great tool for common PDF manipulations (changing page order, combining files, rotating pages etc) has proven to be pdftk. Sadly, a current version for Mac OS X is not available on their homepage. In addition, it is annoying (to say the least) to compile, which is why all three third-party package management systems that I know of (MacPorts, fink, as well as homebrew), last time I checked, did not have it at all, or their versions were broken.

Now I wouldn't be a geek if that kept me from compiling it myself. I took some hints from anoved.net who was nice enough to also provide a compiled binary, but sadly did not include the shared libraries it relies on.

Instead, I made an installer package that'll install pdftk itself as well as the handful of libraries you need into /usr/local. Once you ran this, you can open Terminal.app, and typing pdftk should greet you as follows:

$ pdftk
SYNOPSIS
       pdftk <input PDF files | - | PROMPT>
            [input_pw <input PDF owner passwords | PROMPT>]
            [<operation> <operation arguments>]
            [output <output filename | - | PROMPT>]
            [encrypt_40bit | encrypt_128bit]
(...)

You can download the updated package here: pdftk1.41_OSX10.6.dmg

(MD5 hash: ea945c606b356305834edc651ddb893d)

I only tested it on OS X 10.6.2, if you use it on older versions, please let me know in the comments if it worked.

Read more…

As mentioned earlier, I dove a little into the world of non-relational databases for web applications. One of the more interesting ones seems to be MongoDB. By the way, a video of the presentation I attended is meanwhile online as well.

MongoDB does not only seem to be "fully buzz-word compatible" (MongoDB (from "humongous") is a scalable, high-performance, open source, schema-free, document-oriented database.), it also looks like an interesting alternative storage backend for web applications, for various reasons I'd like to outline here.

Note that I haven't extensively worked with MongoDB, nor have any of the significant web applications I worked with used non-relational databases yet. So you are very welcome to point out points I got wrong in the comments.

First, some terminology: Schema-free and document-oriented essentially means that your data is stored as a loose collection of items in a bucket, not as rows in a table. Different items in the bucket can be uniform (in OOP-terms, instances of the same class), but they needn't be. In MongoDB, if you access a non-existent object, it'll spring into existence as an empty object. Likewise for a non-existent attribute.

How can that help us? Web applications have a much faster development cycle than traditional applications (an observation, for example reflected in the recent development changes on AMO). With all feature changes, related database changes have to be applied equally as frequently, every time write-locking the site up to several minutes depending on how big the changes. In a schema-free database, code changes can smoothly be rolled out and can start using fields right away, on the affected items only. For example, in MongoDB, adding a nickname to the user profiles would be trivial, and every user object that never had a nickname before would be assumed to have an empty one by default. The tedious task of keeping the database schema in sync between development and deployment is basically going away entirely.

In traditional databases, we have gotten accustomed to the so-called ACID properties: Atomicity, Consistency, Isolation, Durability. By relaxing these properties, unconventional databases can score performance-wise, because less locking and less database-level abstraction is needed. Some exemplary ACID relaxations that I gathered about MongoDB are:

  • MongoDB does not have transactions, which affects both Atomicity and Isolation. This will let other threads observe intermediate changes while they happen, but in web applications that is often not a big deal.
  • MongoDB relies on eventual consistency, not strict consistency. That means, when a write occurs and the write command returns, we can not be 100% sure that from that moment in time on, all other processes will see the updated data only. They will only eventually be able to see the changes. This affects caching, because we can't invalidate and re-fill our caches immediately, but again, in web applications it's often not a big deal if updates take a few seconds to propagate.
  • Durability is also relaxed in the interest of speed: As we all know, accessing RAM takes a few nanoseconds, while hitting the hard drive is easily many thousands of times (!) slower. Therefore, MongoDB won't make sure your data is on the hard drive immediately. As a result, you can lose data that you thought was already written if your server goes down in the period between writing and actual storing to the hard drive. Luckily, that doesn't happen too often.

As you see, if our application is not a banking web site and we are willing to part with some of the guarantees that traditional databases offer, we can use a database like MongoDB, that much more closely fits the way modern web applications are developed than regular RDBMSes do. If that's an option, every project needs to decide on a case-by-case basis.

Read more…