There's an excellent xkcd web comic about slacking off while compiling code, but of course, I don't usually compile code, because I code in Python.

In the wake of a little server outage here at Mozilla, here's my version of the comic:

(Based on the above xkcd web comic. Licensed under a CC by-nc license.)

Read more…

Surely, you've heard of many fancy new features that HTML5 and related technologies bring to the Web now and in the future: open video on the web, canvas, transitions, what have you.

But sometimes it's the smallest things that have the biggest impact. Besides these hyped features, HTML5 also introduces a number of semantic form fields. Before, the only textual input the web knew was, well, plain text. It was up to the web application developer to enforce certain rules around that, like making sure the input is a number, or not empty, or even a valid website address (URL).

Firefox 4 understands these new input types and helps the user by enforcing correct values even before the users submits the form. By handling validation on the client, this enables a consistent form validation UI across websites and keeps the user from constantly submitting forms and wait for the server-side form validation to pass or fail. (NB: This does not relieve the developers of performing server-side checks in order to ensure the security of their web application).

Here is what this looks like in a recent prototype of the Firefox Input site:

Another fun little feature, also pictured, is the placeholder text attribute. The grayed-out placeholder in a text box shows you an example of what you might enter into this field. Rather than explaining correct values in a huge label or a side note next to the field, developers can show their users much more easily what data they would like them to enter into the form fields.

All of this makes for fewer mistakes entering data into web forms, which is both beneficial to the user (getting the job done faster) and the developer (collecting better data). Win-win!

For much more detailed on HTML5 forms, placeholders, validation, etc., take a look at Mark Pilgrim's excellent Dive Into HTML5. Also, don't miss out on Anthony Ricaud's in-depth description of HTML5 forms in Firefox on the Mozilla Hacks blog.

Read more…

Note: In case this is not clear, this blog post reflects my own opinion and experience, and is not an official statement on behalf of my employer.

Since I started working at Mozilla headquarters, my job interview volume has drastically increased -- both in person and phone. As I had rarely conducted any job interviews before that, it was as much a learning experience for me as it was a critical part of their job search for the applicants. Initially, I was perhaps just as nervous as the interviewee.

The experience was mixed: Sometimes, candidates come up with surprising answers, elegant solutions to simple problems -- or even just show that they have grasped a problem and its solutions from beginning to end.

Perhaps just as often, however, the experience is also increasingly frustrating. One of the worst things to notice is that the candidate can't code. Seriously? Yes, seriously. Given a relatively simple programming question, they fail to come up with a solution either at all, or worse, they produce a solution that reveals shocking gaps in their knowledge of basic algorithms, understanding of code, or other qualities you would hope for in someone who makes software for a living and claims to have done so for awhile.

Another thing that disappoints me in candidates is when it turns out, they don't have professional goals. Sure, you can't be prepared for every question in the world. But if you are looking for a job, yet you are unsure where you want to go if you do get the position, how can this possibly convince me that you are the right person for the job?

For our web development team, this makes me wonder: Do people consider "web" in front of "developer" to be a weakening qualifier? Or more generally, does software engineering have a secret reputation of not being a real profession, with real professional requirements?

Mozilla is perhaps one of the fewer companies without a strict degree requirement: Some of our brightest minds have no formal college education, yet are incredibly successful and valued members of our community. The reason why this works out is that the Mozilla project, with many more volunteers than employees, is a meritocracy: If you do awesome things, people will respect you and turn to you for more awesomeness. If you aren't doing a good job, your impact on the community will be minute (and stay that way).

Maybe, though, this is sometimes mistaken as an excuse not to be really, really good. It's not about knowing everything there is about writing software. But if they are applying for one of the best positions in the industry, where they have to earn their respect rather than show off their formal credentials, shouldn't they at least try a little harder?

If I could give a few tips to applicants across the Web industry, to perhaps raise their chances of getting a job, and to improve the interview experience for both interviewer and interviewee, I would say:

  • Know your basics. There is a reason they teach algorithms and data structures very, very early in college. And complexity and logic. The works. Even more so, know your Web. If you can't explain the nature of an HTTP request, how can you know your own applications by heart? If you don't know how to secure a web application, how can you protect the privacy of your users?
  • Know why you're awesome. When an interviewer goes home that day, they want to feel excited about hiring you. Give them a reason. That's no invitation to be snotty -- but one to not hold back on exciting things there are to know about you professionally.

What's your experience with interviewing and hiring people? What are the things you want to see in an applicant? And what have you done to find the people who are right for you? I am interested in hearing your opinions in the comments.

Read more…

Cryptographic hash functions play an important part in application security: Usually, user passwords are hashed and stored in the database. When someone logs in, their input is hashed as well and compared to the database content. A weak hash is almost as bad as no hash at all: If someone steals (part of) your user database, they can analyze the hashed values to detect the actual password--and then use it, without the owner's knowledge, to log into your application on their behalf.

As part of a proactive web application security model, it is therefore important to stay ahead of the game attackers play and use sufficiently strong encryption to store passwords. Since cryptanalysts are spending great efforts on breaking encryption algorithms (with the help of increasingly fast and cheap computers), SHA-1 is meanwhile considered only borderline in strength. Not a good position to be in if you want to write future-proof apps.

Django (our web app framework of choice at Mozilla) does not support anything stronger than its default, SHA-1, and has, in the past, WONTFIXed attempts to increase hash strengths, citing strong backwards compatibility as the reason. As long as Django targets Python 2.4 as its greatest common denominator, this is unlikely to change. Writing a full-blown, custom authentication backend for the purpose is an option (the Mozilla Add-ons project chose to do that), but it seemed overkill to me, given that with the exception of the hash strength, Django's built-in authentication code works just fine.

So I decided to monkey-patch their auth model at run time, to add SHA-256 support to my application (while staying backwards-compatible with older password hashes possibly existing in the database).

The code is simple, and I made an effort to keep it as uninvasive as possible, so that it can be removed easily in case Django ever does get support for stronger hashes down the road. Let me know what you think: (Embedded from a Gist on Github).

Read more…

A while ago, I had to import some HTML into a Python script and found out that—while there is cgi.escape() for encoding to HTML—there did not seem to be an easy or well-documented way for decoding HTML entities in Python.

Silly, right?

Turns out, there are at least three ways of doing it, and which one you use probably depends on your particular app's needs.

1) Overkill: BeautifulSoup

BeautifulSoup is an HTML parser that will also decode entities for you, like this:

soup = BeautifulSoup(html, convertEntities=BeautifulSoup.HTML_ENTITIES)

The advantage is its fault-tolerance. If your input document is malformed, it will do its best to extract a meaningful DOM tree from it. The disadvantage is, if you just have a few short strings to convert, introducing the dependency on an entire HTML parsing library into your project seems overkill.

2) Duct Tape: htmlentitydefs

Python comes with a list of known HTML entity names and their corresponding unicode codepoints. You can use that together with a simple regex to replace entities with unicode characters:

import htmlentitydefs, re
mystring = re.sub('&([^;]+);', lambda m: unichr(htmlentitydefs.name2codepoint[m.group(1)]), mystring)
print mystring.encode('utf-8')
Of course, this works. But I hear you saying, how in the world is this not in the standard library? And the geeks among you have also noticed that this will not work with numerical entities. While © will give you ©, © will fail miserably. If you're handling random, user-entered HTML, this is not a great option.

3) Standard library to the rescue: HTMLParser

After all this, I'll give you the option I like best. The standard lib's very own HTMLParser has an undocumented function unescape() which does exactly what you think it does:
>>> import HTMLParser
>>> h = HTMLParser.HTMLParser()
>>> s = h.unescape('© 2010')
>>> s
u'\xa9 2010'
>>> print s
© 2010
>>> s = h.unescape('© 2010')
>>> s
u'\xa9 2010'

So unless you need the advanced parsing capabilities of BeautifulSoup or want to show off your mad regex skills, this might be your best bet for squeezing unicode out of HTML snippets in Python.

Read more…

Last week, I secretly released version 1.2 of my Copy ShortURL add-on. It contains a lot of improvements based on your feedback! Here's the 411 on the new features and how to use them:

is.gd is the new default I switched to is.gd (from tinyurl) as the default shortening service. I am affiliated with neither of them, but I though the point of short URLs is, well, being short. So is.gd wins on that front. If you don't like that, don't fret, because...:

You can pick your own short URL service now If you have a short URL service that you like more than the default, you can pick your own now. Instructions are in the README file on github (towards the bottom). By setting the preference extensions.copyshorturl.serviceURL in about:config, you can for example use tinyurl, bit.ly (requires an API key) and lots of other URL shorteners. If you have additional service URLs to share with the class, please leave a comment!

Notifications Initially, there was no way to tell if the add-on had already done its job, except for checking your clipboard contents (hint, if in doubt, yes, it worked). So I added unobtrusive Growl notifications for platforms that support it. For example:

If you don't have Growl, a Firefox notification bar is shown instead:

Finally, Copy ShortURL is now compatible with Firefox versions 3.6 to 4.0b5pre.

Hope you like it, and feel free to leave a comment here or file issues on github if anything is not working as expected.

Read more…

Since I last blogged about the Copy Short URL add-on, I stumbled across another, very popular example of automatically exposed short URLs:

Wordpress.com as well as self-hosted Wordpress instances have automatic short URLs now, starting with Wordpress version 3.0.

For example, this blog post on wordpress.com about a possible proof for P != NP has the shiny short URL http://wp.me/pr9Ir-1lN.

A recent blog post on my blog, in turn, has: http://fredericiana.com/?p=2921.

Of course, it's a little sad that the auto-generated short URLs on self-hosted Wordpress instances are so ugly, and they are not really short enough to use them easily on twitter or with other character-sensitive applications. But considering how long your average blog post URL is in the first place, it seems like a great win nonetheless.

An unrelated side note: I filed a bug to expose bugzil.la URLs on Mozilla's bugzilla instance. It's not picked up or resolved yet, so if you want to see the support as much as I, feel free to comment on or CC yourself on the bug!

Read more…

Noah's Ark The fail pet collection is glad to welcome a new member! bit.ly, the URL shortening service used by default on twitter, hosts a family of pufferfish in their logo, and consequently, a really big one of them is also responsible for guarding their 404 page:

Let's hope all the sea creatures in my giant fail pet aquarium get along well...

Thanks for the link, dolske!

Read more…

Look what last.fm has in their robots.txt file:

User-Agent: *
...

Disallow: /harming/humans
Disallow: /ignoring/human/orders
Disallow: /harm/to/self

Allow: /

Oh, who wouldn't like geek humor :)

Thanks for pointing this out, jsocol!

Read more…

As it turns out, Blizzard Entertainment's game service battle.net has a fail pet!

Well, "fail pet" might be the wrong word for this, but who expected battle.net to have a cutesy kitten on their error pages anyway. Instead, they have a fellow who somewhat looks like the alien cousin of Lennie from "Of Mice and Men".

"Oops", indeed!

Update: Deb says, it's a Murloc. Thanks!

Thanks for the screen shot, clouserw!

Read more…