The HTML structure of webmail interfaces: Gmail, Hotmail, and Yahoo Mail

As part of the Zentact project I’ve been working on, we were asked to integrate with various webmail clients. This makes it easy to manage your contacts while sending email.

Doing this was a bit of a pain. Since all code is minified, and they all use Javascript events differently, there was a good bit of working to figure out the details. I wanted to share this info in a blog post for programmers who come along in the future. If you don’t know/care about HTML, Javascript events, the DOM, YUI, or AJAX, this post is not for you. Please enjoy one of my other fine posts, perhaps this post on military code names.

Before I begin: there was a ton of info learned (and already forgotten) about this process. This is not a complete guide, but is mostly a brain dump from implementing UI integration on three different webmail interfaces.

Gmail uses 6-character strings, [A-Za-z0-9] for all its classes. These classes remain the same from load-to-load, but I believe that they may change over time with minification. IDs are not as constant, and many are dynamically assigned. These start with a colon.
When you’re working with events, you may get inconsistent results. Some events are not fully propagated, they get captured and you can’t find out about them. If onclick doesn’t work, try listening for onmousedown or onmouseup. One of them may get you notified of the event you want. Same advice goes for onkeydown, onkeyup, and onkeypress. That being said, once you get into these, be sure to realize that these three events will occur in particular orders. Make sure you’ll be getting notified at the right time.
All of the webmail UIs use iframes. This lets them keep their code for loading the UI separate from the code to display the UI. I know there’s some cross-site scripting implications in this, but I’m not sure of all the details. Gmail’s loading screen (the loading bar they show you) is a different iframe than the one that shows you the inbox. All of these iframes are at the root of the document, and there’s nothing else in there.
You could use Firebug break points to pause the code and examine what’s going on, but nearly all JS is minified. Since breakpoints can only be set by line, and there’s multiple functions defined per line, it ends up not being helpful.
For its UI, Yahoo seems to use YUI, plus some other stuff on top of that. There’s some weird results because of this. The body of the email editor is a group of DIVs, some are invisible, some are for border decoration, and others are for the background of the editor.
When we inserted elements into Yahoo Mail using regular DOM operations, they would appear behind other page elements, until another part of the UI was interacted with, when the screen would redraw and then they would bump into place. YUI seems to have its own redraw/repaint functionality, and it won’t play nice with DOM manipulations.
Hotmail is strangely one of the less-exotic interfaces. They use consistent IDs. I don’t think they’re hand-coded, however, because they submit to a naming scheme that seems too machine-generated. But still, they are there, and you should take advantage of them.
When you’re using events, and you get notified of an event, use the event.originalTarget property to find out where in the DOM you are. That’s useful information when you’re dealing with a DOM tree of nonsense class names and IDs.
When you’re trying to figure out where in a DOM tree you are, don’t hesitate to go up several levels and check a great grandparent node, or a “cousin” node. Once you get a single point of reference, you can generally work out where everything else is, relative to it.
Some UIs open each message in its own iframe, which means that IDs are consistent since they’re in their own namespace.

Also, thanks to Nate Koechley for helping me get through some of the Yahoo details.

If you’ve got other questions, shoot me an email. I remember more stuff, but might need a good question to shake it loose.

Twitter Autocomplete (Tw-autocomplete Firefox Extension)

After lots of code, tests, and fun, I’ve produced a Firefox extension to add a useful, new feature to Twitter, as opposed to writing Twitter extensions as a joke 😀

Simply put, the extension provides autocomplete for Twitter usernames from your own list of friends while you’re using the web interface at twitter.com. It’s totally secure — no separate login required. Just install it, and use Twitter naturally.

When you start typing messages to people — using “@user” or “d user” — a list of matching contacts (along with icons) will drop in. You can click the person’s name to fill their username into the text box, or use the arrow keys along with tab/enter to select. As an added bonus, if you can’t remember their username at all, just type their first name, and the extension will figure it out.

There is another autocomplete script for Twitter, but it requires installing extra libraries, and I think this is simpler. Clearly it’s a feature in-demand.

The extension is hosted at addons.mozilla.org, a highly reputable site. They also provide lots of great management features that are handy to developers. I hope you enjoy using Tw-autocomplete.

Firefox extension debugging

One hugely important thing in coding is debugging. Unfortunately, a lot of Javascript debugging gets done via alert() calls. This gets awkward quickly, with the alerts affecting timing, and just being annoying if you have to dump large amounts of data out.

Firebug is a great development tool, and has a really handy logging interface that you can dump debugging info to. Just calling console.log(whatever) will dump it to the main Firebug interface as text that you can copy/paste, scroll through, etc.

If you’re developing a Firefox extension, this debugging capability is really useful. Except, calling console.log() doesn’t work, console isn’t defined for the browser, only for each window.

The trick? Call it directly from the Firebug extension object.

Firebug.Console.log()

Be sure to capitalize both Firebug and Console, and you’ll be good to go. In addition to having great capabilities for logging, the console will prevent your debugging messages from popping up to your users, in case you leave some code where it shouldn’t be.

By the way — if you found this helpful, check back here in a few days. I’ve submitted a presentation proposal to SXSW for Firefox extension development, where I’ve got tons of info for creating extensions for web applications. They collect votes from the community, and I’d like your support. Plus, if the presentation goes through, I’ll be collecting lots of my best tips and putting them online as a resource for the attendees. That means you’ll get all of them too, and you don’t have to go anywhere! 😀

Edit (2008-08-21: Added link for SXSW voting panel)

A icon to save your ass

I just forwarded an email asking for an RFP around to the team at Cloudspace, and since I use address book autocomplete, I checked the email addresses very carefully before sending the email. I’ve heard of (and seen) too many instances where someone quickly sent an email, and addressed it to the wrong person. Funny how it seems to happen most when the person who actually receives the message is the one person who it should definitely have not gone to.

Ideally, my email client would have pictures of everyone, automatically grabbed from their Facebook/Myspace/Interblag Networking accounts. Humans are so visual that it would be immediately noticed if an email wasn’t going to who I wanted it to.

I think the next best thing — at least from a corporate email perspective — would be an highly noticeable icon in the mail window that would only display if all recipient email addresses matched a set of criteria. Some obvious ideas are:

All domains match each other
All domains are on a whitelist
All domains match the sender’s domain

It should be easy enough to hack together an extension for Thunderbird to implement the “highly noticable icon,” but I know that the photos of everyone is the killer of the two. I think it’ll be another few years before we have good enough pictures of everyone (in terms of being auto-grabbable, and not having to take their picture deliberately), but I think the concept will show up soon.

Finding text from a Firefox Extension

OK. Because this was confusing the hell out of me, I had to post about figuring it out.

Let’s say you’re a developer writing a Mozilla Firefox Extension that searches for text on a page (or rather, in a browser window). If you have a button that has the following functionality attached onclick:

var webBrowserFind = getBrowser().selectedBrowser.webBrowserFind;
webBrowserFind.searchString = TEXT_TO_SEARCH_FOR;
var result = webBrowserFind.findNext();

So, every time you click the button, it will search for the text in the current browser tab. Cool.

I did this, and it was working fine, until I wanted to find the second instance of the string on the page. I’d click it twice. Sometimes it would highlight the first instance again. Sometimes it would highlight the second instance. No real sign as to why, and as a kicker, it would never find past the first two instances of the string on the page.

Turns out that after a certain timeout, the nsIWebBrowserFind interface is going to reset which find instance it was at. This means that on the second click, it will find the first occurrence again. If you click twice within the timeout, you’ll get the second result. I didn’t see any mention of this in the documentation, and it’s not clear what the mechanism is that’s resetting the instance.

If you click really fast, you’ll get the third and fourth results. For once, getting frustrated and repeatedly clicking fast was the solution.

Hat tip to mfinkle and johnm of irc.mozilla.org#extdev for helping me find the nsIWebBrowserInterface in Seamonkey