Common mistakes that broke Launchpad's browser compatibility

At this moment, Launchpad is providing all features to Opera, Internet Explorer (IE), and many other browsers. Many features like bug listings, bug tag auto-completion, changing bug status and privacy, choosing people, and editing subscriptions were broken in many browsers. Some features were broken for years because developers mistook their inexperience for something broken these browsers.

I am summarising what was wrong with Launchpad and how to avoid repeating the mistakes of the past. Launchpad reviewers, I urge you to read this so that bad code is not merged into Launchpad's trunk again. Launchpad engineers commonly work with Firefox and Chromium, two browsers with great support the latests technologies, These browsers are also very good at correcting mistakes in markup, CSS, and JavaScript. The lesser browsers are less forgiving; developers must fix their mistakes and on rare occasion write alternate code to make a feature work. Launchpad developers were too quick to blame the browser. They failed to recognise their own inexperience, and they failed to find the root cause of the problem.

Why Do We Want to Support More Browsers?

One user, lazarus, responded to my A tale of two travesties post asking "why bother with IE? Those 4% of users should switch to a DECENT browser."

Canonical's partners often run older releases of Firefox and Chromium. Some partners are running IE. Their staff cannot install software onto their computers, they have no choice but to use the corporate browser. The people who help build Ubuntu, who help Canonical enable Ubuntu to run on new hardware, cannot use Launchpad to do their job.

Consider users who use their phone or tablet with Launchpad. These browsers are also locked down, and they are commonly older versions of browsers. There were several bugs in the past that prevented Canonical staff from using their phones to login to Launchpad. They, even myself, had to find another computer to respond to a notification received via phone. Mobiles devices are now commonly used to work on the Net; Launchpad risks loosing users and projects who can choose another service.

Servers often do not have a GUI. We assume Ubuntu server is running headless. Servers also tend to be LTS installs that might can be 2 to 5 years old. Tasks such as reporting a bugs, updating it, and adding a comment must work in a text-based browser without JavaScript.

What is the Cost of Supporting More Browsers?

Again lazarus stated in his reply to my struggle to run IE in Ubuntu, "I'm really sick of seeing all those workarounds just to make it work for that damn browser."

I wrote less than 20 lines of code to Support IE, yet I removed more than than 125 lines that disabled it or other browsers. I Also removed about 200 lines of duplicate or alternate code from <noscript> elements. It takes fewer lines of code to support older and mediocre browsers than to not support them!

For every bug that was tagged as affecting IE, I discovered it also affected Opera, Konqueror, Firefox 3, and even the latest version of chromium. Fixing one bug solved an issue for multiple browsers each time. I found many bugs overlapped with or were duplicates of other bugs that were listed as a higher priority, yet remained unsolved. Understanding the root cause of a problem allows us to make a single change that fixes many things. Honestly, the most common way to fix an issue was to read the YUI documentation to learn what Launchpad was doing wrong. Thus, the root cause of IE not working is rarely IE itself...the JavaScript bugs in Launchpad are caused by inexperience.

Running other browsers can be difficult as I reported in my Travesty post. I now have a custom apache configuration that cannot be copied to other developer machines. Running Microsoft's official IE developer images take all my available RAM. I also run the VM image from an SD card because Windows is still bloatware. Installing Konqueror and all the KDE dependencies can also be a problem if you are short on disk space.

The maintenance savings by writing good code that supports all browsers are diminished by the days or weeks it takes to install all the browsers to reproduce a bug. If I cannot reproduce a bug, I cannot fix it. For many Launchpad developers, the best way to not fix a bug in a mediocre browser is to avoid introducing the bug. This is why we use YUI as our JS library, it reconciles the differences between browsers. This is why we have code standards and reviews, to ensure that we agree the code is correct and maintainable. We want to recognise bad code before it is merged into trunk and we absolutely must avoid dangerous code,

The Last Sign of Inexperience

This example of code is dangerous and unmaintainable:

if (Y.UA.ie) {
    return;
}

This is the last sign of inexperience. Bugs have already been slipped past review, merged into trunk, and are probably in production preventing user from completing their task. Why is IE not using the feature? Which browser versions and YUI versions were tested? This kind of guard often has an associated comment like:

# Usual IE brokenness.

I think the comment can also be interpreted to mean: "I do not have IE installed, I do not trust it, I do not trust my own code, I do not understand YUI".

Do not let this kind of guard merge. If one browser does not work, there are probably 5 other browsers, like those shipped with an Ubuntu LTS, that also do not work. This branch needs fixing! []{#diagnosis}

Is This a Presentation Problem?

Several bugs fixed by William Grant and myself were not JavaScript bugs as was reported. They were CSS or markup bugs. These bugs were trivially fixed in minutes when shown to the right person, yet some of these bugs were reported years ago. Your chances of fixing a presentation problem with JavaScript is similar to painting a house with a hammer. If the UI for a JavaScript feature does not look right, or not render at all, consider a CSS or markup change.

Non-conformant CSS the most common cause of broken presentation. I often uses Wikipedia's Comparison of Layout Engines when researching differences in layout. Even the latest versions of Firefox and Chromium have inherited defects from their layout engine. Firefox was showing a scrollbar because the form overlay buttons had a text-indent property set to twice the width of the page. Though Chromium did not show the scrollbar, I think Firefox was right in this case; I fixed the issue by setting a text indent that was a few characters wide. Many issues can be fixed using alternate properties or additional CSS selectors to add properties to the problem elements. For example, IE has poor support for visibility (introduced by Netscape in 1998), but has better support for display (introduced by Microsoft in 1998). IE was always showing the fields in the overlay forms that change the page. Though the form overlays were visibility: hidden, IE requires a specific selector for .yui3-lazr-formoverlay-hidden input to be display: none.

Experimental CSS can be both the cause and the solution to a display problem. We want Launchpad to look good, and we often use CSS3 to define the styles. Many CSS3 properties are not fully implemented in browsers, but there are vendor properties (ones with a dash prefix) that do support the presentation. Launchpad has many of these for Gecko and Webkit. We also need to add them for KHtml, Opera and IE. While IE does not support the CSS3 box-shadow property, it has supported shadows since 1998 and we added its vendor property to give all overlays shadows.

Invalid Markup breaks older browsers, but newer browsers are better at fixing (and hiding) markup problems. You can read about how browsers work in depth, but I think I can summarise the issue in a few sentences. When browsers discover an error while parsing the markup, they pause to fix the markup then carry on. For example, paragraphs cannot contain lists. Most browsers assume that the author meant that there is a paragraph before and after the list. If you query the DOM using JS, you may discover it does not match the source. When I switched the Launchpad test suite to use a webkit-based test browser many tests failed. Developers were ignoring test failures because they passed in Firefox. I traced the errors to existing bug reports and discovered code was creating invalid tables and Chromium fixed the markup differently than Firefox. IE is very prone to choke on malformed tables. The lesson is clear, write what you mean, or else browsers are free to reinterpret your bogus markup and topple your scripts.

Launchpad users will soon loose their last line of defence against invalid markup when Launchpad switches to Ubuntu 12.04. My squad stopped development a few months ago to learn why Launchpad's ec2 test runner failed a test that passed on all the other test runners. The reason is because the script used bad markup that toppled the Ubuntu 10.04 version of my test browser. The new webkit library in Ubuntu 12.04 is better at correcting markup, hiding from mistakes from tests, allowing the mistakes to be released to break older browsers.

Is This a Structural Problem?

Structural problems are issues where the choice in markup is brittle or unmaintainable.

Noscript is BAD. Netscape meant well by introducing the <noscript> element, but it leads to maintenance burdens and it assumes that all scripts run perfectly. The intent is to have two implementations and do not diverge. They do diverge though, leading to differing and confusing behaviours. If the script fails to initialise, the user gets no behaviour. This situation was exacerbated in Launchpad with the use of the if (Y.UA.ie) guards that meant IE users never saw the HTML or JavaScript links to change project roles or change blueprints. Launchpad prefers, and YUI recommends, progressive enhancement; write simple markup that does not need JavaScript, then modify the page to add the scripted features. Scripts that replace or hide the HTML feature should do that as the last step; If the script fails, there is still a usable link.

Mutating parent markup while parsing is dangerous. Several widgets like the TextAreaEditor were mutating their parent element before the parent element had closed itself. Only the mad and the brave mutate the thing they are iterating over. Chromium and Firefox handled the insane situation with grace, but other browsers threw a wobbly. IE 8 could not render the bug page, it stopped at the bug description. This situation is hard to see in a review. In general the script block must be outside of the elements it changes, but several widgets enhance parent and grandparent nodes, which means the markup appears to have redundantly nested elements. Some of this madness can be averted by running the script on DOMReady. There are many scripts that load as the page is parsed; I suspect these could all be changed to run at DOMReady.

Is This a YUI Problem?

YUI also has bugs. Has this issue been reported in YUI's bug tracker? Which version of YUI is Launchpad using? Does upgrading fix the issue? Maybe an issue needs to be reported?

There were several hacks in the Launchpad bug pages to load scripts during parsing or after the load event because DOMReady was buggy during YUI3 preview releases. Both Opera and IE had difficulty during running scripts during parse time, and running the scripts after the load event caused the page to flash. Launchpad has been running YUI3+ for almost two years, the hacks should have been removed. The code required a proper XXX comment to remind developers to update the scripts.

# XXX: 2010-03-12 bingo bug=234567:
# DOMReady is bug in YUI3 PR3. Test DOMReady again in YUI 3.0.

There was also a case where Launchpad ran a version of YUI that miswired the event handling in IE. Several if (Y.UA.ie) guards were added without a comment explaining the issue, nor a stating the bug number in YUI's bug tracker. I do not know when when these issues were fixed, but they are fixed in YUI 3.3.

Is This a EcmaScript or DOM Problem?

Launchpad was created in the age of EcmaScript 3 (ES3). This is now the age of ES5, and Launchpad developers are using ES5 browsers like Firefox and Chromium. Browser that the developers once used to build launchpad stopped working. Launchpad stopped working with the browsers available to Ubuntu LTS such as 08.04. Launchpad developers use jslint and YUI to ensure that code works in old and new browsers.

Was the lint report ignored? Launchpad's make lint command uses jslint to prevent known issues before they are merged into trunk. There are still a lot is issues reported in the old lazr modules that need fixing. We fixed coercion mistakes and unnested functions created in loops to address bugs in older browsers.

Which version of EcmaScript? There were a lot of failures caused by unsafe ES5 calls. William fixed many script failures in older browsers by ensuring that Y.Array() is called properly. It is very important to understand that Y.Array() returns a native array. This call is pointless:

var finklesnarks = Y.Array([]);

This call is dangerous because it creates a native ES3 array in older browsers, then calls an ES5 method:

Y.Array(finklesnarks).indexOf(fnord);

I was the author of some of these bad calls. The safe way to use ES5 array with YUI is:

Y.Array.indexOf(finklesnarks, fnord);

Are events happening in a different order? Are links doing both script actions and standard actions? YUI docs describe event bubbling, capturing is never mentioned. I thought this meant that YUI had disabled the capture phase of event propagation, simplifying how we work with events. No, this is not the case. Firefox and Chromium still prefer capture propagation, and IE uses bubbling propagation exclusively. Actions that work in the former browsers often have secondary, and wrong, actions happening in IE. This is because Firefox and Chromium are done propagating, but IE is just starting. Most Launchpad click handlers use e.preventDefault(), which means "stop the default action, continue propagating". IE does continue propagating and unexpected things happen. A lot of Launchpad links work more by accident rather than by design. The correct method to call is e.halt() which stops the default action and propagation. We want to use halt() most of the time. In cases where we want multiple subscribers to respond to an event, we are using a custom event.

Does YUI docs demonstrate that the code will work? This is an odd point to check, but a lot of code was merged into Launchpad trunk without regard to YUI documentation. The bug status widgets did not appear on click in some browsers because they are designed to only display if you used the left mouse button. But YUI documentation states that the button can only be known in the mouseup event...the mouse property explicitly states that it is not for use with click. Removing the unsupported check of event.button fixed a bug that has been open for 3 years.

Is This a Broken Browser Problem?

These are the cases where developers must write code to reconcile the differences between browser implementations, or choose to not support the browser.

Non-conformant surprises are cases where browsers claim do support a feature but do not. Launchpad new bug listing UI depends on history management to sort columns and show batches. History management depends on URLs, but IE's Link.pathname is missing the leading slash so URLs were not relative from root. The fix is trivial once you see the invalid paths in Launchpad's oops reports and then look at the code that makes URLs. As we do in Launchpad's python code, the method that wraps the pathname ensures there is a leading slash.

Alternate API nuisances are cases where the browser doesn't support the methods the script uses, and neither does YUI. The bug tag autocomplete feature never worked in IE, but there was code checking for it. I first refactored the method to check for the methods Launchpad prefers, then checks for the alternate methods. This was hard to fix, I spent an hour reading about how IE creates selection ranges before I could write a 5-line block to do what element.endSelection does.

Unsupported nightmares are cases where neither the browser or YUI support the method. There is one case in Launchpad were this happens. The series timelines use canvas to draw the series, milestones, and releases. It was disabled for all IE version, but IE 9 does support canvas if the page states it is HTML5 . Instead of checking for browser version, I created a method that checks if the browser can create the canvas, and if so, the series timeline is rendered. IE 8 and below do not get the interactive image. Other browsers that I have not planned for will show the canvas if they pass the check.

How do I ensure a browser falls back to working code?

At this moment, there is only one guard that prevents KHtml browsers using a feature. There might come a time to add more guards because Launchpad, YUI, and a kind of browser do not work, and no one has time to make it work. We need to know the browser version that was tested and the version of YUI, for example:

if (Y.UI.ie < 6 && Y.version <= 3.3)

I think we want several experienced YUI developers to agree that this is the proper course before giving up though. When a new version of the browser or YUI is released, we know to check the feature. Does the guard need to be revised, or can it be removed?