Improving Browser Performance with Parallelism

modernbrowser pic.jpg

Earlier this year I saw a presentation by Mozilla’s Lin Clark (@linclark) in a talk that had “The future of browser” in its name. Really this was a talk about web performance in modern browsers, with a key focus on parallelism (but more on that later). Here is what I learned:

With the increasing demand of faster web experiences, like those needed for animations, browsers had to become much faster at doing their job. If the browser did not repaint quickly enough, the experience would not flow as smoothly, at least not as smoothly as we have become accustomed to. Today we expect better screen resolutions and ultra-performant UX experiences, even ones that are rendered on web pages of our phones. This has been brought to us not only by the new hardware constantly being developed, but also by the ever-evolving capabilities of the modern browsers like Chrome and Firefox. But how did they achieve the higher processing prowess necessary to utilize the more advanced hardware?

Let’s start by examining how the hardware first improved: We began with single-core CPUs, which performed simple short-term memory operations with the help of their registers and had access to long-term memory stored in RAM. But in order to make a more powerful processing system, we had to split processing into multiple cores, all of them accessing a shared RAM storage. The cores, working side by side, take turns accessing and writing to RAM. But how do they coordinate this? How do they cooperate and ensure the distributed order of the reads and writes is correct? To solve such “data race” issues involved keen strategy and proper timing. It required a network of sharing work that is optimized to maximize the limits of processing power by using GPUs (which can have thousands of separate cores working together). For an example how GPUs are used to distribute work, check out my recent article on GPUs.

To take full advantage of these hardware improvements, browser developers had to upgrade the rendering engine. In basic terms, the rendering engine in a browser takes HTML and CSS and creates a plan with it, after which, it turns that plan into pixels. This is done in several phases, culminating with the use of GPUs to compose paint layers; all of which is explained in more detail in an article I wrote four years ago.

To upgrade this amazing engine, which brings daily web experiences to people around the globe, browser developers turned to parallelism. Parallelism, which is now fully used by Firefox (since summer of 2017), always existed in Chrome, and is the reason why that browser was always faster. What is parallelism? In the context of a web browser, it is the splitting of computational work done by the browser into separate tasks, such as simultaneous processes running on different tabs. But utilizing it correctly, like when using fine-grained parallelism to share memory between many cores, requires very complicated technology and coordination. Not to mention that the resulting data races can cause some of the worst known security issues. Firefox developers, which instead of starting from scratch, slowly merged parallelism into their existing browser, described the process, “like replacing jet engine parts in mid flight”.

These new browser powers allowed us to do much more than run different tabs at the same time on separate cores. We can now assign different cores to different parts of the same page’s UI (e.g. each Pinterest image on a different core). We render web experiences by allowing JavaScript to run on separate cores with the use of web workers, which can even now share memory between each other. Finally, with the advent of WebAssembly, a low-level assembly-like language that runs with near-native performance and can be compiled from C/C++, performance is really starting to soar. For more information on WebAssembly and how it is used alongside the JavaScript in your browser, see:

Jeff Poyzner