Skip to main content
Back to Blog

Why the Chrome DevTools Isn’t Enough — Profiling End Users

TL;DR: JavaScript is getting harder to scale because of its single-threaded nature and the ever increasing complexity of web apps. The Chrome DevTools is our bread and butter for maintaining and improving the performance of web apps as they scale, but local profiling is often isn't representative of the code execution slowing end user experiences. Companies facing these challenges, namely Facebook, Microsoft, Slack, Dropbox, and Notion, have started profiling production to find and fix performance issues faster and more effectively. This blog post outlines their approaches for profiling production.

Scaling JavaScript is Hard

The biggest reason why web apps are hard to scale is the single threaded nature of JavaScript. All work in the browser must finish under 16ms for a page experience to run at 60fps, meaning your web app’s code execution competes with garbage collection and the browser rendering pipeline (Render/Layout, Paint, and Composite), which must all complete within 16ms. Exceeding this deadline will cause the user to perceive visual latency, as the browser will drop frames if it isn’t met.

longtask blocking user input diagram

Web apps at scale (VSCode, Slack, Notion, Discord, Spotify, etc) span millions of lines of JavaScript code, only making scaling them even more difficult. These companies have dedicated performance teams but struggle to maintain performance. Furthermore, single threaded performance is plateauing.

single threaded javascript core performance

Maintaining the performance of web apps as they scale is only getting harder and users are noticing.

users complaining about slow web app

Web devs have only one tool at their disposal to fix these performance issues: the Chrome DevTools, but it has a few shortcomings when it comes to fixing end user performance issues.

Where the Chrome DevTools Falls Short

While the Chrome DevTools is a performance engineer’s bread and butter, it has a few shortcomings:

  1. Reproducing end user performance issues is difficult. Variance in end user hardware and user flows cause the local Chrome DevTools profiling to stray from what's truly slow for end users. Having accurate profiles of slow user flows requires users to manually reproduce those user flows and cut tickets. But relying on user reporting of performance issues is brittle, since most slow user experiences go underreported.
  2. Regressions are difficult to attribute. Every new release of your app is at risk of regressing performance. At scale, pull requests for new releases of your app might often merge over 100 commits from changes across multiple teams, making regression attribution difficult. We've seen teams guess which code to revert changes to try to fix a regression without having any effect. If reverting changes didn’t work, these teams would continue reverting changes until the metric returned to normal or baseline their metrics to the regressed value.
  3. Identifying optimization opportunities is difficult. With limited visibility into code paths that were slow for end users, it was difficult to identify opportunities for improving performance metrics.

The underlying limitation with the Chrome DevTools is that it doesn’t help developers understand which JavaScript codepaths are slow for end users. To overcome these limitations, companies began exploring new methods for profiling performance in production environments.

Early Origins: Facebook’s End User Profiler Polyfill

Because of this limited visibility into production JavaScript performance, a team at Facebook decided to tackle this problem in September of 2021, by running an experiment to profile JavaScript in production.

Here’s the value driven by the end user profiler polyfill:

This JS profiler was enabled for only a small percentage of Facebook users, and only instrumented functions of 10 statements or more in order to limit performance impact and to limit the quantity of profiling data collected. Nevertheless, we found it extremely valuable for understanding Facebook.com's performance in the field and for finding optimization opportunities.

Facebook’s early experimentation served as the basis for the Chrome team standardizing the Self-Profiling API.

The Self-Profiling API

The Self-Profiling API allows for programmatic and efficient end user JavaScript profiling. It's used by Slack, Microsoft, Facebook, Dropbox, Notion, and others to profile JavaScript running for end users.

Starting the profiler requires two configuration options:

  • sampleInterval: desired sample interval (in ms), sets profiler CPU overhead of profiler
  • maxBufferSize: desired sample buffer size limit, in number of samples, sets profiler memory overhead

Here is an example of using the Self-Profiling API:

// Example of profiling a react render
const profiler = new Profiler({ sampleInterval: 10, maxBufferSize: 10_000 });

react.render(<App />); // render your react app

const trace = await profiler.stop();
sendProfile(trace);

Overhead

When enabling the Self-Profiling API in production, Facebook noticed a <1% impact on pageload (p=0.05). At Palette, none of our largest customers (>100M users) have reported noticeable performance regressions after enabling the profiler.

Filling in the Gaps of the Chrome DevTools

In light of the Self-Profiling API and the Chrome DevTools lacking insight into slow codepaths for end users, we asked “What if the Chrome DevTools Profiler was powered by production profiles from end users?”

Here are a few examples of use cases this would enable:

  1. Identifying impact of 3rd party scripts on pageload for a specific end user
  2. Identifying slow React hooks and render logic for a specific end user
  3. Identifying my application code blocking interaction latency for a specific end user

Efficiently Finding Bottlenecks by Merging Profiles

At scale, individually inspecting potentially millions of end user profiles isn’t practical, so a way of summarizing them would be helpful. Merging all collected profiles into a single profile would effectively do this.

Here’s what an example of what merging profiles would look like:

users complaining about slow web app

Individual user profiles allow understanding performance issues for a specific user. Inspecting a merged profile allows us to answer the same questions but across all our end users.

Here are a few examples of questions profile merging answers:

  1. Which 3rd party scripts impact initial page render the most across all end users?
  2. Which React hooks and render logic impact interaction performance across all end users?

Finding Regressions by Comparing Merged Profiles

Merged profiles also allow us to easily compare function execution time before and after a regression. Imagine you’re building a text editing app and notice typing performance regressed. Comparing the merged profiles of your app between the version before the regression and the versions after would show you exactly which functions have increased in execution time after the regression.

See this example of comparing performance before and after a regression, with total time increases and decreases in green and red respectively:

users complaining about slow web app

With enough sampling data (at Palette, we recommend 1M profile stack samples), these merged profiles are stable enough to identify regressions.

Here are a few examples of questions comparing merged profiles answers:

  1. Which 3rd party scripts regressed pageload across all end users?
  2. Which React hooks and render logic regressed interaction performance across all end users?

Building Profile Merging and Comparison at Palette

At Palette, we’ve built out profile merging and comparison capabilities and call them Profile Aggregation and Profile Aggregate Comparison respectively.

users complaining about slow web app users complaining about slow web app

The foundation of these features are based on these merging and comparison concepts and have advanced filtering capabilities layered on top.

users complaining about slow web app

Filtering allows us to drill into the functions running during a specific metric, on a certain page, and on a certain version/s of your app, allowing answering even further performance questions: “Which functions regressed our custom metric “typing_lag” in the last version of our app on the homepage?”.

Palette allows the following filters:

RuleDescriptionValues
VersionThe version of your app
ConnectionThe effective connetion type of the sessionslow-2g, 2g, 3g, 4g
PathThe page to filter metrics from
DeviceThe device type of the sessionMobile, Desktop, Tablet
CPU CoresThe number of CPU cores of the device
MemoryThe amount of memory (GB) of the device
RegionThe region codeUS, GB, FR, and more
TagThe tag name and value set by tag

Source Code Previews Using Source Maps

Palette allows uploading source maps to show unminified function names and source code and even goes a step further by showing code diffs for regressed functions. Below is an example of a code diff that caused a typing performance regression in Notion:

users complaining about slow web app

Self-Profiling in the Wild — Notion Case Study

After building out Profile Aggregation and Comparison, Notion was one of Palette’s first large customers to use them in production.

By using Palette, Notion improved typing responsiveness, improved table interaction responsiveness, and fixed regressions to load and typing performance. Specifically, they were able to:

  • Reduce page load latency by 15-20%
  • Reduce typing latency by 15%
  • Identify the cause of 60% of regressions
  • Reduce time to resolution of some performance regressions by 3-4 weeks

Fixing Regressions to Document Loading

Notion measures document load performance with an internal custom metric called initial_page_render (IPR), and it often regresses because it’s highly sensitive to newly added initialization code. After integrating Palette and seeing a regression, the team created a Profile Aggregate Comparison to see which code regressed initial_page_render and saw the comparison flamegraph below:

users complaining about slow web app

Despite their codebase having grown past 1M lines of code, it was clear exactly which newly added code regressed document rendering. Here’s what one of their engineers said:

By using Profile Aggregate Comparison, I saw someone added some code that was supposed to be lazy loaded into our main bundle that gets executed on load. It stuck out like a sore thumb in the flame graph.

— Carlo Francisco (Performance Engineer at Notion)

To learn how they used Palette to fix other performance issues, read the full case study.

Conclusion

Profiling production with the Self-Profiling API is the next new frontier of web performance, allowing us to understand end user performance issues that go undetected by the Chrome DevTools. Our north star for performance should be guided by what’s slow for end users, not what’s slow locally, and production profiling enables exactly that.


Stop Guessing Why Your Frontend is Slow

Palette’s production JavaScript profiler and interaction performance metrics tell you why.

Palette is the most helpful tool we have for identifying the root cause of performance issues.

John Ryan
John RyanEngineering Manager at Notion
Book a Demo