Facing the known unknowns and unknown unknowns


This post will sketch out what a Webpack 3 to 4 upgrade looks like in a large modern web application. We hope this is either an entertaining recollection or helpful for your own future upgrades.

Why upgrade?

Coursera has used Webpack in production for a few years now. Recently, we’ve been thinking about how to do code splitting more effectively in an effort to adhere to a performance budget.

As we were on Webpack 3, this resulted in an uncomfortable situation: CommonsChunkPlugin, the mechanism for ensuring that code common to many split points is extracted to a common file, is not part of the deal in Webpack 4. We made the decision to upgrade to Webpack 4 to avoid sinking time into a removed plugin, while also hoping to get some of the benefits.

Just follow the instructions, right?

Webpack documentation has a page dedicated to the upgrade from 3 to 4. Just follow the instructions and we’re done, right? Unfortunately not: In a large web application, there are known unknowns and unknown unknown usages of Webpack. We needed to move them to the realm of known knowns to successfully upgrade.

Current State of the World

Before tackling the unknowns, we describe the current state of the world. Coursera utilizes server-side rendering [SSR] to decrease the time to interaction, while supporting client-side rendering [CSR] for debugging and legacy use cases. Coursera is localized in eight different languages having generated localized Webpack bundles. The Coursera site consists of many single-page applications, each built with Webpack using our internal build system called Rapidos. Each single-page app is backed by an SSR server deployed as a service on Elastic Container Service, serving requests coming in from our edge server.

In development mode, we use Webpack dev middleware along with a custom dev server to emulate the production edge server. The custom dev server forwards requests to a local copy of the SSR server to serve requests coming from developers’ laptops.

Coursera front-end infrastructure

Known Unknowns

In pursuing this upgrade, we had a few known unknowns that we needed to tackle one at a time. The reason we call them “known unknowns” is because though we knew each case was a breaking change due to Webpack 4 changes, we did not know how to unbreak each change. In each case, reading through the changelog, our own Webpack build code, and online resources unblocked us.

  • We had custom code that used the Webpack plugin API for injecting data into HTML templates. This post by the maintainer of ts-loader and this guide by the creator of Webpack were enough for us to understand what changes were necessary. Specifically, the plugin API now used the hooks property to tap into Webpack compiler hooks or the developer’s custom hooks.

For us, this involved changing

compilation.plugin('html-webpack-plugin-after-html-processing', (htmlPluginData, callback) => {
...
}

to

compilation.hooks.htmlWebpackPluginAfterHtmlProcessing.tap( 'withInterpolatedTemplateContext', htmlPluginData => {
...
}
  • We used a variety of libraries that required bumping up versions to support Webpack 4 [e.g., html-webpack-plugin, happypack, webpack-dev-middleware …]. This required reading through the changelog or release notes of each library to find which version was Webpack 4 compatible.
  • Webpack 4 has a new mode field that applies certain optimizations by default. This post by Webpack founder Tobias Koppers gave us all the necessary information. For instance, we removed extraneous plugins like NoEmitOnErrorsPlugin.
  • We needed to migrate our old usage of CommonsChunkPlugin over to SplitChunksPlugin before investing more into code splitting. This required reading this gist describing the difference between the two plugins and then reading the docs. It took us several rounds of reading and playing around with the code for us to understand the new concepts. Our old setting looked like this:
new webpack.optimize.CommonsChunkPlugin({
name: 'app',
async: true,
children: true,
minChunks: 3,
})

Which translates to “create a single async common chunk that contains code common to 3+ async splits.” This translates to the following SplitChunksPlugin setting in config.optimizations.splitChunks.cacheGroups:

{

commons: {
chunks: 'async',
name: 'asyncCommonJS',
minChunks: 3,
// Ignores `minSize`, `maxSize`, and other defaults.
enforce: true,
priority: 0,
},
// The next 2 lines disable the 2 default `cacheGroups`, which are specified in
// https://webpack.js.org/plugins/split-chunks-plugin/#optimization-splitchunks
default: false,
vendor: false,

}
  • We needed to change require paths for various Webpack internal files that had been refactored to import the correct files. For instance,require('webpack/schemas/webpackOptionsSchema.json') -> require('webpack/schemas/WebpackOptions.json')

Unknown Unknowns

A variety of things we classify as unknown unknowns surprised us In the process of upgrading. In every upgrade, unknown unknowns will pop up: it is our job to understand the root cause so we can apply fixes from a position of knowledge.

JSON modules

Webpack 4 now by default regards files with .json extension as JSON modules:

Source from Webpack 4 changelogs

This turned out not to play nicely with our usage of bundle-loader, which we used for lazy loading JSON. See this issue for more details. The fix is to force files with .json extensions to use json-loader, which does work with bundle-loader:

{
test: /.json$/,
loader: 'json-loader',
type: 'javascript/auto',
}

CSS Bundling

We previously used extract-text-webpack-plugin to bundle all our CSS into a single file. This library is no longer recommended for Webpack 4 CSS support [see here and here]. As other folks have reported success with Webpack 4, we tried it anyway. This mostly worked, except the CSS import order was no longer respected. Unfortunately we relied upon the cascading nature of CSS for style overrides in certain cases, making it necessary to respect CSS ordering.

This issue was a common theme with extract-text-webpack-plugin [e.g., see here], so we bit the bullet and migrated to mini-css-extract-plugin.

The config changes are as follows:

new ExtractTextPlugin({
filename: [name].[chunkhash].css,
allChunks: true,
disable: isDev(),
})

is now

new MiniCssExtractPlugin({
filename: ‘[name].[chunkhash].css’,
})

along with the following option in optimizations.splitChunks.cacheGroups:

{
name: 'allStyles',
test: (m: Module) => m.constructor.name === 'CssModule',
chunks: 'all',
enforce: true,
priority: 1,
}

We needed to use the test property to test on CssModule rather than looking for the .css extension because we use Stylus in our codebase.

After this change, we also faced this issue with the plugin generating an additional Javascript file stealing the original entrypoint. We modified our usage of html-webpack-include-assets-plugin to also include the additional Javascript file to unbreak module loading.

Larger stats.json output

We generate a Webpack stats.json file with every Webpack build as Rapidos uses the information on what files were bundled for visualization with webpack-bundle-visualizer and to understand the transitive dependencies of each app. We got errors that looked like RangeError: Invalid string length, which after a search were revealed to be an issue with attempting to create too large a string from a JSON object. The exact nature of the issue was fairly obscure: it looks like Webpack 4 refactored the stats object format, which made the object a lot larger by default [source]. The source of this increase, issuerPath, is not configurable. After some experimenting, we changed our StatsPlugin configuration to the following:

modules: true,
chunks: true,
timings: false,
source: false,
reasons: true,

which still gave us the relevant information while keeping the file small enough to be produced.

Webpack-multi-output and undocumented hooks usage

We use webpack-multi-output to generate multiple outputs for a single configuration. Each chunk that passes through this plugin may generate multiple chunks, one per locale. The Webpack chunk loading script then needs to be modified to make things work. In previous versions, we made use of an undocumented hook to do this:

Webpack 3 jsonp script hack for webpack-multi-output

Access to this hook is no longer allowed in Webpack 4, so we had to use the mainTemplate.hooks.render hook along with an undocumented stage flag to make this work. It now looks like this:

webpack-multi-output jsonp script hack for Webpack 4

We want to stop using webpack-multi-output in the near future — this hack reduces bus factor, makes future upgrades hard, and disempowers non-experts who want to make changes.

What did we get out of this upgrade?

  • Build times are faster. In lazy loading cases, we’ve seen incremental compilation of one of our apps go from ~10s in Webpack 3 to ~2.5s in Webpack 4. Another app went from ~30s to ~20s. In production bundling, we saw a 10–30% decrease in the cold cache case, and a ~50–60% decrease with Uglify cache.
  • Bundle sizes are roughly identical.
  • We’ve reduced our dependency on deprecated, unmaintained, or outdated plugins.
  • We are now confident building on top of Webpack — through reading the source code of Webpack, keeping up with recent developments, and having a better mental image of Webpack internals [e.g., chunk graph algorithm, split chunks plugin], we now feel empowered to take full advantage of Webpack.

That’s all — we hope you enjoyed following along with Coursera’s journey upgrading from Webpack 3 to Webpack 4. We went through trials and tribulations, but think the final outcome was worth the effort.



Source link