Migrating from Underscore to Lodash


The core Dropbox web application is 10 years old and used by millions of users per day. Hundreds of front-end engineers across multiple cities actively work on it. Unsurprisingly, our codebase is very large and somewhat irregular. Recently written parts have thorough test coverage, other parts haven’t been updated in years.

Over the past two years we’ve worked to modernize our front-end stack. We’ve successfully moved from CoffeeScript to TypeScript, from jQuery to React, and from a custom Flux implementation to Redux. Having completed these migrations we identified our utility library, Underscore, as one more candidate for migration.

When we began our research, Underscore hadn’t seen an update in 3 years. Newer developers were hesitant to use a deprecated library. We wanted to fill that need.

Benefits of Lodash

Lodash is a utility library composed of many individual modules. It can be used as a complete library, as individual modules, or as a custom build consisting of just the necessary functions. It’s the single most used utility library on the web, and as a result is extremely battle tested. It heavily optimizes for front-end CPU performance in a way that Underscore doesn’t. For example, Lodash is implemented to take advantage of JIT in JavaScript engines.

It also offers new features that promote functional programming. For example, it’s well suited for building a functional selector layer between React and Redux, two technologies we use in our front-end codebase. Finally, Lodash is actively maintained, which is critical to long-term support of the library.

We wanted to use a strategic migration approach. By gathering consensus from our internal community, doing research first, and constructing a bespoke build for our environment before migrating our entire codebase, we hoped to avoid serious problems.

Getting alignment with a Web Enhancement Proposal

At Dropbox, we use a lightweight, but formal proposal process to align on technology changes. Web Enhancement Proposals (WEPs) are based on Python’s PEPs. They let developers debate the pros and cons of universal changes and reach consensus before making codebase changes that will affect many developers and users.

Given the size of our codebase and the number of users that could be affected, creating a WEP for the migration was a natural first step. We identified the goal of creating a minimized, custom Lodash bundle that we can heavily cache and use throughout our primary web application, deprecating support for Underscore.js and migrating all currently used instances to Lodash. Over 100 engineers interacted with the document. We addressed concerns from front-end teams without too much heavy bureaucracy.

Doing the research first

Because we wanted a custom Lodash build with just the necessary functions, we needed to create a list of those functions.

We looked at how Underscore was being used in our codebase. We also looked at cases where Javascript has evolved enough to permit using native solutions. Using this data we put together a list of Lodash functions we should use. It wasn’t exhaustive but it got us 90% of the way there.

We also needed to get Lodash playing nicely with our toolchain. We use Bazel—an open-source, extensible, scalable build tool—to coordinate our entire build process. Static typing is enforced with TypeScript. Bazel optimizes for a deterministic build process, but doesn’t have any built-in support for tree shaking JavaScript. For the unfamiliar, tree shaking is a process by which unused code is eliminated from a bundle.

Choosing the right tools

We needed to pick the right tool to produce a custom Lodash build with our chosen set of functionality and an accurate subset of TypeScript types. Initially we expected the lodash-cli to provide the support we need. Unfortunately, lodash-cli doesn’t have any notion of types, does a poor job of tree shaking, and is being deprecated with Lodash 5.0.0 in favor of bundling with Webpack.

Next we tried Rollup and Webpack, two popular JavaScript build/bundling tools. We were specifically interested in plugins that would allow us to minimize our bundle size as much as possible. The lodash-webpack-plugin and lodash-ts-imports-loader are both important for reducing bundle size. Webpack was the clear winner; it has much better plugin support for what we were building.

Building the bundle

Our first attempt was a single build that produced a pre-minified Lodash library and a TypeScript typings file. We had configured Webpack to produce an index.ts file that imported the Lodash functions we wanted and then reexported them for consumption by our web app.

With that config we were able to produce a bundle size we were happy with(12k). However we quickly noticed that the generated typings file wasn’t going to play nicely with our actual codebase.

We had assumed the build would produce a single lodash.d.ts file containing all of the types we needed within the file. This assumption was false. Instead, the build created a file that imported functions from individual modules and reexported them. It didn’t encapsulate everything under a single Lodash package so much as output a series of unlinked functions. The lodash-ts-imports-loader plugin converts from the input to the output formats below:

Input

import {after, get} from 'lodash';

Output

import after from 'lodash/after';
import get from 'lodash/get';

This is great for treeshaking, but not so great for typing. While it produced a small javascript bundle, the types file attempted to import from individual Lodash modules. This didn’t work for us because developers needed to import from a single Lodash module. We wanted this both for developer experience and so we could serve it separately and trust browsers to cache it.

The solution was splitting our build into a two-stage process. The first build created a typings file as though we weren’t doing any module splitting/tree shaking. The second build produced a properly treeshaken bundle. From this build we got a minified Lodash bundle and a typings file that looked like this:

import {after, ...} from 'lodash-full';
export {after, ...};

This build had a minor tradeoff in that we needed to check in a full version of the Lodash typings called lodash-full.d.ts.

This allowed our developers to import * as lodash from'lodash'; and not worry about how it was built. It also provided a firewall of sorts, and caused a TypeError if a user attempted to use a Lodash function we didn’t include in our bundle.

Integrating With Bazel: As soon as Webpack produced a working Lodash bundle and typings file, we turned to integrating Webpack with Bazel. We created a Bazel build file which provides rules for building both the treeshaken bundle as well as the typings:

package(default_visibility = ['//visibility:public'])

load('//build_tools/static_build:js.bzl', 'dbx_javascript_module_name')
load('//build_tools/static_build:node.bzl', 'dbx_webpack_build')

dbx_javascript_module_name(
    name = 'lodash-custom',
    src = "http://blogs.dropbox.com/:lodash-custom-bundle",
    module_name = 'external/lodash',
)

extra_config_srcs = [
    'instructions.js',
]

lodash_srcs = [
    'lodash-custom.ts',
    'tsconfig.json',
]

lodash_deps = [
    '//build_tools/node/bazel-utils',
    '//npm/at_types/lodash',
    '//npm/lodash',
    '//npm/lodash-ts-imports-loader',
    '//npm/lodash-webpack-plugin',
    '//npm/ts-loader',
    '//npm/typescript',
    '//npm/uglifyjs-webpack-plugin',
]

dbx_webpack_build(
    name = 'lodash-custom-bundle',
    srcs = lodash_srcs,
    outs = [
        'lodash.js',
    ],
    config = 'treeshaking.config.js',
    extra_config_srcs = extra_config_srcs,
    deps = lodash_deps,
)

dbx_webpack_build(
    name = 'lodash-custom-typings',
    srcs = lodash_srcs,
    outs = [
        'lodash-custom.d.ts',
    ],
    config = 'typings.config.js',
    extra_config_srcs = extra_config_srcs,
    deps = lodash_deps,
)

Everything worked in development mode and we were very happy.

Problems with Minification and source maps: The rest of our build process didn’t like the Webpack-produced minified file instead of original source. Plus, the Webpack source maps didn’t integrate well with the rest of our Bazel-generated source maps. For a simple fix we reconfigured Uglify to produce an unminified but treeshaken file. We then let the rest of our toolchain handle minification and sourcemaps.

We should note that the unminified Webpack bundle came with a fair amount of Webpack cruft: comments, dependency resolution, and other snippets Webpack needs to do its job. However we decided that letting Bazel minify and remove most of this was a reasonable tradeoff to make.

The last issue we ran into involved LodashModuleReplacementPlugin. As we began the actual migration process, we realized we were stripping important functionality that some of our Underscore dependent code was using. For example we were missing Lodash shorthands which allow us to call a function like keyBy with just a string predicate instead of a function. We tweaked our config with this plugin and we were able to move on.

Here are the final Webpack config files:

Typings Webpack config:

// This is separate from treeshaking.config.js because we want to create lodash typings
// out of the original source, not the rewritten source.
// lodash-ts-imports-loader rewrites lodash imports for tree shaking.
// Source: import { get } from 'lodash'
// Output: import * as get from 'lodash/get'.
// This is undesirable for our typings file so we separate the builds
const webpack = require('webpack');
const LodashModuleReplacementPlugin = require('lodash-webpack-plugin');
const bazelUtils = require('bazel-utils');
const buildEnv = bazelUtils.initBazelEnv(__dirname);
module.exports = {
  entry: {app: './lodash-custom.ts'},
  output: {
    // Note: bazel will throw out this file for us.
    filename: 'garbage.js',
    path: buildEnv.outputRoot,
    library: 'external/lodash',
    libraryTarget: 'amd',
  },
  module: {
    rules: [
      {
        test: /.ts$/,
        use: {
          loader: 'ts-loader',
          options: {
            compilerOptions: {
              declaration: true,
            },
          },
        },
      },
    ],
  },
  plugins: [
    new webpack.DefinePlugin({
      NODE_ENV: JSON.stringify('production'),
    }),
  ],
  resolve: {
    symlinks: false,
  },
  resolveLoader: {
    symlinks: false,
  },
};

Treeshaking Webpack config:

const webpack = require('webpack');
const path = require('path');
const LodashModuleReplacementPlugin = require('lodash-webpack-plugin');
const UglifyJsPlugin = require('uglifyjs-webpack-plugin');
const TypingsInstructions = require('./instructions');
const bazelUtils = require('bazel-utils');
const buildEnv = bazelUtils.initBazelEnv(__dirname);

module.exports = {
  entry: {app: './lodash-custom.ts'},

  output: {
    filename: 'lodash.js',
    path: buildEnv.outputRoot,
    library: 'external/lodash',
    libraryTarget: 'amd',
  },

  resolve: {
    extensions: ['.js', '.ts', '.json'],
  },

  module: {
    rules: [
      // Rewrites lodash imports for tree shaking.
      // Source: import { get } from 'lodash'
      // Output: import * as get from 'lodash/get'.
      {
        test: /.ts$/,
        loader: 'lodash-ts-imports-loader',
        enforce: 'pre',
      },
      {
        test: /.ts$/,
        use: {
          loader: 'ts-loader',
          options: {
            compilerOptions: {
              // We generate the typings file in typings.config.js.
              // We separate these because the typings file should use the non tree shaken version of
              // the imports.
              declaration: false,
            },
          },
        },
      },
    ],
  },

  plugins: [
    new webpack.DefinePlugin({
      NODE_ENV: JSON.stringify('production'),
    }),
    new LodashModuleReplacementPlugin({
      collections: true,
      cloning: true,
      flattening: true,
      memoizing: true,
      metadata: true,
      paths: true,
      shorthands: true,
      unicode: true,
    }),
    new UglifyJsPlugin({
      uglifyOptions: {
        compress: false,
        mangle: false,
        output: {
          beautify: true,
          comments: true,
        },
      },
      sourceMap: false,
    }),
  ],

  resolve: {
    symlinks: false,
  },

  resolveLoader: {
    symlinks: false,
  },
};

It was finally time for the big scary part: migrating the codebase. Everything until now had been implemented in a vacuum; we hadn’t yet gone out and touched anyone else’s code.

Previously, our web platform team had automated a migration of our codebase while moving from CoffeeScript to TypeScript. We leveraged that expertise and used Codemod to help with this transition.

First we needed a thorough list of Underscore functions in use. We discovered six different patterns for using Underscore and had to grep for each of them.

import {filter} from 'external/underscore' ;
// Pattern 1 - function imported directly
filter(data, func)


import * as _ from 'external/underscore';

// Pattern 2 - underscore imported as _
_.filter(data, func);


import * as $u from 'external/underscore';

// Pattern 3 - underscore imported as $u
$u.filter(data, func);

// Pattern 4 - type assertion
($u as any).contains(data, func);

// Pattern 5 - object oriented style
$u(data).filter(func)

// Pattern 6 - chain
$u.chain(data).filter(func).sortBy(sortFunc).value();

We then constructed a table that looked like this:

TotalFunctionP1P2 + P3P4P5P6
27findWhere2511
5first5
16flatten826
0foldl
0foldr
11forEach11
1functions1
7groupBy412

We took our list of functions and created a list of mappings for our codemods. We separated native and lodash replacements and added notes for nuances. The list now looked like this:

Native replacements contains

_.contains(list, value, [fromIndex]) => list.includes(value, [fromIndex])

Lodash Replacements countBy

_.countBy(list, iteratee,[context]) => lodash.countBy(list, iteratee)

You can see the entirety of this research here: Underscore Replacements.

We tested all of these by putting together a simple script that asserted equality.

https://www.dropbox.com/s/8xptvxghqz8210r/replacements.js

We made a point to migrate all of our application code first and do the test code separately. That allowed us to run the migrated application code against the unmigrated tests, giving us confidence that we hadn’t introduced bugs into both at the same time—hiding a problem.

Splitting the work

The actual codemods were mostly simple bash scripts that converted from one usage to another. We ran these on the entire codebase and then separated the changes into roughly ten different diffs (pull requests) organized by codebase ownership. This made it easier for us to track down owners for review.

We also manually migrated code that was too complicated to convert automatically. Because we had already split the diffs by ownership, it was easy for us to show stakeholders any code we had manually migrated or that needed extra attention. For example, anything written with $u.chain syntax needed to be converted by hand. Functions like object, has, create, matches, template and object didn’t lend themselves to being automatically migrated to their Lodash or native equivalents. We also noted that forEach behaved differently for objects and arrays and we needed to be diligent here. Finally, we weren’t able to codemod imports of individual functions from Underscore.

We spent a diligent week with these ten diffs open, constantly testing them, and rereading them with a thorough eye. We didn’t want to introduce any regressions with this migration.

After testing, review, and gathering sign offs, we started landing the diffs. To minimize pages importing both Underscore and Lodash for any period of time we landed features related to each other. This took a couple of days and then we were technically done.

Exactly one bug

After all this we had exactly one bug. There was a manual conversion on an internal tool that used splice where we should have used slice. We got a quick fix out and no external users were impacted.

This was an involved process for what initially seemed like a straightforward migration. Using a lightweight proposal format before beginning work meant we were able to address the concerns of our users beforehand. Yet, some of our assumptions were wrong, and some tools didn’t do what we thought they would. Our requirements for a custom build resulted in extra steps. In the end, our approach of experimenting and testing first led to a nearly zero-problem migration. However, we knew we would have to actually teach the tool to make sure it gets used. We organized internal tech talks on functional programming with Lodash and showcased a Lodash “function of the week” in our internal frontend newsletter.

We hope our research will make this process easier for the next person who attempts to migrate a large 10-year-old codebase from one library to another.



Source link