Yarn

What is Yarn?

Yarn caches every package it downloads so it never needs to again. It also parallelizes operations to maximize resource utilization so install times are faster than ever.

Safe, stable, reproducible projects

Yarn is a package manager that doubles down as project manager. Whether you work on one-shot projects or large monorepos, as a hobbyist or an enterprise user, we’ve got you covered.

The node_modules problem

The way installs used to work was simple: when running yarn install Yarn would generate a node_modules directory that Node was then able to consume thanks to its builtin Node Resolution Algorithm. In this context, Node didn’t have to know the first thing about what a “package” was: it only reasoned in terms of files. “Does this file exist here? No? Let’s look in the parent node_modules then. Does it exist here? Still no? Too bad…”, and it kept going until it found the right one. This process was vastly inefficient, and for a lot of reasons:

  • The node_modules directories typically contained gargantuan amounts of files. Generating them could make up for more than 70% of the time needed to run yarn install. Even having preexisting installations wouldn’t save you, as package managers still had to diff the existing node_modules with what it should have been.

  • Because the node_modules generation was an I/O-heavy operation, package managers didn’t have a lot of leeway to optimize it much further than just doing a simple file copy – and even though we could have used hardlinks or copy-on-write when possible, we would still have needed to diff the current state of the filesystem before making a bunch of syscalls to manipulate the disk.

  • Because Node had no concept of packages, it also didn’t know whether a file was meant to be accessed (versus being available by the sheer virtue of hoisting). It was entirely possible that the code you wrote worked one day in development but broke later in production because you forgot to list one of your dependencies in your package.json.

  • Even at runtime, the Node resolution had to make a bunch of stat and readdir calls to figure out where to load every single required file from. It was extremely wasteful, and was part of why booting Node applications took so much time.

  • Finally, the very design of the node_modules folder was impractical in that it didn’t allow package managers to properly dedupe packages. Even though some algorithms could be employed to optimize the tree layout (hoisting), we still ended up unable to optimize some particular patterns – causing not only the disk usage to be higher than needed, but also some packages to be instantiated multiple times in memory.

Fixing node_modules

When you think about it, Yarn already knows everything there is to know about your dependency tree – it even installs it on the disk for you. So the question becomes: why do we leave it to Node to locate the packages? Why don’t we simply tell Node where to find them, and inform it that any require call to X by Y was meant to access the files from a specific set of dependencies? It’s from this postulate that Plug’n’Play was created.

In this install mode (now the default starting from Yarn v2), Yarn generates a single .pnp.js file instead of the usual node_modules. Instead of containing the source code of the installed packages, the .pnp.js file contains a map linking a package name and version to a location on the disk, and another map linking a package name and version to its set of dependencies. Thanks to this efficient system, Yarn can tell Node exactly where to look for files being required – regardless of who asks for them!

This approach has various benefits:

  • Since we only need to generate a single text file instead of tens of thousands, installs are now pretty much instantaneous – the main bottleneck becomes the number of dependencies in your project rather than your disk performance.

  • Installs are more stable and reliable due to reduced I/O operations, which are prone to fail (especially on Windows, where writing and removing files in batch may trigger various unintended interactions with Windows Defender and similar tools).

  • Perfect optimization of the dependency tree (aka perfect hoisting) and predictable package instantiations.

  • The generated .pnp.js file can be committed to your repository as part of the Zero-Installs effort, removing the need to run yarn install in the first place.

  • Faster application startup, because the Node resolution doesn’t have to iterate over the filesystem hierarchy nearly as much as before (and soon won’t have to do it at all!).

Plug’n’Play

Unveiled in September 2018, Plug’n’Play is a new innovative installation strategy for Node. Based on prior works from other languages (for example autoload from PHP), it presents interesting characteristics that build upon the regular commonjs require workflow in an almost completely backward-compatible way.

Workspaces

Split your project into sub-components kept within a single repository.

Stability

Yarn guarantees that an install that works now will continue to work the same way in the future.

Documentation

Special care is put into our documentation, and we keep improving it based on your feedback.

Plugins

Yarn cannot solve all your problems – but it can be the Foundation for others to do it.

Innovation

We believe in challenging the status quo. What should the ideal developer experience be like?

Openness

Yarn is an independent open-source project tied to no company. Your support makes us thrive.

official yarnpkg.com


src stackshare.io/yarn