边缘情况优先的库的臃肿
The bloat of edge-case first libraries

原始链接: https://43081j.com/2025/09/bloat-of-edge-case-libraries

## 依赖树膨胀:呼吁更简单的库 现代 JavaScript 生态系统面临着过多的细粒度依赖问题,这些依赖通常是由为罕见边缘情况过度设计造成的。许多库没有专注于常见用例,而是优先处理潜在的无意义输入——例如在需要数字的地方接受字符串——导致不必要的复杂性和膨胀的依赖树。 这种“先边缘情况”的方法导致了像 `is-number`(检查类似数字的字符串)和 `is-arrayish`(检查类似数组的对象)这样的库,尽管更简单、原生的解决方案通常就足够了(`typeof` 或 `Array.isArray`),但仍然获得了大量的下载量。 作者提倡构建假设正确输入类型的库,并仅验证必要的数据完整性,将验证责任转移到应用程序层面——那里才是它应该在的地方。像 `npmgraph` 和 `node-modules.dev` 这样的工具可以帮助识别并用更简洁的替代方案,甚至原生功能来替换这些不必要的依赖。 最终,专注于核心功能和合理的假设将带来更轻量、性能更高的库,以及更健康的 JavaScript 生态系统。

## Hacker News 讨论:边缘情况优先库的臃肿 一则 Hacker News 讨论围绕一篇批评小型、过度防御型 JavaScript 库泛滥的文章展开。这些库处理大量边缘情况。核心论点是,这些库通常源于预料所有可能输入失败的心态,导致代码膨胀和不必要的依赖。 许多评论者表示同意,认为这个问题源于 JavaScript 历史上动态类型以及缺乏强大的标准库。他们认为,静态类型语言或更全面的标准库可以减轻对如此细粒度验证的需求。然而,另一些人反驳说,问题不在于语言本身,而在于缺乏定义清晰函数契约的经验以及过度设计解决方案的倾向。 讨论还涉及代码健壮性和可读性之间的权衡、历史 JavaScript 限制的影响,以及 TypeScript 和 AI 辅助编码等工具解决该问题的潜力。一个反复出现的主题是,开发者应该信任调用者提供有效输入,并“让其失败”,而不是防御性地针对每一种可想象的错误进行编程。最终,这场对话凸显了防御性编程实践与维护简洁高效代码库之间的紧张关系。
相关文章

原文

This is just some of what I’ve been pondering recently - particularly in terms of how we ended up with such overly-granular dependency trees.

I think we’ve ended up with many libraries in our ecosystem which are edge-case-first, the opposite to what I’d expect. I’ll give a few examples and some thoughts around this, mostly in the hope we can start to trim some of it away.

I believe a lot of the questionably small libraries hiding in our deep dependency trees are a result of over-engineering for inputs and edge cases we’ve probably never seen.

For example, say we’re building a clamp function:

export function clamp(value: number, min: number, max: number): number {
  return Math.min(Math.max(value, min), max);
}

Pretty simple!

What if someone passes nonsensical ranges? Let’s handle that.

export function clamp(value: number, min: number, max: number): number {
  if (min > max) {
    throw new Error('min must be less than or equal to max');
  }
  return Math.min(Math.max(value, min), max);
}

This is probably as far as I’d go. But let’s over-engineer - what if someone passes a number-like string?

export function clamp(value: number | string, min: number | string, max: number | string): number {
  if (typeof value === 'string' && Number.isNaN(Number(value))) {
    throw new Error('value must be a number or a number-like string');
  }
  if (typeof min === 'string' && Number.isNaN(Number(min))) {
    throw new Error('min must be a number or a number-like string');
  }
  if (typeof max === 'string' && Number.isNaN(Number(max))) {
    throw new Error('max must be a number or a number-like string');
  }
  if (Number(min) > Number(max)) {
    throw new Error('min must be less than or equal to max');
  }
  return Math.min(Math.max(value, min), max);
}

At this point, it seems clear to me we’ve just poorly designed our function. It solely exists to clamp numbers, so why would we accept strings?

But hey, let’s go further! What if other libraries also want to accept such loose inputs? Let’s extract this into a separate library:

import isNumber from 'is-number';

export function clamp(value: number | string, min: number | string, max: number | string): number {
  if (!isNumber(value)) {
    throw new Error('value must be a number or a number-like string');
  }
  if (!isNumber(min)) {
    throw new Error('min must be a number or a number-like string');
  }
  if (!isNumber(max)) {
    throw new Error('max must be a number or a number-like string');
  }
  if (Number(min) > Number(max)) {
    throw new Error('min must be less than or equal to max');
  }
  return Math.min(Math.max(value, min), max);
}

Whoops! We’ve just created the infamous is-number library!

This, in my opinion, is poor technical design we’ve all ended up dealing with over the years. Carrying the baggage of these overly-granular libraries that exist to handle edge cases we’ve probably never encountered.

I think it should have been:

export function clamp(value: number, min: number, max: number): number {
  return Math.min(Math.max(value, min), max);
}

Maybe with some min <= max validation, but even that is debatable. At this point, you may as well inline the Math.min(Math.max(...)) expression instead of using a dependency.

We should be able to define our functions to accept the inputs they are designed for, and not try to handle every possible edge case.

There are two things at play here:

A well designed library would assume the right data types have been passed in, but may validate that the values make sense (e.g. min is less than or equal to max).

These over-engineered libraries have decided to implement both at runtime - essentially run-time type checking and value validation. One could argue that this is just a result of building in the pre-TypeScript era, but that still doesn’t justify the overly specific value validation (e.g. the real is-number also checks that it is finite).

We shouldn’t build edge-case-first libraries, i.e. those which solve for edge cases we have yet to encounter or are unlikely to ever encounter.

Example: is-arrayish (76M downloads/week)

The is-arrayish library determines if a value is an Array or behaves like one.

There will be some edge cases where this matters a lot, where we want to accept something we can index into but don’t care if it is a real Array or not.

However, the common use case clearly will not be that and we could’ve just used Array.isArray() all along.

Example: is-number (90M downloads/week)

The is-number library determines if a value is a positive, finite number or number-like string (maybe we should name it is-positive-finite-number to be more accurate).

Again, there will be edge cases where we want to deal with number-like strings or we want to validate that a number is within a range (e.g. finite).

The common use case will not be this. The common use case will be that we want to check typeof n === 'number' and be done with it.

For those edge cases where we want to additionally validate what kind of number it is, we could use a library (but one which exists for the validation, not for the type check).

Example: pascalcase (9.7M downloads/week)

The pascalcase library transforms text to PascalCase.

It has 1 dependency (camelcase) and accepts a variety of input types:

  • strings
  • null
  • undefined
  • arrays of strings
  • functions
  • arbitrary objects with toString methods

In reality, almost every user will be passing a string.

Example: is-regexp (10M downloads/week)

The is-regexp library checks if a value is a RegExp object, and supports cross-realm values.

In reality, almost every user will be passing a RegExp object, and not one from another realm.

For context, cross-realm values can happen when you retrieve a value from an iframe or VM for example:

const iframe = document.createElement('iframe');
iframe.contentWindow.RegExp === RegExp; // false

const iframeRegex = iframe.contentWindow.someRegexp;

iframeRegex instanceof RegExp; // false
isRegex(iframeRegex); // true

This is indeed useful, and I do support this myself in chai (which I maintain). However, this is an edge case most libraries don’t need to care about.

We should build libraries which solve the common use case and make assumptions about the input types they will be given.

Example: scule (1.8M downloads/week)

scule is a library for transforming casing of text (e.g. camel case, etc).

It only accepts inputs it is designed for (strings and arrays of strings) and has zero dependencies.

In most of the functions it exports, it assumes valid input data types.

Example: dlv (14.9M downloads/week)

dlv is a library for deep property access.

It only accepts strings and arrays of strings as the path to access, and assumes this (i.e. does no validation).

Validation is important, and I want to be clear that I’m not saying we should stop validating our data.

However, we should usually be validating the data in the project that owns it (e.g. at the app level), and not in every library that later consumes it as input.

Deep dependencies applying validation like this actually shift the burden from where it belongs (at data boundaries) to deep in the dependency tree.

Often at this point, it is invisible to the consumer of the library.

How many people are passing values into is-number (via other libraries), not realising it will prevent them from using negative numbers and Infinity?

This post isn’t about overly-granular libraries in general, but I’d like to briefly mention them for visibility.

An overly-granular library is one where someone took a useful library and split it up into an almost atomic-level of granularity.

Some examples:

  • shebang-regex - 2LOC, does the same as startsWith('#!'), 86M downloads/week
  • is-whitespace - 7LOC, checks if a string is only whitespace, 1M downloads/week
  • is-npm - 8LOC, checks npm_config_user_agent or npm_package_json are set, 7M downloads/week

This is a personal preference some maintainers clearly prefer. The thought seems to be that by having atomic libraries, you can easily build your next library mostly from the existing building blocks you have.

I don’t really agree with this and think downloading a package for #! 86 million times a week is a bit much.

The e18e community is already tackling a lot of this by contributing performance improvements across the ecosystem, including removing and replacing dependencies with more modern, performant ones.

Through these efforts, there’s already a useful list of replacements and an ESLint plugin.

As a maintainer

If you’re maintaining a library, it would be worth reviewing your dependencies to see if:

  • Any are replaceable by native functionality these days (e.g. Array.isArray)
  • Any are replaceable by smaller, less granular and/or more performant alternatives (e.g. scule instead of pascalcase)
  • Any are redundant if you make more assumptions about input types

Tools like npmgraph can help you visualise your dependency tree to make this task easier.

Also, being stricter around input types will allow you to reduce a lot of code and dependencies.

If you can assume the data being passed in is the correct type, you can leave validation up to the consumer.

As a user

Keep a close eye on your dependencies (both deep and direct), and what alternatives are available to your direct dependencies.

Often, it is easy to stick with a dependency from long ago and forget to re-visit it one day in case there is a better way. Many of these packages are possible natively, or have more modern alternatives.

Useful tools:

  • npmgraph for visualising your dependency tree
  • node-modules.dev for visualising your dependencies and lots of useful meta data
  • Dependabot for keeping your dependencies up to date

On the topic of data, it is also worth ensuring validation happens at data boundaries rather than being delegated to various dependencies. Try to validate the type and value up front, before passing into dependencies.

Most of these libraries exist to handle edge cases that do certainly exist. However, we are all paying the cost of that rather than only those who need to support those edge cases.

This is the wrong way around. Libraries should implement the main use case, and alternatives (or plugins) can exist to provide the edge cases the minority needs.

We should all be more aware of what is in our dependency tree, and should push for more concise, lighter libraries.

联系我们 contact @ memedata.com