眼睛的分辨率极限——我们能看到多少像素？

眼睛的分辨率极限——我们能看到多少像素？
Resolution limit of the eye – how many pixels can we see?

原始链接: https://www.nature.com/articles/s41467-025-64679-2

## 人类视觉分辨率：超越20/20 这项研究表明，常用的显示分辨率标准（如基于20/20视力的每度60像素 – ppd）*低估*了人眼真正的分辨能力。实验揭示了人群平均的中央凹分辨率为94 ppd，个体甚至可达120 ppd – 远远超过当前显示器的“视网膜分辨率”。该研究还发现，分辨率限制因颜色而异；红绿图案的分辨率几乎与灰度（黑白）一样好，这挑战了图像/视频压缩中色度子采样的做法。分辨率随着远离中央凹的距离增加而迅速下降，但每个颜色方向的下降速度不同。研究人员开发了一个模型来预测人群中的分辨率限制，强调了在显示器设计中考虑个体差异的重要性。他们还提供了将显示分辨率与最佳观看距离相关的指导原则，表明当前对4K和8K电视的建议过于保守。最后，这些发现对诸如注视点渲染之类的技术具有影响，表明可以通过根据颜色特定的视觉敏感度调整分辨率降低来节省带宽。

## 分辨率与人眼：摘要一篇近期文章（nature.com）引发了 Hacker News 上关于人类视觉感知极限以及提高屏幕分辨率（如 4K 和 8K）价值的讨论。虽然许多人发现更高的分辨率在清晰度和细节方面有明显改善，尤其是在文本渲染和游戏等任务中，但实际收益很大程度上取决于观看距离和内容。用户们争论在典型桌面距离下，1440p 和 4K 之间的差异是否显著，一些人引用研究表明几乎没有可察觉的差异。另一些人强调，更高的分辨率对特定任务有益，例如详细的图像编辑或在移动设备上进行近距离工作。核心观点是，眼睛的分辨率不仅仅是像素数量。焦距、对比度以及大脑处理视觉信息等因素都发挥作用。虽然我们的眼睛具有潜在的高分辨率，但实际限制意味着在某个点上会产生边际效益递减。最终，分辨率的“最佳点”取决于个人需求和观看条件。许多评论员还强调了色彩准确性与分辨率同样重要，尤其对于创意专业人士而言。

原文

Foveal resolution limit

Figure 2a reports the measured resolution limit for achromatic, red-green, and yellow-violet patterns. The experiment measured the highest resolution that can be reliably detected by an observer at various eccentricities from the fovea. The resolution is expressed in pixels per degree or ppd. The corresponding maximum spatial frequency in cycles per visual degree is equal to half of the reported ppd values. The thresholds are reported for detecting a high-contrast Gabor patch as well as for identifying a decrease in resolution in black text on a white background, and white text on a black background simulating dark mode. The detailed breakdown of the mean, median, and 95th percentile threshold ppd values observed across different colour directions and eccentricity levels are provided in the Supplementary Table 2. First, we will focus on the results for foveal vision, corresponding to an eccentricity of 0°.

Fig. 2: Spatial sensitivity and resolution limits for various colour directions across the visual field.

a The measured resolution limit in pixels-per-degree (ppd) at each eccentricity across the sample (N = 18), with median (open circles), 95% Confidence Intervals (CIs; error bars), and mean (horizontal bars). Numbers next to the violins indicate median ppd values of the observed data. Dashed lines represent the model fit. The edges of the shaded violin plot areas indicate the 95th percentile of thresholds. b Heatmap showing the cumulative probability density of resolution limits within the observer sample, centred around predictions from the fitted Watson (2018)³⁰ model. The model was fitted using measured resolution limits at specific eccentricities (Fig. 2a) and interpolated across those eccentricities to provide a continuous representation. c Ideal display vertical resolution as a function of viewing distance expressed in display height (H). The red horizontal bars indicate the ITU-R BT.2100-2³¹ recommended viewing distances for various display resolutions: FHD (2K), 4K, and 8K. d Required pixel-per-inch (ppi) resolution needed as a function of viewing distance (meters). In plots (b–d), blue areas indicate that any further increase in pixel resolution would not be perceptible to almost all observers, while yellow areas represent resolutions that will be within the visual perceptual limit of almost all observers. The dotted and dashed lines represent different percentiles of the sample as shown in the legend.

The widely accepted 20/20 vision standard, established by Snellen, suggests that the human eye can resolve detail at an angular resolution of 1 min of arc, which corresponds to 60 pixels per degree (ppd) (Supplementary Section 2—“Visual acuity units conversion”) details the conversion between these units). This measure is derived from the design of the Snellen chart, where the smallest letters on the 20/20 line subtend an angle of 5 min of arc, with each critical feature of these letters subtending 1 min of arc when viewed from 20 feet or 6 m^22,23. This 1 arc minute value has historically been considered the threshold for human visual resolution, discussed in more detail in the Supplementary Section 1 (“Historical context of visual acuity standards”), leading to the assumption that 60 ppd is sufficiently high for display purposes. However, younger observers with no optical abnormalities usually have acuities better than 20/20. In the context of displays, the Ultra Retina XDR display found in the 7th generation Apple iPad Pro (2024, 13”) has an effective resolution of 65 ppd when viewed from 35 cm away, the shortest comfortable viewing distance. Both the 20/20 assumption and the Retinal display resolutions are significantly lower than the population mean of 94 ppd we measured in our experiment, or individual values as high as 120 ppd (see Supplementary Fig. 5). This demonstrates that the 60–65 ppd range is not the “retinal resolution” for a display. Note that high-contrast content, such as our Gabor patch stimulus, is not outside of the norm for content typically seen on displays: notably, text is often rendered at maximum contrast. To demonstrate, we also measured the detection threshold for text, both black-on-white and white-on-black (dark mode), and obtained values closely matching the resolution limit for sinusoidal gratings, as indicated by the “Text” data points in Fig. 2a. Our results clearly indicate that the resolution limit of the eye is higher than broadly assumed in the industry.

It could also be surprising that the foveal resolution limit of red-green patterns is similar to that of achromatic patterns—89 ppd for red-green vs. 94 ppd for achromatic. It must be noted, however, that we did not try to isolate observers’ individual chromatic mechanisms via the heterochromatic flicker paradigm²⁴, as we wanted to capture data that could generalise across the population. Our results cast doubt on the common practice of chroma sub-sampling found in almost every lossy image and video format, from JPEG image coding to H.265 or AV1 video encoding. The assumption of chroma subsampling is that the resolution of chromatic channels can be reduced twofold in relation to the achromatic channel due to the lower sensitivity of the visual system to high-frequency chromatic contrast. Our data suggests that this only holds for the yellow-violet colour direction, with the maximum resolution of 53 ppd, but not for the red-green direction, consistent with the vision science theory that the isoluminant red-green pathway is the most sensitive opponent-colour channel of the human visual system²⁵.

Resolution limit in periphery

Figure 2a shows the rapid decline of the resolution limit as the stimulus was presented at increased eccentricities. This is in line with the established understanding that visual acuity and colour discrimination decrease as the stimulus moves away from the fovea, primarily due to the fall-off in cone density and the increase in receptive field size in the retina^4,26.

The notable aspect of our results is that the resolution limit declines with increased eccentricity differently across colour directions. The achromatic resolution limit declines 2.3× between foveal vision and 10° eccentricity, while red-green declines 4.9× and yellow-violet 4.8×. Popular techniques, such as foveated rendering^27,28 or foveated compression²⁹, are optimised for achromatic vision. Our results suggest that these techniques could provide further computational and bandwidth savings by lowering the resolution requirements for the chromatic channels.

Modelling the resolution limit

To interpolate and extrapolate our measurements, we fit the contrast sensitivity model presented by Watson (2018)³⁰:

$$\log ({S}^{{{\rm{c}}}}(e,\rho ))=\log ({S}_{0}^{{{\rm{c}}}})+{k}_{\rho }^{{{\rm{c}}}}(1+{k}_{e}^{{\rm{c}}\,}e)\rho,\quad \forall \,{\rm{c}}\,\in \{\,{\mbox{Ach}},{\mbox{RG}},{\mbox{YV}}\,\},$$

(1)

where S^c is the contrast sensitivity of the colour channel c for a given stimulus at eccentricity (e) and spatial frequency (ρ). S₀ is the baseline sensitivity affected by other stimulus parameters (luminance, temporal frequency, size, etc). k_ρ and k_e are the parameters of the model representing linear decrease with respect to spatial frequency and retinal eccentricity, respectively. The contrast sensitivity S is the inverse of the contrast of the stimulus. In our study, the contrast value is fixed for each colour direction (values reported in the Supplementary Table 1). We optimise the values of S₀, k_ρ and k_e to predict the measured ρ values from our data. The rearranged equation predicting the spatial frequency threshold as an inverse factor of eccentricity follows:

$$\rho (e)=\log \left(\frac{{S}^{{{\rm{c}}}}}{{S}_{0}^{{{\rm{c}}}}}\right)\frac{1}{{k}_{\rho }^{{{\rm{c}}}}(1+{k}_{e}^{{\rm{c}}\,}e)},\quad \forall \,{\rm{c}}\,\in \{\,{\mbox{Ach}},{\mbox{RG}},{\mbox{YV}}\,\}.$$

(2)

The fitted model is drawn as dashed lines in Fig. 2a. More details of the fitting procedure and the parameter values are provided in the Supplementary Section 4.3 ("Parameters of the resolution limit model”).

Resolution limit across the population

In practical applications, it is important to know how the resolution limit varies across the population. This lets us make decisions that are relevant for the majority of the population. For example, designing a display which has “retinal resolution” for 95% of people rather than an average observer. To model the variation of the resolution limit in populations, we used the model from Eq. (2) to find the mean threshold, and then fitted a normal distribution to the per-observer data. To estimate the probability distribution at eccentricities not measured in our dataset, we linearly interpolated the parameters of the Gaussian distribution as detailed in the Supplementary Section 4.4 ("Probability distribution across the population”). The cumulative distribution, shown in Fig. 2b, demonstrates a large variation across the population, especially at eccentricity. For example, if a median observer can see up to 22 ppd at 20° eccentricity, this value increases to 35 ppd for the 95th percentile of our sample. This shows the importance of considering individual differences in populations when designing technology aligned with human vision. Additionally, we also tested the effect of viewing distance on the resolution limit, but did not observe a consistent trend among our sample of observers. More details of this investigation are discussed in the Supplementary Section 4.5 ("The effect of viewing distance”).

We may also want to know how the resolution limit, expressed in ppd units, translates to actual displays and viewing distances. This is shown in Fig. 2c, where we plot the relationship between the display resolution (number of horizontal lines) and the viewing distance (measured in display heights). Our model predictions can be compared with the ITU-R BT.2100-2³¹ recommended viewing distances for television, shown as red horizontal lines in Fig. 2c. Since Full HD (FHD) resolution was not designed to deliver a perfect image, the ITU recommendation of 3.2 display heights falls short of the reproduction below the visibility threshold. Our model indicates that a distance of at least 6 display heights would be necessary to satisfy the acuity limits of 95% of the observers. For 4K and 8K displays, the ITU suggests viewing distances of 1.6–3.2 and 0.8–3.2 display heights, respectively. Our model shows that those ranges are overly conservative and there is little benefit of 8K resolution when sited further than 1.3 display heights from the screen. Used in this way, our model provides a framework to update existing guidelines and to establish new recommendations based on the limitations of our vision. In Fig. 2d, we plot the relation between pixel density (in pixels-per-inch) and viewing distance and show the screen resolution for two different devices. To allow the readers to test their own displays, we created an online display resolution calculator available here.

Example: foveated rendering

Foveated rendering, found in many commercial XR headsets, reduces the quality of rendered content depending on how far is a portion of the screen from the gaze location^27,28. Foveated rendering typically reduces the resolution of rendered content to save bandwidth and the computational cost of rendering. The majority of foveated rendering methods consider only the perception of achromatic contrast and are manually tuned. Here, we show how our measurements could be used to find the right thresholds for foveated rendering for achromatic and chromatic contrast.

Here, we consider a simplified task of foveated filtering, which could improve video compression in foveated streaming—we remove the high-frequency contrast that is invisible to the human eye. To produce the result shown in Fig. 3, we decompose an image into achromatic, red-green and yellow-violet components of the DKL colour space³², decompose it into frequency bands using a Laplacian pyramid³³, and then set to zero the coefficients below the threshold contrast for a given eccentricity according to our fitted model for an average observer. More details of the image simulation can be found in the Supplementary Section 4.6 ("Foveated filtering”). When the reconstructed image in Fig. 3 is seen from the right distance, and the gaze is directed towards the 0° target, the loss of resolution at larger eccentricities should be invisible.

Fig. 3: Eccentricity-dependent filtering that removes invisible details to improve coding performance.

a Original image. b Filtered image; the contour lines show the retinal eccentricity positions relative to the gaze position. The filter was applied uniformly across discrete segments of eccentricity to better show the differences. Please zoom the page on the screen such that the red rectangle in the bottom-left corner of the simulated image is approximately the size of a credit card and view the image from 50 cm away. When the gaze is centred on the red backpack in the image, the degradation of high-frequency details in the periphery will not be noticeable to the human eye.