可以在URL中使用换行符。
You can use newline characters in URLs

原始链接: https://lemire.me/blog/2026/02/28/you-can-use-newline-characters-in-urls/

URLs,即网页内容的地址,可能变得冗长且难以处理。虽然HTML允许使用换行符和制表符格式化URLs以提高可读性,但官方WHATWG URL规范将这些字符标记为无效,会发出浏览器通常会忽略的非致命错误。这意味着这种格式化的URLs在实践中通常*有效*。 然而,空白字符的处理方式在**data URLs**中有所不同。这些URLs直接将文件(如图像)嵌入到URL字符串中。在data URLs中,任何ASCII空白字符(包括空格)都会在解码过程中被忽略。这允许对嵌入式数据进行可读的格式化,例如base64编码的图像或SVG图形。 本质上,虽然标准URLs对空白字符要求严格,但data URLs为嵌入和格式化直接位于URL中的内容提供了灵活性,从而为某些应用程序提供了更简洁、更易读的代码。

``` Hacker News新消息 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交登录 你可以在URL中使用换行符 (lemire.me) 10 分,作者 chmaynard 1小时前 | 隐藏 | 过去 | 收藏 | 讨论 帮助 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系 搜索: ```
相关文章

原文

We locate web content using special addresses called URLs. We are all familiar with addresses like https://google.com. Sometimes, URLs can get long and they can become difficult to read. Thus, we might be tempted to format them
like so in HTML using newline and tab characters, like so:

<a href="https://lemire.me/blog/2026/02/21/
        how-fast-do-browsers-correct-utf-16-strings/">my blog post</a>

It will work.

Let us refer to the WHATWG URL specification that browsers follow. It makes two statements in sequence.

  1. If input contains any ASCII tab or newline, invalid-URL-unit validation error.
  2. Remove all ASCII tab or newline from input.

Notice how it reports an error if there is a tab or newline character, but continues anyway? The specification says that A validation error does not mean that the parser terminates and it encourages systems to report errors somewhere. Effectively, the error is ignored although it might be logged. Thus our HTML is fine in practice.

The following is also fine:

<a href="https://go
ogle.c
om" class="button">Visit Google</a>

You can also use tabs. But you cannot arbitrarily insert any other whitespace.

Yet there are cases when you can use any ASCII whitespace character: data URLs. Data URLs (also called data URIs) embed small files—like images, text, or other content—directly inside a URL string, instead of linking to an external resource. Data URLs are a special kind of URL and they follow different rules.

A typical data URL might look like data:image/png;base64,iVBORw0KGgoAAAANSUhEUg... where the string iVBORw0KGgoAAAANSUhEUg... is the binary data of the image that has been encoded with base64. Base64 is a text format that can represent any binary content: we use 64 ASCII characters so that each character encodes 6 bits. Your binary email attachments are base64 encoded.

On the web, when decoding a base64 string, you ignore all ASCII whitespaces (including the space character itself). Thus you can embed a PNG image in HTML as follows.

<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAA
                                 QAAAAECAIAAAAmkwkpAAAAEUl
                                 EQVR4nGP8z4AATEhsPBwAM9EB
                                 BzDn4UwAAAAASUVORK5CYII=" />

This HTML code is valid and will insert a tiny image in your page.

But there is more. A data URL can also be used to insert an SVG image. SVG (Scalable Vector Graphics) is an XML-based vector image format that describes 2D graphics using mathematical paths, shapes, and text instead of pixels.
The following should draw a very simple sunset:

<img src='data:image/svg+xml,
<svg width="200" height="200" 
     xmlns="http://www.w3.org/2000/svg">
  <rect width="100%" height="100%" fill="blue" /> 
  <!-- the sky -->
  <circle cx="100" cy="110" r="50" fill="yellow" />  
  <!-- the sun -->
  <rect x="0" y="120" width="200" height="80" fill="brown" />  
  <!-- the ground -->
</svg>' />

Observe how I was able to format the SVG code so that it is readable.

Further reading: Nizipli, Y., & Lemire, D. (2024). Parsing millions of URLs per second. Software: Practice and Experience, 54(5), 744-758.

`; modal.addEventListener('click', function(e) { if (e.target === modal) modal.close(); }); modal.querySelector('#bibtex-copy-btn').addEventListener('click', function() { const text = modal.querySelector('#bibtex-target').textContent; navigator.clipboard.writeText(text).then(() => { const origText = this.innerText; this.innerText = "Copied!"; setTimeout(() => this.innerText = origText, 1500); }); }); document.body.appendChild(modal); const style = document.createElement('style'); style.innerHTML = `dialog::backdrop { background: rgba(0, 0, 0, 0.5); }`; document.head.appendChild(style); }                         // 1. Extract the URL             const fullLinkHtml = el.dataset.fullLink;              const tempDiv = document.createElement('div');             tempDiv.innerHTML = fullLinkHtml;              const linkElement = tempDiv.querySelector('a');             const rawUrl = linkElement ? linkElement.href : '';                           // 2. Compute the current access date             const accessedDate = this.getCurrentAccessedDate();              // 3. --- NEW LOGIC: Extract ONLY the year (YYYY) ---             // Gets the full date string, e.g., "November 23, 2025"             const fullDateString = el.dataset.year;             // Use regex to find the four-digit year at the end of the string             const match = fullDateString.match(/(\d{4})$/);             const publicationYear = match ? match[0] : '????'; // e.g., '2025'                          // 4. Generate BibTeX Data with the corrected year             const safeTitle = el.dataset.title.replace(/[^a-zA-Z0-9]/g, '').substring(0, 15);             // Use the clean year for the BibKey             const bibKey = (publicationYear + safeTitle);             const content = `@misc{${bibKey}, author = {${el.dataset.author}}, title = {{${el.dataset.title}}}, year = {${publicationYear}}, howpublished = {\\url{${rawUrl}}}, note = {Accessed: ${accessedDate}} }`;                          // 5. Show Modal             document.getElementById('bibtex-target').textContent = content;             modal.showModal();         }     }; })();
联系我们 contact @ memedata.com