四列ASCII (2017)
Four Column ASCII (2017)

原始链接: https://garbagecollected.org/2017/01/31/four-column-ascii/

一篇 Hacker News 帖子揭示了一种可视化 ASCII 表格的有趣方法——以四列格式呈现,突出隐藏的模式。传统上,ASCII 被视为线性序列,但将其排列成 32 列可以显示控制字符和可打印字符的逻辑组织方式。 关键见解是 ASCII 使用 7 位:前两位定义“组”,其余五位指定该组内的字符。该表显示,按下 `CTRL+[` 会生成 `ESC` 键,因为两者共享相同的最后五位 (11011),而 `CTRL` 实际上将前两位清零。 这解释了为什么特定的 `CTRL` 键组合会产生像换行符 (`^J`)、退格键 (`^H`) 和制表符 (`^I`) 这样的控制字符。它也阐明了为什么使用 `cat -A` 显示的 Windows 文本文件会显示 `^M`——代表回车符 (CR),这是与 Unix 系统不同的换行符。这种列视图使 ASCII 的底层结构变得令人惊讶地直观。

Hacker News 新闻 | 过去 | 评论 | 提问 | 展示 | 工作 | 提交 登录 四列 ASCII (2017) (garbagecollected.org) 8 分,由 tempodox 发表于 2 小时前 | 隐藏 | 过去 | 收藏 | 2 条评论 帮助 rbanffy 发表于 1 小时前 | 上一个 [–] 这也是为什么 Teletype 键盘布局在 8 和 9 上有括号,而现代键盘布局在 9 和 0 上有括号(IBM Selectric 流行的布局)。最初的 Apple II 电脑也采用了这种布局,在 G 的上方有一个“铃铛”。回复 Terretta 发表于 24 分钟前 | 父级 [–] 这个块和键盘按键排列发生了什么? ESC [ { 11011 FS \ | 11100 GS ] } 11101 另外好奇为什么按键可以打开和关闭大括号,但是… 单引号和双引号没有打开和关闭的功能,而是堆叠在一起。每次我输入 Option-{ 和 Option-Shift-{ …回复 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文

I found this gem on Hacker News the other day. User soneil posted to a four column version of the ASCII table that blew my mind. I just wanted to repost this here so it is easier to discover.

Here's an excerpt from the comment:

I always thought it was a shame the ascii table is rarely shown in columns (or rows) of 32, as it makes a lot of this quite obvious. eg, http://pastebin.com/cdaga5i1 It becomes immediately obvious why, eg, ^[ becomes escape. Or that the alphabet is just 40h + the ordinal position of the letter (or 60h for lower-case). Or that we shift between upper & lower-case with a single bit.

You know in ASCII there are 32 characters at the beginning of the table that don't represent a written symbol. Backspace, newline, escape - that sort of thing. These are called control characters.

In the terminal you can type these control characters by holding the CTRL (control characters, get it?) key in combination with another key. For example, as many experienced vim users know pressing CTRL+[ in the terminal (which is ^[ in caret notation) is the same as pressing the ESC key. But why is the escape key triggered by the [ character? Why not another character? This is the insight soneil shares with us.

Remember that ASCII is a 7 bit encoding. Let's say the following:

  • The first two bits denote the group of the character (2^2 so 4 possible values)
  • The remaining five bits describe a character (2^5 so 32 possible values)

In the linked table, which I reproduce below, the four groups are represented by the columns and the rows represent the values.

00 01 10 11
NUL Spc @ ` 00000
SOH ! A a 00001
STX " B b 00010
ETX # C c 00011
EOT $ D d 00100
ENQ % E e 00101
ACK & F f 00110
BEL ' G g 00111
BS ( H h 01000
TAB ) I i 01001
LF * J j 01010
VT + K k 01011
FF , L l 01100
CR - M m 01101
SO . N n 01110
SI / O o 01111
DLE 0 P p 10000
DC1 1 Q q 10001
DC2 2 R r 10010
DC3 3 S s 10011
DC4 4 T t 10100
NAK 5 U u 10101
SYN 6 V v 10110
ETB 7 W w 10111
CAN 8 X x 11000
EM 9 Y y 11001
SUB : Z z 11010
ESC ; [ { 11011
FS < \ | 11100
GS = ] } 11101
RS > ^ ~ 11110
US ? _ DEL 11111

Now in this table, look for ESC. It's in the first group, fifth from the bottom. It's in the first column so its group has bits '00', the row has bits '11011'. Now look on the same line, what else is there? Yep, the '[' character is there, be it in a different column:

  • 10 11011 means [
  • 00 11011 means ESC

So when we you type CTRL+[ for ESC, you're asking for the equivalent of the character 11011 ([) out of the control set. Pressing CTRL simply sets all bits but the last 5 to zero in the character that you typed. You can imagine it as a bitwise AND.

  10 11011 ([)
& 00 11111 (CTRL)
= 00 11011 (ESC)

This is why ^J types a newline, ^H types a backspace and ^I types a tab. This is why if you cat -A a Windows text file, it has ^M printed all over (meaning CR, because newlines are CR+LF on Windows).

联系我们 contact @ memedata.com