DOS 内存管理
DOS Memory Management

原始链接: http://www.os2museum.com/wp/dos-memory-management/

## DOS 内存管理:摘要 早期的 DOS (1.x & 2.0) 内存管理发展以处理超过初始 64K 限制的 RAM。DOS 2.0 引入了函数 – `ALLOC`、`DEALLOC` 和 `SETBLOCK`(分别用于分配、释放和调整大小) – 来管理一个由内存控制块 (MCB) 分割的连续“内存区域”。内存以段(16 字节)为单位分配,MCB 跟踪所有权(进程 ID)和大小。 系统维护一个 MCB 链,在分配期间合并空闲块以最大程度地减少碎片。虽然看似简单,但 DOS 内存管理存在一些问题:可以存在零大小的块,进程可以通过 `SETBLOCK` 劫持所有权,并且一个错误导致 `SETBLOCK` 在失败时有时也会调整内存大小。 后续版本 (2.11 & 5.0) 增加了功能。DOS 2.11 通过 `INT 21h/58h` 引入了未记录的“首次适应/最佳适应/尾部适应”分配策略。DOS 5.0 扩展了此功能,增加了对上部内存块 (UMB) 的支持,允许从常规内存和 UMB 内存进行分配,进一步使系统复杂化,但提供了更大的灵活性。尽管增加了这些功能,但基于块的分配和 MCB 跟踪的核心原则仍然是 DOS 内存管理的核心。

黑客新闻 新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 DOS 内存管理 (os2museum.com) 5 分,来自 supermatou 26 分钟前 | 隐藏 | 过去 | 收藏 | 讨论 帮助 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文

The memory management in DOS is simple, but that simplicity may be deceptive. There are several rather interesting pitfalls that programming documentation often does not mention.

DOS 1.x (1981) had no explicit memory management support. It was designed to run primarily on machines with 64K RAM or less, or not too much more (the original PC could not have more than 64K RAM on the system board, although RAM expansion boards did exist). A COM program could easily access (almost) 64K memory when loaded, and many programs didn’t rely on even having that much. In fact the early PCs often only had 64K or 48K RAM installed. But the times were rapidly changing.

DOS 2.0 was developed to support the IBM PC/XT (introduced in March 1983), which came with 128K RAM standard, and models with 256K appeared soon enough. Even the older PCs could be upgraded with additional RAM, and DOS needed to have some mechanism to deal with that extra memory.

The DOS memory management was probably written sometime around summer 1982, and it meshed with the newly added process management functions (EXEC/EXIT/WAIT)—allocated memory is owned by the current process, and gets freed when that process terminates. Note that some versions of the memory manager source code (ALLOC.ASM) include a comment that says ‘Created: ARR 30 March 1983’. That cannot possibly be true because by the end of March 1983, PC DOS 2.0 was already released, and included the memory management support. The DOS 2.0 memory management functions were already documented in the PC DOS 2.0 manual dated January 1983.

In PC DOS 2.0, three memory management functions were introduced: ALLOC (48h), DEALLOC (49h), and SETBLOCK (4Ah). The DEALLOC function may be better known as “free” and SETBLOCK as “resize”. The all-caps names are used in the actual DOS source code.

Structure

The memory managed by DOS (the “memory arena”) starts out as a single contiguous block. It begins just past the end of statically allocated memory and ends at the end of conventional memory. The available memory can be subdivided into smaller blocks through allocation. After a number of cycles of allocating and freeing memory, the available memory may be split up into a relatively large number of blocks, often a mix of free and used memory.

Each block of memory is prefixed by a header. Note that in the DOS source code, this is called an “arena header”. In third party literature, it is usually called a “memory control block” or MCB. This article will use the MCB terminology.

First of all, DOS manages memory in units of paragraphs (16 bytes), not individual bytes. This approach is derived from the segmented 8086 architecture. Managing memory in paragraph units allows DOS to use 16-bit quantities to record the starting address and size of each block. In addition, the starting paragraph address is also implicitly the segment address of the block. Note that due to tracking sizes in terms of paragraphs, DOS memory blocks are not limited to 64K.

The MCB by necessity takes up an entire paragraph, even though only 5 bytes were initially used; the following is an excerpt from MS-DOS 2.11 DOSSYM.ASM:

;
; arena item
;
arena   STRUC
arena_signature     DB  ?               ; 4D for valid item, 5A for last item
arena_owner         DW  ?               ; owner of arena item
arena_size          DW  ?               ; size in paragraphs of item
arena   ENDS

The signature byte is ‘M’ for all memory blocks except the last one, with the last block using a ‘Z’ signature. Perhaps ‘M’ stands for “memory” and ‘Z’ for “last” (block), or perhaps MZ are the initials of Mark Zbikowski, one of the core developers of DOS 2.0.

The DOS memory management functions check the signature of each MCB they work with. If it’s not ‘M’ or ‘Z’, an error is reported–and if that happens, all bets are off because something in the system corrupted memory, and nothing can be trusted anymore.

The owner of a memory block is a 16-bit word. It is set to zero to indicate a free block. A non-zero value is normally the PID (process identifier) of the owner, that is, the address of the PSP of the owning process. This is important when a process terminates, because DOS automatically frees all memory blocks that the process owned. Note that DOS performs no validity checks on the owner; any process can free or resize any block, regardless of who owns it, and the MCB owner need not be a valid PID.

The size is simply the size of the memory block in paragraphs, in theory up to (almost) 1MB.

The entirety of the memory managed by DOS is described by a chain of MCBs. The start of the chain is located through the arena_head variable within DOS. Each memory block is immediately followed by the MCB describing the next block, except for the last block in the chain (with the ‘Z’ signature) which has no follower.

The MCB chain acts somewhat like a linked list, but it is not a linked list. Instead of using some kind of a link pointer to the next item in the chain, the chain structure is implied by the location and size of the memory blocks. The memory blocks can only be processed in strictly ascending order and there cannot be any gaps between them.

Functions

The DOS memory functions are simple enough. The ALLOC function takes the desired size in paragraphs, and either returns a pointer to newly allocated memory in the AX register, or returns an error code in AX and the size of the largest free block in the BX register.

DOS programs often call ALLOC twice, first attempting to allocate FFFFh paragraphs, which will fail and return the maximum available size. The available maximum is then allocated in the next ALLOC call. Because DOS isn’t a multi-tasking OS, this simple approach reliably works.

A successful ALLOC returns returns the segment address of the newly allocated memory block in register AX, and the paragraph immediately preceding the allocated memory contains the block’s MCB header.

The DEALLOC function is very simple, only setting the block’s owner to zero to mark it as free.

The SETBLOCK function is somewhat like realloc() in the Standard C library, but never moves the allocated block. Resizing a block to the same or smaller size will always succeed, and will free up the remaining memory. Resizing to a larger size may fail, and if it does, the maximum available size will be returned in the BX register (just like when allocating).

Coalescing

An important feature of the DOS memory manager is coalescing of free memory, i.e. merging adjacent free memory blocks.

If a program successfully calls the ALLOC function twice, it will often own two adjacent memory blocks. If it then calls DEALLOC on each block, there will be two adjacent free memory blocks. When allocating memory again, these free blocks somehow need to be coalesced so that the free memory wouldn’t become endlessly fragmented.

DOS uses a simple strategy which will always coalesce free blocks when necessary. It works as follows:

  • The DEALLOC function performs no coalescing whatsoever.
  • The SETBLOCK function will coalesce all free blocks that immediately follow the memory block being resized (even if the new size is the same or smaller).
  • The ALLOC function processes the entire MCB chain and coalesces all free memory blocks that can be coalesced; this ensures that ALLOC always finds the largest available free block.

Naively one might think that the DEALLOC function is a good time to coalesce, but it’s not. If two adjacent blocks are freed, and the block higher in memory is freed last, coalescing can’t be done because DOS cannot reach the previous (lower in memory) MCB, only the next one.

The ALLOC function does the heavy lifting, but that is inevitable: Only by walking the entire MCB chain can DOS coalesce all eligible memory and ensure that the largest free block is found. This means that calling the ALLOC function can be somewhat expensive if the MCB chain is long. In practice, there are unlikely to be more than a few dozen MCBs, even in a heavily loaded system.

Caveats

Now we come to the less obvious aspects of DOS memory management. Some are inevitable, some are strange, some are really bugs.

It is possible to have zero-sized memory blocks (i.e. blocks consisting solely of an MCB header). The ALLOC function does not refuse to allocate zero-sized blocks. In addition, zero-sized blocks may be inevitably created in the course of calling other functions. For example, if an existing allocation is resized to be exactly one paragraph smaller, DOS will be forced to create precisely such a zero-sized MCB.

The SETBLOCK function always sets the MCB owner when it succeeds. That is, a DOS process may resize any existing memory block, regardless of who owns it, and become the owner. If the resizing succeeds (and resizing to the same or smaller size always will), the calling process will become the owner of the block. Obviously, resizing memory blocks owned by other processes is a risky business.

It is possible to use SETBLOCK on a free memory block. Programs obviously should not be doing that, because they do not own free memory. However, DOS makes no attempt to prevent such calls. In addition, thanks to the surprising behavior noted above, successfully calling SETBLOCK on a free memory block will effectively allocate it.

When the SETBLOCK function fails because the requested size was too large, it returns the maximum available size that the block can be resized to. However, DOS already resized the memory block to that maximum available size. This is almost certainly a bug, one that Microsoft didn’t dare fix in later DOS versions.

DOS 2.11 Enhancements

In MS-DOS version 2.11, Microsoft added a memory management tweak which was never documented until much later. Note that this change was not in PC DOS 2.1, but was naturally included in PC DOS 3.0.

In MS-DOS 2.11, there is a new INT 21h function called AllocOper (58h). It allows the caller to set or get the memory allocation strategy. The available options are usually referred to as “first fit” (0, the default value), “best fit” (1), and “last fit” (2).

When the ALLOC function scans the entire MCB chain and coalesces memory, it makes a note of the first (lowest) free memory block that is big enough to satisfy the allocation, the last (highest) free memory block that is big enough, and also the smallest (best) memory block big enough to satisfy the allocation.

Obviously these are not necessarily three different blocks. Two or even all three of the possible allocation options may well be the same block, especially if the number of free memory blocks is low.

Note that INT 21h/58h was not documented in official Microsoft programming references for DOS 2.x, 3.x, and 4.0. The DOS 5.0 reference does document AllocOper, but claims that it was added in DOS 3.0, which is not quite true.

DOS 5.0 Enhancements

The DOS memory management saw very minimal changes from version 2.11 up to and including DOS 4.0. DOS 5.0 brought somewhat significant changes related to UMB support.

When DOS 5.0 and later runs with DOS=UMB, there will be not one but two memory arenas (assuming that UMBs are actually available). The AllocOper function (58h) was significantly extended to support UMBs.

In addition to the first/best/last fit allocation strategy introduced in DOS 2.11, DOS 5.0 introduces three additional strategies, each of which is combined with the first/best/last fit strategy:

  • Allocate from conventional memory only (backward compatible)
  • Allocate from UMBs first, then from conventional memory
  • Allocate from UMBs only

In addition to supporting new memory allocation strategies as controlled by INT 21h/58h subfunctions 0 and 1 (get and set allocation strategy, respectively), DOS 5.0 also added new subfunctions 2 and 3. These allow the caller to query (subfunction 2) or set (subfunction 3) the UMB link.

That is, DOS 5.0 can either link or unlink UMBs from the standard memory chain (in conventional memory). To allocate memory from UMBs, a program must both set an allocation strategy which looks at UMBs and link the UMBs to the pool of available memory.

Note that the allocation strategy and UMB link setting are both global DOS state, not per-process. A DOS program which changes either the allocation strategy or the UMB link state should restore the original setting before it terminates, at least according to the MS-DOS 5.0 Programming Reference.

Summary

Although DOS memory management is in principle very simple, users may find some of its behaviors surprising. The addition of UMB support in DOS 5.0 made DOS memory management noticeably less simple than before, although only TSRs and drivers tend to worry about upper memory.

联系我们 contact @ memedata.com