Shift-to-Middle Array: A Faster Alternative to Std:Deque?

ksherlock · 2025-03-23T23:58:33 1742774313

A couple notes looking at the c++ implementation

- this is going to have problems with non-trivial types. (Think about destructors or move constructors like std::unique_ptr). If you don't want to deal with them, at least add a static_assert(std::is_trivially_copyable::value == true);

- front() doesn't return a reference and it doesn't even return the front

- adding iterators (begin()/end()) will let it play nice with for( : ) loops and , etc.

kragen · 2025-03-23T23:58:18 1742774298

Interesting! This is the kind of thing I like.

I'm having a hard time understanding the description. If I understand right, it's kind of like an inside-out gap buffer, or a hybrid of a gap buffer and a ring buffer? Is the free space in the array always contiguous? If not, is the non-free space? How is it different from ExpandingRingBuffer?

AttilaT · 2025-03-23T23:20:27 1742772027

I recently developed a new data structure called the Shift-To-Middle Array, designed as an alternative to std::deque, std::vector, and linked lists. My goal was to optimize insertion and deletion at both ends, while also improving cache locality and performance compared to traditional implementations.

What is the Shift-To-Middle Array? Unlike std::deque, which uses a fragmented block-based structure, the Shift-To-Middle Array maintains a contiguous memory layout. Instead of shifting elements inefficiently (like std::vector), it dynamically redistributes free space toward the middle, reducing unnecessary data movement.

Key Features: Fast insertions & deletions at both ends (amortized O(1)) Efficient cache utilization (better than linked lists) Supports fast random access (O(1)) No pointer chasing (unlike linked lists) Parallelization & SIMD optimizations possible

Performance Benchmarks I benchmarked Shift-To-Middle Array vs. std::deque vs. ExpandingRingBuffer vs. std::queue across different workloads. Some highlights:

Push-heavy workload → Shift-To-Middle Array showed improved insertion performance over std::deque.

Pop-heavy workload → Showed improvements in memory access and removal operations.

Random insert/remove workloads → Demonstrated better cache efficiency compared to linked lists.

(Full benchmarks and source code available below.)

When Should You Use It? High-performance queue-like structures

Game engines (handling real-time events efficiently)

Networking applications (handling packet buffers)

Dynamic sequences (e.g., computational geometry, physics sims)

Would love to hear thoughts and feedback from the community! Have you encountered similar performance bottlenecks with std::deque or other dynamic structures?

ntonozzi · 2025-03-23T23:58:01 1742774281

Sounds very cool! How do you implement efficient random deletes?

orlp · 2025-03-23T23:37:40 1742773060

I made something similar to this ~10 years ago: https://github.com/orlp/devector. I never finished it (writing proper containers in C++ is a nightmare [1] [2] [3]), although I did start a similar project in Rust a year or two ago... which I also haven't finished yet (the repo is still private). The double-ended vector is very similar to a regular vector, it can just have free space on both ends:

    <------------ cap_front ------------>
                      <------------ cap_back ------------>
    <-----------------  total_capacity  ----------------->
                      <-----  len  ----->
    <-- space_front -->                 <-- space_back -->
    [                 [    elements     ]                ]
                      ^
                      +--- ptr

In the Rust crate I store 1 pointer and three lengths: len, space_front, space_back for a total size of 32 bytes compared to the usual 24 bytes of Vec.

---

I don't think you always want to shift to the middle. Rather, I propose the following strategy (which I do in the Rust crate, unsure if I did the same in C++ implementation):

1. When a request is made for more free space on one side, check if there is already enough free space, and if not,

2. Compute an amortized growing capacity (e.g. double the current capacity), and take the maximum of that with the requested capacity. While doing this ensure you only take into account the capacity of the side you want more space on (e.g. cap_back in the above picture when growing the back),

3. Check if halving the free space on the other side is sufficient to satisfy the amortized request, if yes, do not reallocate and just shift the values internally, otherwise,

4. Allocate a new buffer with the computed capacity, plus the same amount of free space on the other side and copy over the values.

The above strategy ensures you will not exceed 3N space (with doubling space on grow) even when the double-ended vector is used in a LIFO pattern. For example a regular Vec which doubles its size has a 2N total space worst-case.

[1] https://stackoverflow.com/questions/26902006/may-the-element... [2] https://stackoverflow.com/questions/27453230/is-there-any-wa... [3] https://stackoverflow.com/questions/26744589/what-is-a-prope...

taco9999 · 2025-03-23T23:44:22 1742773462

What benefits does this have over a standard VecDeque?

orlp · 2025-03-23T23:53:54 1742774034

The elements are completely contiguous, which can be nice for passing off (subslices) to other APIs, maximum speed iteration, etc.

ufo · 2025-03-23T23:57:50 1742774270

Does it have to move or resize when one of the sides reaches the end of the array? I presume that would be slower than a ring buffer that only grows when it's completely filled?

ufo · 2025-03-23T23:55:50 1742774150

For context: a VecDeque is a ring buffer backed by an array.

boguscoder · 2025-03-23T23:54:04 1742774044

is it just me or benchmarks report link is dead and hence there's no way to see the comparison

（评论） (comments)

（评论）
(comments)