#include <semistable/vector.hpp>
#include <iostream>
int main()
{
semistable::vector<int> x = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
auto it = x.begin() + 5;
std::cout << *it << "\n"; // prints 5
x.erase(x.begin()); // erases first element
std::cout << *it << "\n"; // prints 5 again!
}std::vector stores its contents in a contiguous block of memory, so mid insertions and
erasures invalidate references and iterators to previous elements. semistable::vector
is called semistable in the sense that, while references are still unstable, its iterators
correctly track elements in situations like the above.
semistable::vector stores elements contiguously and provides the same API as std::vector with the extra
guarantee of iterator stability (including end()). The library is header-only and depends solely on
Boost.Config.
C++11 or later required.
From the point of view of stability, there are three types of operation that cause iterators
to become invalid in a classical std::vector:
- insertion of elements before a given position,
- erasure of elements before a given position,
- reallocation to a new buffer (e.g. with a call to
reserve).
When any of these operations happens, semistable::vector creates a new epoch descriptor
indicating the change. Outstanding iterators internally point to the epoch that was
current when they were last used. All arrows in the diagram are std::shared_ptrs:
When the iterators are used, they follow the chain of epoch descriptors till the last one, making the necessary adjustments to their stored position along the way. This ensures that dereference (as well as other iterator operations) are consistent with the current state of the vector:
When an epoch descriptor is outdated (all outstanding iterators are past it), it gets automatically
deleted (no shared_ptr points to it any longer).
The graph shows normalized execution times of the following operations:
- traversal with
for_each, - repeated insertion at the end,
erase_ifof odd elements,- sorting of elements,
for std::vector<int>, std::list<int> and semistable::vector<int> with 0.5M elements.
Results are normalized to the execution time of std::vector. Benchmarks compiled with
clang-cl for Visual Studio 2022 in release mode.
Some observations:
semistable::vectoriterators provide araw()member function returning a plain pointer to the element (this is equivalent to callingstd::to_addresson the iterator). When usingraw()for traversal and sorting (that is,std::for_each(x.begin().raw(), x.end.raw(), ...),std::sort(x.begin().raw(), x.end.raw()), execution times are of course the same as withstd::vector. C++20 introduces the notion of contiguous iterators, which standard algorithms could in principle take advantage of by internally converting contiguous iterators to pointers for increased performance. In reality, though, no standard library implementation does that except for a handful of algorithms with a high chance of being eligible for autovectorization.std::list::sortis not entirely equivalent to sorting a vector (semistable or otherwise), as the former sorts nodes whereas the latter sorts values. So, astd::listiterator will point to the exact same value after sorting, which is not the case for vectors.
Like standard C++ containers, semistable::vector const and
const-like member
functions are thread safe. Iterator usage, however, requires extra precautions:
- The same iterator object can't be used concurrently in different threads, even for nominally const operations such as dereferencing (internally, thread-unsafe epoch traversal is triggered).
- An iterator can't be used concurrently with any thread-unsafe operation on
the
semistable::vectorit belongs in, even if the operation does not touch the piece of memory the iterator points to.
These limitations could in principle be avoided by modifying the library's implementation to use atomic shared pointers.
Currently, iterator stability is not enforced if an exception
is thrown other than by the allocator (typically, by a value_type construction or assignment operation).
There's no internal impediment to evolving the library so as to properly cover these cases.
If an iterator it is kept in the program and never touched while its semistable::vector is
being modified, the associated epoch chain will grow undefinitely because its head can't
be garbage-collected as long as it points at it.
Much as with std::vector, using a semistable::vector iterator pointing to an erased element is still
undefined behavior. The internal epoch machinery, however, could be easily leveraged so that
those illegal uses are detected and signaled via an exception or some other mechanism.
Per a question by Mark Hoemmen,
we've created a monothread version
of semistable::vector that internally uses
boost::local_shared_ptr
instead of std::shared_ptr. Perhaps surprisingly, performance is noticeably better than
with the thread-safe version.
Thanks to Dmitry Arkhipov for his help setting the CI for this project.



