Zero-copy resizeable vector. #
Code: bit-bcast/mmap_vector
We can append elements to an std::vector in O(1), amortized. This works, because when we run out of space (capacity), internally it allocates a new buffer that’s twice the size and copies the values from the old to the new buffer and frees the old buffer.
I thought this was optimal because:
-
we can’t really know if some other application used the space just above ours. Hence, we can’t be sure we’re not “blocked”,
-
we need consecutive addresses in RAM.
The thought is entirely flawed. Applications use virtual addresses; which means
our application has the whole 2**64
address range to itself (yes, the kernel
reserves the top third for itself). Virtual addresses are divided into pages,
usually 4kB, each page is mapped to 4kB of consecutive physical addresses; but
from one page to the next the physical address can jump.
Eventually, the thought is: why can’t a vector keep growing without copying? Some theory first:
Reserving Virtual Addresses #
Virtual addresses can be reserved using mmap
as follows:
void * mmap_allocate(size_t n) {
void * addr = mmap(nullptr, n, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if(addr == MAP_FAILED) {
throw std::runtime_error("Failed to mmap, errno: " + std::to_string(errno));
}
return addr;
}
The nullptr
indicates that we don’t care about the starting address. It’s
important to pick PROT_NONE
, meaning nobody can ready or write to these
pages. Picking PROT_READ | PROT_WRITE
will cause the application to run out
of memory. We don’t need other processes to have access to this address range.
Hence, MAP_PRIVATE
and MAP_ANONYMOUS
means it’s not backed by a file. Then
-1
and 0
are dummy arguments.
Allocating Physical Memory #
We have the addresses, now we need to back some pages with physical memory.
This is done by changing the protection of all required pages to read and write (and then writing from a page):
void mmap_grow(void * addr, size_t n_old, size_t n_new) {
void *begin = next_page((char*) addr + n_old);
std::ptrdiff_t dn = (char *)addr + n_new - (char *)(begin);
if (dn > 0) {
if(mprotect(begin, n_new, PROT_WRITE | PROT_READ) == -1) {
throw std::runtime_error("Failed mprotect: " + std::to_string(errno));
}
}
}
Deallocating Physical Memory #
We can use madvise
with MADV_DONTNEED
followed by mprotect
with
PROT_NONE
to reduce the memory consumption.
Unmapping Virtual Addresses #
To deallocate everything we simply unmap the entire region:
void mmap_free(void * addr, size_t n) {
auto status = munmap(addr, n);
if(status == -1) {
throw std::runtime_error("Failed to unmap.");
}
}
Example: Zero-copy Resizeable Vector #
Using mmap_reserve
and mmap_resize
, it should be trivial to demonstrate the
idea with a vector that reserves virtual addresses for 1T elements, but only
adds physical memory one page at a time. Meaning we can iteratively grow the
vector without copying the existing values. First, however:
Check with Valgind #
Clearly, we should check our implementation for leaks; and valgrind
is the
program to use
==3016== Command: ./cmake-build-debug/mmap_vector
==3016==
terminate called after throwing an instance of 'std::runtime_error'
what(): Failed to mmap, errno: 22
==3016==
which StackOverflow tells us is expected, because Valgrind intercepts mmap
and replaces it with something incompatible.
… and there goes all viability of the idea.