Assembling Your Own Allocator

In addition to defining the interfaces above, this package also implements untyped composable memory allocators. They are untyped because they deal exclusively in void[] and have no notion of what type the memory allocated would be destined for. They are composable because the included allocators are building blocks that can be assembled in complex nontrivial allocators.

Unlike the allocators for the C and C++ programming languages, which manage the allocated size internally, these allocators require that the client maintains (or knows a priori) the allocation size for each piece of memory allocated. Put simply, the client must pass the allocated size upon deallocation. Storing the size in the _allocator has significant negative performance implications, and is virtually always redundant because client code needs knowledge of the allocated size in order to avoid buffer overruns. (See more discussion in a proposal for sized deallocation in C++.) For this reason, allocators herein traffic in void[] as opposed to void*.

In order to be usable as an _allocator, a type should implement the following methods with their respective semantics. Only alignment and allocate are required. If any of the other methods is missing, the _allocator is assumed to not have that capability (for example some allocators do not offer manual deallocation of memory). Allocators should NOT implement unsupported methods to always fail. For example, an allocator that lacks the capability to implement alignedAllocate should not define it at all (as opposed to defining it to always return null or throw an exception). The missing implementation statically informs other components about the allocator's capabilities and allows them to make design decisions accordingly.

$(TDC uint alignment;, $(POST $(RES) > 0))$(TDC size_t goodAllocSize(size_t n);, $(POST $(RES) >= n))$(TDC void[] allocate(size_t s);, $(POST $(RES) is null || $(RES).length == s))$(TDC void[] alignedAllocate(size_t s, uint a);, $(POST $(RES) is null || $(RES).length == s))$(TDC void[] allocateAll();)$(TDC bool expand(ref void[] b, size_t delta);, $(POST !$(RES) || b.length == $(I old)(b).length + delta))$(TDC bool reallocate(ref void[] b, size_t s);, $(POST !$(RES) || b.length == s))$(TDC bool alignedReallocate(ref void[] b,$(BR) size_t s, uint a);, $(POST !$(RES) || b.length == s))$(TDC Ternary owns(void[] b);)$(TDC Ternary resolveInternalPointer(void* p, ref void[] result);)$(TDC bool deallocate(void[] b);)$(TDC bool deallocateAll();, $(POST empty))$(TDC Ternary empty();)$(TDC static Allocator instance;, $(POST instance $(I is a valid) Allocator $(I object)))
Method nameSemantics
Returns the minimum alignment of all data returned by the allocator. An allocator may implement alignment as a statically-known enum value only. Applications that need dynamically-chosen alignment values should use the alignedAllocate and alignedReallocate APIs.
Allocators customarily allocate memory in discretely-sized chunks. Therefore, a request for n bytes may result in a larger allocation. The extra memory allocated goes unused and adds to the so-called internal fragmentation. The function goodAllocSize(n) returns the actual number of bytes that would be allocated upon a request for n bytes. This module defines a default implementation that returns n rounded up to a multiple of the allocator's alignment.
If s == 0, the call may return any empty slice (including null). Otherwise, the call allocates s bytes of memory and returns the allocated block, or null if the request could not be satisfied.
Similar to allocate, with the additional guarantee that the memory returned is aligned to at least a bytes. a must be a power of 2.
Offers all of allocator's memory to the caller, so it's usually defined by fixed-size allocators. If the allocator is currently NOT managing any memory, then allocateAll() shall allocate and return all memory available to the allocator, and subsequent calls to all allocation primitives should not succeed (e.g. allocate shall return null etc). Otherwise, allocateAll only works on a best-effort basis, and the allocator is allowed to return null even if does have available memory. Memory allocated with allocateAll is not otherwise special (e.g. can be reallocated or deallocated with the usual primitives, if defined).
Expands b by delta bytes. If delta == 0, succeeds without changing b. If b is null, returns false (the null pointer cannot be expanded in place). Otherwise, b must be a buffer previously allocated with the same allocator. If expansion was successful, expand changes b's length to b.length + delta and returns true. Upon failure, the call effects no change upon the allocator object, leaves b unchanged, and returns false.
Reallocates b to size s, possibly moving memory around. b must be null or a buffer allocated with the same allocator. If reallocation was successful, reallocate changes b appropriately and returns true. Upon failure, the call effects no change upon the allocator object, leaves b unchanged, and returns false. An allocator should implement reallocate if it can derive some advantage from doing so; otherwise, this module defines a reallocate free function implemented in terms of expand, allocate, and deallocate.
Similar to reallocate, but guarantees the reallocated memory is aligned at a bytes. The buffer must have been originated with a call to alignedAllocate. a must be a power of 2 greater than (void*).sizeof. An allocator should implement alignedReallocate if it can derive some advantage from doing so; otherwise, this module defines a alignedReallocate free function implemented in terms of expand, alignedAllocate, and deallocate.
Returns Ternary.yes if b has been allocated with this allocator. An allocator should define this method only if it can decide on ownership precisely and fast (in constant time, logarithmic time, or linear time with a low multiplication factor). Traditional allocators such as the C heap do not define such functionality. If b is null, the allocator shall return, i.e. no allocator owns the null slice.
If p is a pointer somewhere inside a block allocated with this allocator, result holds a pointer to the beginning of the allocated block and returns Ternary.yes. Otherwise, result holds null and returns If the pointer points immediately after an allocated block, the result is implementation defined.
If b is null, does nothing and returns true. Otherwise, deallocates memory previously allocated with this allocator and returns true if successful, false otherwise. An implementation that would not support deallocation (i.e. would always return false should not define this primitive at all.)
Deallocates all memory allocated with this allocator. If an allocator implements this method, it must specify whether its destructor calls it, too.
Returns Ternary.yes if and only if the allocator holds no memory (i.e. no allocation has occurred, or all allocations have been deallocated).
Some allocators are monostate, i.e. have only an instance and hold only global state. (Notable examples are C's own malloc-based allocator and D's garbage-collected heap.) Such allocators must define a static instance instance that serves as the symbolic placeholder for the global instance of the allocator. An allocator should not hold state and define instance simultaneously. Depending on whether the allocator is thread-safe or not, this instance may be shared.

Sample Assembly

The example below features an _allocator modeled after jemalloc, which uses a battery of free-list allocators spaced so as to keep internal fragmentation to a minimum. The FList definitions specify no bounds for the freelist because the Segregator does all size selection in advance.

Sizes through 3584 bytes are handled via freelists of staggered sizes. Sizes from 3585 bytes through 4072 KB are handled by a BitmappedBlock with a block size of 4 KB. Sizes above that are passed direct to the GCAllocator.

1 alias FList = FreeList!(GCAllocator, 0, unbounded);
2 alias A = Segregator!(
3     8, FreeList!(GCAllocator, 0, 8),
4     128, Bucketizer!(FList, 1, 128, 16),
5     256, Bucketizer!(FList, 129, 256, 32),
6     512, Bucketizer!(FList, 257, 512, 64),
7     1024, Bucketizer!(FList, 513, 1024, 128),
8     2048, Bucketizer!(FList, 1025, 2048, 256),
9     3584, Bucketizer!(FList, 2049, 3584, 512),
10     4072 * 1024, AllocatorList!(
11         () => BitmappedBlock!(GCAllocator, 4096)(4072 * 1024)),
12     GCAllocator
13 );
14 A tuMalloc;
15 auto b = tuMalloc.allocate(500);
16 assert(b.length == 500);
17 auto c = tuMalloc.allocate(113);
18 assert(c.length == 113);
19 assert(tuMalloc.expand(c, 14));
20 tuMalloc.deallocate(b);
21 tuMalloc.deallocate(c);

Allocating memory for sharing across threads

One allocation pattern used in multithreaded applications is to share memory across threads, and to deallocate blocks in a different thread than the one that allocated it.

All allocators in this module accept and return void[] (as opposed to shared void[]). This is because at the time of allocation, deallocation, or reallocation, the memory is effectively not shared (if it were, it would reveal a bug at the application level).

The issue remains of calling a.deallocate(b) from a different thread than the one that allocated b. It follows that both threads must have access to the same instance a of the respective allocator type. By definition of D, this is possible only if a has the shared qualifier. It follows that the allocator type must implement allocate and deallocate as shared methods. That way, the allocator commits to allowing usable shared instances.

Conversely, allocating memory with one non-shared allocator, passing it across threads (by casting the obtained buffer to shared), and later deallocating it in a different thread (either with a different allocator object or with the same allocator object after casting it to shared) is illegal.

Building Blocks

The table below gives a synopsis of predefined allocator building blocks, with their respective modules. Either import the needed modules individually, or import stdx.building_blocks, which imports them all publicly. The building blocks can be assembled in unbounded ways and also combined with your own. For a collection of typical and useful preassembled allocators and for inspiration in defining more such assemblies, refer to std.experimental.allocator.showcase.

$(BOOKTABLE, $(TR $(TH Allocator$(BR)) $(TH Description)) $(TR $(TDC2 NullAllocator, null_allocator) $(TD Very good at doing absolutely nothing. A good starting point for defining other allocators or for studying the API.)) $(TR $(TDC3 GCAllocator, gc_allocator) $(TD The system-provided garbage-collector allocator. This should be the default fallback allocator tapping into system memory. It offers manual $(D free) and dutifully collects litter.)) $(TR $(TDC3 Mallocator, mallocator) $(TD The C heap _allocator, a.k.a. $(D malloc)/$(D realloc)/$(D free). Use sparingly and only for code that is unlikely to leak.)) $(TR $(TDC3 AlignedMallocator, mallocator) $(TD Interface to OS-specific _allocators that support specifying alignment: $(HTTP, $(D posix_memalign)) on Posix and $(HTTP, $(D __aligned_xxx)) on Windows.)) $(TR $(TDC2 AffixAllocator, affix_allocator) $(TD Allocator that allows and manages allocating extra prefix and/or a suffix bytes for each block allocated.)) $(TR $(TDC2 BitmappedBlock, bitmapped_block) $(TD Organizes one contiguous chunk of memory in equal-size blocks and tracks allocation status at the cost of one bit per block.)) $(TR $(TDC2 FallbackAllocator, fallback_allocator) $(TD Allocator that combines two other allocators - primary and fallback. Allocation requests are first tried with primary, and upon failure are passed to the fallback. Useful for small and fast allocators fronting general-purpose ones.)) $(TR $(TDC2 FreeList, free_list) $(TD Allocator that implements a $(HTTP, free list) on top of any other allocator. The preferred size, tolerance, and maximum elements are configurable at compile- and run time.)) $(TR $(TDC2 SharedFreeList, free_list) $(TD Same features as $(D FreeList), but packaged as a $(D shared) structure that is accessible to several threads.)) $(TR $(TDC2 FreeTree, free_tree) $(TD Allocator similar to $(D FreeList) that uses a binary search tree to adaptively store not one, but many free lists.)) $(TR $(TDC2 Region, region) $(TD Region allocator organizes a chunk of memory as a simple bump-the-pointer allocator.)) $(TR $(TDC2 InSituRegion, region) $(TD Region holding its own allocation, most often on the stack. Has statically-determined size.)) $(TR $(TDC2 SbrkRegion, region) $(TD Region using $(D $(LINK2, sbrk)) for allocating memory.)) $(TR $(TDC3 MmapAllocator, mmap_allocator) $(TD Allocator using $(D $(LINK2, mmap)) directly.)) $(TR $(TDC2 StatsCollector, stats_collector) $(TD Collect statistics about any other allocator.)) $(TR $(TDC2 Quantizer, quantizer) $(TD Allocates in coarse-grained quantas, thus improving performance of reallocations by often reallocating in place. The drawback is higher memory consumption because of allocated and unused memory.)) $(TR $(TDC2 AllocatorList, allocator_list) $(TD Given an allocator factory, lazily creates as many allocators as needed to satisfy allocation requests. The allocators are stored in a linked list. Requests for allocation are satisfied by searching the list in a linear manner.)) $(TR $(TDC2 Segregator, segregator) $(TD Segregates allocation requests by size and dispatches them to distinct allocators.)) $(TR $(TDC2 Bucketizer, bucketizer) $(TD Divides allocation sizes in discrete buckets and uses an array of allocators, one per bucket, to satisfy requests.)) $(COMMENT $(TR $(TDC2 InternalPointersTree) $(TD Adds support for resolving internal pointers on top of another allocator.))) )


module stdx.allocator.building_blocks.affix_allocator
module stdx.allocator.building_blocks.allocator_list
module stdx.allocator.building_blocks.bitmapped_block
module stdx.allocator.building_blocks.bucketizer
module stdx.allocator.building_blocks.fallback_allocator
module stdx.allocator.building_blocks.free_list
module stdx.allocator.building_blocks.free_tree
module stdx.allocator.building_blocks.kernighan_ritchie
module stdx.allocator.building_blocks.null_allocator
module stdx.allocator.building_blocks.quantizer
module stdx.allocator.building_blocks.region
module stdx.allocator.building_blocks.scoped_allocator
module stdx.allocator.building_blocks.segregator
module stdx.allocator.building_blocks.stats_collector

Allocator that collects useful statistics about allocations, both global and per calling point. The statistics collected can be configured statically by choosing combinations of Options appropriately.