Heap Management
- Heap allocation maps dynamic variables in the heap area.
- Heap deallocation involves freeing inactive memory space.
- Heap compaction improves memory utilzation & effiency and removes fragmentation.
1. Heap Allocation
The heap is managed in blocks, which contain:
- Housekeeping data (e.g. size, status)
- Object data (e.g. fields, methods).
Object references point to the object data, not the start of the block.
Traditional heap allocation is managed by a free list, a chain of free blocks in the heap. Each heap block contains tracking info in the housekeeping data section. We check in this order:
- If a free block of exactly the correct size is found, return it.
- If a block bigger than the requested size is found, split it into two blocks: one of the requested size, and the other with the remaining size.
- If no block is found, request more memory from the OS.
However, this algorithm is quite slow. We could:
- Maintain many free lists for different block sizes.
- Allocate more space for data that may grow.
We also need to consider memory alignment and negative impact on caching when we dont have spacial locality.
2. Garbage Collection
We can automatically deallocate from the heap with garbage collection. This could be done by:
- Reference counting - each block tracks a reference count in the housekeeping data. If a block is copied, the count is incremented. If a block is deleted, the count is decremented. When the count reaches zero, the block is deallocated. However, this requires extra code for pointer manipulation, and cannot handle cyclic references.
- Mark & Sweep - this contains two phases. Phase 1 marks blocks as live that are reachable from non-heap references. Phase 2 sweeps through the heap, deallocating unmarked blocks. This is slower than reference counting, but can handle cyclic references. It also batches deallocations, which is more efficient.
- Pointer reversal - stackless depth first traversal via pointers
, . The blocks are stored as chains. We can traverse the heap by reversing the pointers. This is faster than mark & sweep, but requires more memory. - Two Space GC - split heap into from space and to space. When the from space is full, we copy all live blocks to the to space. This has no pointer manipulation, and better spacial locality, but is not efficient for large heaps.
- Generational GC - divide th eheap into several areas based on the age of blocks. Adaptively perform different GC algorithms for different areas.