AJaiCodes logoAJaiCodes
HomeArticlesAbout
HomeArticlesModern Java Garbage Collectors (JDK 25+): A Deep Dive

Modern Java Garbage Collectors (JDK 25+): A Deep Dive

Ajanthan Sivalingarajah
·Feb 26, 2026·8 min read
Modern Java Garbage Collectors (JDK 25+): A Deep Dive
JavaJDK21Garbage CollectorsMemory ManagementPerformanceParallel CGSerial GCG1 GCZGCShenandoah GCEpsilon GC
11 views
ProgrammingProgramming FundamentalsBackend EngineeringSoftware ArchitectureHigh Performance Computing
Threads and Concurrency in Modern Java: A Technical Deep Dive

AjaiCodes

A modern tech blog platform where developers share knowledge, insights, and experiences in software engineering and technology.

Quick Links

  • Home
  • Articles
  • About

Legal

  • Privacy Policy
  • Terms of Service

© 2026 AjaiCodes. All rights reserved.

Modern Java Garbage Collectors in JDK 25: Architecture, Generations, Allocation, and Memory Reclamation#

The Day GC Became a Production Incident#

We once had a service that looked perfect on dashboards. CPU steady. Memory within limits. Autoscaling behaving exactly as expected. Yet latency percentiles told a different story—p99 spikes appearing in clusters, every few minutes.

No obvious bottleneck. No slow database queries. No thread starvation.

Then we turned on detailed GC logs.

What surfaced wasn’t catastrophic—no long full GCs, no OOMs. Instead, it was something more subtle and more dangerous: consistent micro-pauses. Each pause was small, but the frequency aligned perfectly with our latency spikes.

That was the turning point. GC wasn’t failing—it was doing exactly what it was designed to do. But our system design, allocation patterns, and collector choice were misaligned.

And that’s the uncomfortable truth: modern GC rarely “breaks” your system. It quietly shapes its behavior.


Why GC Still Matters (Even With ZGC and Shenandoah)#

There’s a persistent myth that modern collectors have “solved” GC.

They haven’t. They’ve shifted the problem.

In today’s environments:

  • Containers impose hard memory ceilings
  • CPU shares fluctuate under orchestration
  • Allocation rates spike unpredictably (especially in event-driven systems)
  • Tail latency matters more than averages

GC is now part of your latency model, not just your memory model.

You don’t tune GC to avoid failure. You tune it to:

  • Stabilize p95/p99 latency
  • Control memory footprint in constrained environments
  • Avoid pathological allocation/reclamation cycles

JVM Heap Architecture — The Model vs Reality#

At a conceptual level, the heap still looks familiar:

graph TD A[Young Gen] --> B[Eden] A --> C[Survivor S0] A --> D[Survivor S1] E[Old Gen] --> F[Tenured]

But modern collectors—especially G1, ZGC, and Shenandoah—don’t operate on contiguous generations. They operate on regions.

Region-Based Reality#

graph TD A[Heap] --> B[Region 1] A --> C[Region 2] A --> D[Region N] B --> E[Eden-like] C --> F[Survivor-like] D --> G[Old-like]

Each region can dynamically change roles. This gives collectors flexibility to:

  • Reclaim memory incrementally
  • Avoid large stop-the-world compactions
  • Optimize based on live data density

Generational Hypothesis — Still the Backbone#

Despite architectural changes, one assumption still holds:

Most objects die young.

In high-throughput systems:

  • Request-scoped objects dominate allocation
  • Intermediate objects (streams, DTOs, buffers) are short-lived
  • Only a small subset survives beyond a few cycles

Modern collectors either:

  • Explicitly implement generations (G1)
  • Or implicitly optimize for the same pattern (ZGC, Shenandoah)

Object Allocation — Why It’s Not Your Bottleneck#

Allocation is almost never the problem.

Thanks to TLABs (Thread Local Allocation Buffers), allocation is effectively:

  • Pointer increment
  • No locks
  • CPU-cache friendly
sequenceDiagram participant Thread participant TLAB participant Heap Thread->>TLAB: Allocate object alt TLAB has space TLAB-->>Thread: Fast path else TLAB full Thread->>Heap: Refill TLAB Heap-->>Thread: New buffer end

The Real Problem: Allocation Rate#

High allocation rate → more frequent GC cycles → more pressure on reclaim mechanisms.

That’s where things break down.


Object Lifecycle — From Eden to Reclamation#

graph LR A[Eden Allocation] --> B[Minor GC] B --> C[Survivor] C --> D[Promotion] D --> E[Old Gen] E --> F[Major GC]

Modern collectors modify this flow:

  • Promotions are adaptive
  • Regions replace contiguous spaces
  • Some collectors eliminate explicit generations

Modern Garbage Collectors in JDK 25#

Serial GC#

  • Single-threaded
  • Predictable pauses
  • Useful for tiny heaps or debugging

Parallel GC#

  • Throughput-focused
  • Uses all CPU cores
  • Long pauses

G1 GC#

  • Region-based
  • Pause-time goals
  • Balanced performance

ZGC#

  • Ultra-low latency
  • Concurrent everything
  • Uses colored pointers

Shenandoah#

  • Concurrent compaction
  • Brooks pointer indirection

Epsilon GC#

  • No-op collector
  • Useful for benchmarking

G1 GC — The Engineering Compromise#

G1 is not the fastest, nor the lowest latency. It’s the most balanced.

Architecture#

graph TD A[Heap] --> B[Regions] B --> C[Young Regions] B --> D[Old Regions] B --> E[Humongous Regions]

Core Idea#

Instead of collecting entire generations, G1:

  • Identifies regions with the most garbage
  • Collects them first
  • Meets a pause-time target

Collection Flow#

sequenceDiagram participant App participant G1 App->>G1: Allocation pressure G1->>G1: Young GC G1->>G1: Mixed GC (young + old) G1-->>App: Controlled pause

Trade-offs#

Pros

  • Predictable pauses
  • Mature ecosystem
  • Works out-of-the-box

Cons

  • Still stop-the-world
  • Sensitive to humongous allocations
  • Pause predictability not perfect

Real-world usage#

G1 is ideal when:

  • You want stability without deep tuning
  • Latency matters, but isn’t ultra-critical
  • Heap sizes are moderate to large

ZGC — Latency as a First-Class Constraint#

ZGC was built with one goal: eliminate pause times as a concern.

Colored Pointers#

ZGC encodes metadata in pointers:

graph TD A[Reference] --> B[Marked] A --> C[Relocated] A --> D[Remapped]

This enables:

  • Concurrent marking
  • Concurrent relocation
  • No long pauses

Execution Model#

sequenceDiagram participant App participant ZGC App->>ZGC: Allocate ZGC->>ZGC: Concurrent mark ZGC->>ZGC: Concurrent relocate ZGC-->>App: Pause < 1ms

Trade-offs#

Pros

  • Near-zero pauses
  • No fragmentation
  • Scales to massive heaps

Cons

  • Higher CPU overhead
  • Needs memory headroom (~10–20%)
  • Less forgiving under extreme memory pressure

Real-world usage#

ZGC shines in:

  • Low-latency APIs
  • Financial systems
  • AI inference workloads

Shenandoah — Concurrent Compaction Done Differently#

Shenandoah takes a different route using Brooks pointers.

graph TD A[Object] --> B[Forwarding Pointer] B --> C[Data]

Key idea#

Every object has an indirection layer, allowing:

  • Relocation without stopping threads
  • Concurrent compaction

Trade-offs#

Pros

  • Low latency
  • Efficient compaction

Cons

  • Extra pointer overhead
  • Slightly higher memory footprint

GC Roots — Where Everything Begins#

graph TD A[GC Roots] --> B[Reachable Objects]

Roots include:

  • Thread stacks
  • Static fields
  • JNI references

Practical insight#

Memory leaks are rarely “forgotten objects.”
They are reachable objects you didn’t expect to be reachable.


Remembered Sets and Write Barriers#

To support region-based collection, the JVM tracks cross-region references.

graph LR A[Region A] -->|Reference| B[Region B] B --> C[Remembered Set]

Write Barrier Example#

obj.field = newValue;

Triggers:

  • Metadata update
  • Remembered set tracking

Trade-off#

  • Adds overhead to writes
  • Enables efficient partial GC

Humongous Objects — The Silent Performance Killer#

Objects larger than half a region size are treated specially.

Problems#

  • Allocated directly in old regions
  • Hard to relocate
  • Cause fragmentation

Real-world triggers#

  • Large JSON payloads
  • Byte buffers
  • ML tensors

Mitigation#

  • Tune region size
  • Avoid large contiguous allocations
  • Stream data where possible

GC Comparison#

GCLatencyThroughputHeap SizeBest For
SerialHighLowSmallEmbedded
ParallelHighHighMediumBatch
G1MediumMediumLargeDefault
ZGCVery LowMediumHugeLow latency
ShenandoahLowMediumLargeLow latency alt

Configuring the Right GC#

Basic Switching#

-XX:+UseG1GC
-XX:+UseZGC
-XX:+UseShenandoahGC

Example: Latency-sensitive setup (ZGC)#

-Xms8g
-Xmx8g
-XX:+UseZGC
-XX:ZUncommitDelay=300

Example: Balanced setup (G1)#

-XX:+UseG1GC
-XX:MaxGCPauseMillis=200

Observability — Where Theory Meets Reality#

Tools#

  • GC logs (-Xlog:gc)
  • Java Flight Recorder
  • Prometheus + Grafana

Metrics That Matter#

  • Allocation rate
  • Pause percentiles (p95/p99)
  • Promotion rate
  • Old gen occupancy

GC in Modern Systems#

Microservices#

  • High churn
  • G1 works well
  • ZGC for latency-critical paths

Cloud#

  • Memory limits amplify GC pressure
  • CPU throttling affects concurrent phases

AI Systems#

  • Massive allocation spikes
  • Large object graphs
  • ZGC often performs best

Where This Is Heading#

The JVM is moving toward:

  • Fully concurrent collectors
  • Region-based everything
  • Predictable memory behavior

But trade-offs remain:

  • CPU vs latency
  • Memory overhead vs stability
  • Simplicity vs control

What’s changing is not the existence of GC—but its role.

It’s no longer just memory management.
It’s a first-class performance characteristic.

And the engineers who understand that tend to be the ones debugging production issues while everyone else is still looking at CPU charts.