Deep Java Library (DJL): Running Machine Learning Models in Pure Java — A Practical Guide to Training and Inference Without Python#

Artificial Intelligence has rapidly transitioned from experimental research to production infrastructure. Recommendation systems, fraud detection models, computer vision pipelines, and intelligent search engines now sit at the core of many enterprise platforms.

Yet one major architectural tension remains: most machine learning models are built using Python, while most enterprise systems run on Java.

For years, this created an awkward boundary between research teams and production systems. Data scientists trained models in Python notebooks. Backend engineers then struggled to operationalize those models in scalable services.

The Deep Java Library (DJL) was created to bridge that gap.

DJL allows Java developers to train, load, and run machine learning models directly inside Java applications, while still leveraging powerful deep learning engines like PyTorch, TensorFlow, MXNet, and ONNX Runtime under the hood.

In this guide we will explore:

how DJL works internally
how to run machine learning inference in Java
how to train models using DJL
how to deploy AI-powered services in enterprise Java environments

This article is designed for experienced Java engineers who want to integrate machine learning into backend systems without switching languages.

1. Introduction: Can Java Do Machine Learning?#

If you attend a machine learning conference today, nearly every demo runs in Python. Libraries such as PyTorch, TensorFlow, and scikit-learn dominate the research ecosystem.

However, production infrastructure tells a different story.

Large-scale systems at banks, telecom companies, and e-commerce platforms are often built on:

Java
Spring Boot
Kubernetes
JVM-based microservices

This creates a gap between AI research and production deployment.

Typical enterprise systems need to support AI capabilities such as:

product recommendation engines
fraud detection services
document classification
natural language processing
AI-powered REST APIs
image recognition pipelines

Traditionally, teams solved this by deploying separate Python services for inference.

While functional, this architecture introduces additional complexity:

cross-language communication
operational overhead
latency between services
deployment complexity

Running ML models directly inside Java services eliminates many of these issues.

This is where the Deep Java Library (DJL) enters the picture.

DJL is an open-source deep learning framework built specifically for Java developers. Instead of implementing its own deep learning engine, DJL provides a unified API that can run on top of multiple ML engines.

That means Java developers can write code once and run it with:

PyTorch
TensorFlow
MXNet
ONNX Runtime

2. What is Deep Java Library (DJL)?#

The Deep Java Library is a high-level deep learning framework designed to make machine learning accessible to Java developers.

Rather than reinventing deep learning from scratch, DJL provides a Java-native abstraction layer over existing ML engines.

The design philosophy is simple:

Allow Java developers to leverage modern deep learning engines without writing Python.

DJL introduces several important abstractions.

Core Architecture#

At a high level, DJL sits between the Java application and the native deep learning engine.

flowchart TD

A[Java Application]
B[DJL API]
C[Engine Abstraction Layer]
D[Deep Learning Engine]

A --> B
B --> C
C --> D

D -->|PyTorch| E
D -->|TensorFlow| F
D -->|MXNet| G
D -->|ONNX Runtime| H

The application interacts only with the DJL API. The underlying engine handles the heavy numerical computation.

This design allows developers to switch engines without rewriting application code.

NDArray API#

At the heart of DJL lies the NDArray abstraction.

NDArray represents multidimensional numerical data similar to:

NumPy arrays
PyTorch tensors
TensorFlow tensors

Example:

NDManager manager = NDManager.newBaseManager();

NDArray array = manager.create(new float[]{1f, 2f, 3f});
NDArray result = array.mul(2);

System.out.println(result);

This simple API hides the complexity of the underlying engine while maintaining high performance.

Model and Predictor Abstractions#

DJL introduces a clear model lifecycle.

Model
  |
  v
Predictor
  |
  v
Inference

A Model represents the machine learning artifact.

A Predictor is responsible for performing inference using that model.

Model Zoo#

DJL includes a Model Zoo containing pre-trained models for common tasks.

These include:

image classification
object detection
natural language processing
speech recognition

Developers can load these models with minimal configuration.

3. DJL Ecosystem Overview#

The DJL ecosystem consists of several modules that support the full machine learning lifecycle.

Core Components#

DJL Core API#

The main abstraction layer that developers interact with.

Key components include:

NDArray
Model
Predictor
Translator
Training utilities

Engine Integrations#

DJL supports multiple backends.

Common integrations include:

PyTorch Engine
TensorFlow Engine
MXNet Engine
ONNX Runtime Engine

Each engine provides optimized native execution.

Model Zoo#

The DJL Model Zoo hosts pre-trained models that can be used immediately.

Example categories:

image recognition
object detection
natural language processing
sentiment analysis

Dataset Utilities#

DJL includes dataset loaders for common machine learning datasets.

Examples include:

MNIST
ImageNet
custom dataset loaders

Training Utilities#

DJL includes APIs for:

defining neural networks
loss functions
optimizers
training loops

DJL Serving#

DJL Serving is a production-grade model server similar to TensorFlow Serving.

It allows models to be exposed as scalable APIs.

4. Setting Up DJL in a Java Project#

Let's walk through setting up DJL in a Maven-based project.

Step 1: Create a Maven Project#

Create a standard Maven project structure.

src
 └── main
      └── java

Step 2: Add Dependencies#

Add DJL dependencies to your pom.xml.

<dependencies>

<dependency>
<groupId>ai.djl</groupId>
<artifactId>api</artifactId>
<version>0.27.0</version>
</dependency>

<dependency>
<groupId>ai.djl</groupId>
<artifactId>model-zoo</artifactId>
<version>0.27.0</version>
</dependency>

<dependency>
<groupId>ai.djl.pytorch</groupId>
<artifactId>pytorch-engine</artifactId>
<version>0.27.0</version>
</dependency>

</dependencies>

Step 3: Install Runtime Engine#

DJL dynamically downloads the correct native engine.

For example, the PyTorch engine downloads the appropriate binary for:

macOS
Linux
Windows

CPU vs GPU Support#

DJL supports GPU acceleration automatically if CUDA is available.

Typical configurations include:

CPU inference (default)
CUDA GPU inference
multi-GPU environments

5. Running Model Inference in Java#

The most common use case is running inference using pre-trained models.

Let’s walk through a practical example.

Example: Image Classification#

We will load a pre-trained image classification model and run prediction.

Step 1: Load Image#

Image img = ImageFactory.getInstance().fromFile(Paths.get("image.jpg"));

Step 2: Load Model#

Criteria<Image, Classifications> criteria =
    Criteria.builder()
        .setTypes(Image.class, Classifications.class)
        .optApplication(Application.CV.IMAGE_CLASSIFICATION)
        .build();

Step 3: Create Predictor#

try (ZooModel<Image, Classifications> model = criteria.loadModel();
     Predictor<Image, Classifications> predictor = model.newPredictor()) {

    Classifications result = predictor.predict(img);

    System.out.println(result);

}

The Predictor handles preprocessing, model inference, and output conversion.

Translators#

DJL uses Translators to convert between Java objects and model tensors.

Example pipeline:

Image -> NDArray -> Model -> NDArray -> Classifications

Translators encapsulate this logic.

6. Training a Simple Model in Java#

DJL also supports training models directly in Java.

Let’s examine a simplified MNIST digit classifier.

Step 1: Load Dataset#

Mnist dataset = Mnist.builder()
    .setSampling(32, true)
    .build();

dataset.prepare();

Step 2: Define Neural Network#

Block block = new SequentialBlock()
    .add(Blocks.batchFlattenBlock(784))
    .add(Linear.builder().setUnits(128).build())
    .add(Activation.reluBlock())
    .add(Linear.builder().setUnits(10).build());

Step 3: Configure Training#

Model model = Model.newInstance("mnist-model");
model.setBlock(block);

Step 4: Trainer Setup#

Trainer trainer = model.newTrainer(new DefaultTrainingConfig(Loss.softmaxCrossEntropyLoss()));

Step 5: Training Loop#

for (Batch batch : trainer.iterateDataset(dataset)) {
    EasyTrain.trainBatch(trainer, batch);
    trainer.step();
    batch.close();
}

This loop performs forward pass, backpropagation, and optimizer updates.

7. DJL Model Zoo#

The Model Zoo simplifies the use of pre-trained models.

Developers can load models without worrying about:

preprocessing
normalization
tensor formatting

Example model categories:

ResNet image classifiers
BERT NLP models
object detection networks

Loading a model typically requires only a few lines of code.

Publishing Your Own Models#

DJL also supports publishing custom models into the Model Zoo format.

This enables teams to share trained models across services.

8. Hardware Acceleration and Performance#

DJL relies on highly optimized native engines.

Performance benefits include:

GPU acceleration
vectorized computation
optimized memory management

CPU vs GPU Execution#

CPU
  + simpler deployment
  + lower infrastructure cost

GPU
  + faster deep learning inference
  + higher throughput

DJL automatically selects the appropriate device when available.

Batching#

Batching allows multiple inputs to be processed simultaneously.

Benefits include:

higher GPU utilization
improved throughput
reduced overhead

9. Deploying AI Models in Java Production Systems#

Enterprise systems commonly deploy DJL in several ways.

Microservice Inference APIs#

Spring Boot services can expose inference endpoints.

Example architecture:

flowchart LR

Client --> REST_API
REST_API --> DJL_Service
DJL_Service --> Model
Model --> Result
Result --> Client

Batch Processing#

Large-scale analytics pipelines often process millions of records.

DJL can run inside batch jobs.

Streaming Inference#

Real-time applications such as fraud detection may perform inference on streaming events.

DJL Serving#

DJL Serving provides a production-ready model server.

Features include:

REST endpoints
scalable inference
model lifecycle management

10. Java vs Python for Machine Learning#

A realistic discussion must acknowledge that Python remains dominant in research.

Python Advantages#

huge ML ecosystem
notebooks and experimentation
faster prototyping

Java Advantages#

mature enterprise ecosystem
strong typing
scalability
JVM performance
integration with business systems

Many organizations adopt a hybrid approach:

Python
   |
Model Training
   |
Export Model
   |
Java
   |
Inference Service

This architecture combines the strengths of both ecosystems.

11. Future of AI in the Java Ecosystem#

AI infrastructure is evolving quickly.

Several trends are emerging in the JVM world.

AI Microservices#

Dedicated inference services are becoming standard components of microservice architectures.

Vector Databases#

Systems such as:

Pinecone
Weaviate
Milvus

enable semantic search and retrieval.

LLM Integration#

Large language models are increasingly integrated with backend services.

Frameworks like Spring AI are enabling this in Java.

Enterprise AI Platforms#

Many organizations are building internal platforms that manage:

models
embeddings
feature stores
inference APIs

Java services frequently act as the orchestration layer.

12. Conclusion#

Machine learning has historically been associated with Python, but the reality of modern enterprise systems is far more complex.

Production systems require:

scalability
reliability
observability
integration with existing infrastructure

The Deep Java Library (DJL) provides a powerful bridge between machine learning research and enterprise Java applications.

By abstracting over deep learning engines such as PyTorch and TensorFlow, DJL enables developers to run sophisticated machine learning workloads directly inside Java services.

For teams already invested in the JVM ecosystem, DJL represents a compelling path toward integrating AI capabilities without abandoning familiar tools and architectural patterns.

As AI continues to reshape software architecture, the ability to embed intelligent capabilities directly inside Java microservices will become increasingly valuable.

DJL makes that future not only possible, but practical.

Deep Java Library (DJL): Running Machine Learning Models in Pure Java — A Practical Guide to Training and Inference Without Python#

Yet one major architectural tension remains: most machine learning models are built using Python, while most enterprise systems run on Java.

The Deep Java Library (DJL) was created to bridge that gap.

In this guide we will explore:

how DJL works internally
how to run machine learning inference in Java
how to train models using DJL
how to deploy AI-powered services in enterprise Java environments

This article is designed for experienced Java engineers who want to integrate machine learning into backend systems without switching languages.

1. Introduction: Can Java Do Machine Learning?#

If you attend a machine learning conference today, nearly every demo runs in Python. Libraries such as PyTorch, TensorFlow, and scikit-learn dominate the research ecosystem.

However, production infrastructure tells a different story.

Large-scale systems at banks, telecom companies, and e-commerce platforms are often built on:

Java
Spring Boot
Kubernetes
JVM-based microservices

This creates a gap between AI research and production deployment.

Typical enterprise systems need to support AI capabilities such as:

product recommendation engines
fraud detection services
document classification
natural language processing
AI-powered REST APIs
image recognition pipelines

Traditionally, teams solved this by deploying separate Python services for inference.

While functional, this architecture introduces additional complexity:

cross-language communication
operational overhead
latency between services
deployment complexity

Running ML models directly inside Java services eliminates many of these issues.

This is where the Deep Java Library (DJL) enters the picture.

That means Java developers can write code once and run it with:

PyTorch
TensorFlow
MXNet
ONNX Runtime

2. What is Deep Java Library (DJL)?#

The Deep Java Library is a high-level deep learning framework designed to make machine learning accessible to Java developers.

Rather than reinventing deep learning from scratch, DJL provides a Java-native abstraction layer over existing ML engines.

The design philosophy is simple:

Allow Java developers to leverage modern deep learning engines without writing Python.

DJL introduces several important abstractions.

Core Architecture#

At a high level, DJL sits between the Java application and the native deep learning engine.

flowchart TD

A[Java Application]
B[DJL API]
C[Engine Abstraction Layer]
D[Deep Learning Engine]

A --> B
B --> C
C --> D

D -->|PyTorch| E
D -->|TensorFlow| F
D -->|MXNet| G
D -->|ONNX Runtime| H

The application interacts only with the DJL API. The underlying engine handles the heavy numerical computation.

This design allows developers to switch engines without rewriting application code.

NDArray API#

At the heart of DJL lies the NDArray abstraction.

NDArray represents multidimensional numerical data similar to:

NumPy arrays
PyTorch tensors
TensorFlow tensors

Example:

NDManager manager = NDManager.newBaseManager();

NDArray array = manager.create(new float[]{1f, 2f, 3f});
NDArray result = array.mul(2);

System.out.println(result);

This simple API hides the complexity of the underlying engine while maintaining high performance.

Model and Predictor Abstractions#

DJL introduces a clear model lifecycle.

Model
  |
  v
Predictor
  |
  v
Inference

A Model represents the machine learning artifact.

A Predictor is responsible for performing inference using that model.

Model Zoo#

DJL includes a Model Zoo containing pre-trained models for common tasks.

These include:

image classification
object detection
natural language processing
speech recognition

Developers can load these models with minimal configuration.

3. DJL Ecosystem Overview#

The DJL ecosystem consists of several modules that support the full machine learning lifecycle.

Core Components#

DJL Core API#

The main abstraction layer that developers interact with.

Key components include:

NDArray
Model
Predictor
Translator
Training utilities

Engine Integrations#

DJL supports multiple backends.

Common integrations include:

PyTorch Engine
TensorFlow Engine
MXNet Engine
ONNX Runtime Engine

Each engine provides optimized native execution.

Model Zoo#

The DJL Model Zoo hosts pre-trained models that can be used immediately.

Example categories:

image recognition
object detection
natural language processing
sentiment analysis

Dataset Utilities#

DJL includes dataset loaders for common machine learning datasets.

Examples include:

MNIST
ImageNet
custom dataset loaders

Training Utilities#

DJL includes APIs for:

defining neural networks
loss functions
optimizers
training loops

DJL Serving#

DJL Serving is a production-grade model server similar to TensorFlow Serving.

It allows models to be exposed as scalable APIs.

4. Setting Up DJL in a Java Project#

Let's walk through setting up DJL in a Maven-based project.

Step 1: Create a Maven Project#

Create a standard Maven project structure.

src
 └── main
      └── java

Step 2: Add Dependencies#

Add DJL dependencies to your pom.xml.

<dependencies>

<dependency>
<groupId>ai.djl</groupId>
<artifactId>api</artifactId>
<version>0.27.0</version>
</dependency>

<dependency>
<groupId>ai.djl</groupId>
<artifactId>model-zoo</artifactId>
<version>0.27.0</version>
</dependency>

<dependency>
<groupId>ai.djl.pytorch</groupId>
<artifactId>pytorch-engine</artifactId>
<version>0.27.0</version>
</dependency>

</dependencies>

Step 3: Install Runtime Engine#

DJL dynamically downloads the correct native engine.

For example, the PyTorch engine downloads the appropriate binary for:

macOS
Linux
Windows

CPU vs GPU Support#

DJL supports GPU acceleration automatically if CUDA is available.

Typical configurations include:

CPU inference (default)
CUDA GPU inference
multi-GPU environments

5. Running Model Inference in Java#

The most common use case is running inference using pre-trained models.

Let’s walk through a practical example.

Example: Image Classification#

We will load a pre-trained image classification model and run prediction.

Step 1: Load Image#

Image img = ImageFactory.getInstance().fromFile(Paths.get("image.jpg"));

Step 2: Load Model#

Criteria<Image, Classifications> criteria =
    Criteria.builder()
        .setTypes(Image.class, Classifications.class)
        .optApplication(Application.CV.IMAGE_CLASSIFICATION)
        .build();

Step 3: Create Predictor#

try (ZooModel<Image, Classifications> model = criteria.loadModel();
     Predictor<Image, Classifications> predictor = model.newPredictor()) {

    Classifications result = predictor.predict(img);

    System.out.println(result);

}

The Predictor handles preprocessing, model inference, and output conversion.

Translators#

DJL uses Translators to convert between Java objects and model tensors.

Example pipeline:

Image -> NDArray -> Model -> NDArray -> Classifications

Translators encapsulate this logic.

6. Training a Simple Model in Java#

DJL also supports training models directly in Java.

Let’s examine a simplified MNIST digit classifier.

Step 1: Load Dataset#

Mnist dataset = Mnist.builder()
    .setSampling(32, true)
    .build();

dataset.prepare();

Step 2: Define Neural Network#

Block block = new SequentialBlock()
    .add(Blocks.batchFlattenBlock(784))
    .add(Linear.builder().setUnits(128).build())
    .add(Activation.reluBlock())
    .add(Linear.builder().setUnits(10).build());

Step 3: Configure Training#

Model model = Model.newInstance("mnist-model");
model.setBlock(block);

Step 4: Trainer Setup#

Trainer trainer = model.newTrainer(new DefaultTrainingConfig(Loss.softmaxCrossEntropyLoss()));

Step 5: Training Loop#

for (Batch batch : trainer.iterateDataset(dataset)) {
    EasyTrain.trainBatch(trainer, batch);
    trainer.step();
    batch.close();
}

This loop performs forward pass, backpropagation, and optimizer updates.

7. DJL Model Zoo#

The Model Zoo simplifies the use of pre-trained models.

Developers can load models without worrying about:

preprocessing
normalization
tensor formatting

Example model categories:

ResNet image classifiers
BERT NLP models
object detection networks

Loading a model typically requires only a few lines of code.

Publishing Your Own Models#

DJL also supports publishing custom models into the Model Zoo format.

This enables teams to share trained models across services.

8. Hardware Acceleration and Performance#

DJL relies on highly optimized native engines.

Performance benefits include:

GPU acceleration
vectorized computation
optimized memory management

CPU vs GPU Execution#

CPU
  + simpler deployment
  + lower infrastructure cost

GPU
  + faster deep learning inference
  + higher throughput

DJL automatically selects the appropriate device when available.

Batching#

Batching allows multiple inputs to be processed simultaneously.

Benefits include:

higher GPU utilization
improved throughput
reduced overhead

9. Deploying AI Models in Java Production Systems#

Enterprise systems commonly deploy DJL in several ways.

Microservice Inference APIs#

Spring Boot services can expose inference endpoints.

Example architecture:

flowchart LR

Client --> REST_API
REST_API --> DJL_Service
DJL_Service --> Model
Model --> Result
Result --> Client

Batch Processing#

Large-scale analytics pipelines often process millions of records.

DJL can run inside batch jobs.

Streaming Inference#

Real-time applications such as fraud detection may perform inference on streaming events.

DJL Serving#

DJL Serving provides a production-ready model server.

Features include:

REST endpoints
scalable inference
model lifecycle management

10. Java vs Python for Machine Learning#

A realistic discussion must acknowledge that Python remains dominant in research.

Python Advantages#

huge ML ecosystem
notebooks and experimentation
faster prototyping

Java Advantages#

mature enterprise ecosystem
strong typing
scalability
JVM performance
integration with business systems

Many organizations adopt a hybrid approach:

Python
   |
Model Training
   |
Export Model
   |
Java
   |
Inference Service

This architecture combines the strengths of both ecosystems.

11. Future of AI in the Java Ecosystem#

AI infrastructure is evolving quickly.

Several trends are emerging in the JVM world.

AI Microservices#

Dedicated inference services are becoming standard components of microservice architectures.

Vector Databases#

Systems such as:

Pinecone
Weaviate
Milvus

enable semantic search and retrieval.

LLM Integration#

Large language models are increasingly integrated with backend services.

Frameworks like Spring AI are enabling this in Java.

Enterprise AI Platforms#

Many organizations are building internal platforms that manage:

models
embeddings
feature stores
inference APIs

Java services frequently act as the orchestration layer.

12. Conclusion#

Machine learning has historically been associated with Python, but the reality of modern enterprise systems is far more complex.

Production systems require:

scalability
reliability
observability
integration with existing infrastructure

The Deep Java Library (DJL) provides a powerful bridge between machine learning research and enterprise Java applications.

By abstracting over deep learning engines such as PyTorch and TensorFlow, DJL enables developers to run sophisticated machine learning workloads directly inside Java services.

For teams already invested in the JVM ecosystem, DJL represents a compelling path toward integrating AI capabilities without abandoning familiar tools and architectural patterns.

As AI continues to reshape software architecture, the ability to embed intelligent capabilities directly inside Java microservices will become increasingly valuable.

DJL makes that future not only possible, but practical.