Artificial Intelligence has rapidly transitioned from experimental research to production infrastructure. Recommendation systems, fraud detection models, computer vision pipelines, and intelligent search engines now sit at the core of many enterprise platforms.
Yet one major architectural tension remains: most machine learning models are built using Python, while most enterprise systems run on Java.
For years, this created an awkward boundary between research teams and production systems. Data scientists trained models in Python notebooks. Backend engineers then struggled to operationalize those models in scalable services.
The Deep Java Library (DJL) was created to bridge that gap.
DJL allows Java developers to train, load, and run machine learning models directly inside Java applications, while still leveraging powerful deep learning engines like PyTorch, TensorFlow, MXNet, and ONNX Runtime under the hood.
In this guide we will explore:
This article is designed for experienced Java engineers who want to integrate machine learning into backend systems without switching languages.
If you attend a machine learning conference today, nearly every demo runs in Python. Libraries such as PyTorch, TensorFlow, and scikit-learn dominate the research ecosystem.
However, production infrastructure tells a different story.
Large-scale systems at banks, telecom companies, and e-commerce platforms are often built on:
This creates a gap between AI research and production deployment.
Typical enterprise systems need to support AI capabilities such as:
Traditionally, teams solved this by deploying separate Python services for inference.
While functional, this architecture introduces additional complexity:
Running ML models directly inside Java services eliminates many of these issues.
This is where the Deep Java Library (DJL) enters the picture.
DJL is an open-source deep learning framework built specifically for Java developers. Instead of implementing its own deep learning engine, DJL provides a unified API that can run on top of multiple ML engines.
That means Java developers can write code once and run it with:
The Deep Java Library is a high-level deep learning framework designed to make machine learning accessible to Java developers.
Rather than reinventing deep learning from scratch, DJL provides a Java-native abstraction layer over existing ML engines.
The design philosophy is simple:
Allow Java developers to leverage modern deep learning engines without writing Python.
DJL introduces several important abstractions.
At a high level, DJL sits between the Java application and the native deep learning engine.
flowchart TD A[Java Application] B[DJL API] C[Engine Abstraction Layer] D[Deep Learning Engine] A --> B B --> C C --> D D -->|PyTorch| E D -->|TensorFlow| F D -->|MXNet| G D -->|ONNX Runtime| H
The application interacts only with the DJL API. The underlying engine handles the heavy numerical computation.
This design allows developers to switch engines without rewriting application code.
At the heart of DJL lies the NDArray abstraction.
NDArray represents multidimensional numerical data similar to:
Example:
NDManager manager = NDManager.newBaseManager();
NDArray array = manager.create(new float[]{1f, 2f, 3f});
NDArray result = array.mul(2);
System.out.println(result);
This simple API hides the complexity of the underlying engine while maintaining high performance.
DJL introduces a clear model lifecycle.
Model
|
v
Predictor
|
v
Inference
A Model represents the machine learning artifact.
A Predictor is responsible for performing inference using that model.
DJL includes a Model Zoo containing pre-trained models for common tasks.
These include:
Developers can load these models with minimal configuration.
The DJL ecosystem consists of several modules that support the full machine learning lifecycle.
The main abstraction layer that developers interact with.
Key components include:
DJL supports multiple backends.
Common integrations include:
Each engine provides optimized native execution.
The DJL Model Zoo hosts pre-trained models that can be used immediately.
Example categories:
DJL includes dataset loaders for common machine learning datasets.
Examples include:
DJL includes APIs for:
DJL Serving is a production-grade model server similar to TensorFlow Serving.
It allows models to be exposed as scalable APIs.
Let's walk through setting up DJL in a Maven-based project.
Create a standard Maven project structure.
src
└── main
└── java
Add DJL dependencies to your pom.xml.
<dependencies>
<dependency>
<groupId>ai.djl</groupId>
<artifactId>api</artifactId>
<version>0.27.0</version>
</dependency>
<dependency>
<groupId>ai.djl</groupId>
<artifactId>model-zoo</artifactId>
<version>0.27.0</version>
</dependency>
<dependency>
<groupId>ai.djl.pytorch</groupId>
<artifactId>pytorch-engine</artifactId>
<version>0.27.0</version>
</dependency>
</dependencies>
DJL dynamically downloads the correct native engine.
For example, the PyTorch engine downloads the appropriate binary for:
DJL supports GPU acceleration automatically if CUDA is available.
Typical configurations include:
The most common use case is running inference using pre-trained models.
Let’s walk through a practical example.
We will load a pre-trained image classification model and run prediction.
Image img = ImageFactory.getInstance().fromFile(Paths.get("image.jpg"));
Criteria<Image, Classifications> criteria =
Criteria.builder()
.setTypes(Image.class, Classifications.class)
.optApplication(Application.CV.IMAGE_CLASSIFICATION)
.build();
try (ZooModel<Image, Classifications> model = criteria.loadModel();
Predictor<Image, Classifications> predictor = model.newPredictor()) {
Classifications result = predictor.predict(img);
System.out.println(result);
}
The Predictor handles preprocessing, model inference, and output conversion.
DJL uses Translators to convert between Java objects and model tensors.
Example pipeline:
Image -> NDArray -> Model -> NDArray -> Classifications
Translators encapsulate this logic.
DJL also supports training models directly in Java.
Let’s examine a simplified MNIST digit classifier.
Mnist dataset = Mnist.builder()
.setSampling(32, true)
.build();
dataset.prepare();
Block block = new SequentialBlock()
.add(Blocks.batchFlattenBlock(784))
.add(Linear.builder().setUnits(128).build())
.add(Activation.reluBlock())
.add(Linear.builder().setUnits(10).build());
Model model = Model.newInstance("mnist-model");
model.setBlock(block);
Trainer trainer = model.newTrainer(new DefaultTrainingConfig(Loss.softmaxCrossEntropyLoss()));
for (Batch batch : trainer.iterateDataset(dataset)) {
EasyTrain.trainBatch(trainer, batch);
trainer.step();
batch.close();
}
This loop performs forward pass, backpropagation, and optimizer updates.
The Model Zoo simplifies the use of pre-trained models.
Developers can load models without worrying about:
Example model categories:
Loading a model typically requires only a few lines of code.
DJL also supports publishing custom models into the Model Zoo format.
This enables teams to share trained models across services.
DJL relies on highly optimized native engines.
Performance benefits include:
CPU
+ simpler deployment
+ lower infrastructure cost
GPU
+ faster deep learning inference
+ higher throughput
DJL automatically selects the appropriate device when available.
Batching allows multiple inputs to be processed simultaneously.
Benefits include:
Enterprise systems commonly deploy DJL in several ways.
Spring Boot services can expose inference endpoints.
Example architecture:
flowchart LR Client --> REST_API REST_API --> DJL_Service DJL_Service --> Model Model --> Result Result --> Client
Large-scale analytics pipelines often process millions of records.
DJL can run inside batch jobs.
Real-time applications such as fraud detection may perform inference on streaming events.
DJL Serving provides a production-ready model server.
Features include:
A realistic discussion must acknowledge that Python remains dominant in research.
Many organizations adopt a hybrid approach:
Python
|
Model Training
|
Export Model
|
Java
|
Inference Service
This architecture combines the strengths of both ecosystems.
AI infrastructure is evolving quickly.
Several trends are emerging in the JVM world.
Dedicated inference services are becoming standard components of microservice architectures.
Systems such as:
enable semantic search and retrieval.
Large language models are increasingly integrated with backend services.
Frameworks like Spring AI are enabling this in Java.
Many organizations are building internal platforms that manage:
Java services frequently act as the orchestration layer.
Machine learning has historically been associated with Python, but the reality of modern enterprise systems is far more complex.
Production systems require:
The Deep Java Library (DJL) provides a powerful bridge between machine learning research and enterprise Java applications.
By abstracting over deep learning engines such as PyTorch and TensorFlow, DJL enables developers to run sophisticated machine learning workloads directly inside Java services.
For teams already invested in the JVM ecosystem, DJL represents a compelling path toward integrating AI capabilities without abandoning familiar tools and architectural patterns.
As AI continues to reshape software architecture, the ability to embed intelligent capabilities directly inside Java microservices will become increasingly valuable.
DJL makes that future not only possible, but practical.