Performance analysis of DeepSeek R1 AI Inference using vLLM on ND-H100-v5

Introduction The DeepSeek R1 model represents a new frontier in large-scale reasoning for AI applications. Designed to tackle complex inference tasks, R1 pushes the boundaries of what’s possible—but not without significant infrastructure demands. To deploy DeepSeek R1 effectively in an inference service like vLLM, high-performance hardware is essential. Specifically, the…

29/08/2025Azure High Performance Computing (HPC) Blog

Share:

You may be interested in

Running GPU accelerated workloads with NVIDIA GPU Operator on AKS
Azure High Performance Computing (HPC) Blog,
Dr. Wolfgang De Salvador - EMEA GBB HPC/AI Infrastructure Senior Specialist Dr. Kai Neuffer - Principal Program Manager, Industry and Partner Sales - Energy Industry Resources and references used in…
23/02/2024
Benchmark EDA workloads on Azure Intel Emerald Rapids (EMR) VMs
Azure High Performance Computing (HPC) Blog,
Co-author: Nalla Ram and Wiener Evgeny, Intel Electronic Design Automation (EDA) consists of a collection of software tools and workflows used for designing semiconductor products, most notably advanced computer…
25/11/2024
Unmounting Azure Managed Lustre Filesystem in a CycleCloud HPC cluster using Azure Scheduled Events.
Azure High Performance Computing (HPC) Blog,
There is a known behaviour in Lustre if a VM has the Lustre mounted and it gets evicted or deleted as part of workflow without releasing the filesystem lock. Lustre…
04/10/2023
Slurm custom image for a locked down environemnt and faster start-up time, Azure Cyclecloud
Azure High Performance Computing (HPC) Blog,
Enviornment : Cyclecloud: 8.7.1 Slurm project 3.0.11 Slurm version: 23.11.10-2 OS of compute and execute: marketplace Almalinux HPC image gen 2 8.10 PREREQUISITES: - working CC install (mine is currently…
22/05/2025
Creating a SLURM Cluster for Scheduling NVIDIA MIG-Based GPU Accelerated workloads
Azure High Performance Computing (HPC) Blog,
Today, researchers and developers often use a dedicated GPU for their workloads, even when only a fraction of the GPU's compute power is needed. The NVIDIA A100, A30, and H100…
07/07/2024
Open OnDemand with Azure CycleCloud Workspace for Slurm
Azure High Performance Computing (HPC) Blog,
Introduction Open OnDemand, developed by the Ohio Supercomputer Center (OSC), is an open-source, web-based portal designed to offer seamless access to high-performance computing (HPC) resources. The integration with Azure CycleCloud…
27/05/2025
Saying Goodbye to HPC Pack 2012 R2: End of Life Date Reached April 11th 2023
Azure High Performance Computing (HPC) Blog,
Introduction: Microsoft announced many years ago that it will end support for its High-Performance Computing (HPC) Pack 2012 R2 on April 11th, 2023. This means that Microsoft will no longer…
11/04/2023
Running DeepSeek-R1 on a single NDv5 MI300X VM
Azure High Performance Computing (HPC) Blog,
Contributors: Davide Vanzo, Yuval Mazor, Jesse Lopez DeepSeek-R1 is an open-weights reasoning model built on DeepSeek-V3, designed for conversational AI, coding, and complex problem-solving. It has gained significant attention…
01/02/2025
Daily schedule: Microsoft in-booth sessions at NVIDIA GTC
Azure High Performance Computing (HPC) Blog,
We look forward to seeing you at the NVIDIA GTC AI Conference March 17 - 21 in San Jose, CA or virtually. Visit Microsoft booth #514 for daily informative sessions…
06/03/2025
Running OpenFOAM simulations on Azure Batch
Azure High Performance Computing (HPC) Blog,
OpenFOAM (Open Field Operation and Manipulation) is an open-source computational fluid dynamics (CFD) software package. It provides a comprehensive set of tools for simulating and analyzing complex fluid flow and…
12/05/2023
Quick HPC Cluster Creation with Apps using CycleCloud and EESSI: A WRF example
Azure High Performance Computing (HPC) Blog,
Would you like to have a single script to quickly provision High Performance Computing (HPC) clusters with access to several ready-to-use HPC applications (WRF, GROMACS, OpenFOAM, and many more) so…
22/02/2024
Siemens Simcenter™ STAR-CCM+™ on Azure HBv4
Azure High Performance Computing (HPC) Blog,
A suite of solvers is included in Simcenter STAR CCM+ for solving problems involving complex geometries and physical phenomena. The wide range of physics models includes CFD, computational solid mechanics,…
10/06/2025