Achieving Optimal Performance for DeepSeek Expert Parallelism (DeepEP) on Azure

DeepEP DeepEP is a high-performance communication library developed by DeepSeek AI to optimize Mixture-of-Experts (MoE) and expert parallelism (EP) in large-scale AI models. It provides high-throughput, low-latency all-to-all GPU kernels for MoE dispatch and combine operations, which are critical for efficiently routing data between expert modules during training and inference. DeepEP…

17/05/2025Azure High Performance Computing (HPC) Blog

Share:

You may be interested in

HPC/AI Storage options for NDm_v4 (A100) Azure kubernetes service (AKS) cluster
Azure High Performance Computing (HPC) Blog,
Introduction In a previous blog post we showed how to deploy an optimal NDm_v4 AKS cluster, i.e. all 8 InfiniBand and GPU devices on each NDm_v4 are installed…
15/06/2023
MatLab and Azure: A Match Made in Performance Heaven
Azure High Performance Computing (HPC) Blog,
Used in many industries, including engineering, mathematics, and finance, MatLab is a proprietary programming language and multi-paradigm numerical computing environment. With the increasing complexity of data analysis, simulation, and modeling…
10/04/2023
Massive Scaling of Barracuda Virtual Reactor on Azure-GPU
Azure High Performance Computing (HPC) Blog,
1. Introduction: We are excited to share with you the latest benchmarks for Barracuda Virtual Reactor on the Azure NDv2 ('Standard_ND40rs_v2') and ND A100 v4 ('Standard_ND96asr_v4') virtual machines (VMs), accelerated by NVIDIA…
12/06/2023
Unpacking the Performance of Microsoft Azure ND GB200 v6 Virtual Machines
Azure High Performance Computing (HPC) Blog,
For a comprehensive understanding of our benchmarking methodologies and detailed performance results, please refer to our benchmarking guide available on the official Azure GitHub repository: Azure AI Benchmarking Guide. Breakdown…
17/03/2025
Running OpenFOAM simulations on Azure Batch
Azure High Performance Computing (HPC) Blog,
OpenFOAM (Open Field Operation and Manipulation) is an open-source computational fluid dynamics (CFD) software package. It provides a comprehensive set of tools for simulating and analyzing complex fluid flow and…
12/05/2023
3 ways HPC supports innovation in manufacturing
Azure High Performance Computing (HPC) Blog,
Download the eBook 1 IDC, High-Performance Computing: Game Changer in the Manufacturing Industry, 2022. 2 Hyperion Research, HPC Market Update: HPC/AI Market Results, and High Growth Areas, 2023.…
17/09/2024
Monitoring HPC & AI Workloads on Azure H/N VMs Using Telegraf and Azure Monitor (GPU & InfiniBand)
Azure High Performance Computing (HPC) Blog,
As HPC & AI workloads continue to scale in complexity and performance demands, ensuring visibility into the underlying infrastructure becomes critical. This guide presents an essential monitoring solution for AI…
24/04/2025
Announcing Azure HBv5 Virtual Machines: A Breakthrough in Memory Bandwidth for HPC
Azure High Performance Computing (HPC) Blog,
The next generation of purpose-built HPC Azure virtual machines Today during Microsoft Ignite, Satya Nadella unveiled our latest CPU-based virtual machine for HPC customers and their applications, Azure HBv5. This…
19/11/2024
Siemens Simcenter™ STAR-CCM+™ on Azure HBv4
Azure High Performance Computing (HPC) Blog,
A suite of solvers is included in Simcenter STAR CCM+ for solving problems involving complex geometries and physical phenomena. The wide range of physics models includes CFD, computational solid mechanics,…
10/06/2025
Create the Future at Microsoft Azure HPC + AI Infrastructure Summit with AMD.
Azure High Performance Computing (HPC) Blog,
HPC AI Infrastructure Summit Discover how Microsoft, AMD, and your peers are revolutionizing their infrastructure with HPC+AI Click Here to Register Now to Secure Your Spot! Join us for…
03/05/2024
Comprehensive Nvidia GPU Monitoring for Azure N-Series VMs Using Telegraf with Azure Monitor
Azure High Performance Computing (HPC) Blog,
In today’s AI and HPC landscapes, GPU monitoring has become essential due to the complexity and high resource demands of these workloads. Effective monitoring ensures that GPUs are utilized optimally,…
30/09/2024
Inference performance of Llama 3.1 8B using vLLM across various GPUs and CPUs
Azure High Performance Computing (HPC) Blog,
Introduction Following our previous evaluation of Llama 3.1 8B inference performance on Azure’s ND-H100-v5 infrastructure using vLLM, this report broadens the scope to compare inference performance across a range of…
27/08/2025