Azure High Performance Computing (HPC) Blog Archives

Explore HPC & AI Innovation: Microsoft + AMD at HPC Roundtable 2025

Explore how Microsoft and AMD are accelerating HPC and AI innovation at the HPC Roundtable 2025 in Turin. On September 30th, the city of Turin will host the HPC Roundtable…

29/09/2025Azure High Performance Computing (HPC) Blog

Use Entra IDs to run jobs on your HPC cluster

Introduction This blog demonstrates the practical implementation of System Security Services Daemon (SSSD) with the recently introduced “idp” provider that can be used on Azure Linux 3.0 HPC clusters to…

29/09/2025Azure High Performance Computing (HPC) Blog

CycleCloud + Hammerspace

Abstract The theme of this blog is “Simplicity”. Today’s HPC user has an overabundance of choices when it comes to HPC Schedulers, clouds, infrastructure in those clouds, and data management…

25/09/2025Azure High Performance Computing (HPC) Blog

A Quick Guide to Benchmarking AI models on Azure: Llama 405B and 70B with MLPerf Inference v5.1

by Mark Gitau (Software Engineer) Introduction For the MLPerf Inference v5.1 submission, Azure shared performance results on the new ND GB200 v6 virtual machines. A single ND GB200 v6 VM…

09/09/2025Azure High Performance Computing (HPC) Blog

Performance analysis of DeepSeek R1 AI Inference using vLLM on ND-H100-v5

Introduction The DeepSeek R1 model represents a new frontier in large-scale reasoning for AI applications. Designed to tackle complex inference tasks, R1 pushes the boundaries of what’s possible—but not without…

29/08/2025Azure High Performance Computing (HPC) Blog

Teamcenter Simulation Process Data Management Architecture on Azure CycleCloud- Slurm cluster

Introduction: Many customers run multiple Teamcenter-SPDM solutions across the enterprise, mixing multiple instances, multiple ISV vendors, and hybrid cloud/on-prem implementations. This fragmentation reduces the customer’s ability to uniformly access data.…

28/08/2025Azure High Performance Computing (HPC) Blog

Inference performance of Llama 3.1 8B using vLLM across various GPUs and CPUs

Introduction Following our previous evaluation of Llama 3.1 8B inference performance on Azure’s ND-H100-v5 infrastructure using vLLM, this report broadens the scope to compare inference performance across a range of…

27/08/2025Azure High Performance Computing (HPC) Blog

Performance of Llama 3.1 8B AI Inference using vLLM on ND-H100-v5

Introduction The pace of development in large language models (LLMs) has continued to accelerate as the global AI community races toward the goal of artificial general intelligence (AGI). Today’s most…

26/08/2025Azure High Performance Computing (HPC) Blog

Optimizing Large-Scale AI Performance with Pretraining Validation on a Single Azure ND GB200 v6

by Mishty Dhekial (Software Engineer Intern) and Hugo Affaticati (Cloud Infrastructure Engineer) Why Llama? The Llama3 8B model was selected as the focus of this analysis due to its relevance…

19/08/2025Azure High Performance Computing (HPC) Blog

Ansys Minerva Simulation & Process Data Management Architecture on Azure

Architecture Ansys Minerva baseline architecture has four distributed tiers (client, web, enterprise, and resource) in a single Azure availability zone. Each tier aligns to function and communication flows between these…

30/07/2025Azure High Performance Computing (HPC) Blog

Creating a Slurm Job Submission App in Open OnDemand with Copilot Agent

High Performance Computing (HPC) environments are essential for research, engineering, and data-intensive workloads. To efficiently manage compute resources and job submissions, organizations rely on robust scheduling and orchestration tools. In…

21/07/2025Azure High Performance Computing (HPC) Blog

Performance at Scale: The Role of Interconnects in Azure HPC & AI Infrastructure

by Hugo Affaticati (Cloud Infrastructure Engineer), Amirreza Rastegari (Senior Software Engineer), Jie Zhang (Principal Software Engineer), and Michael Ringenburg (Principal Software Engineer Manager). Interconnects enable communication between VMs (also…

26/06/2025Azure High Performance Computing (HPC) Blog

Benchmark Different Capacities for EDA Workloads on Microsoft HPC Storages

Overview Semiconductor (or Electronic Design Automation [EDA]) companies prioritize reducing time to market (TTM), which depends on how quickly tasks such as chip design validation and pre-foundry work can be…

24/06/2025Azure High Performance Computing (HPC) Blog

Microsoft Discovery: The path to an agentic EDA environment

Generative AI has been the buzz across engineering, science and consumer applications, including EDA. It was the centerpiece of the keynotes at both SNUG and CadenceLive, and it will feature…

20/06/2025Azure High Performance Computing (HPC) Blog

Deploying Siemens NX/X on Azure Virtual Desktop: Multi-Session GPU Sharing for CAD Workloads

Workflow NX users access the NX application deployed on Azure Virtual Desktop via Remote Desktop Application (RDP). User can access the NX application either login into the Session Desktop or…

20/06/2025Azure High Performance Computing (HPC) Blog

Distributed Finetuning, and Inference with NeMo-Run on Azure CycleCloud Workspace for Slurm

Introduction The NVIDIA NeMo Framework is a scalable, cloud-native generative AI framework designed to support the development of Large Language Models (LLMs) and Multimodal Models (MMs). The NeMo Framework provides…

18/06/2025Azure High Performance Computing (HPC) Blog

HPC on Redhat Linux images

While working with enterprise HPC teams, one of the often-heard questions is: how do we use InfiniBand or GPUs on Azure with Redhat Linux? Since Redhat does not provide hpc-enabled…

17/06/2025Azure High Performance Computing (HPC) Blog

Using Gromacs through EESSI on NC_A100_v4

We have been showing in earlier blogs how to use EESSI for getting access to highly optimized applications for different cpu architectures, e.g.: Accessing the EESSI Common Stack of Scientific…

14/06/2025Azure High Performance Computing (HPC) Blog

Building an Automated Recovery Pipeline for GPU Clusters with Slurm on Azure Part 2

Disclaimer: The slurm-cluster-health-manager project is a sample tool created specifically for the article it accompanies. This is not an official Microsoft product, and it is not supported or maintained by Microsoft.…

13/06/2025Azure High Performance Computing (HPC) Blog

Siemens Simcenter™ STAR-CCM+™ on Azure HBv4

A suite of solvers is included in Simcenter STAR CCM+ for solving problems involving complex geometries and physical phenomena. The wide range of physics models includes CFD, computational solid mechanics,…

10/06/2025Azure High Performance Computing (HPC) Blog

Announcing the AI Infrastructure on Azure repository

Authors Davide Vanzo - Senior Technical Program Manager - Azure Specialized Jer-Ming Chia - Principal Technical Program Manager - Azure Specialized Jesse Lopez - Senior Technical Program Manager - Azure…

28/05/2025Azure High Performance Computing (HPC) Blog

Open OnDemand with Azure CycleCloud Workspace for Slurm

Introduction Open OnDemand, developed by the Ohio Supercomputer Center (OSC), is an open-source, web-based portal designed to offer seamless access to high-performance computing (HPC) resources. The integration with Azure CycleCloud…

27/05/2025Azure High Performance Computing (HPC) Blog

Fusing Simulation with Deep Learning: Technical Insights from the Frontlines on Azure

1. Simulation Meets Deep Learning: A New Paradigm At the heart of this confluence lies the potential to fuse numerical rigor with data flexibility. Key emerging patterns include: a. Surrogate…

26/05/2025Azure High Performance Computing (HPC) Blog

Slurm custom image for a locked down environment and faster start-up time, Azure Cyclecloud

Environment : Cyclecloud: 8.7.1 Slurm project 3.0.11 Slurm version: 23.11.10-2 OS of compute and execute: marketplace Almalinux HPC image gen 2 8.10 Prerequisites: - working CC install (mine is currently…

23/05/2025Azure High Performance Computing (HPC) Blog

Slurm custom image for a locked down environemnt and faster start-up time, Azure Cyclecloud

Enviornment : Cyclecloud: 8.7.1 Slurm project 3.0.11 Slurm version: 23.11.10-2 OS of compute and execute: marketplace Almalinux HPC image gen 2 8.10 PREREQUISITES: - working CC install (mine is currently…

22/05/2025Azure High Performance Computing (HPC) Blog

Building an Automated Recovery Pipeline for GPU Clusters with Slurm on Azure Part1

Overview As GPU clusters grow in scale, failure recovery becomes a critical part of maintaining workload resiliency and maximizing compute resource utilization. In this article series, I’ll walk through how…

21/05/2025Azure High Performance Computing (HPC) Blog

Achieving Optimal Performance for DeepSeek Expert Parallelism (DeepEP) on Azure

DeepEP DeepEP is a high-performance communication library developed by DeepSeek AI to optimize Mixture-of-Experts (MoE) and expert parallelism (EP) in large-scale AI models. It provides high-throughput, low-latency all-to-all GPU kernels for…

17/05/2025Azure High Performance Computing (HPC) Blog

Azure CycleCloud + Slurm: A Beginner’s Guide to Job Submission

High Performance Computing Cluster: A high-performance computing (HPC) cluster is a collection of interconnected computers (nodes) that work together to perform complex computational tasks at high speeds, far beyond the…

15/05/2025Azure High Performance Computing (HPC) Blog

Accelerating Genomic Pipelines with Nextflow on Microsoft Azure at the Nextflow Summit Boston 2025

Nextflow is one of the most widely adopted open-source workflow orchestrators in the scientific research domain. In genomics, a pipeline refers to a series of computational steps designed to analyze…

09/05/2025Azure High Performance Computing (HPC) Blog

DGX Cloud Benchmarking on Azure

NVIDIA DGX Cloud benchmarking provides a standardized framework for evaluating the performance of large-scale AI workloads using containerized recipes. Each recipe targets a specific workload and supports flexible configuration across…

05/05/2025Azure High Performance Computing (HPC) Blog

Computer-Aided Engineering “CAE” on Azure

Table of Contents: What is Computer-Aided Engineering (CAE)? Why Moving CAE to Cloud? Cloud vs. On-Premises What Makes Azure Special for CAE Workloads? What Makes Azure Stand out Among Public…

30/04/2025Azure High Performance Computing (HPC) Blog

New Linux VDI Solution: Deploy Cendio Thinlinc as a Standalone Image with Direct Connectivity

Prerequisites Microsoft Azure Subscription: Ensure you have an active Microsoft Azure subscription. Virtual Network: Ensure you have a VNet with connectivity to your corporate network (for example, over VPN) Resource…

29/04/2025Azure High Performance Computing (HPC) Blog

Automating Ubuntu Pro Integration with Azure CycleCloud HPC Clusters

Ubuntu Pro is a premium subscription service offered by Canonical. It is designed to provide additional features, tools, and extended support for Ubuntu users, particularly those in enterprise or production…

28/04/2025Azure High Performance Computing (HPC) Blog

Introducing NVads V710 v5 series VMs

Cost-optimized AI inference, virtual workstations, and cloud gaming. AI inferencing and graphics-intensive applications continue to demand cost-effective, low power, high performance GPUs with more GPU memory and faster CPUs. Today…

24/04/2025Azure High Performance Computing (HPC) Blog

Monitoring HPC & AI Workloads on Azure H/N VMs Using Telegraf and Azure Monitor (GPU & InfiniBand)

As HPC & AI workloads continue to scale in complexity and performance demands, ensuring visibility into the underlying infrastructure becomes critical. This guide presents an essential monitoring solution for AI…

24/04/2025Azure High Performance Computing (HPC) Blog

Session on-demand and highlights from NVIDIA GTC 2025

It was great to see you at the NVIDIA GTC AI Conference in San Jose, CA March 17 - 21! We hope you enjoyed Microsoft's social events and that our…

03/04/2025Azure High Performance Computing (HPC) Blog

Running Container Workloads in CycleCloud-Slurm – Multi-Node, Multi-GPU Jobs (NCCL Benchmark)

Running high-performance computing (HPC) and AI workloads in the cloud requires a flexible and scalable orchestration platform. Microsoft Azure CycleCloud, when combined with Slurm, provides an efficient solution for managing…

02/04/2025Azure High Performance Computing (HPC) Blog

Azure’s ND GB200 v6 Delivers Record Performance for Inference Workloads

In our previous deep dive on the performance of the ND GB200 v6 virtual machines, we explored the architectural improvements of the NVIDIA GB200 NVL72 one component at a time.…

31/03/2025Azure High Performance Computing (HPC) Blog

Preview Announcement: Open OnDemand Integration with Azure CycleCloud Workspace for Slurm

We are thrilled to announce the preview of Open OnDemand integration with Azure CycleCloud Workspace for Slurm. Open OnDemand, developed by the Ohio Supercomputer Center (OSC), is an open-source, web-based…

27/03/2025Azure High Performance Computing (HPC) Blog

Product Announcement: Open OnDemand Integration with Azure CycleCloud Workspace for Slurm

27/03/2025Azure High Performance Computing (HPC) Blog

Accelerating the Intelligence Age with Azure AI Infrastructure and the GA of ND GB200 v6

Today we are thrilled to announce the General Availability of Azure's latest AI infrastructure Virtual Machines, the ND GB200 v6. Azure is proud to be one of the first cloud…

18/03/2025Azure High Performance Computing (HPC) Blog

Hibernation is now Generally Available on GPU VMs!

Today, we're excited to announce that hibernation for Azure GPU VMs is now generally available. We announced the extension of hibernation support to GPU Virtual Machines (VMs) in Azure in April…

18/03/2025Azure High Performance Computing (HPC) Blog

Unpacking the Performance of Microsoft Azure ND GB200 v6 Virtual Machines

For a comprehensive understanding of our benchmarking methodologies and detailed performance results, please refer to our benchmarking guide available on the official Azure GitHub repository: Azure AI Benchmarking Guide. Breakdown…

17/03/2025Azure High Performance Computing (HPC) Blog

Experience Next-Gen HPC Innovation: AMD Lab Empowers ‘Try Before You Buy’ on Azure

In today’s fast-paced digital landscape, High-Performance Computing (HPC) is a critical engine powering innovation across industries—from automotive and aerospace to energy and manufacturing. To keep pace with escalating performance demands…

11/03/2025Azure High Performance Computing (HPC) Blog

Daily schedule: Microsoft in-booth sessions at NVIDIA GTC

We look forward to seeing you at the NVIDIA GTC AI Conference March 17 - 21 in San Jose, CA or virtually. Visit Microsoft booth #514 for daily informative sessions…

06/03/2025Azure High Performance Computing (HPC) Blog

Optimizing AI Workloads on Azure: CPU Pinning via NCCL Topology file

(Co-authored by: Rafael Salas, Sreevatsa Anantharamu, Jithin Jose) Introduction NCCL NVIDIA Collective Communications Library (NCCL) is one of the most widely used communication library for AI training/inference. It features GPU-focused…

03/03/2025Azure High Performance Computing (HPC) Blog

Innovation starts here! Join Microsoft at NVIDIA GTC

Calling all AI innovators, champions, and pioneers... The impact of AI on businesses is profound and Microsoft is committed to helping every company, regardless of its size, and everyone, regardless…

13/02/2025Azure High Performance Computing (HPC) Blog

Create the future with AI: Join Microsoft at NVIDIA GTC

13/02/2025Azure High Performance Computing (HPC) Blog

Using Azure CycleCloud with Weka

What is Azure CycleCloud? Azure CycleCloud is an enterprise-friendly tool for orchestrating and managing HPC environments on Azure. With Azure CycleCloud, users can provision infrastructure for HPC systems, deploy…

11/02/2025Azure High Performance Computing (HPC) Blog

Running DeepSeek-R1 on a single NDv5 MI300X VM

Contributors: Davide Vanzo, Yuval Mazor, Jesse Lopez DeepSeek-R1 is an open-weights reasoning model built on DeepSeek-V3, designed for conversational AI, coding, and complex problem-solving. It has gained significant attention…

01/02/2025Azure High Performance Computing (HPC) Blog

Azure High Performance Computing (HPC) Blog

Today's News

Login

What you're searching for?