Please log in to access the latest updates. If you don't have an account yet, you can register by clicking the Register link. We're excited to have you join our website and stay informed about our latest updates.
Abstract The theme of this blog is “Simplicity”. Today’s HPC user has an overabundance of choices when it comes to HPC Schedulers, clouds, infrastructure in those clouds, and data management…
by Mark Gitau (Software Engineer) Introduction For the MLPerf Inference v5.1 submission, Azure shared performance results on the new ND GB200 v6 virtual machines. A single ND GB200 v6 VM…
Introduction The DeepSeek R1 model represents a new frontier in large-scale reasoning for AI applications. Designed to tackle complex inference tasks, R1 pushes the boundaries of what’s possible—but not without…
Introduction: Many customers run multiple Teamcenter-SPDM solutions across the enterprise, mixing multiple instances, multiple ISV vendors, and hybrid cloud/on-prem implementations. This fragmentation reduces the customer’s ability to uniformly access data.…
Introduction Following our previous evaluation of Llama 3.1 8B inference performance on Azure’s ND-H100-v5 infrastructure using vLLM, this report broadens the scope to compare inference performance across a range of…
Introduction The pace of development in large language models (LLMs) has continued to accelerate as the global AI community races toward the goal of artificial general intelligence (AGI). Today’s most…
by Mishty Dhekial (Software Engineer Intern) and Hugo Affaticati (Cloud Infrastructure Engineer) Why Llama? The Llama3 8B model was selected as the focus of this analysis due to its relevance…
Architecture Ansys Minerva baseline architecture has four distributed tiers (client, web, enterprise, and resource) in a single Azure availability zone. Each tier aligns to function and communication flows between these…
High Performance Computing (HPC) environments are essential for research, engineering, and data-intensive workloads. To efficiently manage compute resources and job submissions, organizations rely on robust scheduling and orchestration tools. In…
by Hugo Affaticati (Cloud Infrastructure Engineer), Amirreza Rastegari (Senior Software Engineer), Jie Zhang (Principal Software Engineer), and Michael Ringenburg (Principal Software Engineer Manager). Interconnects enable communication between VMs (also…
Overview Semiconductor (or Electronic Design Automation [EDA]) companies prioritize reducing time to market (TTM), which depends on how quickly tasks such as chip design validation and pre-foundry work can be…
Generative AI has been the buzz across engineering, science and consumer applications, including EDA. It was the centerpiece of the keynotes at both SNUG and CadenceLive, and it will feature…
Workflow NX users access the NX application deployed on Azure Virtual Desktop via Remote Desktop Application (RDP). User can access the NX application either login into the Session Desktop or…
Introduction The NVIDIA NeMo Framework is a scalable, cloud-native generative AI framework designed to support the development of Large Language Models (LLMs) and Multimodal Models (MMs). The NeMo Framework provides…
While working with enterprise HPC teams, one of the often-heard questions is: how do we use InfiniBand or GPUs on Azure with Redhat Linux? Since Redhat does not provide hpc-enabled…
We have been showing in earlier blogs how to use EESSI for getting access to highly optimized applications for different cpu architectures, e.g.: Accessing the EESSI Common Stack of Scientific…
Disclaimer: The slurm-cluster-health-manager project is a sample tool created specifically for the article it accompanies. This is not an official Microsoft product, and it is not supported or maintained by Microsoft.…
A suite of solvers is included in Simcenter STAR CCM+ for solving problems involving complex geometries and physical phenomena. The wide range of physics models includes CFD, computational solid mechanics,…
Introduction Open OnDemand, developed by the Ohio Supercomputer Center (OSC), is an open-source, web-based portal designed to offer seamless access to high-performance computing (HPC) resources. The integration with Azure CycleCloud…
1. Simulation Meets Deep Learning: A New Paradigm At the heart of this confluence lies the potential to fuse numerical rigor with data flexibility. Key emerging patterns include: a. Surrogate…
Environment : Cyclecloud: 8.7.1 Slurm project 3.0.11 Slurm version: 23.11.10-2 OS of compute and execute: marketplace Almalinux HPC image gen 2 8.10 Prerequisites: - working CC install (mine is currently…
Enviornment : Cyclecloud: 8.7.1 Slurm project 3.0.11 Slurm version: 23.11.10-2 OS of compute and execute: marketplace Almalinux HPC image gen 2 8.10 PREREQUISITES: - working CC install (mine is currently…
Overview As GPU clusters grow in scale, failure recovery becomes a critical part of maintaining workload resiliency and maximizing compute resource utilization. In this article series, I’ll walk through how…
DeepEP DeepEP is a high-performance communication library developed by DeepSeek AI to optimize Mixture-of-Experts (MoE) and expert parallelism (EP) in large-scale AI models. It provides high-throughput, low-latency all-to-all GPU kernels for…
High Performance Computing Cluster: A high-performance computing (HPC) cluster is a collection of interconnected computers (nodes) that work together to perform complex computational tasks at high speeds, far beyond the…
Nextflow is one of the most widely adopted open-source workflow orchestrators in the scientific research domain. In genomics, a pipeline refers to a series of computational steps designed to analyze…
NVIDIA DGX Cloud benchmarking provides a standardized framework for evaluating the performance of large-scale AI workloads using containerized recipes. Each recipe targets a specific workload and supports flexible configuration across…
Table of Contents: What is Computer-Aided Engineering (CAE)? Why Moving CAE to Cloud? Cloud vs. On-Premises What Makes Azure Special for CAE Workloads? What Makes Azure Stand out Among Public…
Prerequisites Microsoft Azure Subscription: Ensure you have an active Microsoft Azure subscription. Virtual Network: Ensure you have a VNet with connectivity to your corporate network (for example, over VPN) Resource…
Ubuntu Pro is a premium subscription service offered by Canonical. It is designed to provide additional features, tools, and extended support for Ubuntu users, particularly those in enterprise or production…
Cost-optimized AI inference, virtual workstations, and cloud gaming. AI inferencing and graphics-intensive applications continue to demand cost-effective, low power, high performance GPUs with more GPU memory and faster CPUs. Today…
As HPC & AI workloads continue to scale in complexity and performance demands, ensuring visibility into the underlying infrastructure becomes critical. This guide presents an essential monitoring solution for AI…
Running high-performance computing (HPC) and AI workloads in the cloud requires a flexible and scalable orchestration platform. Microsoft Azure CycleCloud, when combined with Slurm, provides an efficient solution for managing…
In our previous deep dive on the performance of the ND GB200 v6 virtual machines, we explored the architectural improvements of the NVIDIA GB200 NVL72 one component at a time.…
We are thrilled to announce the preview of Open OnDemand integration with Azure CycleCloud Workspace for Slurm. Open OnDemand, developed by the Ohio Supercomputer Center (OSC), is an open-source, web-based…
We are thrilled to announce the preview of Open OnDemand integration with Azure CycleCloud Workspace for Slurm. Open OnDemand, developed by the Ohio Supercomputer Center (OSC), is an open-source, web-based…
Today we are thrilled to announce the General Availability of Azure's latest AI infrastructure Virtual Machines, the ND GB200 v6. Azure is proud to be one of the first cloud…
Today, we're excited to announce that hibernation for Azure GPU VMs is now generally available. We announced the extension of hibernation support to GPU Virtual Machines (VMs) in Azure in April…
For a comprehensive understanding of our benchmarking methodologies and detailed performance results, please refer to our benchmarking guide available on the official Azure GitHub repository: Azure AI Benchmarking Guide. Breakdown…
In today’s fast-paced digital landscape, High-Performance Computing (HPC) is a critical engine powering innovation across industries—from automotive and aerospace to energy and manufacturing. To keep pace with escalating performance demands…
We look forward to seeing you at the NVIDIA GTC AI Conference March 17 - 21 in San Jose, CA or virtually. Visit Microsoft booth #514 for daily informative sessions…
(Co-authored by: Rafael Salas, Sreevatsa Anantharamu, Jithin Jose) Introduction NCCL NVIDIA Collective Communications Library (NCCL) is one of the most widely used communication library for AI training/inference. It features GPU-focused…
Calling all AI innovators, champions, and pioneers... The impact of AI on businesses is profound and Microsoft is committed to helping every company, regardless of its size, and everyone, regardless…
Calling all AI innovators, champions, and pioneers... The impact of AI on businesses is profound and Microsoft is committed to helping every company, regardless of its size, and everyone, regardless…
What is Azure CycleCloud? Azure CycleCloud is an enterprise-friendly tool for orchestrating and managing HPC environments on Azure. With Azure CycleCloud, users can provision infrastructure for HPC systems, deploy…
Contributors: Davide Vanzo, Yuval Mazor, Jesse Lopez DeepSeek-R1 is an open-weights reasoning model built on DeepSeek-V3, designed for conversational AI, coding, and complex problem-solving. It has gained significant attention…
Introduction This guide provides step-by-step instructions on how to run DeepSeek on Azure Kubernetes Service (AKS). The setup utilizes an ND-H100-v5 VM to accommodate the 4-bit quantized 671-billion parameter model on a…
High-performance computing thrives on efficient GPU resource sharing, and integrating NVIDIA’s CUDA Multi-Process Service (MPS) with CycleCloud-managed Slurm clusters can revolutionize how teams optimize their workloads. CUDA MPS streamlines…