Optimizing Language Model Inference on Azure

By Shantanu Deepak Patankar, Software Engineer Intern, and Hugo Affaticati, Technical Program Manager 2 Inefficient inference optimization can lead to skyrocketing costs for customers, making it crucial to establish clear performance benchmarking numbers. This blog sets the standard for expected performance, helping customers make informed decisions that maximize efficiency and…

02/10/2024Azure High Performance Computing (HPC) Blog

Share:

You may be interested in

Product Announcement: Open OnDemand Integration with Azure CycleCloud Workspace for Slurm
Azure High Performance Computing (HPC) Blog,
We are thrilled to announce the preview of Open OnDemand integration with Azure CycleCloud Workspace for Slurm. Open OnDemand, developed by the Ohio Supercomputer Center (OSC), is an open-source, web-based…
27/03/2025
Achieving Optimal Performance for DeepSeek Expert Parallelism (DeepEP) on Azure
Azure High Performance Computing (HPC) Blog,
DeepEP DeepEP is a high-performance communication library developed by DeepSeek AI to optimize Mixture-of-Experts (MoE) and expert parallelism (EP) in large-scale AI models. It provides high-throughput, low-latency all-to-all GPU kernels for…
17/05/2025
Running AutoDock HPC application on Azure Machine Learning platform
Azure High Performance Computing (HPC) Blog,
Azure Machine Learning is a service that enables you to create, train, deploy, and manage machine learning models and experiments. You can also use it to run HPC applications, such…
03/01/2024
Azure Managed Lustre with Automatic Synchronisation to Azure BLOB Storage
Azure High Performance Computing (HPC) Blog,
Introduction This blog post walks through how to setup an Azure Managed Lustre Filesystem (AMLFS) that will automatically synchronise to an Azure BLOB Storage container. The synchronisation is achieved using the Lustre…
30/11/2023
Integrating Azure Managed Lustre Filesystem (AMLFS) into CycleCloud HPC Cluster
Azure High Performance Computing (HPC) Blog,
Overview: This blog discusses how easily we can integrate Azure Managed Lustre Filesystem into CycleCloud HPC cluster using a custom project named cyclecloud-amlfs. Azure Managed Lustre delivers the time-tested Lustre file…
14/07/2023
Comprehensive Nvidia GPU Monitoring for Azure N-Series VMs Using Telegraf with Azure Monitor
Azure High Performance Computing (HPC) Blog,
In today’s AI and HPC landscapes, GPU monitoring has become essential due to the complexity and high resource demands of these workloads. Effective monitoring ensures that GPUs are utilized optimally,…
30/09/2024
Accelerating water wading simulation using Altair® nanoFluidX® on Azure Nvidia A100 and Nvidia H100
Azure High Performance Computing (HPC) Blog,
Over the last few weeks we have been working together with Altair engineers to verify and validate their nanoFluidX v2024 product on Azure. This software offers significant advantages for engineers…
09/09/2024
Experience Next-Gen HPC Innovation: AMD Lab Empowers ‘Try Before You Buy’ on Azure
Azure High Performance Computing (HPC) Blog,
In today’s fast-paced digital landscape, High-Performance Computing (HPC) is a critical engine powering innovation across industries—from automotive and aerospace to energy and manufacturing. To keep pace with escalating performance demands…
11/03/2025
Annual Roundup of AI Infrastructure Breakthroughs for 2023
Azure High Performance Computing (HPC) Blog,
What a difference a year makes! Last year I said 2022 was a banner year for AI developments… if 2022 was a banner year then how should we describe 2023?…
27/03/2024
Microsoft at SC23
Azure High Performance Computing (HPC) Blog,
Come visit Microsoft at Supercomputing 2023 (SC23) November 12 – 17 where we’ll deep dive into high-performance computing (HPC) and AI solutions during an exciting week of sessions, hands on experiences and…
27/10/2023
Announcing the release of CycleCloud 8.6
Azure High Performance Computing (HPC) Blog,
CycleCloud 8.6: What's New and How to Get Started Learn about the latest features and enhancements of CycleCloud, the leading cloud HPC orchestration platform. CycleCloud is a powerful tool that…
29/02/2024
Automated deployment of CycleCloud and SLURM using CLI
Azure High Performance Computing (HPC) Blog,
Azure CycleCloud allows the creation of resources to run High Performance Computing (HPC) applications based on widely used job schedulers such as PBS, SLURM, and LSF. Once CycleCloud is installed,…
04/10/2023