The proliferation of Generative AI, driven by the advent of Large Language Models (LLMs), Visual Language Models (VLMs), and Agentic workflows, has fundamentally transformed the requirements for high-performance computing infrastructure. For the users of the IndiaAI compute portal and the broader ecosystem, the selection of Graphics Processing Units (GPUs) has evolved from a simple metric of peak floating-point performance (FLOPS) into a complex optimization challenge involving memory bandwidth, interconnect topology, and architectural specialization.