1.
Pradeep Rao Vennamaneni. Optimizing Cloud-Native LLM Workloads with Serverless GPU Orchestration and Token-Aware Scheduling. tajet [Internet]. 2024 Apr. 25 [cited 2025 Oct. 9];4(04):33-59. Available from: https://www.theamericanjournals.com/index.php/tajet/article/view/6603