(1)
Pradeep Rao Vennamaneni. Optimizing Cloud-Native LLM Workloads With Serverless GPU Orchestration and Token-Aware Scheduling. tajet 2024, 4, 33-59.