Roland Huß: Generative AI on Kubernetes, Kartoniert / Broschiert
Generative AI on Kubernetes
- Operationalizing Large Language Models
Sie können den Titel schon jetzt bestellen. Versand an Sie erfolgt gleich nach Verfügbarkeit.
- Verlag:
- O'Reilly Media, 04/2026
- Einband:
- Kartoniert / Broschiert
- Sprache:
- Englisch
- ISBN-13:
- 9781098171926
- Artikelnummer:
- 12452462
- Umfang:
- 404 Seiten
- Gewicht:
- 644 g
- Maße:
- 233 x 178 mm
- Stärke:
- 21 mm
- Erscheinungstermin:
- 7.4.2026
- Hinweis
-
Achtung: Artikel ist nicht in deutscher Sprache!
Klappentext
Generative AI is revolutionizing industries, and Kubernetes has fast become the backbone for deploying and managing these resource-intensive workloads. This book serves as a practical, hands-on guide for MLOps engineers, software developers, Kubernetes administrators, and AI professionals ready to combine AI innovation with the power of cloud native infrastructure. Authors Roland Huß and Daniele Zonca provide a clear road map for training, fine-tuning, deploying, and scaling GenAI models on Kubernetes, addressing challenges like resource optimization, automation, and security along the way.
With actionable insights with real-world examples, readers will learn to tackle the opportunities and complexities of managing GenAI applications in production environments. Whether you're experimenting with large-scale language models or facing the nuances of AI deployment at scale, you'll uncover expertise you need to operationalize this exciting technology effectively.
- Learn how to deploy LLMs more efficiently with optimized inference runtimes
- Get hands-on with GPU scheduling, including hardware detection and multinode scaling
- Monitor and understand LLM-specific metrics like Time to First Token and token throughput
- Know when to fine-tune a model or when retrieval augmentation is the better choice
- Discover how to evaluate models with standardized benchmarks before committing GPU resources
- Learn to run agentic applications with secure tool integration, identity management, and persistent state