Gemma 2, the next generation of Google Gemma models, is now optimized with TensorRT-LLM and packaged as NVIDIA NIM inference microservice.
]]>Experience and test Llama3-ChatQA models at scale with performance optimized NVIDIA NIM inference microservice using the NVIDIA API catalog.
]]>As generative AI experiences rapid growth, the community has stepped up to foster this expansion in two significant ways: swiftly publishing state-of-the-art foundational models, and streamlining their integration into application development and production. NVIDIA is aiding this effort by optimizing foundation models to enhance performance, allowing enterprises to generate tokens faster��
]]>At Google I/O 2024, Google announced Firebase Genkit, a new open-source framework for developers to add generative AI to web and mobile applications using models like Google Gemini, Google Gemma. With Firebase Genkit, you can build apps that integrate intelligent agents, automate customer support, use semantic search, and convert unstructured data into insights. Genkit also includes a developer UI��
]]>Speakers from NVIDIA, Meta, Microsoft, OpenAI, and ServiceNow will be talking about the latest tools, optimizations, trends and best practices for large language models (LLMs).
]]>Join us in-person or virtually and learn about the power of RAG with insights and best practices from experts at NVIDIA, visionary CEOs, data scientists, and others.
]]>