Emanuele Fabbiani
Engineer, researcher, entrepreneur. Emanuele earned his PhD in AI by researching time series forecasting in the energy field. He was a guest researcher at EPFL Lausanne, and he's now the Head of AI at xtream, where he solves business problems with AI. He published 8 papers in international journals and presented and organised tracks and workshops at 20+ international conferences, including AMLD Lausanne, ODSC London, WeAreDevelopers Berlin, PyData Berlin, PyData Paris, and PyCon Florence. He lectured in Italy, Switzerland, and Poland.
Head of AI
Your company –xtream
Session
We all learnt the basic modules of system design. Relational database, Blob storage, API Gateway, and Distributed Logging are familiar concepts. However, the rise of Generative AI applications brings a new set of challenges and tools to solve them.
In this talk, we will explore the new modules essential for developing Generative AI (GenAI) applications, addressing their unique challenges.
We will begin by talking about why using Large Language Models (LLMs) is insufficient. Next, we'll introduce guardrails, which are crucial for sanitizing both the input and output of LLMs and diffusion models. We'll then cover prompt compression techniques, designed to reduce the cost of generation.
Following this, we'll present prompt registries, which enable over-the-air A/B testing and updates of prompts, and the role of observability, highlighting tools such as openllmetry, a new standard based on opentelemetry.
We'll also examine how to enhance LLM capabilities with function calls, allowing LLMs to interact as agents and communicate with other external systems.
During the discussion, we will show code examples in Python.
By the end of this talk, attendees will have a clear understanding of the main modules used in building GenAI applications and gain valuable insights into designing their own.