Intent Over Keywords: From Stitched Embeddings to a Production Fashion Agent

Overview
Curriculum

Production AI Engineering

 

Build, Break, and Ship Real Multimodal AI Systems

 

Most AI courses show you how to build something that works.

This project shows you why those same systems fail in production.

You will build three interconnected AI systems that evolve from a common problem:

How do you build a multimodal AI application that remains accurate, observable, and reliable when real users start interacting with it?

Along the way, you will encounter architectural decisions that seem correct at first, only to discover later that they introduce hidden failure modes.

Some of those failures are subtle.

Some are expensive.

And some completely change how you think about AI system design.


Project 1: The Architecture That Should Work

You begin by building a multimodal Video RAG system using separate models for:

  • Visual understanding

  • Audio understanding

  • Text understanding

On paper, the architecture looks reasonable.

In practice, something unexpected happens.

The system retrieves information.

The system generates answers.

Yet certain queries consistently produce results that feel "almost correct" but not actually correct.

The deeper you investigate, the more you realize the problem is not prompting.

It is not chunking.

It is not the vector database.

It is something much more fundamental.

By the end of this project, you will understand why many multimodal architectures quietly fail without engineers noticing.


Project 2: The Redesign

Now rebuild the same system.

Fewer components.

Fewer moving parts.

Fewer opportunities for failure.

This time you will use a completely different retrieval philosophy.

The result is not just a cleaner architecture.

It fundamentally changes:

  • Retrieval behavior

  • Latency characteristics

  • Operational complexity

  • Failure patterns

Most importantly, you will learn how a single architectural decision can eliminate an entire category of bugs.

This project is where many students experience their biggest "aha" moment.


Project 3: StyleMe

The final project brings everything together.

You will build an AI fashion assistant capable of understanding:

  • Text

  • Images

  • Multi-turn conversations

But this is not just another chatbot.

Before retrieval happens, the system must decide:

  • What the user actually wants

  • Whether retrieval is even required

  • Which workflow should execute next

The agent learns to think before it acts.

As the project grows, new challenges emerge:

  • Observability

  • Evaluation

  • Tracing

  • Deployment

  • Reliability

These are the problems that separate demos from products.


What Makes This Different

 

Most AI projects stop when the answer looks correct.

This project starts there.

You will learn:

  • Why retrieval systems fail even when the vector search works

  • Why multimodal systems behave differently from text-only systems

  • How agents make decisions before tool execution

  • How to trace AI workflows across multiple components

  • How production teams evaluate AI systems beyond accuracy

  • What actually happens when these systems move from local development to cloud infrastructure


What You Will Walk Away With

 

By the end of this project, you will have built:

✓ A multimodal Video RAG system

✓ A unified retrieval architecture

✓ A production-ready AI agent

✓ End-to-end observability pipelines

✓ Cloud deployment workflows

More importantly, you will understand the reasoning behind every architectural decision.

Not just how to build the system.

Why it was built that way.

And why alternative approaches fail.


Who This Project Is For

 

This project is designed for developers who already know the basics of AI and want to understand what happens beyond tutorials.

If you have ever wondered:

  • Why does my RAG system retrieve irrelevant information?

  • When should I use multiple embedding models?

  • What does observability mean for AI applications?

  • How do production AI teams evaluate quality?

  • What separates a demo from a deployable product?

You will find those answers here.

And probably discover questions you didn't know to ask.

Deleting Course Review

Are you sure? You can't restore this back

Course Access

This course is password protected. To access it please enter your password below:

Scroll to Top