FlightDeck
Project Concept
GenUI FlightDeck is a website/UI and agentic experimentation system for building trustworthy generated interfaces. It turns every generated UI into a measurable experiment: agents create declarative UI Blueprints, validate them against an approved Catalog of trusted Components and frontend/design.md, deploy them as Variants on a product Surface, and learn from real interaction evidence.
The first visible Surface is an event-discovery experience, but the product is not just an events app. The event UI is the testbed for the larger FlightDeck loop: generate multiple interface approaches, choose the most relevant first action for a user task, test which approach works better for different behavior archetypes, and feed those results into a structured Reasoning Bank.
FlightDeck is designed around a controlled GenUI safety model. Agents should not invent executable frontend code at runtime. They produce declarative Blueprints that reference pre-approved Components, while the client renders those Components through a trusted renderer. The system should eventually align this model with A2UI-style payloads, AG-UI/CopilotKit runtime interaction, and LangChain/LangGraph orchestration.
The project does four things:
Generate UI Blueprints: Given a task, persona/archetype, prior telemetry, and design rules, the system creates controlled interface candidates for the highest-leverage action point.
Critique and validate them: A Critique Agent checks schema, Catalog usage, frontend/design.md, UX Laws, accessibility, copy clarity, motion, and experiment isolation before a Blueprint enters the Library.
Run live experiments: Experiments assign validated Blueprints as Variants, serve them on a Surface, and collect signals such as first action, task completion, backtracks, variant switches, latency, accessibility status, and feedback.
Turn evidence into improvement: The Reasoning Bank stores structured outcomes and proposed design rules, then future agents use that evidence to improve new Blueprints, reports, frontend/design.md, and possibly system prompts.
The north-star product is an experimentation cockpit for generated interfaces. Designers and UXRs should get first-click evidence and design-rule recommendations. Developers should get schema, renderer, Catalog, latency, and replay information. PMs should get experiment status, uncertainty, and ship/iterate/hold recommendations. QA should get accessibility, regression, and broken-action evidence.
Entry
Status: Submitted
Last saved: May 16 at 5:14 PM -03
Team Roster
You must be registered for the event to view the team message board.
Luiz Henrique Simoes Team Lead RSVP Approved
Data Scientist, Designer, PM at N/A
Daniel Brito RSVP Approved
Data Egineer at -