Multi-flag experiment capability in Feature Experimentation (cross-flag MVT / interleaved testing)
Customer: Inspire Brands
Request: Multi-flag experiment capability in Feature Experimentation (cross-flag MVT / interleaved testing)
Problem:
Inspire Brands uses Feature Experimentation to manage their testing program across their QSR digital ordering funnel. They deploy individual feature flags for distinct features on the same page — for example, their menu page has separate flags controlling a search feature, a product carousel, and filters. Over time, they accumulate multiple independent flags on a single page or funnel area and need to understand which combination of those flags produces the optimal user experience.
Today, the only way to test combinations of features across flags is to consolidate them into a single feature flag with multiple variables and variants — essentially rebuilding what already exists. This creates significant engineering rework, produces oversized flags that lose meaning after the test concludes, and doesn't reflect how teams actually build: features are shipped incrementally on independent flags, and the need to test combinations emerges later based on real-time data.
The customer is scaling from 2 testing pods to 7–10 and currently sequences tests to avoid contamination (e.g., 4 tests × 2 weeks = 8 weeks). They want to run factorial/combination tests across existing flags without rebuilding them under a single flag.
Desired capability:
The ability to create an experiment at the project level that spans multiple feature flags — selecting existing flags and their variations, and automatically generating (or manually defining) the combinations to test. This would function similarly to how Web Experimentation's MVT uses sections and combinations, but applied to Feature Experimentation flags.
Example use case:
Flag 1 (Menu Search): Variation A / Variation B
Flag 2 (Product Carousel): Variation C / Variation D
Desired experiment: Test A/C, A/D, B/C, B/D as distinct experiences with unified reporting on which combination drives the best outcome.
Business impact:
Inspire Brands plans to scale to 7–10 product teams running concurrent experiments by Q4 2026. Without this capability, they face a choice between costly flag consolidation rework or sacrificing the ability to understand interaction effects across features. This is a blocker to their experimentation maturity and a potential retention risk as their program scales. Their engineering lead (Andy Ades) has volunteered to be a technical POC if the product team has questions about feasibility or implementation on the client side.