Package Design Testing: What Changing One Thing on a Package Reveals

This article draws on insights gathered through package design testing across more than 30 eye tracking studies.

When a CPG brand redesigns its packaging, it sometimes changes everything at once. More often, it makes a targeted call: swap the background color, update the label architecture, and lead with a product image instead of a text claim. The logic is sound and is best practice for research: isolate the variable, measure the impact.

That makes packaging redesign studies some of the richest sources of insight in eye tracking research. Each design variant is exposed to 100 shoppers independently, making every study a controlled experiment. When you look across a library of them, patterns emerge that no single study could surface on its own. And when you layer in competitive shelf audits where a brand’s packaging is evaluated against real category neighbors, those patterns get a second test against the world as it exists on the shelf.

Over the past three years, Rich Insights has conducted more than 30 packaging studies across a wide range of brands and categories, representing over 6,000 individual shopper sessions. The categories span beverages, snacks, frozen foods, personal care, household goods, and specialty foods. The design changes tested range from a single background color swap to full aesthetic overhauls.

Taken together, this data tells a consistent story about what moves the eye on the shelf and what doesn’t. Here are some of the most durable lessons.

    Six principles of package design testing illustrated with shelf photography
    1. Color is the fastest attention signal, and the performance gap is larger than most brands expect

    Across the study library, no single element has a more immediate effect on shelf standout than background or label color. Brands often make color decisions for brand equity or aesthetic reasons. The data repeatedly shows that those decisions carry a heavier shelf performance cost than most anticipate.

    The effect shows up with striking clarity when the same package is tested with only a color change. In one study, switching a product’s background from beige to white with no other changes moved noticeability from 75% to 83% and first-seen performance from 14% to 23%. A nine percentage-point swing in first-seen from a single design decision. That means more people noticed it quickly, and more people noticed it overall.

    Color distinctiveness is not an aesthetic preference; it is a functional asset with measurable performance consequences that can be quantified and tested. Try to own a unique color and/or a palette in your category. Package design testing is the only reliable way to know which color position is actually available in your specific category.

     

    1. Food photography and flavor imagery are the most reliable standout and purchase drivers

    Across every food and beverage study in this library, designs that led with clear, high-quality product photography or flavor imagery outperformed those that carried the flavor story through typography, abstract color coding, or descriptive text. This finding holds across categories and shelf environments.

    In one beverage study testing three full design approaches, the design that replaced abstract color-coded packaging with photorealistic fruit photography was noticed by 91% of shoppers in 1.0 seconds, was seen first by 31%, and captured 12% of purchases. The two designs without product photography ranged from 82–84% noticed, with first-seen rates under 16% and purchase rates of 5–6%.

    Competitive shelf data consistently identifies food photography as a primary driver of both standout and purchase conversion. In frozen food, snack, and beverage categories alike, brands with the largest, clearest product images tended to capture disproportionate purchases relative to their shelf space. One study found that a brand featuring prominent food imagery outperformed a much larger-footprint competitor on purchases, suggesting image quality was doing more work than shelf real estate. Smaller brands take note!

    Shoppers make fast, pattern-matching decisions on the shelf. A visible, appetizing product image dramatically lowers the cognitive load of that decision. Designs that make a shopper work harder to understand what something looks or tastes like are starting at a structural disadvantage, regardless of how good the product actually is. This is one of the most actionable and consistent findings across our package design testing library.

     

    1. A clear viewing path outperforms a dense one, even when the dense design has more to say

    One of the most consistent findings across my library is that designs with simpler, more ordered visual architecture, fewer equally weighted competing elements, outperform cluttered designs even when the cluttered design carries more information. This holds in redesign studies, competitive audits, and direct element-level comparisons.

    In one beverage study, reorganizing an existing label into a strict top-to-bottom visual hierarchy, without adding or removing any elements, reduced the total time for shoppers to process all label content from 1.7 to 1.5 seconds. In a shelf encounter lasting just a few seconds, faster information transfer is a meaningful advantage.

    In another study, two versions of a functional beverage were tested. Version 1 achieved a higher standout score on the shelf. Version 2 was purchased 2.5 times more often because it required less visual effort to navigate. It managed to increase the noticeability of critical elements like flavor and ingredients. Standout got shoppers to look, and the clarity got them to buy.

    A personal care brand tested three label versions of the same product. The winning design replaced descriptive text with functional icons and simplified the hierarchy. It outperformed both predecessors on every measured element: brand recognition, message noticeability, and purchase intent. A household goods study reached the same conclusion: icons and contextual in-use imagery outperformed text-based benefit summaries on every metric, and the cleaner design was rated as representing higher quality. Shoppers can process icons and images much faster than text.

    A useful diagnostic: when engagement time is high, but purchase rate is low, the cause is almost always a design that’s asking the eye to do too much. Long engagement on a confusing design is not the same as long engagement on a compelling one.

     

    1. Benefit claims are systematically under-noticed across every category

    Across every category and study type, functional benefit claims, whether ingredient-based, certification-based, or format-based, are the most consistently under-noticeable elements on packaging. This holds for redesign studies, competitive audits, and direct claim testing.

    In one study testing multiple design variants specifically to improve the visibility of a functional ingredient claim, none of the variants cracked 22% findability, regardless of where or how the claim was placed. In another, a benefit categorization system built around distinct need states was tested across two shelf environments; shoppers simply didn’t notice the categorical callouts in either context.

    A single-pack element study offered particularly clear quantification: with 3 seconds of exposure, only about one-third of shoppers noticed any of the benefit callouts. The pack had a well-designed, flavor-forward viewing path, which was working as intended. But protein content, allergen certifications, and sugar callouts were each noticed by roughly 30–35% of shoppers in that window.

    Competitive shelf audits tell the same story from outside. Across multiple audits, secondary claims: organic certifications, allergen callouts, functional ingredient lists, specialty badges consistently register with fewer than 20% of shoppers.

    The gap between what shoppers say matters and what their eyes actually register on the shelf is a structural feature of the environment. Benefit claims need to earn their position through size, contrast, and placement. If a benefit is the main purchase trigger or brand distinction, then it should be treated as such. Package design testing with dedicated findability tasks is the only way to close that gap with confidence.

     

    1. The design that gets noticed most isn’t always the one that converts best

    A finding that reappears again and again is that the highest-noticeability design variant is not necessarily the highest-purchase variant. There is a difference between drawing an eye and making a decision.

    In one functional beverage study, the design with the higher standout score was purchased 2.5 times less often than its simpler counterpart. In a snack packaging study, the fastest-noticed design closed purchases quickly; the most-noticed design captured slightly more purchases but at significantly longer decision times.

    Several audits showed brands overperforming in attention relative to their shelf footprint while underperforming in purchase conversion because engagement ran into design complexity, rather than landing on a clear purchase signal.

    For brands evaluating competing design options, looking at the full purchase funnel: noticed, first seen, engaged, purchased, time to purchase can tell a fundamentally different story than anchoring on any single metric.

     

    1. Shelf context shapes performance; the same design wins differently depending on what’s next to it

    Several studies tested identical designs across different shelf environments, and the performance differences were large enough to matter strategically. Competitive context is not a secondary consideration in packaging design; it should be a primary one. This is where I find brands that work with competent designers really diverge. A good designer creates a unique design that represents the brand, and a great designer considers how it will be merchandised.

    In one energy drink study, three colorway variants were tested on two different shelf sets. The darkest, most high-contrast design dominated one shelf: most noticed, fastest recognized, highest purchase rate. On the second shelf, with a different competitive mix, that same design dropped significantly, while a lighter, more consistent design held its performance across both environments. The design optimized for one context didn’t hold its advantage in another.

    In a functional beverage study, a brand’s absolute noticeability improved when moving from a soda shelf to a functional shelf. But its competitive rank fell, and its purchase share dropped, because the functional shelf is populated by brands with stronger visual purchase signals and more established category cues. Better absolute metrics, worse relative outcome.

    A color that owns a category in one retail environment may be common in another. An aesthetic that reads as premium in one category may read as generic in another. Testing on the actual competitive shelf is not optional for brands that want to create the most successful design possible.

     

    1. Small changes produce larger effects than most brands expect

    Across most of my studies, the magnitude of impact from targeted, single-element changes is consistently larger than most brands realize. The data makes a strong case for testing incremental changes seriously, and not just sweeping redesigns. When I worked with larger more established brands, I saw this time and again, small changes create a butterfly effect that can drastically improve standout and/or understandability.

    A single background color change: +8% noticeability, +9% first-seen. A label hierarchy reorganization with no elements added or removed resulted in 2.5 times more purchases in one study. In another study, replacing descriptive text with icons improved the standout of every measured element on the label. Swapping a concept illustration for a food photograph: all key elements on the pack became more noticeable, not just the image itself.

    The eye is doing rough, rapid triage before any content is processed and evaluating (in fractions of a second) whether something deserves a closer look. Color, imagery, and visual simplicity answer that question before a shopper has read a word. Everything else happens after.

    The most reliable lever package design testing reveals: making the first decision easier improves everything downstream. That principle is consistent across all 6,000+ shopper sessions and three years of research.

     

    What This Means for Package Design Testing

    Package design testing yields the most actionable results when you look at both macro and element-level data together.  Coincidentally, this is what makes eye tracking so valuable when it comes to testing a design. Macro-level shelf metrics: noticeability, time to notice, and purchase rate can tell you which design wins. Then front-panel element tracking tells you why. Both layers are needed to translate a result into an actionable design recommendation.

    Test claims directly if they matter to the purchase. Studies that measure overall noticeability won’t detect whether a specific functional claim is registering. If an attribute drives purchase intent in survey data, it warrants a dedicated findability task in eye tracking. The gap between claim importance and claim visibility is consistent and large enough to warrant testing.

    Define what’s being optimized before the study runs. A design that maximizes noticeability may not maximize purchase rate. A design that wins on first-seen may convert more slowly. These trade-offs are real and resolvable, but only if properly tested.

    Test against the actual competitive shelf. The same design will perform differently depending on what’s next to it. A concept test against a blank background tells you something. A shelf test against actual category competitors tells you what you need to know to decide.

     

    Rich Insights conducts eye tracking and shopper research for CPG brands evaluating packaging design, shelf placement, and purchase drivers. This article draws on more than 30 studies conducted between 2023 and 2026, representing over 6,000 individual shopper sessions across food, beverage, personal care, and household goods categories. Eye tracking data is collected using Tobii webcam-based technology, enabling remote shelf simulation studies at scale.