The UX Mental Model Score: An AI-driven evaluation framework

The UX Mental Model Score: An AI-driven evaluation framework

The UX Mental Model Score: An AI-driven evaluation framework

As a UX researcher, I’ve often felt our current toolbox is missing something crucial—methods that effectively bridge the gap between specific usability metrics and general satisfaction scores. Working with AI tools like Claude and ChatGPT, I developed the Mental Model Score (MMS) framework—a theoretical system that uniquely takes a contextual approach, evaluating applications from the user’s context rather than focusing solely on application characteristics. This framework addresses how users’ mental models, shaped by what they think is happening rather than what actually is, create critical gaps in our understanding. This isn’t just another case study of AI assistance; it’s the story of how collaborative intelligence created something that fills a genuine methodological need.

The gap in our UX research toolbox

The more I’ve worked in UX research, the more I’ve noticed a peculiar gap in our methodological toolbox. On one hand, we have highly specific tools that measure discrete interactions (time on task, error rates, eye tracking). On the other, we have general satisfaction metrics (NPS, CSAT, SUS) that provide overall scores but little diagnostic insight.

What’s missing is a middle ground—a framework that captures users’ internal representations of a system while acknowledging the critical balance between gain and pain in their experiences. Is the pain worth the gain? This question, similar to the concept of brand equity in marketing (the associated values toward a labeled experience), often gets lost in our current methods.
The challenge is particularly acute because traditional UX evaluation methods like heuristic evaluations and SUS (System Usability Scale) primarily focus on the inherent characteristics of the application itself. They ask: “Does this application follow established design principles?” or “How usable is this system according to standardized criteria?” While valuable, these approaches often miss the critical contextual dimension of how users experience applications within their specific usage environments and in comparison to alternatives.

This insight led me to envision a new framework—one that would measure not just what users do or say, but how they internally represent systems and the balance between perceived value and effort, all within their unique usage contexts.

 

The contextual advantage: What makes MMS different

What sets the MMS framework apart from traditional UX evaluation methods is its contextual approach. Rather than evaluating an application in isolation against fixed standards, MMS considers the user’s entire ecosystem:

    1. Comparative evaluation: MMS enables comparison between different tools and substitutes by measuring mental model alignment from the user’s perspective, not just against abstract usability standards.
    2. Contextual understanding: The framework acknowledges that a user’s experience is shaped by their specific environment, previous tool experiences, and the alternatives available to them.
    3. User-centered rather than application-centered: While methods like heuristic evaluation focus on whether an application meets predefined criteria, MMS focuses on how well an application aligns with users’ contextual mental models.
    4. Ecosystem awareness: Traditional methods might rate an application highly on usability scales, yet miss that it fails to integrate with users’ broader tool ecosystem.

This contextual approach makes MMS particularly valuable for comparing different solutions within the same problem space and understanding why users might prefer a technically “inferior” product that better matches their mental models.

 

The collaborative creation process

With this challenge in mind, I turned to AI as a thought partner. Here’s how our collaboration unfolded:

    1. Concept exploration: I shared my observations about the gap in UX methodologies with Claude, explaining how users often judge experiences through a balance of gain versus pain within their specific contexts. The AI helped structure these intuitions into potential measurement frameworks.
    2. Framework refinement: Through iterative discussions with ChatGPT, we explored how to quantify users’ internal representations within their usage contexts, eventually settling on five key components:
      1. • Effort (E): The perceived cognitive and physical work users expend
        • Trust (T): Users’ confidence in the system’s reliability and intentions
        • Expectation Alignment (X): The gap between anticipated and actual behavior
        • Impact (I): How significantly misalignments affect the user’s gain/pain balance
        • Concern Factors (C): Specific anxiety points that weigh on the experience
    3. Formula development: Working with Claude’s logical reasoning capabilities, we created a mathematical representation: MMS = (w₁E + w₂T + w₃X + w₄I) – w₅C, essentially calculating whether the gains (positive factors) outweigh the pains (concerns).
    4. Insight generation logic: Perhaps most valuable was the AI’s help in creating interpretive frameworks—understanding what different score patterns reveal about users’ internal representations within their usage contexts.
    5. Calculator implementation: Finally, ChatGPT helped develop the actual HTML/CSS/JavaScript code for a functional MMS Calculator, transforming theoretical framework into testable tool.

    Throughout this process, the AI wasn’t just a technical assistant but a conceptual collaborator, helping bridge the gap between vague observations and structured methodology.

     

    The framework in theory

    The Mental Model Score (MMS) framework aims to capture the balance between gain and pain in user experiences, based on how users internally represent systems within their specific contexts. Here’s how it works in theory:n
    After conducting user research sessions, researchers input findings into the MMS Calculator, assigning ratings from 1-5 for each component:

      • User Effort: How much work users perceive in accomplishing tasks (1: High effort, 5: Low effort)
      • Trust: Users’ confidence in the system (1: No trust, 5: High trust)
      • Expectation Alignment: How well system behavior matches what users anticipate (1: Large mismatch, 5: Perfect match)
      • Impact: How significantly any misalignment affects the gain/pain balance (1: Minimal impact, 5: Severe impact)
      • Concern Factors: Specific worries that weigh on the experience (1: No concern, 5: Severe concern)

    Rather than judging a system against abstract standards, the MMS framework deliberately focuses on measuring how users perceive and internally represent their experiences within their specific contexts—recognizing that these contextual mental models, not objective application characteristics, drive user behavior and satisfaction.

    The calculator produces an overall score that represents whether the perceived gains outweigh the perceived pains, along with detailed insights into where mental models align or diverge from system reality within the user’s specific context.

     

    Beyond the numbers: The potential value

    The MMS framework’s potential value lies in how it bridges the gap between specific and general UX metrics while enabling contextual comparison:

      1. Capturing the Gain/Pain Balance: Like brand equity in marketing, MMS aims to quantify whether users feel the value gained is worth the effort invested.
      2. Enabling comparison across solutions: Unlike methods that evaluate applications in isolation, MMS allows for meaningful comparison between different tools and alternatives by measuring from the user’s contextual perspective.
      3. Respecting mental models: Rather than focusing solely on objective metrics, MMS acknowledges that users interact with systems as they believe them to be, not as they actually are.
      4. Avoiding rationalization traps: By focusing on internal representations rather than specific solution evaluations, MMS may help users express their true mental models rather than post-hoc rationalizations.
      5. Contextual insight: While traditional methods might tell you if an application is “good” according to established principles, MMS aims to tell you if it’s “right” for users in their specific contexts.

    I’m looking forward to testing whether this framework actually delivers these potential benefits in real-world scenarios.

     

    The future of human-AI collaboration in UX

    Creating the MMS framework has shown me how AI can help bridge the gap between observation and methodology in UX research. While AI can’t replace human intuition about user psychology, it excels at:

        • Pattern recognition: Identifying relationships between components that shape users’ internal representations
        • Framework building: Translating vague observations into structured measurement systems
        • Implementation: Rapidly converting theoretical constructs into testable tools

    This collaboration exemplifies how humans can identify gaps in existing approaches while AI helps formalize solutions that might otherwise remain intuitive but unstructured.

     

    Next steps

    With the framework defined and the calculator built, I’m now preparing to test whether the MMS approach actually captures users’ internal representations better than existing methods. I plan to:

      1. Apply the framework to upcoming user research projects
      2. Compare MMS findings with traditional UX metrics to identify unique insights
      3. Assess whether the contextual approach provides valuable comparative insights between different tools
      4. Refine the framework based on how well it captures users’ genuine mental models within their specific contexts

    I believe this approach has potential to address the methodological gap I’ve observed, but only real-world testing will determine whether it truly helps us understand how users internally represent systems within their unique usage contexts.

    Disclaimer

    The Mental Model Score (MMS) framework is currently a theoretical approach to evaluating users’ internal representations of systems and the balance between gain and pain in their experiences. It has not yet been tested in real-world scenarios and has not undergone formal validation. The framework and calculator are provided as experimental tools that will require significant testing and refinement.
    All insights and recommendations generated by the calculator should be considered speculative until validated through practical application. The collaborative AI process described represents my personal experience in framework development.

    THE STIMULUS EFFECT | Podcasts

    Podcasts on Spotify

    You can listen to the Stimulus Effect Podcasts
    on Spotify now!

     

    Click to listen on Spotify!

    Mental Model Score (MMS) Calculator

    Mental Model Score (MMS) Calculator

    Calculate how well users' mental models align with your system design using the MMS framework. Higher scores indicate better alignment and user experience.

    Positive Factors

    1: High Effort 5: Low Effort
    1: No Trust 5: High Trust
    1: Large Mismatch 5: Perfect Match
    1: No Impact 5: Significant Impact

    Concern Factors

    Add the concerns that affect user experience:

    1: No Concern 5: Severe Concern

    MMS Calculation

    The Mental Model Score (MMS) is calculated using this formula:

    MMS = (Effort + Trust + Expectation + Impact) - (Average of Concern Factors)

    Higher MMS indicates stronger mental model alignment and better user experience.

    0
    Measure your mind’s digital defense

    Measure your mind’s digital defense

    Measure your mind’s digital defense

    How susceptible is your mind to algorithmic influence? The new Cognitive Resilience Diagnostic (CRD) offers a practical way to discover your personal cognitive vulnerabilities and strengths in the digital landscape. This self-assessment tool provides insights that can help you navigate our increasingly AI-mediated world with greater autonomy.

    Introducing the Cognitive Resilience Diagnostic (CRD)

    Imagine having a fitness tracker—but for your mind. The CRD tool quantifies your Cognitive Resilience Level by evaluating key dimensions such as focused attention, emotional regulation, and your ability to process and filter digital information. Using an innovative formula, it provides you with a personalized Cognitive Resilience Score (CRS) that highlights both your strengths and areas where you can enhance your mental defenses.

    You might ask: why is this important? In today’s hyper-connected world, awareness of how our digital surroundings function and influence us is crucial because these interactions directly affect our values, decision-making processes, and even our fundamental thought patterns—often without our conscious awareness.

    The origin story

    This project originated from my earlier post “The Digital Dance – Reclaiming Our Minds.” After exploring how technology shapes our thinking, I wondered: could we create a way to evaluate and track our individual cognitive resilience? With today’s powerful AI tools, developing such a framework proved more feasible than I initially imagined. I used both ChatGPT and Claude to create the first draft of what would become the “Cognitive Resilience Diagnostic (CRD)” framework. With that conceptual foundation, I leveraged Claude 3.7 Sonnet’s coding capabilities to build a fully functional HTML application.
    And after couple of tweaks I can now present the Cognitive Resilience Diagnostic (CRD) – a self-assessment tool that helps you measure and strengthen your mind’s resistance to digital manipulation. Most importantly, it jumpstarts your awareness of your cognitive vulnerability to influences, pushed theories, biases, and subtle manipulations that permeate our digital environment.

    What is the Cognitive Resilience Diagnostic?

    The CRD is a comprehensive self-assessment framework designed to help you understand your unique psychological relationship with digital technology. Unlike simple screen time trackers or generic digital wellness advice, the CRD examines multiple dimensions of your cognitive interaction with the digital world:

        • How your attention responds to digital distractions
        • Your emotional reactions to social media and online content
        • Your information processing patterns when consuming digital media
        • Your specific vulnerability factors to algorithmic influence

    By completing the assessment, you’ll receive a personalized Cognitive Resilience Score (CRS) that quantifies your overall mental immunity to digital manipulation, along with detailed insights into your specific strengths and vulnerabilities.

    How does it work?

    The assessment evaluates 20 distinct dimensions across four major components that determine your cognitive resilience:

      1. Cognitive Resilience (CR) – This component measures your ability to maintain clear thinking despite digital distractions. It examines factors like attention quality, cognitive load management, and resistance to thought fragmentation.
      2. Emotional Regulation (ER) – This component evaluates how well you manage emotions triggered by digital content, including your resistance to emotional contagion and recovery time after exposure to triggering content.
      3. Information Processing (IP) – This component assesses how you handle the flood of information in the digital environment, including verification behaviors, diversity of information sources, and resistance to confirmation bias.
      4. Vulnerability Factors (VS) – This component identifies specific attributes that might increase your susceptibility to manipulation, such as dependence on social validation, exposure to echo chambers, and need for cognitive closure.

    The CRD also evaluates which brain system (reptilian, emotional, or rational) dominates your response in different digital contexts, providing crucial insights into when you might be most vulnerable to influence.

    Upcoming future improvements 

    As the first version of the CRD tool, this application perhaps marks the beginning of an evolving journey toward deeper insights into our cognitive resilience. I will gathering feedback and exploring innovative enhancements to refine the assessment, expand its diagnostic dimensions, and offer even more tailored recommendations over time. While you may notice areas that are still in development, rest assured that with future updates I will try to deliver a more robust and comprehensive tool for understanding and strengthening your mental defenses.

    Ready to discover your digital defense level

    Taking the Cognitive Resilience Diagnostic is simple:

      1. Complete the self-assessment questionnaire (takes approximately 15-20 minutes)
      2. Receive your personalized Cognitive Resilience Score and detailed analysis
      3. Review your tailored recommendations for enhancing digital resilience
      4. Implement the suggested strategies in your daily digital life
      5. Re-assess periodically to track your progress

    The digital world isn’t going away, but you can develop the cognitive skills to navigate it on your own terms. The CRD gives you the insights and tools to strengthen your mental sovereignty in an increasingly AI-mediated world.

      Try it out now →

      Disclaimer

      Your Privacy is Our Priority: The Cognitive Resilience Diagnostic (CRD) operates entirely on your local device. No personal data is saved, stored, or transmitted to external servers during your assessment. Your responses and results remain completely private and are temporarily processed only for the duration of your session. Once you close your browser or navigate away, all information is automatically deleted. 
      The Cognitive Resilience Diagnostic (CRD) tool is intended for self-examination and personal awareness. It is NOT a substitute for professional mental health advice or clinical diagnosis. Users are encouraged to consult with a healthcare professional if they have concerns about their mental well-being.

      THE STIMULUS EFFECT | Podcasts

      Podcasts on Spotify

      You can listen to the Stimulus Effect Podcasts
      on Spotify now!

       

      Click to listen on Spotify!

      Cognitive Resilience Diagnostic Tool

      Cognitive Resilience Diagnostic Tool (CRD)

      Measure your susceptibility or resistance to various forms of information manipulation, cognitive distortion, and emotional contagion in digital environments.

      Instructions

      This questionnaire is designed to help you reflect on how you handle distractions, emotions, and information when using digital media. No personal data is saved; it is solely for your own self-assessment.

      Read each scenario and question carefully. Use the 5-point rating scale to indicate how much you agree with each statement (or how frequently you experience the described behavior). There are no "right" or "wrong" answers—respond based on your honest self-perception.

      The assessment has 20 questions across four dimensions and will take approximately 5-10 minutes to complete.

      Cognitive Resilience

      Your ability to stay focused, handle multiple information streams, and flexibly switch thinking modes (emotional, analytical, etc.) without feeling overwhelmed.

      1.
      Scenario: You're reading an important email while your phone keeps buzzing with social media notifications and there's background noise (e.g., music, people talking).
      How well can you remain focused on the email content despite these distractions?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always
      2.
      Scenario: You're trying to follow a news livestream while simultaneously responding to work chat messages and checking social media.
      How overwhelmed do you feel when handling multiple streams of information at once?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always
      3.
      Scenario: You come across a post that sparks a strong emotional reaction, but you also want to analyze it logically.
      How easily can you shift from an emotional to a more analytical mindset in this situation?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always
      4.
      Scenario: You see a breaking news headline that could be clickbait.
      How quickly do you engage critical thinking (e.g., fact-checking, questioning sources) to assess its credibility?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always
      5.
      Scenario: You're reading an online forum discussion and keep getting interrupted by pop-up ads or direct messages.
      How well do you maintain a coherent train of thought despite these frequent interruptions?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always

      Emotional Regulation

      Your awareness of and ability to manage emotional responses to digital content—especially when it is upsetting, polarizing, or highly charged.

      1.
      Scenario: You read an angry rant in the comment section of a social media post.
      How often do you find yourself adopting that anger or frustration after reading such comments?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always
      2.
      Scenario: A friend shares an emotional story about a controversial topic on their feed.
      How likely are you to experience an amplified emotional reaction beyond your usual response?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always
      3.
      Scenario: You come across a shocking headline while scrolling through your news feed.
      How aware are you of your own emotional responses (e.g., anxiety, anger, excitement) as you continue reading?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always
      4.
      Scenario: You see upsetting news about a global event that conflicts with your values.
      How quickly can you return to a balanced emotional state once you stop reading or take a break?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always
      5.
      Scenario: You need to decide whether to share or comment on a post that elicits a strong emotional response.
      How well do you balance your emotions and logical reasoning when deciding your next action?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always

      Information Processing

      How you seek out, verify, and interpret digital information, including willingness to consider multiple perspectives and filter out irrelevant details.

      1.
      Scenario: You want to learn more about a news story you just heard.
      How diverse are the sources (e.g., multiple news outlets, expert articles, fact-checking sites) you consult before forming an opinion?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always
      2.
      Scenario: You come across a surprising statistic on social media.
      How often do you verify that statistic with other reputable sources before accepting or sharing it?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always
      3.
      Scenario: You see an article that contradicts a long-held belief or perspective of yours.
      How willing are you to read it thoroughly and consider its viewpoint?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always
      4.
      Scenario: You're researching a new topic online but encounter a lot of unrelated content, ads, or tangential links.
      How easily can you filter out the irrelevant information to find what's truly important?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always
      5.
      Scenario: You're exploring a complex social or political issue that has many nuanced arguments.
      How well do you avoid seeing the issue in purely black-and-white terms?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always

      Vulnerability Factors

      Social and psychological tendencies that might influence how you form beliefs or share information in digital environments (e.g., seeking approval, echo chambers).

      1.
      Scenario: You share an opinion on social media, and it gets very few likes or comments.
      How important is external validation (e.g., likes, positive feedback) to how you feel about your opinion?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always
      2.
      Scenario: You frequently visit an online community or forum where most users share your perspective.
      How much time do you spend in such spaces versus exploring viewpoints that differ from yours?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always
      3.
      Scenario: You receive positive feedback (likes, shares, compliments) on your posts or comments.
      How strongly does this influence what or how you post in the future?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always
      4.
      Scenario: A well-known "authority" or expert posts a claim that supports your viewpoint.
      How likely are you to question the validity of their claim before accepting it?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always
      5.
      Scenario: You encounter a topic where information is incomplete or conflicting, and there is no clear answer.
      How comfortable are you with the ambiguity, rather than needing a definitive conclusion?
      1 Strongly Disagree / Never
      2 Disagree / Rarely
      3 Neutral / Sometimes
      4 Agree / Often
      5 Strongly Agree / Always

      Brain System Activation Assessment

      For each context, select which brain system tends to dominate your response:

      1.
      When consuming breaking news:
      Quick, instinctual reaction
      Reptilian Brain
      Feeling-based response
      Emotional Brain
      Analytical, measured approach
      Rational Brain
      2.
      During social media engagement:
      Quick, instinctual reaction
      Reptilian Brain
      Feeling-based response
      Emotional Brain
      Analytical, measured approach
      Rational Brain
      3.
      When consuming political information:
      Quick, instinctual reaction
      Reptilian Brain
      Feeling-based response
      Emotional Brain
      Analytical, measured approach
      Rational Brain
      4.
      When receiving personal criticism:
      Quick, instinctual reaction
      Reptilian Brain
      Feeling-based response
      Emotional Brain
      Analytical, measured approach
      Rational Brain
      5.
      When making financial decisions:
      Quick, instinctual reaction
      Reptilian Brain
      Feeling-based response
      Emotional Brain
      Analytical, measured approach
      Rational Brain
      0

      Pin It on Pinterest