How We Diagnose Failure

Slide Idea

This slide presents a diagnostic reasoning exercise: given a vague specification ("A dog happily walking on the High Line"), students are asked to take 30 seconds to predict what will fail when generating from this specification and to classify which failure type that predictable failure indicates (ambiguity/design failure, model limitation, or ethical boundary). This activity develops predictive diagnostic capability—anticipating failures before they occur based on specification analysis.

Key Concepts & Definitions

Predictive Failure Diagnosis

Predictive failure diagnosis is the analytical practice of examining specifications before execution or generation to identify likely failure modes—anticipating what will go wrong, why it will fail, and what type of failure will result. This represents forward-looking diagnostic thinking: rather than waiting for outputs to fail and then diagnosing why (reactive diagnosis), predictive diagnosis examines inputs identifying weaknesses that will cause failures (proactive diagnosis). The exercise exemplifies this: students don't generate outputs and evaluate results; instead, they analyze the vague specification itself predicting what failures it will produce. Research on expert-novice differences demonstrates that experts routinely engage in predictive diagnosis—they examine plans, specifications, or designs identifying likely problems before implementation, while novices often proceed to implementation discovering problems only through failure. Developing predictive diagnostic capability enables students to identify and fix specification problems early when correction is inexpensive, rather than discovering them late through costly failures.

Source: Ericsson, K. A., & Smith, J. (Eds.). (1991). Toward a general theory of expertise: Prospects and limits. Cambridge University Press.

Specification Vagueness as Failure Predictor

Specification vagueness as failure predictor refers to the principle that underspecified, ambiguous, or imprecise requirements reliably predict implementation failures even before implementation begins—vague specifications guarantee varied interpretations producing inconsistent results failing to satisfy intent. The example specification "A dog happily walking on the High Line" exhibits multiple vagueness dimensions: "dog" (what breed, size, color, markings?), "happily" (what behavioral indicators of happiness—tail wagging, tongue out, bouncing gait, relaxed posture?), "walking" (what pace, direction, gait characteristics?). Each vague term permits wide interpretive latitude enabling implementations satisfying the literal specification while missing intent. Research on requirements engineering demonstrates strong correlation between specification vagueness and project failures: projects with vague requirements experience higher rates of rework, cost overruns, and user dissatisfaction than projects with precise requirements, because vagueness prevents alignment between what stakeholders expect and what implementers produce. The predictive power of vagueness means specification quality assessment can reliably forecast implementation success probability before expensive implementation begins.

Source: Berry, D. M., Kamsties, E., & Krieger, M. M. (2003). From contract drafting to software specification: Linguistic sources of ambiguity. Technical report.

Failure Type Classification as Diagnostic Framework

Failure type classification as a diagnostic framework refers to the practice of categorizing anticipated or observed failures into distinct types (ambiguity/design failures, model limitations, ethical boundaries) enabling type-appropriate remediation rather than generic "fix it" responses. This classification serves a diagnostic function: identifying which type of failure will occur determines what intervention is needed. Ambiguity failures predict: outputs will vary inconsistently, some satisfying intent while others violate it, because vague specifications permit divergent interpretations—remediation requires specification tightening. Model limitation failures predict: systems cannot reliably produce what specification requests regardless of precision, because requested outputs exceed current capabilities—remediation requires alternative approaches or goal adjustment. The exercise asks students to predict both what fails and what failure type—developing understanding that failure diagnosis isn't complete until failure cause is classified enabling appropriate response. Professional practice routinely employs failure classification precisely because different failure types require fundamentally different fixes.

Source: Raji, I. D., et al. (2020). Closing the AI accountability gap: Defining an end-to-end framework for internal algorithmic auditing. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (pp. 33-44).

Diagnostic Reasoning from Incomplete Information

Diagnostic reasoning from incomplete information is the inferential process of making sound judgments about likely outcomes, failure modes, or problems despite having partial, ambiguous, or uncertain evidence—using available information to generate plausible predictions even when complete certainty is impossible. The 30-second exercise requires this: students cannot know definitively what will fail (they haven't generated outputs), but they must reason from specification characteristics to likely failures. This mirrors professional diagnostic reasoning across domains: physicians diagnose likely conditions from incomplete patient information, engineers predict likely failure modes from partial system specifications, designers anticipate likely usability problems from interface mockups. Research on diagnostic expertise demonstrates that sound diagnostic reasoning doesn't require complete information—it requires systematic analysis of available information, pattern recognition connecting current situations to known failure modes, and probabilistic thinking about likely outcomes rather than certain predictions. Developing comfort with incomplete-information diagnosis proves essential because real-world decision contexts rarely provide complete information before decisions must be made.

Source: Norman, G. (2005). Research in clinical reasoning: Past history and current trends. Medical Education, 39*(4), 418-427.

Ambiguity Detection as Specification Analysis Skill

Ambiguity detection as specification analysis skill refers to the developed capability to recognize when specifications contain undefined terms, permit multiple incompatible interpretations, or lack sufficient precision to guide consistent implementation—identifying vagueness that others might overlook. The exercise develops this skill by presenting overtly vague specifications requiring students to identify what's ambiguous: expert analysts immediately recognize "happily" as subjective interpretation requiring behavioral specification, recognize "dog" as category requiring narrowing, and recognize entire specification as insufficient for consistent generation. Research on requirements analysis demonstrates that ambiguity detection skill varies dramatically between novices and experts: novices often fail to recognize ambiguity treating vague specifications as adequately clear, while experts systematically scan specifications identifying underspecified dimensions. Developing ambiguity detection enables students to evaluate specifications they write or receive, identifying weaknesses requiring clarification before they cause failures. This skill transfers broadly: students encounter vague assignment instructions, project requirements, research protocols, client briefs—ability to recognize what's underspecified and needs clarification proves valuable across contexts.

Source: Gause, D. C., & Weinberg, G. M. (1989). Exploring requirements: Quality before design. Dorset House.

Why This Matters for Students' Work

Understanding predictive failure diagnosis and developing the ability to anticipate failures before they occur fundamentally changes how students approach specification evaluation, planning, and risk management—shifting from reactive problem-solving after failures occur to proactive problem prevention through early diagnosis.

Students often operate in reactive mode: they create specifications, attempt implementation or generation, discover failures, then diagnose what went wrong. While this reactive approach eventually identifies problems, it proves inefficient and expensive: discovering failures late requires rework, wasted effort, and potential deadline pressure. Developing predictive diagnostic capability enables proactive mode: examine specifications before implementation, identify likely failures, fix specification weaknesses early when correction is cheap, proceed to implementation with higher success probability. The exercise models this shift: rather than generating from vague specification then evaluating failures, students analyze specification itself predicting failures before generation. This predictive thinking transfers: when receiving assignment instructions, students can identify ambiguous aspects requiring clarification before investing effort; when planning projects, students can anticipate likely problems adjusting plans preventively; when writing specifications for collaborators, students can predict what will confuse recipients refining specifications before sharing.

Understanding that specification vagueness predicts failures develops students' specification quality standards. Students sometimes accept vague specifications treating them as "good enough"—they don't recognize vagueness as a problem until failures occur. However, the example "A dog happily walking on the High Line" demonstrates that vague specifications reliably produce varied inconsistent results: different generators interpret "happily" differently (some showing tail-wagging, some showing tongue-out expression, some showing bouncing gait), interpret "dog" differently (different breeds, sizes, colors), creating outputs that technically satisfy vague specification but vary dramatically in ways that may violate intent. Recognizing this predictive relationship—vagueness → inconsistent interpretations → failure to meet intent—raises students' quality standards: specifications that previously seemed adequate now appear problematically vague requiring refinement.

Understanding failure type classification as a diagnostic framework develops students' remediation thinking. Students sometimes treat all failures generically: something didn't work, try again with modifications, hope for better results. However, different failure types require different fixes: ambiguity failures (from vague specifications) require specification tightening—trying again with same vague specification produces same inconsistent results; model limitations require alternative approaches or acceptance—refining specifications won't overcome fundamental capability gaps; ethical boundaries require values assessment—technical success doesn't justify proceeding if ethical concerns exist. The exercise develops this classification thinking by requiring students to predict not just what fails but what failure type results. This classification skill transfers: when encountering failures in any context, students can diagnose failure type determining appropriate remediation rather than applying random fixes hoping something works.

The 30-second time constraint develops students' rapid diagnostic thinking. Students sometimes believe sound diagnosis requires extensive analysis time, making them reluctant to make judgments without exhaustive consideration. However, professional contexts often require quick diagnostic assessments: during meetings, someone shares vague proposals and you must immediately identify likely problems; during collaborative work, a partner suggests an approach and you must quickly assess feasibility; during time-pressured projects, you must rapidly evaluate multiple alternatives identifying likely failures. The exercise demonstrates that sound predictive diagnosis doesn't necessarily require extended time—examining "A dog happily walking on the High Line" for 30 seconds suffices to identify major ambiguity problems and predict inconsistent interpretation failures. Practicing rapid diagnosis builds confidence making timely assessments rather than deferring judgment indefinitely.

Understanding diagnostic reasoning from incomplete information develops students' comfort with uncertainty. Students sometimes experience paralysis when facing incomplete information: they cannot make judgments without complete certainty, so they defer decisions waiting for more information that may never arrive. However, professional practice constantly requires making sound judgments despite incomplete information: specifications are always somewhat ambiguous, plans always have uncertainties, decisions must be made before all facts are known. The exercise normalizes incomplete-information diagnosis: students haven't seen generation outputs, haven't tested specification, haven't verified what actually fails—yet they must predict failures based solely on specification analysis. This practice develops probability thinking: can't know for certain what will fail, but can make educated predictions based on available information that prove useful despite uncertainty.

Developing ambiguity detection as an analytical skill enables students to become critical specification evaluators rather than passive specification consumers. Students often accept specifications as given without evaluating quality: assignment instructions are treated as fixed, project requirements are accepted without questioning, brief descriptions are implemented without clarification. However, professional practice requires evaluating specification quality: identifying what's underspecified, requesting clarification before proceeding, pushing back on inadequate requirements. The exercise develops this evaluative capability: students must analyze specifications identifying what's vague, what's ambiguous, what will cause problems. This critical evaluation skill transfers to evaluating specifications students receive (identifying what needs clarification) and evaluating specifications students create (recognizing their own ambiguities requiring refinement).

How This Shows Up in Practice (Non-Tool-Specific)

Filmmaking and Media Production

Film pre-production employs predictive failure diagnosis extensively during script and shot list review. Before expensive production begins, teams analyze plans identifying likely failures enabling preventive correction.

Script reading sessions specifically identify vague specifications predicting production problems. A script line like "Character enters angrily" prompts diagnosis: this will fail because "angrily" permits multiple incompatible interpretations (shouting? silent seething? aggressive body language? facial expression?). Production team predicts: without specification tightening, different rehearsals will produce inconsistent performances, director will struggle communicating specific intent, multiple takes will be required finding appropriate expression. Classification: ambiguity/design failure requiring specification through directorial discussion with actors about specific angry behaviors for this character at this moment.

Location scouting employs predictive diagnosis analyzing site specifications. A location description "urban exterior, industrial aesthetic" prompts analysis: this will fail because "industrial aesthetic" lacks precision (abandoned factory? active warehouse? renovated loft? shipping containers?). Team predicts varied location proposals not matching the director's intent. Remediation: request specification refinement (specific visual references, era designation, activity level requirements) before scouting begins.

Equipment planning uses predictive diagnosis from technical specifications. The camera department receives requests to "capture intimate emotional scenes with beautiful background blur." Analysis identifies vagueness: "intimate" (what framing? how close?), "beautiful blur" (how shallow depth of field? what bokeh characteristics?). Prediction: without tighter specification, camera team may bring wrong lenses, lighting setup may not support required aperture, framing may not match "intimate" intent. Classification: ambiguity failure requiring cinematographer-director discussion specifying precise technical requirements before equipment rental.

Budget and schedule planning employ predictive failure diagnosis from production specifications. A vague sequence description "complex action scene" prompts analysis: this will fail (cost and schedule overruns) because "complex" isn't quantified (how many stunts? What safety equipment? how many camera setups?). Team predicts: allocated budget and time insufficient for actual complexity. Remediation: require specification breakdown (specific action beats, stunt requirements, shot list detail) enabling accurate resource estimation.

Design

Interface design specification review employs predictive failure diagnosis before development begins, identifying vague requirements that will produce inconsistent implementations.

Design brief analysis identifies ambiguity predicting design divergence. A brief states "modern professional aesthetic appropriate for enterprise users." Analysis: "modern" permits multiple incompatible interpretations (minimalist? bold, colorful? trendy current? classic timeless?), "professional" is culture-dependent, "appropriate" lacks definition. Prediction: designers will propose wildly divergent designs all claiming to satisfy brief, stakeholders will disagree about which succeeds because "modern professional" wasn't defined, multiple revision cycles will be required converging on shared understanding. Classification: ambiguity failure requiring specification through visual references, specific aesthetic principles, competitive analysis establishing shared definition before design begins.

Accessibility requirement review employs predictive diagnosis from compliance specifications. A requirement states "design should be accessible." Analysis: catastrophically vague—"accessible" encompasses contrast ratios, font sizes, keyboard navigation, screen reader support, motion sensitivity, cognitive load, numerous specific WCAG criteria. Prediction: designers will implement partial accessibility (maybe high contrast colors) believing requirement satisfied, accessibility audit will reveal numerous violations requiring expensive redesign. Remediation: specify concrete accessibility requirements (WCAG 2.1 AA compliance, specific criteria testing, assistive technology compatibility) before design begins.

Responsive design specifications receive predictive diagnosis. Requirement: "interface should work on mobile devices." Analysis: "work" undefined (full feature parity? adapted subset? completely different mobile design?), "mobile devices" encompasses a vast size range (phone to tablet, landscape and portrait). Prediction: design will work poorly on some devices, stakeholder expectations won't match designer interpretation of "work," implementation will require extensive revision. Remediation: specify breakpoint strategy, feature prioritization across sizes, specific tested device configurations before wireframing.

Writing

Academic assignment design employs predictive failure diagnosis analyzing instructions before distribution to students, identifying ambiguous requirements that will produce varied incompatible responses.

Assignment instruction review identifies vague specifications. An assignment states: "Write an analysis of the reading." Analysis: "analysis" permits multiple interpretations (summary? critique? comparison? application?), "the reading" might refer to one text or multiple, length unspecified, format unspecified, evaluation criteria undefined. Prediction: students will submit wildly varied work (some summaries, some critiques, some comparisons), instructors will struggle evaluating inconsistent submissions against undefined criteria, students will feel evaluation was arbitrary. Classification: ambiguity failure requiring specification of analysis type, specific texts, length range, format requirements, evaluation rubric before assignment distribution.

Research protocol review employs predictive diagnosis. A protocol states "interview participants about their experiences." Analysis: "experiences" vastly underspecified (experiences with what specifically? what timeframe? what aspects?), interview structure undefined (structured? semi-structured? unstructured?), data recording method unspecified. Prediction: different researchers will collect incompatible data preventing meaningful analysis, research may not address intended questions, IRB may request clarification delaying approval. Remediation: specify precise research questions, interview protocol with specific question domains, data collection procedures before research begins.

Content brief analysis identifies specification gaps. A brief request "engaging article about climate change for general audience." Analysis: "engaging" subjective and undefined (narrative storytelling? data visualization? controversial angle? practical advice?), "general audience" encompasses an enormous range (high school educated? college? specific knowledge assumed?), article length unspecified, tone undefined. Prediction: writer will make assumptions about engagement approach possibly mismatching editor expectations, multiple revision rounds required. Remediation: specify engagement approach, audience knowledge level, length target, tone examples, publication context before writing.

Computing and Engineering

Software requirements review employs systematic predictive failure diagnosis identifying vague specifications before expensive development begins.

Requirements specification analysis identifies ambiguity predicting implementation inconsistency. A requirement states: "The system should respond quickly to user actions." Analysis: "quickly" completely undefined (100ms? 1 second? 5 seconds?), "user actions" unspecified (which actions? all actions equally? some more critical?), "respond" undefined (show loading indicator? complete processing? return partial results?). Prediction: different developers will implement different performance targets believing requirements satisfied, performance testing will reveal inconsistencies, stakeholders will disagree about acceptable thresholds requiring rework. Classification: ambiguity failure requiring specific latency targets per action type, percentile requirements (95th percentile < Xms), loading state specifications before development.

Algorithm selection employs predictive diagnosis from performance specifications. A specification requests "efficient data processing for large datasets." Analysis: "efficient" ambiguous (time complexity? space complexity? throughput? latency?), "large" undefined (millions? billions? terabytes?), trade-offs unspecified. Prediction: algorithm choice may optimize wrong dimension (choosing space-efficient algorithm when time-efficiency matters more), performance may fail stakeholder expectations despite "efficient" claim. Remediation: specify concrete performance targets, dataset size ranges, resource constraints, acceptable trade-offs before algorithm design.

API design review identifies specification vagueness. Documentation states "API returns relevant results." Analysis: "relevant" completely subjective (relevant by what criteria? ranked how? filtered how?), result format unspecified, pagination undefined, error conditions unaddressed. Prediction: client developers will make incompatible assumptions about return format, relevance ranking will surprise users, integration will require extensive trial-and-error discovering actual behavior. Remediation: specify result schema, relevance criteria and algorithm, pagination mechanism, comprehensive error responses before API implementation.

Common Misunderstandings

"Predictive diagnosis is just guessing—you can't know what will fail until you try it"

This misconception treats prediction as speculation rather than recognizing it as reasoned inference from specification analysis. While predictive diagnosis cannot achieve perfect certainty (you haven't actually generated outputs to observe failures), it's not random guessing—it's systematic analysis applying known patterns. The vague specification "A dog happily walking" reliably predicts certain failures: "happily" will be interpreted inconsistently (some generators showing tail-wagging, others tongue-out, others bouncing gait) because subjective terms permit varied interpretations; "dog" without breed/size/color specification will produce varied appearances; these variations constitute ambiguity failures traceable to underspecified requirements. This prediction isn't guess—it's application of the well-established principle that vague specifications produce inconsistent interpretations. Research on requirements engineering demonstrates strong correlation between specification characteristics and failure probabilities: vague requirements statistically predict higher failure rates than precise requirements. Professional practice relies on predictive diagnosis precisely because it proves reliable enough to guide decisions despite not providing perfect certainty. Engineers predict likely failure modes from design specifications before building prototypes; physicians predict likely diagnoses from symptoms before definitive test results; editors predict likely reader comprehension problems from draft text before publication. These predictions prove useful for decision-making despite being probabilistic rather than certain.

"The 30-second time limit means diagnostic quality doesn't matter—any quick answer is acceptable"

This misconception misinterprets time constraint as an excuse for careless thinking rather than recognizing it as practice in rapid sound diagnosis. The 30-second limit doesn't mean diagnostic reasoning quality is unimportant—it means students must practice making sound assessments efficiently rather than requiring unlimited deliberation. The vague specification contains obvious ambiguities identifiable in seconds by anyone systematically analyzing it: "happily" is subjective requiring behavioral specification, "dog" lacks specificity, "walking" permits variation. Recognizing these ambiguities and predicting they'll cause inconsistent interpretation failures (ambiguity/design failure type) represents sound diagnosis achievable quickly. Professional contexts routinely require rapid diagnosis: during meetings, you have moments to assess whether proposed approach will work; during code review, you must quickly identify likely bugs; during design critique, you must efficiently spot likely usability problems. The time constraint develops this professional capability—making sound judgments efficiently rather than deferring all judgments pending exhaustive analysis. However, rapid doesn't mean careless: students should still apply diagnostic reasoning systematically (scan specification for vague terms, predict how vagueness enables varied interpretations, classify resulting failure type) even when doing so quickly.

"If I can think of any possible way the specification could fail, my prediction is correct"

This misconception treats any failure prediction as equally valid rather than recognizing that diagnostic quality involves identifying most likely or most significant failures. While the vague specification could fail in multiple ways (ambiguity about "happily," ambiguity about "dog," ambiguity about "walking," potential ethical concerns if certain interpretations objectify animals, potential model limitations if requested context proves challenging), some failures are far more predictable and significant than others. The most obvious predictable failure: "happily" and other vague terms will be interpreted inconsistently producing varied outputs—this is a near-certain failure from an ambiguity/design problem. Less obvious: potential model limitations—but these depend on specific generation system capabilities not evident from specification alone. Sound diagnosis prioritizes likely significant failures rather than cataloging every conceivable problem. Professional diagnostic reasoning involves this prioritization: physicians diagnose most likely conditions before considering rare diseases; engineers identify most probable failure modes before exhaustive fault tree analysis; editors flag most significant comprehension barriers before minor style issues. The exercise develops this prioritization: in 30 seconds, identify the failure you're most confident will occur and classify its type—training judgment about what failures matter most and are most predictable.

"Failure prediction is negative thinking—optimistic approach assumes specifications are adequate until proven otherwise"

This misconception frames predictive diagnosis as pessimism rather than recognizing it as essential risk management and quality assurance practice. Predictive diagnosis isn't about negativity—it's about realistic assessment enabling early problem correction. The vague specification "A dog happily walking" isn't adequate just because we hope generators will interpret it matching our unstated intent—it's objectively insufficient because key terms lack definition permitting inconsistent interpretation. Recognizing this inadequacy early enables specification refinement before generation attempt, improving success probability. Proceeding optimistically with vague specification doesn't prevent failures—it merely delays discovering them until after wasted generation effort. Research on software project management demonstrates that early requirements analysis identifying specification weaknesses dramatically reduces project failures and costs compared to optimistic assumptions that vague requirements will somehow work out. Professional practice across domains employs systematic failure prediction precisely because it improves outcomes: architecture review predicts structural weaknesses before construction; medical differential diagnosis considers likely problems before treatment; financial risk analysis predicts likely economic failures before investment. This isn't pessimism—it's responsible planning. Effective work requires balancing realistic problem anticipation (identifying likely failures) with constructive problem-solving (refining specifications to prevent those failures).

Scholarly Foundations

Ericsson, K. A., & Smith, J. (Eds.). (1991). Toward a general theory of expertise: Prospects and limits. Cambridge University Press.

Comprehensive analysis of expert performance across domains demonstrating that experts engage in extensive forward planning and problem anticipation while novices proceed to action discovering problems through failure. Discusses how expert physicians diagnose likely conditions before tests, expert chess players anticipate opponent moves before they occur, expert programmers identify likely bugs before execution. Establishes predictive diagnosis as characteristic of expertise across fields. Relevant for understanding that anticipating failures before they occur represents sophisticated professional capability students should develop.

Berry, D. M., Kamsties, E., & Krieger, M. M. (2003). From contract drafting to software specification: Linguistic sources of ambiguity. Technical report.

Systematic analysis of how natural language ambiguity in specifications creates implementation problems. Catalogs specific linguistic patterns that reliably predict varied interpretations: vague quantifiers ("many," "few"), subjective modifiers ("quickly," "happily"), underspecified categories ("dog" without attributes). Establishes that specification vagueness reliably predicts interpretation inconsistency enabling prediction of failures before implementation. Directly relevant for understanding how specification analysis enables failure prediction.

Raji, I. D., Smart, A., White, R. N., Mitchell, M., Gebru, T., Hutchinson, B., Smith-Loud, J., Theron, D., & Barnes, P. (2020). Closing the AI accountability gap: Defining an end-to-end framework for internal algorithmic auditing. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (pp. 33-44).

Framework for AI system auditing emphasizing failure type classification (design failures, capability limitations, bias/fairness issues) as essential for appropriate remediation. Discusses how different failure types require different interventions: design failures require specification improvement, capability limitations require alternative approaches, bias issues require data or model changes. Establishes that failure classification enables targeted effective responses rather than generic fixes. Relevant for understanding why predicting failure type (not just predicting failure) matters for effective problem-solving.

Norman, G. (2005). Research in clinical reasoning: Past history and current trends. Medical Education, 39(4), 418-427.

Review of diagnostic reasoning research in medical education discussing how physicians make diagnoses from incomplete uncertain information. Establishes that sound diagnosis doesn't require complete information—it requires systematic analysis of available information, pattern recognition, and probabilistic thinking. Discusses how novices struggle with diagnostic uncertainty while experts comfortably make provisional diagnoses from partial information. Relevant for understanding diagnostic reasoning from incomplete information as a learnable professional skill.

Gause, D. C., & Weinberg, G. M. (1989). Exploring requirements: Quality before design. Dorset House.

Classic requirements text emphasizing importance of identifying ambiguities and underspecified aspects before implementation. Discusses questioning techniques for revealing vague requirements, methods for detecting common ambiguity patterns, and strategies for requirement clarification. Establishes ambiguity detection as essential requirements analysis skill requiring practice to develop. Relevant for understanding ambiguity detection as analytical capability applicable to specification evaluation.

Lamsweerde, A. van. (2009). Requirements engineering: From system goals to UML models to software specifications. Wiley.

Comprehensive requirements engineering text discussing how to analyze requirements for completeness, precision, and consistency before development. Discusses formal and informal techniques for detecting ambiguities, strategies for predicting likely implementation problems from requirement characteristics, and methods for requirement refinement. Establishes requirements analysis as systematic discipline enabling problem prediction and prevention. Relevant for understanding predictive diagnosis from specification analysis.

Reason, J. (1990). Human error. Cambridge University Press.

Foundational work on error analysis and failure prediction discussing how to analyze systems identifying latent failures before they manifest as active failures. Introduces concepts of error classification, failure mode analysis, and predictive safety assessment. While focused on safety-critical systems, principles transfer broadly: certain system characteristics reliably predict certain failure types enabling preventive intervention. Relevant for understanding failure type classification and prediction as general diagnostic practices.

Klein, G. A. (1998). Sources of power: How people make decisions. MIT Press.

Analysis of naturalistic decision-making demonstrating how experts make rapid assessments in time-pressured uncertain contexts. Discusses pattern recognition enabling quick problem diagnosis, mental simulation for predicting outcomes, and confidence calibration in incomplete-information reasoning. Establishes that rapid diagnosis doesn't require sacrificing quality—experts develop efficient diagnostic thinking through practice. Relevant for understanding 30-second diagnostic exercise as training in rapid sound assessment.

Boundaries of the Claim

The slide presents a diagnostic reasoning exercise predicting failures from vague specification and classifying failure types. This does not claim that perfect failure prediction is possible, that all failures are predictable from specification analysis alone, or that 30 seconds provides adequate time for comprehensive failure analysis in all contexts.

The characterization of this as predictive diagnosis enabling failure anticipation describes capability the exercise develops but doesn't guarantee prediction accuracy. Students' predictions may be incomplete (missing some failure modes while identifying others), incorrect (predicting failures that wouldn't actually occur), or oversimplified (recognizing vagueness without fully analyzing its implications). Prediction quality depends on analytical skill, experience with similar specifications, and understanding of generation system characteristics—capabilities students are still developing.

The vague specification example demonstrates obvious ambiguities enabling relatively straightforward failure prediction. Real-world specifications often contain subtle ambiguities requiring more sophisticated analysis, domain expertise, or extended consideration to recognize. The 30-second time constraint serves pedagogical purposes (practicing rapid diagnosis, maintaining activity pacing, focusing on most obvious issues) but shouldn't be interpreted as claiming all diagnostic reasoning should occur in 30 seconds or that this duration suffices for comprehensive specification analysis.

The framework doesn't specify: what constitutes "correct" failure prediction (whether students must identify specific failures matching instructor expectations or whether various reasonable predictions are acceptable), how confident predictions must be to justify acting on them (when to proceed despite uncertainty versus when to require more information), whether all predictable failures matter equally (how to prioritize among multiple anticipated problems), or how to handle situations where different analysts predict conflicting failures from same specification.

The emphasis on failure type classification (ambiguity, model limitation, ethical boundary) provides a useful framework but doesn't claim these are the only possible failure types, that all failures fit neatly into exactly one category, or that classification is always straightforward. Some failures involve multiple types simultaneously; some don't fit standard categories; some require extended analysis to classify accurately.

Reflection / Reasoning Check

1. After analyzing the vague specification "A dog happily walking on the High Line" for 30 seconds and predicting what will fail, reflect on your diagnostic process: What specific aspect of the specification did you identify as most likely to cause failure, and what reasoning led you to that prediction? How did you decide which failure type (ambiguity/design, model limitation, or ethical boundary) your predicted failure represents—what criteria did you use for classification? If you were to actually generate outputs from this specification, what would you look for to verify whether your prediction was accurate—what would confirm you were right versus reveal you missed the actual failure mode? More broadly, what does it mean to make diagnostic predictions from incomplete information (you haven't seen outputs yet)—is this kind of prediction useful despite not being certain, and if so, what makes uncertain predictions valuable for decision-making?

This question tests whether students can articulate their diagnostic reasoning process, explain failure type classification logic, understand prediction verification, and recognize the value of probabilistic reasoning despite uncertainty. An effective response would identify specific vague element ("happily" lacks behavioral specification, or "dog" lacks breed/size/color definition, or specification overall lacks necessary detail), articulate why that vagueness predicts failure ("happily" permits multiple interpretations—tail-wagging vs. tongue-out vs. bouncing gait—so different generators will produce inconsistent results"), explain classification reasoning ("this is ambiguity/design failure because problem stems from underspecified requirement, not from system inability to render well-specified content"), describe verification approach (would compare multiple generation outputs looking for variation in how "happily" is interpreted—if varied, prediction confirmed; if consistent, prediction was wrong or vagueness didn't matter), and recognize that probabilistic prediction proves useful despite uncertainty (enables early specification refinement preventing likely failures even though prediction isn't guaranteed accurate—better to fix probable problems proactively than wait for certain failure reactively). This demonstrates understanding that diagnostic reasoning involves systematic analysis, classification logic, verification thinking, and comfort with useful-but-uncertain predictions.

2. Consider the relationship between specification vagueness and failure predictability: Why does vague specification like "A dog happily walking" enable you to predict failures before generation even attempts? What specific mechanism connects vagueness to predictable failure? Now think about contrast: Could you predict failures as easily from a very detailed, precise specification, or does vagueness make failures more predictable? What does this reveal about the relationship between specification quality and diagnostic difficulty—does bad specification make diagnosis easier or harder? In professional contexts, what are the implications: If you can easily predict that vague specifications will cause failures, what should happen before implementation begins? What responsibility do people creating or receiving vague specifications have to address predictable failures proactively rather than waiting for them to occur?

This question tests understanding of mechanisms connecting vagueness to failure, relationship between specification quality and diagnostic predictability, and professional responsibility implications. An effective response would explain mechanism (vagueness permits multiple interpretations → different interpreters make different choices → outputs vary inconsistently → some fail to match intent despite satisfying literal vague specification), recognize apparent paradox that bad specifications enable easier failure prediction than good specifications (vague specifications reliably predict interpretation inconsistency failures; precise specifications prevent that failure mode so other failure types become relevant—model limitations, execution errors—which may be harder to predict from specification alone), articulate professional responsibility (if failures are predictable from vague specifications, specifications should be refined before implementation rather than proceeding knowing failures will occur; people creating vague specifications should tighten them before sharing; people receiving vague specifications should request clarification rather than guessing intent), and recognize economic logic (early specification refinement when vagueness is detected costs far less than implementing vague specification, discovering failures, and doing rework). This demonstrates sophisticated understanding that diagnostic ease doesn't indicate problem severity—easily diagnosed problems still cause failures if not addressed—and that predictability creates professional obligation for proactive correction.

Return to Slide Index