94% of Students Use AI for Assessed Work. Here's Why That's Not the Problem.

The Higher Education Policy Institute's 2026 Student Generative AI Survey landed last week with a statistic that made headlines: 94% of UK undergraduates now use generative AI to help with assessed work (HEPI, 2026). The reaction was predictable. Some commentators wrung their hands about academic integrity. Others called for stricter detection tools. A few suggested we simply accept defeat and let AI write everything.

All of these responses miss the point. The 94% figure is not evidence that students are cheating en masse. It is evidence that traditional assessment is broken, and has been for some time. We have been grading outputs that AI can now produce effortlessly, then acting surprised when students use AI to produce them. The problem is not the students. The problem is that we have been assessing the wrong things.

What Does the HEPI Survey Actually Show?

The HEPI report, conducted by Savanta in December 2025 with 1,054 full-time UK undergraduates, reveals a more nuanced picture than the headline suggests. Yes, 94% use AI to help with assessed work. But "help" covers a broad spectrum: summarising dense readings, generating draft outlines, explaining difficult concepts, and checking grammar. Only 12% report directly including AI-generated text in their submissions, up from 8% the previous year.

The qualitative responses are particularly revealing. One student explained that "AI tools allowed me to quickly summarise dense readings and generate drafts or outlines for assignments, saving hours of tedious work and letting me focus on critical analysis and deeper understanding." Another admitted simply: "I'm not using my brain at all." The HEPI authors note the striking contrast: "There is a difference between a student who is utilising AI to advance their learning and a student who is outsourcing their learning to AI."

This distinction matters. The first student is using AI as cognitive scaffolding, freeing up mental resources for higher-order thinking. The second is experiencing what the OECD calls "cognitive offloading," letting AI do the thinking and missing out on the struggle necessary for genuine learning. Both behaviours register as "AI use," but their educational implications are opposite.

Why Detection Cannot Solve This Problem

The instinctive response to rising AI use is to improve detection. If students are using AI inappropriately, catch them. But this approach has failed, is failing, and will continue to fail for reasons that are now well documented.

Independent analyses have found that AI detection tools produce false positive rates between 5% and 20%, meaning genuine student work is routinely flagged as machine-generated (Nesenoff & Miltenberg, 2025). The consequences are not abstract. In 2024, the Australian Catholic University falsely accused approximately 1,500 students based on Turnitin's AI detector. Students lost graduate nursing positions. Transcripts were marked "results withheld" for months. Some were forced to hand over their entire internet browsing history to prove their innocence.

These tools are particularly unreliable for non-native English speakers and neurodivergent students, whose writing patterns can naturally differ from what algorithms expect "human writing" to look like. A Yale School of Management student sued in 2025 alleging wrongful suspension after GPTZero flagged their exam, citing discrimination against non-native English speakers. The detection arms race is not just ineffective; it actively harms the students it claims to protect.

Even when detection works, it addresses only the symptom. A student who successfully evades detection has still learned nothing from the assignment. A student who is caught has been punished, but the underlying incentive structure remains unchanged. As long as we assess outputs that AI can produce, we incentivise students to use AI to produce them. Detection treats the behaviour as a moral failure when it is actually a rational response to misaligned incentives.

The Real Problem: Assessing Outputs That No Longer Require Human Thinking

The traditional essay, the lab report, the problem set, these assessment formats evolved when producing them required sustained cognitive effort. If you wanted a well-structured argument about photosynthesis, you had to understand photosynthesis. If you wanted a coherent analysis of titration results, you had to perform the titration and think through the data. The output served as reasonable evidence of the process.

That correlation has shattered. A student can now prompt an AI to generate a sophisticated essay, complete with citations, in under a minute. They can request a lab report with plausible observations, appropriate error analysis, and well-reasoned conclusions without touching any equipment. The output looks correct because AI can synthesise what correct outputs look like. But the student has developed none of the skills the assignment was designed to teach.

This is not a new problem. Students have always found ways to shortcut assessments: copying from previous years, dividing work among study groups, purchasing essays online. AI has simply democratised the shortcut, making it accessible to everyone rather than just those with resources or connections. The scale has changed, but the underlying issue, that we assess outputs detached from process, has existed for decades.

What Does AI-Proof Assessment Look Like?

The OECD Digital Education Outlook 2026 answered this question explicitly: "Instead of grading the final paper, teachers should evaluate how a student interacted with AI, how they critiqued its output, and how they refined their ideas over time" (OECD, 2026). This is process-oriented assessment: evaluating not what students produce, but how they engage with learning to produce it.

In practice, this means capturing the process itself. When a student works through a problem, what decisions do they make? When they encounter difficulty, how do they respond? When their first approach fails, do they adapt or give up? These questions get at what education actually aims to develop: thinking skills, problem-solving capacity, and the ability to learn from mistakes.

Laboratory science offers a particularly clear example. You cannot AI your way through pipetting technique. You cannot prompt ChatGPT to develop your fine motor control, your ability to observe subtle colour changes, or your instinct for when an experiment feels wrong. These skills exist in the body and develop through practice. A student who watches AI generate a perfect lab report has learned nothing about working in a lab.

How Virtual Labs Make Process Assessment Possible

At WhimsyLabs, process-oriented assessment is not a retrofit in response to AI. It is how we designed our platform from the beginning. Our virtual labs capture every action a student takes: which equipment they select, in what order they perform steps, how they respond to unexpected results, and how their technique improves over time. This interaction logging creates a detailed record of scientific thinking in action.

Action logging is the foundation. When a student pipettes a solution, we track their technique: the angle they hold the pipette, the speed at which they dispense, whether they pre-wet the tip. When they measure temperature, we record not just the final reading but when they took it, how many readings they made, and whether they waited for equilibrium. These granular details reveal whether a student is developing proper laboratory habits or simply going through the motions.

Technique grading builds on action logging to assess procedural competence. A student who reaches the correct endpoint in a titration through sloppy technique receives different feedback than one whose technique was precise but who miscalculated the molarity. Both need improvement, but in different ways. Traditional assessment, focused only on the final answer, cannot distinguish between them.

Expert pathway comparison places student behaviour in context. We have mapped how expert scientists approach common experiments, identifying the decision trees and problem-solving patterns that characterise skilled scientific thinking. When a student's approach diverges significantly from expert pathways, it signals an opportunity for targeted guidance. When their approach aligns with expert reasoning even if they reach an incorrect conclusion, it suggests their scientific thinking is developing appropriately.

The Assessment Crisis Is Also an Opportunity

The HEPI survey reveals something that should concern every educator: nearly two-thirds of students (65%) say assessment has changed significantly in response to AI, and many articulate anxiety about false accusations of misconduct. Students are living in a surveillance environment where their authentic work might be flagged as fraudulent, while AI-assisted work might pass undetected. The incentives are perverse.

But the same survey shows a path forward. Students overwhelmingly see AI skills as essential, with 68% believing these capabilities are necessary to thrive in today's world. They are not trying to cheat. They are trying to develop skills they correctly perceive as important. The problem is that current assessment structures force them to choose between developing AI literacy and demonstrating subject knowledge, when both should be possible simultaneously.

Process-oriented assessment resolves this tension. When we assess how students think rather than what they produce, AI becomes a tool rather than a threat. Students can use AI to summarise readings, generate draft outlines, or check their reasoning, while still being assessed on the cognitive work that matters: forming hypotheses, interpreting data, adapting to unexpected results, and developing physical laboratory skills that no AI can replicate.

What This Means for Science Education

Science education faces a particular challenge because laboratory skills are central to the discipline. A student who has never held a pipette, adjusted a microscope, or observed a chemical reaction cannot claim to understand science in any meaningful sense. Yet traditional lab assessment, focused on written reports, has always been vulnerable to shortcuts. AI has simply made the vulnerability impossible to ignore.

The solution is not to ban AI from science education. It is to assess what AI cannot do: the physical skills, the procedural competence, the ability to respond to real-time experimental challenges. Virtual labs that capture process data make this possible at scale. A student's interaction log provides richer evidence of their scientific development than any written report could, and it cannot be faked by any AI.

The HEPI authors conclude that "institutions have a crucial role in ensuring AI enhances learning rather than diminishing it." They recommend structured AI induction, curriculum changes to teach AI skills explicitly, and clear assessment-specific guidance. These are all important. But the most fundamental change required is rethinking what we assess in the first place.

Ninety-four percent of students using AI is not a crisis. It is a signal that assessment must evolve. The tools exist. The research supports it. The only question is whether education will adapt, or continue to fight a detection war it has already lost.

Related Articles

References

All Posts