The life agentic

Reflections on agentic AI

Earlier pages in this section teach how to operate agentic tools and how to avoid the immediate risks they create on your project. This page is different. It collects writing from people who have been using these tools long enough to start reflecting on the consequences — for skill, for comprehension, for ownership of what you build, and for the wider social context of software work.

The point is not to convert you to any one position. The point is to make sure you have read at least some of the public discussion before forming your own view. Most of the writing linked here was published while the situation was still moving — the canonical sources you would normally reach for in a more settled field do not yet exist.

A pragmatic suggestion: read at least one piece from each section. The reflection prompts at the end of each section are short and intentionally open; they work best after you have read.

Intended learning outcomes covered on this page

After working through this page, students should be better able to:

Generation is hallucination; verification is the default

A useful reframing comes from practitioners at Thoughtworks: every generation is, in a strict sense, a hallucination. The model is always producing plausible text by predicting the next likely token, and the interesting question is not whether it is hallucinating but which generations happen to be true. Rebecca Parsons has been making that case for some years; the in-house Thoughtworks article puts it as “AI hallucination isn’t a system failure; it’s the natural result of a new kind of computing that works on probability, not on strict logic.” Martin Fowler discusses the same reframing in his own essay on LLMs and software development.

If you accept the reframing, verification stops being a special step you take when something looks wrong; it is the default mode of using the tool. That is also how the Verification page in this section is structured.

Further reading

Reflection prompts

Cognitive debt, comprehension debt, decoupled comprehension

The longer-term cost of agentic coding may not be in the code at all; it may be in what happens to the people who use the tools. Margaret-Anne Storey proposes the term cognitive debt for the erosion of shared mental models that builds up when AI generates code faster than a team can understand it. Technical debt lives in the code; cognitive debt lives in people, and it is harder to measure because none of the usual velocity metrics capture it.

Addy Osmani names a closely related effect comprehension debt: the gap between how much code exists in your system and how much of it any human being genuinely understands. He points to a randomised study by Judy Shen and Alex Tamkin, How AI Impacts Skill Formation, which finds that AI assistance to learn a new programming library impairs developers’ later conceptual understanding, code reading, and debugging — without delivering significant efficiency gains on average. The cost is paid in skill, not in clock time.

Joshua Bloom, writing about scientific work rather than software per se, captures the personal side of the same drift. After a week of working with agents, he wrote, “I started to get this nagging sense I was being slowly led into a state of stuporous acquiescence, that the whole package was working even if I couldn’t understand all of it.” A useful umbrella term for all three observations is decoupled comprehension — the code and your grasp of it have drifted apart. The mitigation, as Problematic cases puts it, is the verification workflow: recoupling claim to evidence, deliberately.

Further reading

Reflection prompts

Context, attention, and the long session

Teresa Torres writes about context rot: as a session gets longer and noisier, the value of each individual detail in the agent’s context diminishes, and eventually the agent’s behaviour degrades in ways that look like sloppiness or forgetfulness. The pattern is the same one Agentic concepts describes in operational terms; Torres’s article frames it from the user’s side — when to suspect context rot, why “start a fresh session” is cheaper than it feels, and how it shapes the rhythm of working with an agent over the course of a day.

Further reading

Reflection prompts

The lethal trifecta and the new security shape

Simon Willison’s lethal trifecta names the conditions under which prompt injection becomes especially dangerous: an agent that simultaneously has access to your private data, exposure to untrusted content, and the ability to communicate externally. Each leg on its own is usually manageable; together they let an attacker hidden inside something the agent reads exfiltrate private data through the agent itself. The operational version of this — and the practical defaults that follow from it — lives on the Problematic cases page. Read Willison’s original post for the underlying argument and for why it matters more in agentic settings than in single-turn chat. His longer Agentic engineering patterns guide is a useful companion piece on how practitioners are working with these constraints in practice.

Further reading

Reflection prompts

Building without purpose, building for joy, building wastelands

Two related observations sit in tension here. On the one hand, agentic tools genuinely lower the barrier to starting a project. Thomas Ptacek’s piece The Emacsification of Software describes a culture in which AI-pilled developers are finally finishing the long list of personal tools they always wanted but never had time to write — building small, idiosyncratic software the same way long-time Emacs users build personal text-editor extensions. Lalit Maganti’s account of building syntaqlite is a concrete worked example of this: eight years of wanting to build a particular tool, three months of actually building it with AI help. Practitioners sometimes call this pattern SLIPsolves lingering but important problems — because it is one of the clearest reasons agentic tools win sceptics over.

On the other hand, the same friction-lowering makes it easy to accumulate half-finished projects nobody is maintaining. Among the wider discussion you will hear productive uncertainty (making progress without being sure what just happened), compulsive construction (building without a clear goal because the agent makes it cheap to), cognitive overflow (development velocity exceeding developer comprehension), and project wasteland (the graveyard of started-but-abandoned agentic projects). None of these are formal terms, but they are recognisable in conversation.

Further reading

Reflection prompts

Education, literacy, and the duty of care

Mary Kalantzis and Bill Cope argue that generative AI is, more than anything, a technology of writing, and that the consequences for literacy and education may be on the scale of the invention of moveable type. Their article is the most ambitious of the readings on this page and is worth reading slowly; they end with a proposal for what they call cyber-social literacy learning. One small observation worth holding on to is their remark that teachers may, ironically, dodge some of the displacement pressure that other knowledge-economy workers face — precisely because part of their job is the duty of care of being with students in person.

Further reading

Reflection prompts

Does AI broaden or narrow the literature you read?

One of the more interesting empirical questions is whether LLM-assisted research broadens or narrows the literature people actually read. The answer so far is “both, depending on scale”. At the level of a single researcher, AI-aided search tools surface newer papers, more books, and sources that traditional keyword search misses; researchers using these tools publish more and cite more broadly. At the level of the field as a whole, however, LLM-suggested references concentrate heavily on already well-cited papers, and the topics that get studied appear to converge. The same tool that expands one person’s reading can contract the collective conversation.

This is not a contradiction; it is a difference of scale. It is worth holding both findings at once when you reason about your own use of these tools.

The clearest single statement of the paradox is in the title of a recent preprint by Hao, Xu, Li and Evans: Artificial Intelligence Tools Expand Scientists’ Impact but Contract Science’s Focus. Their study reports that scientists using AI publish around three times more papers and receive nearly five times more citations — and that the collective volume of scientific topics shrinks measurably. The Cornell Chronicle’s lay summary, reporting on a 2025 Science paper by Kusumegi and colleagues, makes the same individual-level finding: AI-using scientists produced roughly a third more papers on arXiv, with non-native English speakers gaining the largest boost, and AI-powered search tools surfaced newer and more diverse sources than traditional keyword search. A more pessimistic reading comes from Algaba and colleagues, who analysed nearly 275,000 LLM-generated references and found that LLMs systematically reinforce the Matthew effect in citations — they overwhelmingly suggest already-popular papers. Traberg, Roozenbeek, and van der Linden go further still, arguing in Communications Psychology that the rush to study and use AI is producing a scientific monoculture of methods, questions, and viewpoints. Whichever reading persuades you, the takeaway is the same: notice what your own LLM-aided literature search is and is not surfacing.

Further reading

Reflection prompts

Where this page goes next

The list above is not complete and will not stay current on its own. If you come across a piece worth adding — or one that has aged badly and should come down — let the course know. For the operational counterparts to the reflective writing collected here, see Problematic cases of using AI and Verification.