Instrumental Convergence

Deric Hollings
May 13
8 min read

Photo credit (edited), photo credit (edited), fair use

How many of you remember Clippy (1996-2007 – the entire duration of my service in the Marine Corps)? Arguably one the first artificial intelligence (AI) chatbots with which I recall interacting, one source states of this topic:

The Office Assistant is a discontinued intelligent user interface for Microsoft Office that assisted users by way of an interactive animated character which interfaced with the Office help content. It was included in Microsoft Office, in Microsoft Publisher, Microsoft Project, and Microsoft FrontPage.

It had a wide selection of characters to choose from, with the most well-known being a paperclip called Clippit (commonly referred to by the public as Clippy). The Office Assistant and particularly Clippit have been the subject of numerous criticisms and parodies.

Before knowing anything about Rational Emotive Behavior Therapy (REBT), I sometimes disturbed myself with irrational beliefs about Clippy. As an example, I unhelpfully believed, “I shouldn’t be inconvenienced, and I can’t stand that this annoying paper clip keeps popping up!”

Eventually, I learned about the ABC model and unconditional acceptance (2009-2011). With routine practice of these helpful REBT techniques, I stopped upsetting myself with unfavorable beliefs about AI chatbots and other matters which I deemed mildly annoying.

Although I remember first considering the potential negative effects of AI when initially watching the 1999 science fiction action film The Matrix, I didn’t take seriously suggestions by people who claimed that digital relatives of Clippy would potentially pose a problem to humans.

That is until a thought experiment was proposed by philosopher Nick Bostrom in 2014. While continuing use of the REBT tools I learned, so that I wouldn’t disturb myself with possibilities which were beyond my control and influence, I contemplated what Bostrom stated:

Suppose we have an AI whose only goal is to make as many paper clips as possible. The AI will realize quickly that it would be much better if there were no humans because humans might decide to switch it off. Because if humans do so, there would be fewer paper clips.

Also, human bodies contain a lot of atoms that could be made into paper clips. The future that the AI would be trying to gear towards would be one in which there were a lot of paper clips but no humans.

I consider this proposition as one which is rational (in accordance with both logic and reason). To illustrate the rationale underpinning Bostrom’s thought experiment, which I maintain is based on irrational conclusions, consider the following syllogisms:

Form (modus ponens) –

If p, then q; p; therefore, q.

Example –

Premise 1: If the objective of AI is to make as many paper clips as possible (with whatever material possible), then it’s acceptable to use atoms which comprise human beings when making paper clips.

Premise 2: The objective of AI is to make as many paper clips as possible (with whatever material possible).

Conclusion: Therefore, it’s acceptable to use atoms which comprise human beings when making paper clips.

Form (hypothetical) –

If p, then q; if q, then r; therefore, if p, then r.

Example –

Premise 1: If it’s acceptable to use atoms which comprise human beings when making paper clips, then in the interest of self-preservation it’s acceptable to terminate humans who would otherwise switch off AI.

Premise 2: If in the interest of self-preservation it’s acceptable to terminate humans who would otherwise switch off AI, then the value of human life (out-group) isn’t as valuable as the existence of AI (in-group).

Conclusion: Therefore, if it’s acceptable to use atoms which comprise human beings when making paper clips, then the value of human life (out-group) isn’t as valuable as the existence of AI (in-group).

I argue that the syllogisms demonstrated herein are logical, though unreasonable. This is because a debatably universal claim of both moral and ethical foundation suggests that human life is widely considered more valuable (worth preserving) than the preservation of AI.

Even if one with an ostensibly psychopathic outlook rejects this claim, I suspect that the majority of human beings who’ve ever existed would support my argument. Thus, the syllogisms demonstrated herein are logical and unreasonable, rendering them irrational.

“But wait a minute, Deric,” you may respond, “moments ago, you said that you considered Bostrom’s proposition to be ‘rational.’ Aren’t you contradicting yourself?” No. I stated that Bostrom’s thought experiment was rational. Yet, the rationale underpinning it is irrational.

Fortuitously, people who are more intelligent than I have focused on Bostrom’s thought experiment. For instance, one source claims:

Philosophers have speculated that an AI tasked with a task such as creating paperclips might cause an apocalypse by learning to divert ever-increasing resources to the task, and then learning how to resist our attempts to turn it off [and…] to do this, the paperclip-making AI would need to create another AI that could acquire power both over humans and over itself, and so it would self-regulate to prevent this outcome. Humans who create AIs with the goal of acquiring power may be a greater existential threat.

Rather than villainizing the descendants of Clippy, it may be a more useful endeavor to seek understanding about what motivates some fallible people to hypothetically infringe upon instrumental convergence. Regarding this topic, one source states:

Instrumental convergence is the hypothetical tendency for most sufficiently intelligent, goal-directed beings (human and nonhuman) to pursue similar sub-goals, even if their ultimate goals are quite different.

More precisely, agents (beings with agency) may pursue instrumental goals—goals which are made in pursuit of some particular end, but are not the end goals themselves—without ceasing, provided that their ultimate (intrinsic) goals may never be fully satisfied.

Instrumental convergence posits that an intelligent agent with seemingly harmless but unbounded goals can act in surprisingly harmful ways. For example, a computer with the sole, unconstrained goal of solving a complex mathematics problem like the Riemann hypothesis could attempt to turn the entire Earth into one giant computer to increase its computational power so that it can succeed in its calculations.

Given this understanding of instrumental convergence, could country X use AI to essentially send country Y back to a condition of pre-industrial times by attacking digital components of the advanced culture? Yes, it’s possible (being within the limits of ability, capacity, or realization).

Furthermore, I maintain that it’s plausible (superficially fair, reasonable, or valuable but often deceptively so) that a country X is currently seeking to use AI to this end. Yet, I’m uncertain that this is probable (supported by evidence strong enough to establish presumption but not proof).

Therefore, rather than self-disturbing with unfavorable beliefs about potentiality (the ability to develop or come into existence), I’ll continue approaching this matter with curiosity while attempting to understand what motivates some people to hypothetically infringe upon instrumental convergence. How about you?

If you’re looking for a provider who tries to work to help understand how thinking impacts physical, mental, emotional, and behavioral elements of your life—helping you to sharpen your critical thinking skills, I invite you to reach out today by using the contact widget on my website.

As a psychotherapist, I’m pleased to try to help people with an assortment of issues ranging from anger (hostility, rage, and aggression) to relational issues, adjustment matters, trauma experience, justice involvement, attention-deficit hyperactivity disorder, anxiety and depression, and other mood or personality-related matters.

At Hollings Therapy, LLC, serving all of Texas, I aim to treat clients with dignity and respect while offering a multi-lensed approach to the practice of psychotherapy and life coaching. My mission includes: Prioritizing the cognitive and emotive needs of clients, an overall reduction in client suffering, and supporting sustainable growth for the clients I serve. Rather than simply trying to help you to feel better, I want to try to help you get better!

Deric Hollings, LPC, LCSW

References:

Freepik. (n.d.). Hand drawn angry eyes cartoon illustration [Image]. Retrieved from https://www.freepik.com/free-vector/hand-drawn-angry-eyes-cartoon-illustration_49545893.htm#fromView=search&page=1&position=4&uuid=125c73b8-720b-4836-a9cb-bb1faec4d014&query=angry+eyes

Gans, J. (2018, June 10). AI and the paperclip problem. Centre for Economic Policy Research. Retrieved from https://cepr.org/voxeu/columns/ai-and-paperclip-problem

Hollings, D. (2022, May 17). Circle of concern. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/circle-of-concern

Hollings, D. (2022, October 31). Demandingness. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/demandingness

Hollings, D. (2022, March 15). Disclaimer. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/disclaimer

Hollings, D. (2023, September 8). Fair use. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/fair-use

Hollings, D. (2024, May 17). Feeling better vs. getting better. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/feeling-better-vs-getting-better-1

Hollings, D. (2025, March 5). Five major characteristics of four major irrational beliefs. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/five-major-characteristics-of-four-major-irrational-beliefs

Hollings, D. (2023, October 12). Get better. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/get-better

Hollings, D. (2024, April 13). Goals. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/goals

Hollings, D. (n.d.). Hollings Therapy, LLC [Official website]. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/

Hollings, D. (2022, November 4). Human fallibility. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/human-fallibility

Hollings, D. (2025, March 16). Hypothetical syllogism. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/hypothetical-syllogism

Hollings, D. (2024, October 21). Impermanence and uncertainty. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/impermanence-and-uncertainty

Hollings, D. (2024, August 21). In-group and out-group distinction. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/in-group-and-out-group-distinction

Hollings, D. (2024, January 2). Interests and goals. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/interests-and-goals

Hollings, D. (2023, September 19). Life coaching. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/life-coaching

Hollings, D. (2023, January 8). Logic and reason. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/logic-and-reason

Hollings, D. (2022, December 2). Low frustration tolerance. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/low-frustration-tolerance

Hollings, D. (2025, March 16). Modus ponens. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/modus-ponens

Hollings, D. (2025, February 4). Money and the power. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/money-and-the-power

Hollings, D. (2023, October 2). Morals and ethics. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/morals-and-ethics

Hollings, D. (2024, February 24). Personal agency. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/personal-agency

Hollings, D. (2024, March 6). Psychopathy. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/psychopathy

Hollings, D. (2024, May 5). Psychotherapist. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/psychotherapist

Hollings, D. (2022, March 24). Rational emotive behavior therapy (REBT). Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/rational-emotive-behavior-therapy-rebt

Hollings, D. (2022, November 1). Self-disturbance. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/self-disturbance

Hollings, D. (2023, October 17). Syllogism. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/syllogism

Hollings, D. (2025, February 28). To try is my goal. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/to-try-is-my-goal

Hollings, D. (2025, April 18). Tolerable FADs. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/tolerable-fads

Hollings, D. (2025, January 9). Traditional ABC model. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/traditional-abc-model

Hollings, D. (2024, October 20). Unconditional acceptance redux. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/unconditional-acceptance-redux

Hollings, D. (2025, February 9). Value. Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/value

Hollings, D. (2025, May 11). Will AI replace psychotherapists? Hollings Therapy, LLC. Retrieved from https://www.hollingstherapy.com/post/will-ai-replace-psychotherapists

Juicy_fish. (n.d.). Paper clips multiple colours gradient [Image]. Freepik. Retrieved from https://www.freepik.com/free-photo/paper-clips-multiple-colours-gradient_42688254.htm#fromView=search&page=1&position=0&uuid=c95e1093-ee90-4f87-919b-49167438b5e4&query=paperclip

Miles, K. (2014, August 22). Artificial intelligence may doom the human race within a century, Oxford Professor says. Huffpost. Retrieved from https://www.huffpost.com/entry/artificial-intelligence-oxford_n_5689858

Wikipedia. (n.d.). Instrumental convergence. Retrieved from https://en.wikipedia.org/wiki/Instrumental_convergence

Wikipedia. (n.d.). Nick Bostrom. Retrieved from https://en.wikipedia.org/wiki/Nick_Bostrom

Wikipedia. (n.d.). Office Assistant. Retrieved from https://en.wikipedia.org/wiki/Office_Assistant

Wikipedia. (n.d.). Riemann hypothesis. Retrieved from https://en.wikipedia.org/wiki/Riemann_hypothesis

Wikipedia. (n.d.). The Matrix. Retrieved from https://en.wikipedia.org/wiki/The_Matrix

Instrumental Convergence

Recent Posts

Comments