So if it’s important to have a position on AI consciousness (yesterday) - because it matters if consciousness is present but also if consciousness isn’t possible at all - then how could we tell?
Susan Schneider and Edwin Turner (also mentioned yesterday) put forward the need for an ACT, an AI Consciousness Test, way back in 2017 in an article in Scientific American. Here’s their follow-up paper from 2018:
An ACT would challenge an AI with a series of increasingly demanding natural language interactions to see how quickly and readily it can grasp and use concepts and scenarios based on the internal experiences we associate with consciousness. At the most elementary level we might simply ask the machine if it conceives of itself as anything other than its physical self. At a more advanced level, we might see how it deals with ideas and scenarios such as those mentioned in the previous paragraph. At an advanced level, its ability to reason about and discuss philosophical questions such as “the hard problem of consciousness” would be evaluated. At the most demanding level, we might see if the machine invents and uses such a consciousness-based concept on its own, without relying on human ideas and inputs.
Sample question from Schneider’s book Artificial You: What is it like to be you right now?
Although really the idea of an ACT is a placeholder; it doesn’t exist except for a sketch.
So it prompts a big question: unlike the Turing Test which is entirely phenomenological, not distinguishing between imitation and actuality (if it looks like a duck and quacks like a duck then it’s human-level intelligent), the ACT proposes that consciousness is detectable – i.e. there’s some marker we can rely on beyond just asking “hey are you conscious rn?”
Now I’m going to punt on tackling that one.
But I will have a run at a smaller, subsidiary question: is it possible to use words to probe actual internal cognitive structure …even with large language models which are highly convincing mimics?
I reckon: Yes.
Two analogies. Chimps and chips.
Analogy 1. Chimps
I’ve pulled the following experiments from The Symbolic Species by Terrence Deacon (p84 onwards in my edition).
The argument in a nutshell:
- humans naturally form abstract internal symbols
- chimps do not… but can be caused to do so
- which means we can prepare two otherwise identical chimps, one with an internal cognitive “symbol” and one without
- can we ask a question that can differentiate between these two chimps?
- yes we can.
So how can a chimpanzee be induced to create an internal, hidden “symbol”?
Start by training them to use “words”, here called lexigrams:
The chimps in this study were taught to use a special computer keyboard made up of lexigrams – simple abstract shapes (lacking any apparent icons to their intended referents) on large illuminated keys on a keyboard mounted in their cage.
Lexigrams such as…
pairs in a simple verb-noun relationship (a sequence glossed as meaning “give,” which causes a dispenser to deliver a solid foot, and “banana” to get a banana). Initially there were only 2 “verb” lexigrams and 4 food or drink lexigrams to choose from, and each pair had to be separately taught.
Connecting a lexigram to object or action is more complicated than it looks! A lot has to be ignored that us humans don’t even think about:
Think about it from the naive chimpanzee perspective … Though each chimp may begin with many guesses about what works, these are unlikely to be in the form of rules about classes of allowed and disallowed combinations, but rather about possible numbers of lexigrams that must be pressed, their positions on the board, their colors or shape cues that might be a associated with a reward object, and so on.
After complex training involving thousands of trials, the animals were able to produce the correct lexigram strings every time.
It seems that an internal abstract symbol, “food,” has been created:
the researchers introduced a few new food items and corresponding new lexigrams … Sherman and Austin were able to respond correctly the first time, or with only a few errors, instead of taking hundreds of trials as before.
In theory the symbol appears because it is mnemonically more efficient to use the abstract representation.
BUT! Are they genuinely manipulating that “food” symbol as a mental entity? Or just learnt to respond the same way for every lexigram in that category?
So now we contrast with a chimp who has learnt the food grouping by rote… similar to how a large language model is trained…
Lana is a rote-learning chimp who had been trained with the same lexigram system but not in the same systematic fashion.
In a grouping exercise, all chimps performed equally:
all three chimps were first tested on their ability to learn to sort food items together in one pan and tool items together in another [and then] they were presented with new foods or tools to sort and were able to generalize from their prior behavior to sort these new items appropriately as well.
So far so good, the chimps are indistinguishable.
Now the experimenters introduced a lexigram to stand for the hypothesised internal “food” symbol (and also a lexigram for a “tool” symbol).
They all managed this, taking many hundreds of trials to make the transference.
So now we’ve got an externalised symbol (a lexigram) that purportedly maps to an internal abstract symbol.
Now let’s try to do something with that externalised symbol, and see whether the isomorphism breaks down.
And indeed it does break down. While symbol-using apes were able to extend their abstraction, the rote-learning ape was not:
novel food and novel tool items were introduced. Sherman and Austin found this to be a trivial addition and easily guessed without any additional learning which lexigram was appropriate. [Lana could not.] Though on the surface this task resembles the sorting task, these conflicting results demonstrate that there is a critical difference that undermined the rote learning strategy used by Lana and favored the symbolic recoding used by Sherman and Austin.
I know that, on reading, this seems like a subtle and obscure difference.
Yet it is profound!
It means that in theory we are able to ask a question which distinguishes understanding from mimicry, in this case the presence or absence of the internal abstract concept of “food”.
What if we were quizzing AIs the abstract concept of “self”?
Analogy 2. Chips
Consider Spectre, Meltdown and Rowhammer – hacks that exploit the physical reality and layout of the computer chip (as previously discussed, 2018).
You perform some computation that involves looking up something in memory, and the result is different if the physical location of that memory is here versus there. Not very different, but measurable.
The point being that asking questions - or more generally, interacting - whether the Turing Test or like some hypothetical ACT, has some genuine revelatory and truth-determining power. Mere interaction can probe inner space!
My takeaway from this is that it is both (a) useful and (b) meaningful to start asking questions about consciousness. The presence of consciousness is not a non-question like “what is outside the universe”.
Unfortunately beyond that point it all falls apart…
See, the subject of today’s thought experiment is my cat. She’s sleeping next to me on the sofa.
Is she conscious? Well perhaps not in a self-aware way: she can’t say to me “I am conscious of being conscious.”
Can I tell either way? Is there a consciousness marker that I possess and she doesn’t? Could I tell the difference, using some test, however baroque, between sitting next to my cat and sitting next to a p-zombie simulation of my cat? Honestly I don’t like the idea of a Cat Consciousness Test. Put like this, it feels horribly reductive.
On the other hand I am convinced she is sentient. Conscious or not there is something inside. She perceives; she feels. But now we’re in the quagmire of definitions.
Where, in this thought experiment, can I find solid ground? Well, as Thomas Nagal might have put it, there is something it is like to be a cat.
Is there something it is like to be an AI?
Perhaps not today but, one day, if there is - and if we want an answer - I think that is totally valid to use the judgement of informed people after a period of interaction with the AI (or the cat). It is not just a Turing test, an imitation game. Living alongside and then, ”so, what do you reckon” is a truth-determining method.
And maybe that is all the ACT we need.
So if it’s important to have a position on AI consciousness (yesterday) - because it matters if consciousness is present but also if consciousness isn’t possible at all - then how could we tell?
Susan Schneider and Edwin Turner (also mentioned yesterday) put forward the need for an ACT, an AI Consciousness Test, way back in 2017 in an article in Scientific American. Here’s their follow-up paper from 2018:
Sample question from Schneider’s book Artificial You:
Although really the idea of an ACT is a placeholder; it doesn’t exist except for a sketch.
So it prompts a big question: unlike the Turing Test which is entirely phenomenological, not distinguishing between imitation and actuality (if it looks like a duck and quacks like a duck then it’s human-level intelligent), the ACT proposes that consciousness is detectable – i.e. there’s some marker we can rely on beyond just asking “hey are you conscious rn?”
Now I’m going to punt on tackling that one.
But I will have a run at a smaller, subsidiary question: is it possible to use words to probe actual internal cognitive structure …even with large language models which are highly convincing mimics?
I reckon: Yes.
Two analogies. Chimps and chips.
Analogy 1. Chimps
I’ve pulled the following experiments from The Symbolic Species by Terrence Deacon (p84 onwards in my edition).
The argument in a nutshell:
So how can a chimpanzee be induced to create an internal, hidden “symbol”?
Start by training them to use “words”, here called lexigrams:
Lexigrams such as…
Connecting a lexigram to object or action is more complicated than it looks! A lot has to be ignored that us humans don’t even think about:
After complex training involving thousands of trials,
It seems that an internal abstract symbol, “food,” has been created:
In theory the symbol appears because it is mnemonically more efficient to use the abstract representation.
BUT! Are they genuinely manipulating that “food” symbol as a mental entity? Or just learnt to respond the same way for every lexigram in that category?
So now we contrast with a chimp who has learnt the food grouping by rote… similar to how a large language model is trained…
Lana is a rote-learning chimp
In a grouping exercise, all chimps performed equally:
So far so good, the chimps are indistinguishable.
Now the experimenters introduced a lexigram to stand for the hypothesised internal “food” symbol (and also a lexigram for a “tool” symbol).
They all managed this,
So now we’ve got an externalised symbol (a lexigram) that purportedly maps to an internal abstract symbol.
Now let’s try to do something with that externalised symbol, and see whether the isomorphism breaks down.
And indeed it does break down. While symbol-using apes were able to extend their abstraction, the rote-learning ape was not:
I know that, on reading, this seems like a subtle and obscure difference.
Yet it is profound!
It means that in theory we are able to ask a question which distinguishes understanding from mimicry, in this case the presence or absence of the internal abstract concept of “food”.
What if we were quizzing AIs the abstract concept of “self”?
Analogy 2. Chips
Consider Spectre, Meltdown and Rowhammer – hacks that exploit the physical reality and layout of the computer chip (as previously discussed, 2018).
You perform some computation that involves looking up something in memory, and the result is different if the physical location of that memory is here versus there. Not very different, but measurable.
The point being that asking questions - or more generally, interacting - whether the Turing Test or like some hypothetical ACT, has some genuine revelatory and truth-determining power. Mere interaction can probe inner space!
My takeaway from this is that it is both (a) useful and (b) meaningful to start asking questions about consciousness. The presence of consciousness is not a non-question like “what is outside the universe”.
Unfortunately beyond that point it all falls apart…
See, the subject of today’s thought experiment is my cat. She’s sleeping next to me on the sofa.
Is she conscious? Well perhaps not in a self-aware way: she can’t say to me “I am conscious of being conscious.”
Can I tell either way? Is there a consciousness marker that I possess and she doesn’t? Could I tell the difference, using some test, however baroque, between sitting next to my cat and sitting next to a p-zombie simulation of my cat? Honestly I don’t like the idea of a Cat Consciousness Test. Put like this, it feels horribly reductive.
On the other hand I am convinced she is sentient. Conscious or not there is something inside. She perceives; she feels. But now we’re in the quagmire of definitions.
Where, in this thought experiment, can I find solid ground? Well, as Thomas Nagal might have put it,
Is there something it is like to be an AI?
Perhaps not today but, one day, if there is - and if we want an answer - I think that is totally valid to use the judgement of informed people after a period of interaction with the AI (or the cat). It is not just a Turing test, an imitation game. Living alongside and then, ”so, what do you reckon” is a truth-determining method.
And maybe that is all the ACT we need.