Keyboard Shortcuts?

×
  • Next step
  • Previous step
  • Skip this slide
  • Previous slide
  • mShow slide thumbnails
  • nShow notes
  • hShow handout latex source
  • NShow talk notes latex source

Click here and press the right key for the next slide (or swipe left)

also ...

Press the left key to go backwards (or swipe right)

Press n to toggle whether notes are shown (or add '?notes' to the url before the #)

Press m or double tap to slide thumbnails (menu)

Press ? at any time to show the keyboard shortcuts

\title {How to Distinguish Two (Or More) Systems for Social Cognition \\ How to Distinguish Two (Or More) Systems for Social Cognition}
 
\maketitle
 

How to Distinguish Two (Or More) Systems for Social Cognition

\def \ititle {How to Distinguish Two (Or More) Systems for Social Cognition}
\begin{center}
{\Large
\textbf{\ititle}
}
 
\iemail %
\end{center}
I’m interested in, and have been attempting to defend, the view that many aspects of social cognition, including mental state ascription, involve two or more systems.
In this talk I want to take a small step back and try to identify considerations which motivate and ground postulating multiple systems.

system ?!

An immediate problem facing people like me who want to talk about multiple systems is that we have to say what a system is. This is a problem because there are many attempts to characterise ‘system’, each only slightly different from the next. And there is no obvious principle for deciding between competing characterisations.

Are ‘the dichotomous characteristics used to define the two-system models … perfectly correlated’?

[and] whether a hybrid system that combines characteristics from both systems could not be … viable’

\citep[p.\ 537]{keren_two_2009}

Keren and Schul (2009, p. 537)

‘the process architecture of social cognition is still very much in need of a detailed theory’

\citep[p.\ 759]{adolphs_conceptual_2010}

Adolphs (2010 p. 759)

Like Adolphs and others, I’m not convinced that we already have a good theory of cognitive architecture which provides a rigorous theoretical basis for distinguishing systems.
But how can we sensibly ask questions about systems at all without a theory that tells us what systems are? We need to feel our way slowly forwards by examining particular cases in which it seems reasonable to ask the question, Are these observations all a consequence of one system’s operations, or of two or more systems’ operations?
So let me offer you a case study where I think a question about systems arises.

case study 1

speech perception

The case study involves speech perception, which I take to be an important form of social cognition and also one that is closely related to action understanding.
Consider this following representation of twelve sounds. Each sound differs from its neighbours by the same amount as any other sound. Most people would not be able to discriminate two adjacent sounds ...
... except for in two special cases (one around -3 to -1 and one around +1 to +3) where the discrimination is easier. Here people tend to report hearing the sound change from ba to da or from da to ga.
So although one comparably distant pair of sounds are, from a narrowly acoustic point of view, no more different from each other than any other comparably distant pair of sounds ...
... the sounds in this pair are easy to discriminate.
Coarticulation, the fact that the acoustic effects of phonetic gestures overlap, complicates the relation between sounds and speech. But I want to simplify things by ignoring that here.
Because of the effects of factors such as phonetic context, some things which are acoustically quite unrelated ...
... are nevertheless be identified as the same phonetically.
Conversely, it is possible for the same acoustic simulus to correspond to different phonetic gestures.
This is because the locations of category boundaries change depending on contextual factors such as the speaker’s dialect or the rate at which the speaker talks ; both factors dramatically affect which sounds are produced.
So which features of the stimuli does speech perception track?
Liberman and Mattingly argue that, in the case of speech, the function of speech perception is to track to differences between intended phonetic gestures.
There are two distinct *questions*: Which sound is it? and Which phonetic gesture is it?
We know these questions are distinct because, as I’ve just shown you, the same phonetic gesture can have quite different acoustic effects, and, conversely, the same acoustic effects can be consequences of different phonetic gestures.

Distinct questions ...

change phonetic contextchange dialect, rate, ...
Same sound?
Same phonetic gesture?

... but are there distinct systems?

But are there distinct systems for identifying sounds and phonetic gestures? There’s a system that enables us to distinguish [ba] from [da] and there’s a system that enables us to distinguish the sound of a doorbell from the sound of a siren, or to distinguish notes played on a piano that differ in pitch. Are these two distinct systems or just one?
I want to pause before answering this question.
We’ve reached a position where it makes sense to ask whether we have two systems or just one. How did we get here?
Let me characterise our route to this position by appeal to David Marr’s famous three-fold distinction between levels of description of a system: the computational description, the representations and algorithms, and the hardware implementation.
Although probably there are ways in which Marr’s idea could be refined, his simple three-fold distinction nicely fits the case of speech perception.
\citet[p.~22ff]{Marr:1982kx} distinguishes:
\begin{itemize}
\item computational description---What is the thing for and how does it achieve this?
\item representations and algorithms---How are the inputs and outputs represented, and how is the transformation accomplished?
\item hardware implementation---How are the representations and algorithms physically realised?
\end{itemize}

1. computational description

-- What is the thing for and how does it achieve this?

The computational description tells us what speech perception is for and how it is achieved in very broad terms.
Speech perception is for identifying phonetic gestures; this is achieved by observing their acoustic and bodily effects and working out which phonetic gesture is most likely to have such effects given the context.
But merely having the computational description does not provide a lot of insight into how speech perception works. For that you also need to identify representations and alogrithms ...

2. representations and algorithms

An account of the representations and algorithms tells us ...

-- How are the inputs and outputs represented, and how is the transformation accomplished?

The Motor Theory of Speech Perception provides an account of the representations and algorithms. According to this theory, the representations of phonetic gestures are motor representations. That is, the perceiver represents phonetic gestures in the way she would if she were not perceiving but producing those phonetic gestures.

3. hardware implementation

-- How are the representations and algorithms physically realised?

The hardware implementation tells us how the representations and algorithms are physically realised.
The final thing we need to understand speech perception is a description of the hardware in which the algorithm is implemented. It’s only here that we even care that the system we’re talking about is biological rather than a narrowly mechanical device, using cogs, say, or an electronic device.

Marr (1992, 22ff)

One lesson I take from this is very simple (I’m sorry if it seems to simple to even mention): In order to ask whether we have multiple systems or just one system, we first need characterisations of multiple systems which differ at one or more of the three levels.
In the case of speech perception, we identified (the broad outlines of) distinct computational descriptions for an acoustic system and a speech perception system.
This is how we reached a position where it makes sense to ask whether we have two systems or just one.
Let me be careful here: having distinct computational descriptions does not tell us whether there are distinct systems. It is merely a prerequisite for being in a position to ask whether there is one system or multiple systems. (More carefully: the prerequesite is having descriptions of systems that differ at one of the three levels: computational description, representations and algorithms or hardware implementation.)
So let me return to my question.
Are there distinct systems for identifying sounds and phonetic gestures? There’s a system that enables us to distinguish [ba] from [da] and there’s a system that enables us to distinguish the sound of a doorbell from the sound of a siren, or to distinguish notes played on a piano that differ in pitch. Are these two distinct systems or just one?

Distinct questions computational descriptions ...

change phonetic contextchange dialect, rate, ...
Same sound?
Same phonetic gesture?

... but are there distinct systems?

The Argument From ‘Jein’

To answer this question I want to introduce you to what I’ll call ‘The Argument From ‘Jein’’.
Here are artificial speech-like stimuli for two syllables, [ra] and [la].
The syllables have the same “base” but differ in the “transition”.
When a “transition” is played alone, it sounds like a chirp and quite unlike anything we normally hear in speech. This is part of a general phenomenon: recognizing a stimulus as speech alters how it is processed acoustically, so that some acoustic details are ignored.
Why is this relevant? If there is just one system for processing sounds and speech, then any given stimulus can either be speech or nonspeech but not both. So if there is just one system for processing sounds and speech, it should not ever be possible to process a stimulus in such a way that a single subject reports hearing both the chirp and also the phoneme.
By contrast, if there are two systems then each can make a decision about whether a given stimlus is speech. And although they should generally make the same decision, there could be occasions on which one system decides that a stimulus is nonspeech wheras the other system decides that it is speech. That is, \textbf{multiple systems make possible what a single system rule out: a single stimulus is simultaneously classified as speech and as nonspeech}.
So the conjecture that there are two different systems for identifying sounds and phonetic gestures distinctively predicts that it should be possible, even if only in very special circumstances, for a single stimulus to be simultaneously classified as speech and as nonspeech.
Liberman and colleagues set out to confirm this prediction by studying what they call ‘duplex perception’ \citep{Liberman:1981xk}.\footnote{ In this presentation I have simplified some details of their method and omit controls.}
Duplex perception occurs when the base and transition are played together but in separate ears. In this case, subjects hear both the chirp that they hear when the transition is played in isolated, and the syllable [la] or [ra]. Which syllable they hear depends on which transition is played, so the stimulus must have been classified as speech. But because subjects hear the chirp we also know that the stimulus was not classified as speech in acoustic processes.
This shows that auditory and speech processing involve distinct systems.
Here, then, is the Argument From ‘Jein’.
\textbf{a conjecture about multiple systems can sometimes be distinguished from a competing conjecture about a single system because multiple systems make possible what a single system rule out: incompatible responses to a single stimulus.}

The Argument From ‘Jein’

Multiple systems make possible what a single system rules out: incompatible responses to a single stimulus.

The Timing Objection

A single system makes incompatible responses sequentially.

The Argument From ‘Jein’ provides us with good but not decisive evidence because it is possible in principle that a single system makes incompatible responses sequentially. This is the Timing Objection.

 

The Objection from ‘Meh’

A single system puts roughly equal weight on two answers.

Given these objections, the Argument from Jein can’t usually stand alone but needs to be supported by further arguments. (In the case of speech perception, further arguments can be given by appeal to a hypothesis about the algorithms and representations involved in speech perception; namely the view that the representations underpinning speech perception are motor representations.)

system ?!

I started with the thought that we don’t have a good theory that tells us what systems are.
But I’ve been suggesting that this isn’t an insurmountable barrier to articulating and defending claims about multiple systems.
The lack of a theory needn’t hold us back because we know how to characterise systems and to generate predictions from hypotheses about multiple systems using, for instance, what I’ve been calling the Argument from Jein.
So while I think we will eventually need to get the train back onto the rails, it may well be possible to show that we need to postulate multiple systems in advance of having a detailed theory of ‘the process architecture of social cognition’.
But how far have we got really? There are two kinds of claim proponents of ‘two systems’ views might make, appropriately enough. I’m going to label these ‘vertical’ and ‘horizontal’.
In distinguishing systems for auditory and speech processing, we are distinguishing between systems which both have many features associated with modularity; in particular, their operations are informationally encapsulated and to a significant degree automatic in the sense that whether the process occurs is independent of the subject’s task and motivation.
This is a ‘horizontal’ distinction between two systems. The value of having horizontally distinct systems may arise from gains achieved through specialization.

physical cognition

phonological awareness

physical cognition

speech perception

auditory perception

But of course much of the excitement surrounding claims about multiple systems concerns the ‘vertical’ dimension.
Here the idea is that we have two systems which are both concerned with processing, say, physical properties of things such as their momentum. What distinguishes the systems will instead be architectural characteristics of their processes, such as informational encapsulation and automaticity.
In general, systems that are vertically distinct do not differ with respect to the domain of things they process. How could we ever be in a position to ask whether there are vertically distinct systems?
Let me offer you a case study where I think a question about vertically distinct systems arises.

case study 2

representational momentum

The case study involves representational momentum.

Hubbard 2005, figure 1a; redrawn from Freyd and Finke 1984, figure 1

Hubbard 2005, figure 1b; drawn from Freyd and Finke 1984, table 1

\textbf{Representational momentum suggests that there are automatic processes which predict the future trajectories of physical objects.}

Is the system underpinning representational momentum also what underpins explicit verbal predictions about objects’ trajectories?

You might wonder why anyone would ask this question at all. \textbf{Wouldn’t it be redundant to have two systems for a single purpose?} But we might be motivated to ask this question by the thought that whatever underpins representational momentum needs to occur very rapidly, whereas working out where an object will go can sometimes involve quite complex calculations.
\textbf{Since any broadly inferrential process must make a trade-off between speed and accuracy, having two or more systems each capable of predicting objects’ trajectories could be advantageous.}

speed vs accuracy

Since this is an important point for me, let me repeat: any broadly inferrential process must make a trade-off between speed and accuracy.

Henmon (1911, table 2)

speed vs accuracy:
Here you see the results of an old experiment by Henmon who had subjects judge which of two only very slightly different lines was longer. He noted that ‘under each category of judgment the wrong judgments are in general shorter’.
This experiment doesn’t provide evidence for a speed-accuracy trade off and wasn’t designed to (speed was not experimentally mainpulated). But it’s interesting that the idea of a speed-accuracy trade-off goes back such a long way.
Henmon p. 195: ‘A continuation of this investigation (1) where the time of exposure of stimuli was limited, (2) where the time of judgment was voluntarily shortened or prolonged, (3) and with varying differences in stimuli, should give significant results. footnote: The writer had planned such an investigation, but a change of work has necessitated its indefinite postponement; hence the publication of these preliminary results.’
\textbf{The value of having two systems which process inputs from a single domain arises from this trade-off.} Having multiple systems enables complementary trade-offs to be made. So it is not obvious that there could not be two systems both of which can predict the trajectories of moving objects.

Is the system underpinning representational momentum also what underpins explicit verbal predictions about objects’ trajectories?

How could we tell?

Kozhevnkov and Heggarty have neatly answered this question.
Kozhevnkov and Heggarty’s first step was to provide a conjecture about the computational description of the system underpinning representational momentum.

1. computational description

-- What is the thing for and how does it achieve this?

2. representations and algorithms

-- How are the inputs and outputs represented, and how is the transformation accomplished?

3. hardware implementation

-- How are the representations and algorithms physically realised?

Marr (1992, 22ff)

They were let to a conjecture about the computational description by reflection on the fact that \textbf{any broadly inferrential process must make a trade-off between speed and accuracy}.

To extrapolate objects’ motion on the basis of [e.g. Newtonian] physical principles, one should have assessed and evaluated the presence and magnitude of such imperceptible forces as friction and air resistance ... This would require a time-consuming analysis that is not always possible.

‘In order to have a survival advantage, the process of extrapolation should be fast and effortless, without much conscious deliberation.

Impetus theory allows us to extrapolate objects’ motion quickly and without large demands on attentional resources.’

Kozhevnikov and Heggarty (2001, p. 450)

So their conjecture concerned the computational description of the system underpinning representational momentum: they conjectured that this system depends on an impetus model of the physical.
\textbf{But how can this conjecture be tested?}
\citep[p.\ 640]{hubbard:2013_launching}: ‘prediction based on an impetus heuristic could yield an approximately correct (and adequate) solution [...] but would require less effort or fewer resources than would prediction based on a correct understanding of physical principles.’
\citet[p.\ 450]{kozhevnikov:2001_impetus}: ‘To extrapolate objects’ motion on the basis of physical principles, one should have assessed and evaluated the presence and magnitude of such imperceptible forces as friction and air resistance operating in the real world. This would require a time-consuming analysis that is not always possible. In order to have a survival advantage, the process of extrapolation should be fast and effortless, without much conscious deliberation. Impetus theory allows us to extrapolate objects’ motion quickly and without large demands on attentional resources.’
Kozhevnikov and Heggarty’s conjecture about the computational description of the system underpinning representational momentum entails that the system should be subject to certain signature limits.
A \emph{signature limit of a system} is a pattern of behaviour the system exhibits which is both defective given what the system is for and peculiar to that system.
To give a simple example, suppose there is a machine for performing addition which generally works well but gives incorrect answers when asked to add twin primes. This is a signature limit of the system. By contrast, the system’s giving incorrect answers in very hot conditions, or when very large numbers are input, is not a signature limit. After all, lots of systems fail in these conditions.

How to get from computational description to prediction?

Signature limits!

Kozhevnikov & Hegarty (2001, figure 1)

Fix shape and density. How would increasing the object’s size affect how quickly it decelerates when launched vertically? Impetus: larger size entails greater deceleration (so slower ascent). Newtonian: larger size entails lower deceleration (so faster ascent) if considering air resistance; otherwise size makes no difference.

simplified from Kozhevnikov & Hegarty (2001)

simplified from Kozhevnikov & Hegarty (2001)

This is a doubly exciting result.
It allows us to run the Argument From Jein: \textbf{a conjecture about multiple systems can sometimes be distinguished from a competing conjecture about a single system because multiple systems make possible what a single system rule out: incompatible responses to a single stimulus.}
But even more convincingly, the prediction generated by Kozhevnikov and Heggarty’s conjecture about the computational description of the system underpinning representational momentum has been directly confirmed.
So while not decisive, I take this to be strong evidence for a \textbf{vertical distinction} between two systems for physical cognition.
Let me pause to spell out the pattern as I think Kozhevnikov and Heggarty have provided us with a good model for evaluating claims about vertical distinctions between two systems.
We have two recipes for distinguishing two systems. The first recipe is simpler but perhaps more limited ...

From speech perception (Liberman et al)

Multiple systems make possible what a single system rule out: incompatible responses to a single stimulus

The second recipe involves three ingredients but has produced quite compelling evidence for a vertical distinction between two systems.

From representational momentum (Kozhevnikov & Heggarty):

1. speed-vs-accuracy trade offs

motivate conjectures about

2. computational descriptions

which entail

3. signature limits

that generate predictions.

Note the role of the speed vs accuracy trade offs in the recipe. The claim I’m offering isn’t that there are slow systems and fast systems, or flexible systems and efficient systems; although it is sometimes convenient to talk in this way.
The fundamental role of the trade offs is to motivate conjectures about computational descriptions. So, yes, it is important that any one system has to make a trade off between speed and accuarcy; and, yes, where there is a vertical distinction between systems we would expect the two (or more) systems to make complementary trade offs between speed and accuracy; but, no, this observation does not itself constitute a theory of the cognitive architecture.

mental state ascription

What are the prospects for following this recipe to defend a vertical distinction between two systems for mental state ascription?

From speech perception (Liberman et al)

Multiple systems make possible what a single system rule out: incompatible responses to a single stimulus

From representational momentum (Kozhevnikov & Heggarty):

1. speed-vs-accuracy trade offs

motivate conjectures about

2. computational descriptions

which entail

3. signature limits

that generate predictions.

The first thing we need are considerations that point to the possibility of speed-vs-accuracy trade offs.
A process involves \emph{belief-tracking} if how processes of this type unfold typically and nonaccidentally depends on facts about beliefs. So belief tracking can, but need not, involve representing beliefs.

belief-tracking is sometimes but not always automatic

A process is \emph{automatic} to the degree that whether it occurs is independent of its relevance to the particulars of the subject's task, motives and aims.
Or, more carefully, does belief tracking in human adults depend only on processes which are automatic?
Yes: \citet{Schneider:2011fk,schneider:2014_task,kovacs_social_2010}
No: \citet{back:2010_apperly}
Yes and No: \citep[p.\ 132]{Wel:2013uq}
There is now a variety of evidence that belief-tracking is sometimes and not always automatic in adults. Let me give you just one experiment here to illustrate.

Schneider et al (2014, figure 1)

One way to show that mindreading is automatic is to give subjects a task which does not require tracking beliefs and then to compare their performance in two scenarios: a scenario where someone else has a false belief, and a scenario in which someone else has a true belief. If mindreading occurs automatically, performance should not vary between the two scenarios because others’ beliefs are always irrelevant to the subjects’ task and motivations.

Schneider et al (2014, figure 3)

\citet{Schneider:2011fk} did just this. They showed their participants a series of videos and instructed them to detect when a figure waved or, in a second experiment, to discriminate between high and low tones as quickly as possible. Performing these tasks did not require tracking anyone’s beliefs, and the participants did not report mindreading when asked afterwards.
on experiment 1: ‘Participants never reported belief tracking when questioned in an open format after the experiment (“What do you think this experiment was about?”). Furthermore, this verbal debriefing about the experiment’s purpose never triggered participants to indicate that they followed the actor’s belief state’ \citep[p.~2]{Schneider:2011fk}
Nevertheless, participants’ eye movements indicated that they were tracking the beliefs of a person who happened to be in the videos.
In a further study, \citet{schneider:2014_task} raised the stakes by giving participants a task that would be harder to perform if they were tracking another’s beliefs. So now tracking another’s beliefs is not only irrelevant to performing the tasks: it may actually hinder performance. Despite this, they found evidence in adults’ looking times that they were tracking another’s false beliefs. This indicates that ‘subjects … track the mental states of others even when they have instructions to complete a task that is incongruent with this operation’ \citep[p.~46]{schneider:2014_task} and so provides evidence for automaticity.% \footnote{% % quote is necessary to qualify in the light of their interpretation; difference between looking at end (task-dependent) and at an earlier phase (task-independent)? %\citet[p.~46]{schneider:2014_task}: ‘we have demonstrated here that subjects implicitly track the mental states of others even when they have instructions to complete a task that is incongruent with this operation. These results provide support for the hypothesis that there exists a ToM mechanism that can operate implicitly to extract belief like states of others (Apperly & Butterfill, 2009) that is immune to top-down task settings.’ It is hard to completely rule out the possibility that belief tracking is merely spontaneous rather than automatic. I take the fact that belief tracking occurs despite plausibly making subjects’ tasks harder to perform to indicate automaticity over spontaneity. If non-automatic belief tracking typically involves awareness of belief tracking, then the fact that subjects did not mention belief tracking when asked after the experiment about its purpose and what they were doing in it further supports the claim that belief tracking was automatic. }
Further evidence that mindreading can occur in adults even when counterproductive has been provided by \citet{kovacs_social_2010}, who showed that another’s irrelevant beliefs about the location of an object can affect how quickly people can detect the object’s presence, and by \citet{Wel:2013uq}, who showed that the same can influence the paths people take to reach an object. Taken together, this is compelling evidence that mindreading in adult humans sometimes involves automatic processes only.
‘Participants never reported belief tracking when questioned in an open format after the experiment (“What do you think this experiment was about?”). Furthermore, this verbal debriefing about the experiment’s purpose never triggered participants to indicate that they followed the actor’s belief state’ \citep[p.~2]{Schneider:2011fk}

From speech perception (Liberman et al)

Multiple systems make possible what a single system rule out: incompatible responses to a single stimulus

From representational momentum (Kozhevnikov & Heggarty):

1. speed-vs-accuracy trade offs

motivate conjectures about

2. computational descriptions

which entail

3. signature limits

that generate predictions.

Automaticity generally requires cognitive efficiency. And anticipatory eye movements occur rapidly. So we have grounds for thinking that, on different occasions, belief-tracking will require different speed-accuracy trade offs.
This motivates considering the possibility that there is a vertical distinction between two systems.
The next thing we need is a computational description which does what impetus mechanics did for the physical: it enables us to see how a system could trade accuracy to gain speed.
This is the point of the construction of minimal theory of mind, which, like all of the research I’m presenting today, I undertook with Ian Apperly.

the

dogma

of mindreading

To make see the coherence of this project, we need to reject a dogma. The dogma is that there is one model of the mental and mindreading involves the use of that model. Or, more carefully (to accommodate Wellman et al), the dogma is that there is either just one model or else a family of models where one of the models, the best and most sophisticated model, contains all of the states that are contained in any of the models.
[Is this way of putting is clearer? : the mental states included in each model are a subset of the mental states included in the best, most sophisticated model. (The idea is that there is a model containing all the states in the union of the sets of states contained in each model.) ]
Lots of researchers’ views and arguments depend on this dogma. But I think you can see that the dogma is not something we should take for granted by drawing a parallel between mindreading and physical cognition.
I can't explain it in detail here, but minimal theory of mind is like impetus mechanics. It's obviously flawed and gets things quite wildly wrong but still useful in a limited range of circumstances.
Butterfill and Apperly's minimal theory of mind identifies a model of the mental.
I'm not going to describe the construction of minimal theory of mind, but I've written about it with Ian Apperly and outlined the idea on your handout.
The construction of minimal theory of mind is an attempt to describe how mindreading processes could be cognitively efficient enough to be automatic. It is a demonstration that automatic belief-tracking processes could be mindreading processes.
For this talk, the details don't matter. What matters is just that it's possible to construct minimal models of the mental which are powerful enough that using them would enable you to solve some false belief tasks.
Unlike the full-blown model, a minimal model distinguishes attitudes by relatively simple functional roles, and instead of using propositions or other complex abstract objects for distinguishing among the contents of mental states, it uses things like locations, shapes and colours which can be held in mind using some kind of quality space or feature map.
Let me put it another way. The canonical model of the mental is used for a wide range of things: its roles are not limited to predicton; instead it also supports explanation, regulation and story telling. In this way it's more like a myth-making framework than a scientific one, although this is rarely recognised.
A minimal model of the mental gets efficiency in part by being suitable only for prediction and retrodiction.

From speech perception (Liberman et al)

Multiple systems make possible what a single system rule out: incompatible responses to a single stimulus

From representational momentum (Kozhevnikov & Heggarty):

1. speed-vs-accuracy trade offs

motivate conjectures about

2. computational descriptions

which entail

3. signature limits

that generate predictions.

What about signature limits?

One signature limit on minimal models of the mental concerns false beliefs about identity. These are the kind of false belief Lois Lane has when she falsely believes that Superman and Clark Kent are different people: for the world to be as she believes it to be, there would have to be two objects rather than one; her beliefs expand the world.
(This is for illustrating mistakes about identity.) You might not realise that your bearded drinking pal ‘Ian’ and the author ‘Apperly’ are one and the same person.
[Explain why minimal models can't cope with false beliefs about identity.] Now on a cananical model of the mental, false beliefs involving identity create no special probelm. This is because the (Fregean) proposition that Superman is flying is distinct from the proposition that Clark Kent is flying. Different propositions, different beliefs. By contrast, a minimal model of the mental uses relational attitudes like registration; this means that someone using a minimal model is using the objects themselves, not representational proxies for them, to keep track of different beliefs. Consequently knowing that Superman is Clark Kent prevents a minimal mindreader from tracking Lois’ false beliefs about identity.
This is why false beliefs about identity are a signature limit of minimal models of the mental.

From speech perception (Liberman et al)

Multiple systems make possible what a single system rule out: incompatible responses to a single stimulus

From representational momentum (Kozhevnikov & Heggarty):

1. speed-vs-accuracy trade offs

motivate conjectures about

2. computational descriptions

which entail

3. signature limits

that generate predictions.

So what are the predictions?
Given that we can coherently make testable hypotheses about models, I want to finish by considering one for which a variety of evidence has recently been offered.

Hypothesis:

Some automatic belief-tracking systems rely on minimal models of the mental.

Prediction:

Automatic belief-tracking is subject to the signature limits of minimal models.

(See \citealp{wang:2015_limits,low:2010_preschoolers,low:2014_quack}; contrast \citealp{scott:2015_infants}.)

False-Belief:Identity task

adapted from Low & Watts (2013); Low et al (2014); Wang et al (2015)

There is some evidence that this prediction is correct. Jason Low and his collegaues set out to test it. They have now published three different papers showing such limits; and Hannes Rakoczy and others have more work in progress on this. Collapsing several experiements using different approaches, the basic pattern of their findings is this ...
Take non-automatic responses first; in this case, communicative responses. When you do a false-belief-identity task, you see the pattern you also find for false-belief-locations tasks. But things look different when you measure non-automatic responses ...

False-Belief:Identity task

adapted from Low & Watts (2013); Low et al (2014); Wang et al (2015)

The non-automatic responses all show the signature limit of minimal models of the mental. This is evidence for the hypothesis that Some automatic belief-tracking systems rely on minimal models of the mental.
I also hear that quite a few scientists have pilot data that speaks against this signature limit.
One particular task for future research will be to examine whether other automatic responses to scenarios involving false beliefs about identity, such as response times and movement trajectories, are also subject to this signature limit.

reidentifying systems:

same signature limit -> same system

This graph shows something else that I want to highlight, something which goes beyond my recipe from Kozhevnikov and Heggarty. Look at the three year olds. What might make us think that three year old’s responses are a consequence of the same system that underpin’s adults’ automatic responses? One compelling consideration is that three year old’s responses manifest to the same signature limit as adults’.
same signature limit -> same system

False-Belief:Identity task

adapted from Low & Watts (2013); Low et al (2014); Wang et al (2015)

Just say that you can do this with other stimuli and paradigms, and we have done this with infants and would like to do it with adults.
These findings complicate the picture: is helping driven by automatic processes only? If not, why do we predict that the signature limit of minimal theory of mind is found in this case too?

Scott et al (2015, figure 2b)

Scott and colleagues \citep{scott:2015_infants} provided other evidence suggesting that infants’ mindreading may be relatively sophisticated. Specifically, 17-month-olds watched a thief attempt to steal a preferred object (a rattling toy) when its owner was momentarily absent by substituting it with a less-preferred object (a non-rattling toy). Infants looked longer when the thief substituted the preferred object with a non-visually-matching silent toy compared to when the thief substituted it with a visually-matching silent toy. The authors postulated that infants can ascribe to the thief an intention to implant in the owner a false-belief about the identity of the substituted toy. The authors further suggested that infants make such ascriptions only when the substitution involves a visually-matching toy and the owner will not test whether the toy rattles on her return.
However, Scott et al.’s \citep{scott:2015_infants} explanations also require postulating that infants take the thief to be strikingly inept; despite having opportunity simply to pilfer from a closed box known to contain at least three rattling toys, the thief engages in elaborate deception which will be uncovered whenever the substituted toy is next shaken and the thief, as sole suspect, easily identified. A further difficulty is that factors unrelated to the thief’s mental states vary between conditions, such as the frequencies with which toys visually matching one present during the final phase of the test trial have rattled. These considerations jointly indicate that further evidence would be needed to support the claim that humans’ early mindreading capacity enables them to ascribe intentions concerning false beliefs involving numerical identity.

From speech perception (Liberman et al)

Multiple systems make possible what a single system rule out: incompatible responses to a single stimulus

From representational momentum (Kozhevnikov & Heggarty):

1. speed-vs-accuracy trade offs

motivate conjectures about

2. computational descriptions

which entail

3. signature limits

that generate predictions.

It seems to me that the same argument works in the case of ascribing mental states that also worked for representing the momentum of physical objects.

conclusion

My main contention is that we can make progress in identifying multiple systems even in advance of a detailed theory of the process architecture.
It has to be said that not everyone is convinced ..

‘the theoretical arguments offered [...] are [...] unconvincing, and [...] the data can be explained in other terms’

(\citealp{carruthers:2015_two}; see also \citealp{carruthers:2015_mindreading}).

Carruthers (2015)

What is my response? Yes, the data can be explained in other terms, at least post hoc; and certainly there is as yet insufficient data for certainty. What about the theoretical arguments? Partners in crime defence ... theoretical arguments for multiple systems for belief are the same as the theoretical arguments for physical cognition or number cognition (but that’s a different talk).

The system underpinning automatic belief tracking in adults

is not what generally underpins explicit verbal judgements;

and it relies on minimal theory of mind.

2sys doesn’t depend on minimal theory of mind (only on there being a computational description, just as Kozhevnikov could be right about two systems but wrong about impetus in particular characterising representational momentum.)

The same system underpins

some of infants’ belief-tracking abilities; or even

all of infants’ belief-tracking abilities.

Further questions:
Can arguments for a vertical distinction between two systems also be made about ascribing emotions, other mental states or goals?
If so, which horizontal distinctions divide systems for social cognition?