Click here and press the right key for the next slide (or swipe left)

also ...

Press the left key to go backwards (or swipe right)

Press n to toggle whether notes are shown (no equivalent if you don't have a keyboard)

Press m or double tap to see a menu of slides

\title {Origins of Mind: Lecture Notes \\ Joint Action without Mindreading}
 
\maketitle
 

Joint Action

without Mindreading

\def \ititle {Origins of Mind}
\def \isubtitle {Joint Action without Mindreading}
 
\
 
\begin{center}
{\Large
\textbf{\ititle}: \isubtitle
}
 
\iemail %
\end{center}
My talk today is about joint action and mindreading but to explain what motivates this research I want to start further back with the notion of reciprocity.

reciprocity

epistemological

phenomenological

onto- and phylogenetic

conceptual

Sometimes there is reciprocity in social cognition. It is not just that one individual is targeting another with her abilities to track actions, perceptions or thoughts. Rather, each individual is targeting her social intelligence at the other so that each is simultaneously both the subject and the target of social intelligence. This kind of reciprocity is characteristic of joint attention and at least some kinds of joint action.
We can distinguish three sorts of question about reciprocity.
The first is epistemological. Are there facts about others' minds or actions we do know but would not be in a position to know if reciprocity were impossible for us? Let me put this another way. It is a familiar idea that observation and inference are two ways of knowing. We can know facts about others' thoughts and actions by observing them, and by drawing inferences. Does reciprocity merely provide additional opportunities to observe and additional premises for inference? Or could there be a further way of knowing, one that essentially involves reciprocity? This is an epistemological question about reciprocity.
The second question concerns the emergence, in development or evolution. Which if any forms of reciprocity play a role in explaining the emergence of sophisticated forms of mindreading and referential communication?
There is also a conceptual question about reciprocity ... The conceptual question is, What forms of reciprocity are there?

reciprocity

One familiar way of characterising reciprocity involves involve higher-order ascriptions.

reciprocity

The mindreader ascribes to her target beliefs (say) about the mindreader's own beliefs and other mental states.

reciprocity

And if her target reciprocates, she might escalate by ascribing to the target beliefs about her own beliefs about the target's beliefs about her beliefs. While this might be useful in some situations,

reciprocity

there are two limits. First, this model of reciprocity can only apply where mindreaders are capable of metarepresentation, and there may be forms of mindreading that do not involve metarepresentation (or so I have argued elsewhere). Second, on this model of reciprocity the basic intuition goes unsatisfied.

reciprocity

without escalation

Reciprocity in mindreading should sometimes result in something like a meeting of minds rather than an escalation of higher-order ascriptions. The conceptual challenge is to capture this meeting of minds.
A radical alternative to the model of higher-order escalation involves the idea that there are psychological states with plural subjects. This is the idea that there is a single thought, intention or perception that is both mine and yours. This is a shared thought in the strongest sense: we share the thought in just the sense that sisters share a parent. This idea clearly captures an intuition about the meeting of minds, but perhaps unfortunately it's one familar to Star Trek fans in the form of the borg. The borg idea has been endorsed by John Campbell in research on joint attention and by Hans Bernard Schmid in research on joint action. I aim to be neutral about the borg---that is, about whether there really are psychological states with plural subjects. I think everyone will agree that the bare appeal to this idea doesn't enable us to answer the questions about epistemology and emergence, perhaps because it sheds no light on the mechanisms that make reciprocity possible.
My broad aim in what follows is to see whether I can identify a model of reciprocity that avoids both higher-order escalation and direct appeal to the borg idea of psychological states with plural subjects. Hopefully doing this will eventually put us in a position to answer the epistemological, ontogentic and philogenetic questions reciprocity.

Joint Action

Those who have thought about reciprocity have focussed on two main cases, joint attention and joint action. In both cases the challenge of providing a model of reciprocity without higher-order escalation arises, but I think the two cases quite separate treatment. Today my focus is joint action.
Let me start with a challenge about emergence.
 
\section{A Challenge and a Conjecture}
 
\section{A Challenge and a Conjecture}
challenge
Explain the emergence of sophisticated human activities including mindreading.
The challenge is to explain the emergence, in evolution or development, of sophisticated forms of human activity including referential communication and mindreading.
A number of researchers have suggested that meeting this challenge requires us to invoke some kind of social interaction ...
According to what Moll and Tomasello call the Vygotskian Intelligence Hypothesis,

‘participation in … leads children to construct uniquely powerful forms of cognitive representation.’

(Moll & Tomasello 2007)

\citep{Moll:2007gu}

‘perception, action, and cognition are grounded in

(Knoblich & Sebanz 2006)

\citep[p.\ 103]{Knoblich:2006bn}

‘human cognitive abilities … [are] built upon

(Sinigaglia and Sparaci 2008)

\citep{sinigaglia:2008_roots}
I'm going to assume that they are right.
If we take these ideas seriously, the first question we need to ask is, What kinds of social interaction matters for the emergence of sophisticated human activities?

What kinds of social interaction?  Joint action!

There seems to be some consensus on the idea that joint action is particularly important.
But what is joint action?
A paradigm case of joint action would be two sisters cycling to school together.
By contrast, two strangers cycling side-by-side are performing parallel but merely individual actions.
Or, to take another paradigm case,
when members of a flash mob in the Central Cafe respond to a pre-arranged cue by noisily opening their newspapers, they perform a joint action.
But when someone not part of the mob just happens to noisily open her newspaper in response to the same cue, her action is not part of any joint action.%
\footnote{
See \citet{Searle:1990em}; in his example park visitors simultaneously run to a shelter, in once case as part of dancing together and in another case because of a storm.
Compare \citet{Pears:1971fk} who uses contrast cases to argue that whether something is an ordinary, individual action depends on its antecedents.
}
challenge
Explain the emergence of sophisticated human activities including mindreading.
So the challenge was to explain the emergence of sophisticated human activities including referential communication and mindreading.
conjecture
Joint action plays a role in explaining how sophisticated human activities emerge.
And the conjecture I want to consider, borrowed from a variety of researchers, is that joint action plays a role in explaining how sophisticated human activities emerge.
Now there is a compelling objection to this conjecture.
It will take me a while to explain what the objection is.
The objection arises when we ask ask what joint action is.
 
\section{Joint Action vs Parallel but Merely Individual Action}
 
\section{Joint Action vs Parallel but Merely Individual Action}
We introduced joint action by referring to contrasts between genuine joint action and parallel but merely individual action.
These and other contrast cases invite the question,
\textbf{What distinguishes joint actions from parallel but individual actions?}
 
\textbf{Question} What distinguishes joint actions from parallel but individual actions?
The first contrast case shows that the difference can’t be just a matter of coordination because people who merely happen to be cycling side-by-side also need to coordinate their actions in order to avoid colliding.
Note also that in both cases each individual's cycling is intentional, so our intentionally cycling together cannot be only a matter of our each intentionally cycling.
The second contrast case shows that the difference can’t just be that the resulting actions have a common effect because merely parallel actions can have common effects too.
 
At this point it is natural to appeal to intention.
If we are performing actions of some type phi,
then perhaps for our doing this to be a joint action is just for us to be doing this
because we each intend that we, you and I, phi together.
So in the case of the cycling sisters, each would intend that they, the two sisters cycle to school together.
I'm going to come back to this idea so it's helpful to give it a label.
Let's call it the Simple Account of intentional joint action.

each sister intends that
they, the sisters, cycle to school together?

Does the appeal to togetherness make this circular? Not as long as we understand 'together' only in the sense in which the three legs of a tripod support the flask \emph{together}.
So we have to understand the intention as concerning an event type that could be a joint action but might also involve merely parallel actions.
 
\textbf{Simple Account}
(Intentional) joint action occurs when there is an act-type, φ, such that each of several agents intends that they, these agents, φ.
This certainly distinguishes the cases on your left from those on your right.
But we can see that the Simple Account is too simple as it stands by adapting an example from Gilbert and Bratman ...
Contrast two friends walking together in the ordinary way,
which is a paradigm case of collective agency,
with a situation where two gangsters walk together but each is forcing the other.
It works like this: Gangster 1 pulls a gun on Gangster 2 and says: “let’s walk”
But Gangster 2 does the same thing to Gangster 1 simultaneously.
This is walking together in the Tarrantino sense,
and clearly not a case of joint action.
At least it’s not joint action unless the central event of of Reservoir Dogs is also a case of joint action.
Since in this case there is something which all the agents involved intend, it seems that our being involved in a joint action can't be a matter only of there being something such that we each intend that we do it together.
 
\section{Bratman on Shared Intention}
 
\section{Bratman on Shared Intention}
The leading, best developed and most influential way around this problem is due to Michael Bratman. His idea for avoiding this sort of problem is to suggest that we don’t just each intend the action but rather we each intend to act by way of the other's intentions.
We can put this by saying that our intentions must interlock: mine specify yours and yours mind.
Now this appeal to interlocking intentions enables Bratman to avoid counterexamples like the Tarantino walkers; if I intend that we walk by way of your intention that we walk, I suppose can't rationally also point a gun at you and coerce you to walk.

‘each agent does not just intend that the group perform the […] joint action.

‘Rather, each agent intends as well that the group perform this joint action in accordance with subplans (of the intentions in favor of the joint action) that mesh’

(Bratman 1992: 332)

`each agent does not just intend that the group perform the […] joint action. Rather, each agent intends as well that the group perform this joint action in accordance with subplans (of the intentions in favor of the joint action) that mesh' \citep[p.\ 332]{Bratman:1992mi}.
Our plans are \emph{interconnected} just if facts about your plans feature in mine and conversely.
‘shared intentional [i.e.\ collective] agency consists, at bottom, in interconnected planning agency of the participants’ \citep{Bratman:2011fk}.
In making this idea more precise, Bratman proposes sufficient conditions for us to have a shared intention that we J ...
... the idea is then that an intentional joint action is an action that is appropriately related to a shared intention.

We have a shared intention that we J if

‘1. (a) I intend that we J and (b) you intend that we J

‘2. I intend that we J in accordance with and because of la, lb, and meshing subplans of la and lb; you intend [likewise] …

‘3. 1 and 2 are common knowledge between us’

(Bratman 1993: View 4)

\begin{minipage}{\columnwidth}
\emph{Bratman’s claim}. For you and I to have a collective/shared intention that we J it is sufficient that:
\begin{enumerate}[label=({\arabic*}),itemsep=0pt,topsep=0pt]
\item `(a) I intend that we J and (b) you intend that we J;
\item `I intend that we J in accordance with and because of la, lb, and meshing subplans of la and lb; you intend that we J in accordance with and because of la, lb, and meshing subplans of la and lb;
\item `1 and 2 are common knowledge between us' \citep[View 4]{Bratman:1993je}
\end{enumerate}
\end{minipage}
Note that the conditions require not just that we intend the joint action, but that we intend it because of each other's intentions, where this is common knowledge.
So we need not just intentions about intentions ...
... also you need to know things about my knowledge of your intentions concerning my intentions.
This indicates that, in general, having shared intentions requires mindreading at close to (or perhaps just beyond) the limits of most adult humans' abilities. Bratman's account of shared intention is an example where reciprocity is modeled as higher-order escalation.
And this is a problem for us.
challenge
Explain the emergence of sophisticated human activities including mindreading.
conjecture
Joint action plays a role in explaining how sophisticated human activities emerge.
This is a problem because our conjecture was that joint action plays a role in explaining how sophisticated human activities emerge.
objection
Joint action presupposes mindreading at the limits of human abilities.
But if joint action presupposes mindreading at close to the limits of human abilities,
and if mindreading abilities are a paradigm case of humans' cognitive sophistication,
then we must reject the conjecture.
For in appealing to joint action we would be presupposing what was supposed to be explained.
In what follows I want to defend the conjecture by identifying a way around the objection.
But before I do this, I want to mention a second problem with Bratman's account ..

minimal approach

What’s wrong with Bratman's account is exemplified by Beatrice and Baldric. Their problem is that they don’t conceive of their actions as an exercise of shared agency. What we want is some way to capture the sense that agents engaged in shared agency conceive of their actions as exercises of shared agency, without of course going in a circle by appealing directly to shared agency here. At this point it’s tempting to appeal to romantic notions of sharing, or to introduce distinctive ingredients like special modes of thought, special ontological constructs or special kinds of reasoning. I want to suggest a way of capturing the agents’ perspective without any such distinctive ingredients.
And in doing this, I want to try a minimal approach. Instead of starting with a sophisticated notion of shared intention, we try to find a minimal starting point and add ingredients as needed.
 
\section{A Minimal Approach}
 
\section{A Minimal Approach}
I want to start with a claim from Kirk Ludwig's semantic analysis.
A \emph{joint action} is an event with two or more agents, as contrasted with an \emph{individual action} which is an event with a single agent \citep[p.\ 366]{ludwig_collective_2007}.
[Grounding]
Events $D_1$, ...\ $D_n$ \emph{ground} $E$, if: $D_1$, ...\ $D_n$ and $E$ occur;
$D_1$, ...\ $D_n$ are each (perhaps improper) parts of $E$; and
every event that is a proper part of $E$ but does not overlap $D_1$, ...\ $D_n$ is caused by some or all of $D_1$, ...\ $D_n$.
For an individual to be \emph{among the agents of an event} is for there to be actions $a_1$, ...\ $a_n$ which ground this event where the individual is an agent of some (one or more) of these actions.
A joint action is an event with two or more agents \citep{ludwig_collective_2007}.
This definition is too broad.
To see why, consider an example.
Nora and Olive killed Fred.
Each fired a shot.
Neither shot was individually fatal but together they were deadly.
An ambulance arrived on the scene almost at once but Fred didn't make it to the hospital.
On the revised simple definition, this event is a joint action just because Nora and Olive are both agents of it.
Now suppose that Nora and Olive have no knowledge of each other, nor of each other's actions, and that their efforts are entirely uncoordinated.
We might even suppose that Nora and Olive are so antagonistic to each other that they would, if either knew the other's location, turn their guns on each other.
The event of their killing Fred is nevertheless a joint action on the revised simple definition.
 
Why is this a problem?
Because it shows that our present characterisation of joint action as an event with two or more agents doesn't match intutions about contrasts between joint and parallel but merely individual action.
So we need to improve on this.
What is missing from this first attempt at characterising joint action?
Some joint actions are purposive in the sense that
among all their actual and possible consequences,
there are outcomes to which they are directed
Can we capture the purposive aspect of joint action without invoking shared intention?
To this end, let me introduce the notion of a distributive goal.
A \emph{goal} is an outcome to which actions are, or might be, directed. (Contrast a \emph{goal-state}, an intention or other state of an agent linking an action to a goal to which it is directed.)
An outcome is a \emph{distributive goal} of two or more actions just if
(a) each action is individually directed to this outcome; and
(b) it is possible that: all actions succeed relative to this outcome.
It's striking that the notion of a distributive goal is able to distinguish many standard contrasts between joint and parallel but merely individual action.
For example, take the strangers cycling the same route side-by-side. Their actions don't have a distributive goal.
Each cyclist's actions are directed at her own arrival. These are different outcomes; after all, if one falls off and gets taken to hospital, then the outcome to which her action was directed will not occur, although the outcome to which the other cyclist's action was directed could still occur.
Could we then say that a joint action is (a) an event with two or more agents where (b) the actions which are parts of this event have a distributive goal?
There are lots of objections to this view. One arises from the two gangsters who force each other to walk. Their actions have a distributive goal but they do not consitute a joint action.
[*CUT: another counterexample to the sufficiency of distributive goals] No. To see why not, consider this case. One dark night two communists each independently intend to paint a large bridge red. More carefuly, each intends to perform some action which will ground or partially ground the painting of the bridge red. Because the bridge is large and they start from different ends, they have no idea of the other's involvement in their project until they meet in the middle. Here the conditions on our proposed definition are met: we have (a) an event with two or more agents where (b) the actions which are parts of this event have a distributive goal? But intuitively the bridge painting is not a joint action. This is a problem because we wanted a notion of joint action such that implicit conception of this notion must be available through reflection on the supposedly paradigmatic cases of joint action.
What are we missing?
Just a moment ago I noted that some joint actions are purposive in the sense that
among all their actual and possible consequences, there are outcomes to which they are directed.
But we can say more than this.
In paradigm cases of joint action, it is the joint action as a whole that is directed to this outcome
So it is not just a matter of each agent’s individual actions being directed to this outcome.
But where some actions have a distributive goal, this involve barely anything more than each action individually being directed to that goal.
How can we capute the idea that is a joint action taken as a whole is directed to an outcome?
To this end let me introduce the notion of a collective goal.
An outcome is a \emph{collective goal} of two or more actions just if
(a) this outcome is a distributive goal of the actions;
(b) the actions are coordinated; and
(c) coordination of this type would normally facilitate occurrences of outcomes of this type.
[*CUT] The communist bridge painters that I mentioned earlier, their activities do not have a collective goal because they are not coordinated.
Examples of activities that typically have collective goals include uprooting a small tree together and tickling a baby together to make it laugh.
Could we say that a joint action is (a) an event with two or more agents where (b) the actions which are parts of this event have a collective goal?
This depends on further specifying the coordination involved in a collective goal. Let me explain ...
The notion of a collective goal assumes that of coordination. This should be understood in a very broad sense.
For example, when two agents between them lift a heavy block by means of each agent pulling on either end of a rope connected to the block via a system of pulleys, their pullings count as coordinated just because the rope relates the force each exerts on the block to the force exerted by the other.
In this case, the agents' activities are coordinated by a mechanism in their environment, the rope, and not necessarily by any psychological mechanism.
Given that we are working with a broad notion of coordination, we should also allow that when two people walk in the Tarantino sense, the guns provide for the coordination of their actions. So we cannot say that for some actions to be a joint action it is sufficient that they have a collective goal.
To get an interesting notion of joint action, we need to further specify the type of coordination in virtue of which actions have a collective goal.
To make a conjecture based on work with bees and ants, in some cases ...
the coordination needed for a collective goal may even be supplied by behavioural patterns \citep{seeley2010honeybee} and pheromonal signals \citep[pp.\ 178-83, 206-21]{hoelldobler2009superorganism}.
But our concern is with collective goals whose coordination is due to psychological states.
Now I take it that a variety of psychological mechanisms can provide for the coordination needed for actions to have a collective goal. Indeed, one such mechanism is the interlocking structure of intentions and knowledge that Bratman specified in attempting to provide sufficient conditions for shared intention.
Here I want to consider cases where the coordination required for actions to have a collective goal involves motor representations rather than conscious thought.
 
\section{Motor Representation in Joint Action}
 
\section{Motor Representation in Joint Action}

motor representation

I want to consider cases where the coordination required for actions to have a collective goal involves motor representations rather than conscious thought.
Let me start by stepping back and consider an individual action.
An agent moves a mug from one place to another, passing in from her left hand to her right hand half way [*demonstrate].
It’s a familiar idea that motor representations of outcomes resemble intentions in that they can trigger processes which are like planning in some respects.
These processes are like planning in that they involve starting with representations of relatively distal outcomes and gradually filling in details, resulting in a structure of motor representations that can be hierarchically arranged by the means-end relation \citep{bekkering:2000_imitation,grafton:2007_evidence}.
Processes triggered by motor representations of outcomes are also planning-like in that they involve selecting means for actions to be performed now in ways that anticipate future actions \citep{jeannerod_motor_2006,zhang:2007_planning,rosenbaum:2012_cognition}.
Now in this action of moving a mug, there is a need, even for the single agent, to coordinate the exchange between her two hands.
(If her action is fluid,
she may proactively adjust her left hand in advance of the mug’s being lifted by her right hand \citep[compare][]{diedrichsen:2003_anticipatory,hugon:1982_anticipatory, lum:1992_feedforward}.)
How could such tight coordination be achieved?
Part of the answer involves the fact that motor representations and processes concerning the actions involving each hand are not entirely independent of each other.
Rather there is a plan-like structure of motor representation for the whole action and motor representations concerning actions involving each hand are components of this larger plan-like structure.
It is in part because they are components of a larger plan-like structure that the movements of one hand constrain and are constrained by the movements of the other hand.
But how is any of this relevant to the case of joint action? Could there be a role for motor representation in coordinating joint action?
This would seem unlikely if one's motor representations concerned only one's own actions. But do they? There is at least one sense in which they do not. Outcomes are represented motorically not only in performing actions but sometimes in observing others acting \citep{Rizzolatti:1996eu,rizzolatti_functional_2010}.
This can happen not only when observing a single agent acting alone but even when observing several agents performing a joint action \citep{manera:2013_time}.

motor representations and processes in observation enable predictions

These motor representations trigger planning-like processes in observer much like those that would occur if she were actually acting \citep{Jeannerod:2001yb,Gangitano:2001ft,ambrosini:2012_tie}.
Further, the occurrence of motor representations and planning-like processes in action observation is no mere quirk: it can enable observers to predict others' movements and to identify outcomes to which their actions are directed, so that it is almost as if the observer were covertly making predictions by planning how someone in the target's situation should proceed \citep{Wolpert:2003mg,kilner:2007_predictive,Costantini:2012fk}.
So much for observation. What about joint action?

in joint action,

motor representions of collective goals

Loehr et al (2013); Ménoret et al (in press); Tsai et al (2011)

which trigger planning-like processes

concerning others' actions

Kourtis et al (2013); Meyer et al (2011)

facilitate interpersonal coordination

Loehr & Palmer (2011); Novembre et al (2013); Vesper et al (2013)
There is evidence for motor representations, in agents involved in a joint action, of outcomes to which the joint action (or some part of it) is or could be collectively directed. For instance there is evidence that, sometimes, a pianist playing a duet will represent not only her own and her partner's individual contributions but also the chord that she and her partner are supposed to produce together \citep{loehr:2013_monitoring}.% \footnote{ Further studies using both neurophysiological and behavioural measures indicate that in some joint actions there are motor representations of outcomes to which that joint action could be collectively directed \citep{Menoret:2013fk,tsai:2011_groop_effect}. }
We also know that these motor representations trigger planning-like motor processes ...
... and that these processes concern not only actions that will be performed by the agent in whom they occur, but also actions that will eventually be performed by others involved in the joint action. There is indeed evidence that such planning-like motor processes sometimes occur in joint action \citep{kourtis:2012_predictive, meyer:2011_joint}.
Further, for a variety of joint actions including passing an object, jumping together and playing a piano duet, motor representations concerning another’s actions can facilitate interpersonal coordination (\citealp[p.\ 9]{kourtis:2012_predictive}; \citealp{loehr:2011_temporal}; \citealp{novembre:2013_motor}; \citealp{vesper:2012_jumping}; \citealp{vesper:2013_our}).% \footnote{ There is also evidence that when agents are engaged in joint action, they sometimes take into account future actions to be performed by others when choosing how to act now, and do so in much the way they would if they were performing the whole action alone \citep{meyer:2013_higher-order}. }
If we considered only motoric aspects, it might almost seem as if each agent were planning to performing alone the whole of what is in fact a joint action. \textbf{But how could this provide for coordination of the agents' actions?}

Motor representations concern not only bodily configurations and movements but also more distal outcomes such as the grasping of a mug or the pressing of a switch \citep{butterfill:2012_intention,hamilton:2008_action,cattaneo:2009_representation}.
Some motor processes are planning-like in that they involve deriving means by which the outcomes could be brought about and in that they involve coordinating subplans \citep{jeannerod_motor_2006,zhang:2007_planning}.
Motor processes concerning actions others will perform occur in observing others act \citep{Gangitano:2001ft}---and even in observing several others act jointly \citep{manera:2013_time}---and enables us to anticipate their actions \citep{ambrosini:2011_grasping,aglioti_action_2008}.
In joint action, motor processes concerning actions another will perform can occur \citep{kourtis:2012_predictive, meyer:2011_joint},
and can inform planning for one's own actions \citep{vesper:2012_jumping,novembre:2013_motor,loehr:2011_temporal}.
In some joint actions, the agents have a single representation of the whole action (not only separate representations of each agent's part) \citep{tsai:2011_groop_effect,loehr:2013_monitoring,Menoret:2013fk},
and may each make a plan for both their actions \citep{meyer:2013_higher-order}.
Earlier we considered what is involved in performing an ordinary, individual action, where an agent moves a mug from one place to another passing it between her hands half-way.
Compare this individual action with the same action performed by two agents as a joint action.
One agent takes the mug and passes it to the other, who then places it.
The joint action is like the individual action in several respects.
First, the goal to which the joint action is directed is the same, namely to move the mug from here to there.
Second, there is a similar coordination problem---the agents’ two hands have to meet.
And, third, the evidence we have mentioned suggests that in joint action, motor representations and processes occur in each agent much like those that would occur if this agent were performing the whole action alone.
Why would this be helpful?
 
Suppose the agents' planning-like motor processes are similar enough that, in this context, they will reliably produce approximately the same plan-like structures of motor representations.
Then having a single planning-like motor process for the whole joint action in each agent means that
\begin{enumerate}
\item in each agent there is a plan-like structure of motor representations concerning each of the others’ actions,
\item each agent's plan-like structure concerning another's actions is approximately the same as any other agent's plan-like structure concerning those actions,
\item each agent's plan-like structure concerning her own actions is constrained by her plan-like structures concerning the other’s actions.
\end{enumerate}
So each agent’s plan-like structure of motor representations concerning her own actions is indirectly constrained by the other agents' plan-like structures concerning their own actions
by virtue of being directly constrained by her plan-like structures concerning their actions.
In this way it is possible to use ordinary planning-like motor processes to achieve coordination in joint action.
What enables the two or more agents' plan-like structures of motor representations to mesh is not that they represent each other's plans but that they processes motorically each other's actions and their own as parts of a single action.
 
So how does the joint action differ from the corresponding individual action?
There are at least two differences.
First, we now have two plan-like structures of motor representations because in each agent there is a planning-like motor process concerning the whole action.
These two structures of motor representations have to be identical or similar enough that the differences don’t matter for the coordination of the agents’ actions---let us abbreviate this by saying that they have to \emph{match}.
The need for matching planning-like structures is not specific to joint action;
it is also required where one agent observing another is able to predict her actions thanks to planning-like motor processes concerning the other’s actions (we mentioned evidence that this occurs above).
A second difference between the joint action and the individual action is this.
In joint action there are planning-like motor processes in each agent concerning some actions which she will not eventually perform.
There must therefore be something that prevents part but not all of the planning-like motor process leading all the way to action.
Exactly how this selective prevention works is an open question.
We expect bodily and environmental constraints are often relevant.
There may also be differences in how others’ actions are processed motorically \citep[compare][]{novembre:2012_distinguishing}.
\footnote{\citep[p.\ 2901]{novembre:2012_distinguishing}: 'in the context of a joint action—the motor control system is particularly sensitive to the identity of the agent (self or other) of a represented action and that (social) contextual information is one means for achieving this distinction'}
And inhibition could be involved too \citep[compare][]{sebanz:2006_twin_peaks}.
My proposal, then, is this. Speaking motorically, sometimes agents are able to achieve coordination for joint action not by representing each others’ plans but by treating each other's actions and their own as if they were parts of a single action.
So perhaps joint action is not always only a matter of intention, knowledge or commitment: perhaps sometimes joint action constitutively involves motor representation.
So let me return to the notion of a collective goal. I suggest that this notion, simple as it is, provides us with the core thing we need to understand purposive joint action. This is not to say that the notion of a collective goal provides us with deep insight into joint action; clearly it does not. Rather, we gain insight by understanding mechanisms of coordination that enable us to further specify ways for our actions to have a collective goal.
I've just argued that these mechanisms of coordination include a certain interagential structure of motor representation.
Let me be explicit about what this structure involves. First, there is an outcome to which a joint action could be collectively directed and in each agent there is a motor representation of this outcome. Second, these motor representations trigger planning-like processes in each agent which result in plan-like hierarchies of motor representations. Third, the plan-like hierarchy in each agent involves motor representations concerning another's actions as well as her own. Fourth, the plan-like hierarchies of motor representations in the agents nonaccidentally match. When all of these conditions are met, the result is an interagential structure of motor representations capable of providing the coordination needed for the actions in question to have a collective goal.
An interagential structure of motor representation: \begin{enumerate} \item there is an outcome to which a joint action could be collectively directed and in each agent there is a motor representation of this outcome; \item these motor representations trigger planning-like processes in each agent which result in plan-like hierarchies of motor representations; \item the plan-like hierarchy in each agent involves motor representations concerning another's actions as well as her own; \item the plan-like hierarchies of motor representations in the agents nonaccidentally match. \end{enumerate}
In short then, some actions have collective goals in virtue of a certain interagential structure of motor representation: When this happens, it is almost as if we achieve coordination not by thinking about each others' panls but by each engaging in motor planning for all of our actions.

Joint Action without Mindreading

So this is my account of joint action without mindreading. These conditions are sufficient for an event with two or more agents to be a joint action: (a) the actions which are part of this event have a collective goal; and (b) the coordination required for the collective goal is provided by an appropriate interagential structure of motor representation.
These conditions are sufficient for an event with two or more agents to be a joint action: \begin{enumerate} \item the actions which are part of this event have a collective goal; \item the coordination required for the collective goal is provided by an appropriate interagential structure of motor representation. \end{enumerate}
How do these conditions rule out counterexamples like that involved when two agents walk together in the Tarantino sense? The interagential structure of motor representation ensures that we each treat the whole action as if it were our own. So, if my conditions are met, your pointing a gun at me would be almost like your pointing a gun at yourself in order to force yourself to do something.
challenge
Explain the emergence of sophisticated human activities including mindreading.
conjecture
Joint action plays a role in explaining how sophisticated human activities emerge.
objection
Joint action presupposes mindreading at the limits of human abilities.
This matters for the objection I mentioned earlier. If we characterise joint action in the standard way, by invoking shared intention, then I think it's reasonably clear that performing a joint action will require sophisticated cognitive abilities including mindreading. As we saw, this is fatal for the conjecture that joint action plays a role in explaining how sophisticated human activities emerge.
But we have just seen that we can avoid this objection by recognising that there is a constitutive role for motor representation in characterising joint action. One amazing thing about the recent work on motor representation in joint action is that not only tells us about mechanisms of coordination: In combination with the minimal framework of distributive and collective goals, it also provides us with a new way of thinking about what joint action is. And one benefit of this is that it allows us to hold on to the conjecture that joint action is a core form of social intelligence, one that may well play a role in explaining how sophisticated human abilities like mindreading and referential communication emerge in evolution or development.

Joint Action without Mindreading

That, anyway, is why I think there can be joint action without mindreading, and why I think it matters that this is possible.
Sometimes interlocking intentions and knowledge states are not enough. Sometimes no amount of forming intentions about others’ intentions and acquiring knowledge of such intentions is sufficient, all by itself, for shared intention. In some cases, joint action requires changing perspective to conceive of your own and others’ actions almost as if they constituted a single action. Perhaps it is these kinds of joint action that matter for understanding the emergence of sophisticated forms of mindreading and referential communication.

appendix

 
\section{Appendix: Interconnected vs Parallel Planning}
 
\section{Appendix: Interconnected vs Parallel Planning}
The guiding idea behind Bratman's conditions for shared intention is this:
shared agency consists, at bottom, ...

shared intentional agency consists, at bottom, in interconnected planning agency of the participants.’

(Bratman 2011, p. 11)

Facts about your plans feature in my plans & conversely.

We have a shared intention that we J if

‘1. (a) I intend that we J and (b) you intend that we J

‘2. I intend that we J in accordance with and because of la, lb, and meshing subplans of la and lb; you intend [likewise] …

‘3. 1 and 2 are common knowledge between us’

(Bratman 1993: View 4)

parallel planning

You plan our actions, yours and mine, and I plan our actions too

Here, interconnected planning is planning where facts about your plans feature in my plans & conversely. What I've just tried to show is that interconnected planning is not sufficient for joint action. Ayesha's and Ahmed's plans are interconnected and so are Beatrice's and Baldric's, but still each sees the other's actions only as opportunities to exploit and constraints to work around. Now my question is what would be sufficient for joint action ...
Suppose you and I are tasked with moving this table through that door.
In doing this, must my plan take into account facts about your intentions as well as about the weight of the table, width of the door &c?
This case has some special features: (i) there is a single most salient route for the table given our objective; (ii) there is a single most salient way of dividing up the roles between us.
I suggest that, in this situation, neither of us needs to form a plan involving the others' intentions.
The situation makes this redundant.
All we have to plan is how two people in our situations should move the table through the door.
To a first approximation, then, what the situation seems to call for is not that our plans are interconnected but rather that we each make a plan for the table-moving action as a whole. This is inspiration for the view that we might arrive at sufficient conditions by reflecting on parallel rather than interconnected planning ...

shared intentional agency consists, at bottom, in interconnected planning agency of the participants.’

(Bratman 2011, p. 11)

Facts about your plans feature in my plans & conversely.

parallel planning

You plan my actions as well as yours, and I do likewise.

In parallel planning, I plan all of our actions and you do the same.
I want to suggest that shared agency sometimes requires only parallel, and not interconnected planning.