Monkey neurons were stimulated with blue light via electrode penetrations into the brain. The stimulation covered the full RF of each neuron.
A drifting Gabor stimulus appeared on half of the trials. The monkey was rewarded for moving toward the target on Gabor-present and absent trials. This task allowed us to test stimulus preferences of HS neurons.
Observation of Object Motion
The object motion observed by monkey neurons may be a combination of several features, such as the movement of the individual parts of the target object, its speed, or the way it is held. Neurons that respond to the target object’s movement can also be activated when it is viewed from a different position than usual, or when the target is being moved by another object. This type of observation can be used to help the brain understand the dynamics of an object’s movement, and can thus help in planning or guiding actions.
To determine which visual features are crucial to elicit monkey neuron responses, we conducted control tasks that required the monkey to passively observe 12 stimuli. Four of these were the same as in the basic task (MGI, MGIII, HG and HM), but the remaining 8 were modified versions of these stimuli with their first or second part obscured using black shading. The shaded versions prevented the observation of the context or the beginning of forelimb movement, but they did not occlude actual grasping (or its mimicking). We found that neurons that exhibited HS (i.e., coding for the overarching goal expressed by the video stimulus) were more likely to respond to the masked version of the videos than to the original ones. We also found that these neurons were able to differentiate between the different masked versions, but they did not discriminate between the two video epochs (Video 1 and Video 2).
Summary of receptive field properties of MSTd and MSTl neurons. Each ellipse represents a cross-section through the best-fitting 2-dimensional (2D) Gaussian fit at half-maximal response amplitude. The data points represent the mean receptive field size for each stimulus condition, averaged over 2 monkeys.
MSTd and MSTl neurons are known to exhibit a wide range of receptive fields, with MSTd exhibiting significantly larger receptive fields than MSTl. The difference in receptive field sizes is believed to reflect the differing functions of these two cortical areas. For example, stimulation of premotor cortex elicits movements to stereotyped postures depending on the location of the stimulating electrode. For example, stimulation at site one causes the monkey to bring its arm in front of its eyes, whereas stimulation at site two induces an open mouth posture.
Observation of Context
Researchers have identified neurons in monkey premotor cortex that respond differently during the observation of a goal-related or non-goal-related action. These neurons are selective for the overarching goal and not the specific objects or actions that comprise the target goal (Figure 3). They have also been shown to express a predictive ability by activating only when the observed action is likely to be followed by a goal-related action, namely bringing the object to mouth for eating or placing it into a container. This observation-action selectivity has been attributed to the presence of a pathway linking monkey area 45B in the prearcuate cortex with LST and LB1 in the lower bank of STS and with LIPa in the intraparietal sulcus (Moschovakis, 2004; Gerbella et al., 2010).
To identify these neurons, a set of control and task-related stimulation events was used. The stimulus was presented on a computer monitor to the monkey, and its position was controlled with digital output signals that specified the onset and offset of the fixation point, the beginning and end of stimulus presentation, and the time of reward delivery. The monkey was passively required to observe 6 different videos involving a variety of actions, agent behaviors and objects in the scene. Each video was matched to the same control conditions and had one of the 6 overarching goals.
The monkey was then asked to either observe the experimenter eat the food or place it into a container and to press a button when the appropriate action was observed. If the actor pressed the button in response to a goal-related action, the monkey received a small amount of water. If the actor pressed the button in reaction to a non-goal-related action, the monkey received nothing. In addition, the trial was aborted and a beep was delivered if the monkey’s gaze did not stay within a fixed fixation window during stimulus presentation.
Neurons that responded to the observation of an overarching goal and did not respond to the execution of the same action were called “mirror neurons.” Another class of monkey neurons, which were activated by the observation of a food-eating act but not the execution of the same action, were called “motor-set neurons.” Motor-set neurons appear to fire when they sense that the monkey is about to perform a movement.
Observation of Action
For this experiment, researchers recorded the neuronal activity in the VLPF of two rhesus monkeys while they passively observed video clips depicting human or monkey goal-directed actions. While a previous study had shown that VLPF single neurons code for biological stimuli, this was the first to show that some neurons in this area also respond to the observation of goal-directed actions. Specifically, the research showed that some of these neurons preferentially responded to the observation of monkeys grasping objects from both first and third person perspective compared to other types of grasping and mimicking videos. They further found that these neurons were also more likely to discharge during the observation of the actual goal-directed action rather than its reenactment.
To determine whether this preference for goal-directed action observation was a result of context or merely of specific actions, the researchers used a control task. In this test, the monkey was shown the same 12 videos as in the basic task but with one or both of the first and second epochs obscured using black shading. For example, a neuron that responds stronger to the observation of a monkey grasping from both first and third person perspective (Video Epoch 1) will decrease its discharge during obscured Video Epoch 2.
The results showed that these VLPF neurons are highly selective with respect to their response to both the action execution and its observation, yet they always display a weak modulation. This suggests that they are not coding contextual information about the actions but instead are using that contextual information to select their behavioral responses.
As the authors explain, this is in line with previous work on F5 showing that it codes not only object movement but also the interpretation of others’ behavior and its implication for oneself. In fact, this finding is a further support for the hypothesis that the role of F5 in understanding the intentions of others is to exploit their motor representations during action observation and execution.
To investigate this further, the authors employed a generative deep learning model called XDREAM that learns to generate images that activate particular neurons in VLPF. By searching a huge hypothesis space without assumptions about features or semantic categories, this model eventually evolved images that strongly resembled real-world objects.
Observation of Self
When we watch a video of ourselves performing an action, some of our neurons are activated. These are the so-called mirror neuron. Initially, researchers thought that these neurons coded a visual representation of our own actions, but now it seems more likely that they code other types of information. For example, it has been shown that orbitofrontal and lateral prefrontal neurons are sensitive to the identity of other monkeys, and that they can even predict the choices of other monkeys49. Furthermore, it has been shown that these neurons can code the prediction of what other monkeys will choose to do, which is a form of metaperception.
To understand these signals better, scientists analyzed the activity of VLPF neurons using a computational algorithm called XDREAM that takes the firing rate of a particular neuron as input and uses it to guide the evolution of an artificial image. The resulting image looks like noise at first, but over time it mutates, combines and creates new shapes that look more and more like faces or things in the animal’s environment.
The study showed that some neurons are highly selective, responding only to certain videos. The neurons that responded to multiple stimuli were categorized as ‘multi-stimulate’, and their responses were compared with the responses of the other neurons. Moreover, the study also examined the influence of obscuring certain parts of a video on the neuron’s response.
For instance, a neuron that responds exclusively to the observation of a monkey grasping food from a first and third person perspective is categorized as a highly selective (HS) neuron. Its discharge begins before movement onset, peaks during the gripping phase and continues till the completion of the movement. Its discharge decreases during obscuration of the first epoch, and increases in the second epoch.
Other examples of HS neurons include those that respond to only one of the following videos: a monkey grasping food from first and third person perspectives, a human actor extending his arm in front of himself (Biological Movement, BM), or mimicking this action (Human Grasping, HG). This neuron does not respond to other videos.