Spatiotemporal processing of real faces is supported by dissociable visual-sensing-modulated neural circuitry

Hirsch, Joy 1 ; Kelley, Megan1 ; Tiede, Mark1 ; Zhang, Xian1 ; Noah, J. Adam1

Research facility: Yale School of Medicine

Published Jul 09, 2025 on Dryad. https://doi.org/10.5061/dryad.rv15dv4j4

Data files

Jul 09, 2025 version files 2.26 GB

data.tgz
2.26 GB
FileExistTable.csv

2.59 KB
README.md
27.36 KB

Abstract

Real faces elicit both unique patterns of visual sensing behavior and of neural activity. To investigate the relationship between these phenomena, twenty participants underwent simultaneous acquisition of functional near infrared spectroscopy (fNIRS), electroencephalography (EEG), and eye-tracking while they viewed a real human face or a control robot face. We hypothesized that neural processing of real faces is influenced by patterns of visual acquisition. Regression analyses of fNIRS and eye-tracking revealed real-face-specific modulation of the right lateral (peak-t=3.68, p=0.001) and dorsal (peak-t=3.85, p<0.001) visual streams by fixation duration and dwell time, respectively. Standardized low-resolution brain electromagnetic tomography (sLORETA) identified significant alpha (8-13hz) oscillatory activity in the lateral and dorsal parietal clusters during human face viewing, suggesting a role for temporal binding in processing faces. These findings are consistent with our hypotheses and point to dissociable roles for the lateral and dorsal visual streams in live face processing.

This readme file was generated on 2024-09-20 by Dr. Megan Kelley.

GENERAL INFORMATION

Title of Dataset:

Author Information
Name: Megan Kelley
ORCID: 0000-0001-8612-5215
Institution: Yale School of Medicine
Address: 300 George Street, Suite 902, New Haven, CT, 06511, USA
Email: megan.kelley@yale.edu

Principal Investigator Information
Name: Joy Hirsch
ORCID: 0000-0002-1418-6489
Institution: Yale School of Medicine
Address: 300 George Street, Suite 902, New Haven, CT, 06511, USA
Email:

Author/Alternate Contact Information
Name: J. Adam Noah
ORCID: 0000-0001-9773-2790
Institution: Yale School of Medicine
Address: 300 George Street, Suite 902, New Haven, CT, 06511, USA
Email: adam.noah@yale.edu

Date of data collection: Approximate collection dates are 2022-01-01 through 2023-02-01.

Geographic location of data collection: 300 George Str, New Haven, CT, United States.

Information about funding sources that supported the collection of the data:
Title: Basic neural processing mechanisms of live human face viewing
Source: NIH: 1F99NS129174-01A1
PI: Kelley, Megan

Title: Mechanisms of Interpersonal Social Communication: Dual-Brain fNIRS Investigation
Source: NIMH - R01 MH107513
PI: Joy Hirsch, Ph.D.

Title: Mechanisms of Dynamic Neural Coupling during Face-to-Face Expressions of Emotion
Source: NIMH - R01MH119430
PI: Joy Hirsch, Ph.D.

Title: Neural Mechanisms for Social Interactions and Eye Contact in ASD
Source: NIMH, 1R01MH111629-01
PI: Joy Hirsch, Ph.D.

SHARING/ACCESS INFORMATION

Licenses/restrictions placed on the data: None.

Links to publications that cite or use the data: None

Links to other publicly accessible locations of the data: None

Links/relationships to ancillary data sets: None

Was data derived from another source? No

Recommended citation for this dataset:
Kelley, Megan S.; Noah, J. Adam; Zhang, Xian; Hirsch, Joy (2024). Simultaneous fNIRS, EEG, and eye tracking data during real live human face viewing and robot face viewing, with different degrees of continuity. [Dataset]. Dryad.

DATA & FILE OVERVIEW
For this data set, we have included three files: 1: a data.tar file that includes all raw and exported data collected during the experiment; 2) this readme.md file; and 3) and text file, FileExistTable.csv, which is an overview of the dataset indicating which files are present or missing for each subject.
During data collection subjects completed two data recording visits on separate days. On one visit, the task was completed with a human partner and on the other visit, with a robot partner. Partner order was counterbalanced per subject.
Each visit consisted of two runs each of four conditions (described below). The order of conditions was counterbalanced across subjects, with each subject completing the same condition order during both visits.

File List:
The types of files included are briefly listed below. For full details on names and details specific to each file, see DATA-SPECIFIC INFORMATION section.

FNIRS data files: csv files containing oxyhemoglobin, deoxyhemoglobin, and total concentration for each channel at each time point. Data were collected with a 6msec sample time per channel.
FNIRS channel location files: csv file containing MNI coordinates for each fNIRS channel.
EEG data files: csv files containing scalp voltage for each channel, at a sampling rate of 256hz.
EEG channel location files: csv file containing the MNI coordinates for each EEG channel.
Eye tracking data files: csv file containing the location, timestamp, and classification of eye behavior during the tasks. These data were collected at 120Hz.

File names follow a format indicating visit type, condition, and run. The format for each file name is included in DATA-SPECIFIC INFORMATION section.

Visit H: the task was completed with a human partner. Both fNIRS data and channel location files will start with H for human.
Visit R: the task was completed with a robot partner. Both fNIRS data and channel location file will start with R for robot.

For each visit there are four conditions consisting of 15 events. In all conditions, events consisted of 3200ms of face viewing time, subdivided differently depending on condition.
Condition 1 - NF: also called the “no flicker (NF)” condition, events consist of one stimulus epoch that is 3200 ms.
Condition 2 - LF: also called the “long epoch with flicker (LF)” condition, events consist of two stimulus epochs that are 1600 ms, with a 200 ms disruption between them.
Condition 3 - SF: also called the “short epoch with flicker (SF)” condition, events consist of four stimulus epochs that are 800ms, with a 200 ms disruption between each.
Condition 4 - XF: also called the “extra short epoch with flicker (XF)” condition, events consist of eight stimulus epochs that are 400ms, with a 200 ms disruption between each.
In all cases, the stimulus is a view of the partner’s face, enabled by the smart glass divider between the participant and the partner turning transparent.

Filenames contain a three letter phrase indicating condition and visit name. For example, NFR indicates the NF condition (condition 1) for the robot visit. XFH indicates the XF condition (condition 4) for the Human visit.

Each condition is conducted twice per partner. The filename will contain run1 or run2.

Additional related data collected that was not included in the current data package: None.

Are there multiple versions of the dataset? No
If yes, name of file(s) that was updated:
Why was the file updated?
When was the file updated?

METHODOLOGICAL INFORMATION
Description of methods used for collection/generation of data:

Paradigm
Participants completed two runs of a face viewing paradigm with two partners. The face viewing paradigm consisted of alternating periods of face viewing and no face viewing. Participants were seated 140 centimeters (cm) in front of and facing a partner with a “Smart Glass” divider (Hohofilm, Shanghai, City) at the midpoint between them. The divider can be made transparent or opaque under computer control, with a switching latency of ~10ms. Participants faced straight ahead, keeping a consistent head posture while the opacity of the smart glass was toggled between transparent and opaque, allowing or preventing them from seeing their partner. When they could see their partner, participants were instructed to look at the face, allowing their eyes to move as felt comfortable and natural, and when they could not see it, they were instructed to keep their eyes open and focused on a dot centered on the divider.
Runs consisted of five task blocks alternating with 15-s rest blocks. Task blocks were divided into three face-viewing events which alterna¬ted with 3-s no-face-viewing periods . Face viewing events for all conditions consisted of 3200ms of total face viewing time each; however, the continuity of view varied based on condition. During condition 1, face viewing events consisted of a single 3200ms epoch of continuous view of the partners face with no disruptions. During condition 2, the 3200ms of face viewing time was subdivided into two 1600ms face viewing epochs with a single 200ms disruption to face view. Condition 3 events consisted of four 800ms face viewing epochs with three 200ms disruptions to face view. Condition 4 events consisted of eight 400ms face viewing epochs with seven 200ms disruptions to face view. Disruptions occurred by “flickering” the smart glass from transparent to opaque for 200ms, and then quickly back to transparent.

The face viewing paradigm was a modification on similar methods described previously(Dravida et al., 2020; Hirsch et al., 2017, 2022; Kelley et al., 2021; Noah et al., 2020; Parker et al., 2023).

FNIRS, EEG, and Eye tracking data collection
Data were collected via simultaneous fNIRS, EEG, and eye tracking while individuals viewed either another real human partner’s face or a robot partners face.
fNIRS Data were collected via a multichannel continuous-wave system (LABNIRS, Shimadzu Corporation, Kyoto, Japan) consisting of forty emitter-detector optode pairs. During the task, optodes were connected to a cap placed on the participant’s head based on size to fit comfortably. For consistency in cortical coverage, the middle anterior optode was placed 1 cm above the nasion; the middle posterior optode was placed in line with the inion; and the CZ optode was aligned with the anatomical CZ. After cap placement, hair was cleared from optode holders using a lighted fiber-optic probe (Daiso, Hiroshima, Japan) prior to optode placement. Optodes were arranged in a matrix, contacting the scalp, enabling acquisition of 128 channels. After optode placement and prior to beginning the experiment, signal-to-noise ratio was assessed by measuring attenuation of light for each channel, with adjustments made as needed(Noah et al., 2015; Tachibana et al., 2011).\
FNIRS signal acquisition, optode localization, and signal processing were similar to methods described previously(Dravida et al., 2020; Hirsch et al., 2017, 2022; Kelley et al., 2021; Noah et al., 2020).\
Eye-movements were recorded using a desk-mounted Tobii Pro (Stockholm, Sweden) X3-120 eye-tracking system placed 70cm in front and slightly below the participant’s face. Eye behavior was recorded at 120 Hz. The Eye tracker was calibrated for each participant using a transparent plane with three dots placed around the face of the partner. Participants were instructed to look at each dot in turn, and each gaze angle was recorded. Calibration was confirmed by having participants look at each eye and the nose of the partner and confirming alignment. Synchronized scene video capturing the participant’s view of the partner was recorded at 30 Hz with a resolution of 1280x720 pixels using a Logitech c920 camera (Lausanne, Switzerland) positioned directly behind and above the participants’ head. This enabled tagging of participant looking behavior within a manually placed “face box.”
EEG data were acquired via a 256-Hz, 32-electrode dual-bioamplifer g.USB multi-amp system (g.tec Medical Engineering, Austria). The electrode layout was adapted from the 10-10 system to accommodate optode placement on the fNIRS cap. Saline conducting gel was manually placed for each electrode after optode placement to ensure scalp contact. Scalp contact was manually reviewed per electrode using a digital oscilloscope, and adjustments were made as needed.

Recording of optode locations
After completion of the tasks, locations of optodes and electrodes were recorded for each participant using the Structure Sensor scanner (Boulder, CO, USA) which creates a 3D model (.obj file) of the participant’s head and cap. Locations of the standard anatomical landmarks nasion, inion, cz, t3, and t4 as well as optode locations were manually placed on the 3D model using either MATLAB. Electrode locations were then determined by calculating the midpoint between surrounding optodes. Locations were then corrected for cap drift using custom MATLAB scripts which rotated optode and electrode locations around the Montreal Neurological Institute (MNI) X-axis from left ear towards the midline (Eggebrecht et al., 2012; Okamoto & Dan, 2005). This was done to bring the cz optode in line with the anatomical cz according to original placement to account for stereotyped tilting of the cap towards the left ear that could occur during optode removal.

Environmental/experimental conditions: During the experiment, the overhead lights of the room were extinguished. An experimenter was present out of the view of the participant during each run. Two directed lights were used to fully luminate the partner’s face and eliminate shadows. Two diffuse bar-lights were used to softly illuminate the opaque SmartGlass during no-face-viewing periods to suppress the participant’s reflection. Participants only reported being able to see the rough outline of their head in the SmartGlass, with no internal features visible. These were on throughout run durations. Partners had comparable luminance (Human partner: 42.4 lumens; Robot partner: 42.6 lumens), with slightly lower luminance than the opaque smart glass (44.0 lumens). Luminance was measured with a luxmeter positioned at a typical eye height, placed 70cm from and facing the smart glass divider and 140cm from the partner.

Methods for processing the data: Data included in the .tar file are raw, unprocessed data.

Instrument- or software-specific information needed to interpret the data:
Data is presented in a generic text file format, making them broadly accessibly for analysis in a wide array of software. The software utilized by the lab are described here. fNIRS data were collected via a multichannel continuous-wave LABNIRS system, producing OMM files and converted to text files which can be analyzed using MathWorks MATLAB with the NIRS-SPM(Ye et al., 2009) package. EEG data were collected with a 256-Hz, 32-electrode dual-bioamplifer g.USB Amp system (G. Tec Medical Engineering, Austria) and are analyzable through MathWorks MATLAB with the EEGLAB extension(Swartz Center for Computational Neuroscience, California, USA)(Delorme & Makeig, 2004). Eye-tracking data were collected Tobii Pro (Stockholm, Sweden) X3-120 eye-tracking system and are analyzable with Tobii Pro Labs.

Standards and calibration information, if appropriate:
During experimental set up, optode holders were cleared of hair using a lighted fiberoptic wand. This ensured that scalp contact was made. After fNIRS optode placement and prior to beginning the experiment, signal-to-noise ratio was assessed by measuring attenuation of light for each channel, with manual adjustments made as needed. After optode placement, saline conducting gel was placed in EEG electrodes manually to ensure scalp contact. Scalp contact was reviewed using an oscilloscope, and adjustments to scalp connectivity were made as needed. Eye tracking calibration was completed using Tobii Pro Lab’s calibration protocol, with a vertical plane placed in alignment with the end of the partners nose which placed three calibration points around the edge of the partners face which participants were instructed to view in time.

Describe any quality-assurance procedures performed on the data: Optode and electrode connectivity were reviewed and adjusted prior to starting the experiment and were further monitored over the course of the experiment so that connectivity issues could be addressed between runs if needed. Eye tracking calibration was assessed by confirming that recorded looking behavior corresponded with accurate location. After data collection, EEG data were manually reviewed by eye and bad channels were removed and replaced with an interpolation calculated from the remaining channels. Channel 32 was removed from all subjects do to stereotyped connectivity issues from the cap design. Eye tracking was calibrated at the start of each visit using a transparent plane with three dots placed in alignment with the tip of the nose of the partner. Participants were instructed to focus on each dot in turn in order to calibrate looking behavior, and then calibration was confirmed by ensuring that the eyetracker conformed with locations the participant was instructed to look at.

People involved with sample collection, processing, analysis, and/or submission: Dr. Megan Kelley designed the experiment and participated in data collection, conducted analyses on the data, and produced this document. Lab post-graduate associates including Sophie Gardephe and Julianna Brenner were primarily responsible for fNIRS, EEG, and eye tracking set up and data collection, with assistance from other lab members as needed. Dr. J. Adam Noah was responsible for maintenance of the EEG, fNIRS, and eye tracking hardware and software. He also assisted in the design and implementation of the experiment, as well as data collection and analyses. Dr. Xian Zhang assisted in data analysis and software upkeep, as well as produced the code responsible for running the experiment and synchronizing the modalities. Dr. Joy Hirsch oversaw the project.

DATA-SPECIFIC INFORMATION
FileExistTable.csv is an organizational table indicating which files are present or missing.

Each column corresponds to a subject.
Each row corresponds to a file.
Values in the table are binary indicating the presence or absence of each file for each subject.

For all file names, the following are used as place holders:

VISIT indicates the type of partner the task is with, robot or human. The value of visit can be either R or H.
ID indicates subject number, ranging from 01 to 21.
CONDITION can be 1, 2, 3, or 4, corresponding with NF, LF, SF, and XF conditions respectively.
RUNID is either the first or second run, and can be either 1 or 2.

fNIRS data
The format of the name of fNIRS files:
VISIT_ID_cCONDITION_rRUNID_fnirs.csv

A description of the contents of fNIRS files:

Each row is a sample.
Column 1 is time in seconds.
Column 2 is trigger. There are two types of triggers in the files. The main ones relevant to the data set are those with a trigger value >0 and < 3000, which indicates the onset of the stimulus.
Column 3 is the oxyhemoglobin concentration of ch1.
Column 4 is the de-oxyhemoglobin concentration of ch1.
Column 5 is the total-oxyhemoglobin concentration of ch1.
Column 6 is the oxyhemoglobin concentration of ch2.
Column 7 is the de-oxyhemoglobin concentration of ch2.
[…]
The final column is the total-oxyhemoglobin concentration of ch134.

Columns indicated by […] continue in the pattern of three columns per channel corresponding to oxyhemoglobin, de-oxyhemoglobin, and total-oxyhemoglobin concentration in that order. There are a total of 134 channels.

The format of the name of fNIRS channel location files:
VISIT_ID_xyz.csv

A description of the contents of fNIRS channel location files:

Columns 1, 2, and 3 correspond to the MNI X – Y – Z coordinates, respectively.
Rows correspond to 134 channels, 134 rows for 134 channels

EEG data
The format of the name of EEG data files:
VISIT_ID_EEGData.csv

A description of the contents of EEG data files:

Columns correspond to EEG channels.
Rows correspond to samples, at a sample rate of 256hz.

The format of the name of EEG channel location files:
VISIT_ID_EEGxyz.csv

A description of the contents of EEG channel location files:

Column 1, 2, and 3 correspond with MNI X – Y – Z coordinates, respectively.
Rows correspond with channels. Channel names are, in row order: fp1, fp2, af3, af4, f7, f3, fz, f4, f8, fc5, fc1, fc2, fc6, t7, c3, cz, c4, t8, cp5, cp1, cp2, cp6, p7, p3, pz, p4, p8, po3, po4, o1, oz, and o2.

The format of the name of EEG event file:
VISIT_ID_EEGevent.csv

A description of the contents of EEG event files:

Rows correspond to each event.
Column 1 is the sample number
Column 2 is the condition, using the following legend.
1 = NF : the duration of one stimulus is 3200 ms
2 = LF : the duration of one stimulus is 1600 ms
3 = SF : the duration of one stimulus is 0800 ms
4 = XF : the duration of one stimulus is 0400 ms

Eye Tracking data
The format of the name of the eye tracking data file:
VISIT_ID_cCONDITION_rRUN_eyetracking.csv

A description of the contents of eye-tracking files:

Rows correspond to samples
Columns 1 and 2 are time information, with 1 being time since the recording was started and 2 being the timestamp on the computer
Columns 3-22 are descriptive information about the recording, software, date, and calibration accuracy. They will be identical for all samples in a file.\
o Column 3 contains technical information about the acquisition sensor
o Columns 4-12 are self-explanatory
o Columns 13 and 14 contain software details including the name of the filter used to classify fixation behavior and the software version number. The filter used was the default Tobii Pro filter.
o Columns 15 and 16 contains the spatial resolution of the recording
o Columns 17-22 contain average calibration and precision metrics.
 Accuracy is the average difference between the real stimuli position and the measured gaze position when looking at that stimuli position, given in mm. This is found in column 17 in millimeters and column 20 in degrees
 Precision is the ability of the eye tracker to reliably reproduce the same gaze point measurement when looking at the same stimuli position, measured in Root Mean Square (RMS), found in column 19 in mm and 22 in degrees. Precision SD is the spread of the measurements taken when looking at the same stimuli position. This is found in column 18 in mm and 21 in degrees.
Column 23 is the timestamp on the eye tracker.
Columns 24 and 25 contain event information, which will include triggers indicating when the smart glass was transparent and thus the face was visible and when the smart glass turned opaque, occluding the face. Column 24 contains the classification of the event, in terms of whether it was an event entered using a keyboard (e.g. a button press). Column 25 contains the actual value of the event. Events were marked using a custom python UDP broadcast tool that communicated directly between the computer running the paradigm and the computer running the Tobii Pro Lab Software.
Columns 26-31 contain the XY coordinates of the gaze points in the plane of calibration. 26 and 27 contain coordinates of the average of the two eyes, 28 and 29 of the left eye alone, and 30 and 31 of the right eye alone.
Columns 32-37 are the XYZ coordinates of the gaze vector between the left and right eye and the point in the environment that is in focus
Columns 38 and 39 are the pupil diameter of the left and right eyes.
Columns 40 and 41 are the validity of the left and right eyes. Valid is 1 and no eye present is a value of 0.
Columns 42-47 are the XYZ coordinates of the left and right eyes in the Display Area Coordinate System (DACS). DACS is a 3D coordinate system with its origin in the top left corner of the stimuli area on the plane of calibration, with Y pointing downward (a coordinate that is lower on the stimuli area will have a higher Y-value), X pointing horizontally across the calibration plane, and Z pointing towards the participant (A gaze point closer to the participant than the calibration plane will have a positive value). They are given in millimeters. See https://connect.tobii.com/s/article/Display-Area-Coordinate-System-DACS?language=en_US for description.
Columns 48-53 are the xyz coordinates of the gaze points in the media coordinate system (MCS). These columns should not be used for analysis as the definition of media was not standardized between participants.
Column 54 is the classified eye behavior for each data point. Possible values include Saccade, Fixation, Microsaccade, Tremor, Drift, Smooth pursuit, Vergence, and Vestibular-ocular reflex. Descriptions of each can be found at the following link: https://www.tobii.com/resource-center/learn-articles/types-of-eye-movements
Column 55 is the duration of the discrete behavioral event containing the data point. This value will be the same for all samples classified as belonging to the same event.
Column 56 numeric indicator corresponding with the event type in column 55.
Columns 57 and 58 are the XY coordinates of the fixated location on the calibration plane.
Column 59 not used in this experiment
Missing Data
Eye-tracking calibration could not be obtained on all subjects due to wearing of glasses or other factors. In the case of non-calibration, data is marked as missing.
Other missing data is due to equipment malfunction during experiment.

References
Delorme, A., & Makeig, S. (2004). EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. Journal of Neuroscience Methods, 134(1), 9–21.
Dravida, S., Noah, J. A., Zhang, X., & Hirsch, J. (2020). Joint Attention During Live Person-to-Person Contact Activates rTPJ, Including a Sub-Component Associated With Spontaneous Eye-to-Eye Contact. Frontiers in Human Neuroscience, 14. https://doi.org/10.3389/fnhum.2020.00201
Eggebrecht, A. T., White, B. R., Ferradal, S. L., Chen, C., Zhan, Y., Snyder, A. Z., Dehghani, H., & Culver, J. P. (2012). A quantitative spatial comparison of high-density diffuse optical tomography and fmri cortical mapping. Neuroimage, 61(4), 1120–1128. https://doi.org/10.1016/j.neuroimage.2012.01.124
Hirsch, J., Zhang, X., Noah, J. A., Dravida, S., Naples, A., Tiede, M., Wolf, J. M., & McPartland, J. C. (2022). Neural correlates of eye contact and social function in autism spectrum disorder. PLOS ONE, 17(11), e0265798. https://doi.org/10.1371/journal.pone.0265798
Hirsch, J., Zhang, X., Noah, J. A., & Ono, Y. (2017). Frontal, temporal, and parietal systems synchronize within and across brains during live eye-to-eye contact. NeuroImage, 157, 314–330. https://doi.org/10.1016/j.neuroimage.2017.06.018
Kelley, M., Noah, J. A., Zhang, X., Scassellati, B., & Hirsch, J. (2021). Comparison of human social brain activity during eye-contact with another human and a humanoid robot. Frontiers in Robotics and AI.
Noah, J. A., Ono, Y., Nomoto, Y., Shimada, S., Tachibana, A., Zhang, X., Bronner, S., & Hirsch, J. (2015). fMRI Validation of fNIRS Measurements During a Naturalistic Task. Journal of Visualized Experiments : JoVE, 100. https://doi.org/10.3791/52116
Noah, J. A., Zhang, X., Dravida, S., Ono, Y., Naples, A., McPartland, J. C., & Hirsch, J. (2020). Real-time eye-to-eye contact is associated with cross-brain neural coupling in angular gyrus. Frontiers in Human Neuroscience, 14, 19. https://doi.org/10.3389/fnhum.2020.00019
Okamoto, M., & Dan, I. (2005). Automated cortical projection of head-surface locations for transcranial functional brain mapping. Neuroimage, 26(1), 18–28. https://doi.org/10.1016/j.neuroimage.2005.01.018
Parker, T. C., Zhang, X., Noah, J. A., Tiede, M., Scassellati, B., Kelley, M., McPartland, J., & Hirsch, J. (2023). Neural and visual processing of social gaze cueing in typical and ASD adults. medRxiv, 2023.01. 30.23284243.
Tachibana, A., Noah, J. A., Bronner, S., Ono, Y., & Onozuka, M. (2011). Parietal and temporal activity during a multimodal dance video game: An fNIRS study. Neuroscience Letters, 503(2), 125–130. https://doi.org/10.1016/j.neulet.2011.08.023
Ye, J. C., Tak, S., Jang, K. E., Jung, J., & Jang, J. (2009). NIRS-SPM: Statistical parametric mapping for near-infrared spectroscopy. NeuroImage, 44(2), 428–447. https://doi.org/10.1016/j.neuroimage.2008.08.036

1.1. Face viewing paradigm

Participants completed two runs of a face viewing paradigm with two partners (Figure 1). The paradigm is similar to methods described previously^{35–37,63–65}. The face viewing paradigm alternated periods of face viewing and no face viewing (Figure 1A). Participants were seated 140 centimeters (cm) in front of and facing a partner with a “Smart Glass” divider (Smart Glass Country, Vancouver, Canada) at the midpoint between them. The divider can be made transparent or opaque under computer control, with a switching latency of ~10ms. Participants faced straight ahead, keeping a consistent head posture while the opacity of the smart glass was toggled between transparent and opaque, allowing or preventing them from seeing their partner. When they could see their partner, participants were instructed to look at the face, allowing their eyes to move as felt comfortable and natural, and when they could not see it, they were instructed to keep their eyes open and focused on a dot centered on the divider.

Runs (Figure 1B) consisted of five 16-second (s) long task blocks alternating with 15-s rest blocks. Task blocks were divided into three 3.4-s face-viewing events (orange lines) which alternated with 3-s no-face-viewing periods (short blue lines). Task blocks were subdivided because extended face viewing can cause social discomfort that confounds results. Events were further subdivided into epochs, described in section 7.2.

1.2. Face viewing epochs and “flicker” disruption to face view

The previously developed face viewing paradigm^35–37 was modified by segmenting the face viewing events into two 1600-milisecond (ms) face viewing epochs separated by a 200-ms (opaque divider) disruption to face view (Figure 1, grey inset). This leverages the high temporal resolution of EEG, enabling comparison of the average initial and subsequent epoch to each other. This serves two exploratory purposes. First, it allows built-in replication of results if the same pattern emerges across both epochs. Second, it allows assessment of repetition suppression. While the second epoch on its own is not interpretable due to the inability to dissociate the neural impact of face viewing, disruption, and autocorrelation, decrease in activity during the second epoch would suggest sensitivity to face viewing continuity or repetition suppression.

1.3. Human and robot partners

Participants performed the task with two partners: a real human and a dynamic robot called Maki^36,66,67. Partner order was counterbalanced, and conditions with partners were completed on different days. The human partner allowed their eyes to move over the participant’s face as felt comfortable and natural, while maintaining a neutral expression; a steady breathing and blinking pattern; and a still posture. The clothing, hairstyle, and make-up of the human partner were allowed to vary. In all cases, the human partner was a young adult female.

The robot partner Maki (Figure 1, black inset) was a 3D-printed bust designed by HelloRobo (Atlanta, Georgia) which simulates human head and eye-movements but otherwise lacks human features, movements, or reactive capabilities. Maki was chosen due to the design emphasis on eye movements, as well as its similarity to the overall size and organization of the human face, controlling for the appearance, immediacy, and movements of a human face^36,68. Maki’s eyes have comparable components to human eyes: whites surrounding a colored iris with a black “pupil.” An Arduino IDE controlling six servo motors were used to drive Maki’s movements, creating six degrees of freedom: head turn left-right and tilt up-down; left eye moves left-right and up-down; right eye move left-right and up-down; and eye-lids open-close. During runs, Maki engaged in a pseudorandom pattern of naturalistic blinks⁶⁹ and saccade- and fixation-like eye movements. “Fixations” occurred at one of nine points of a three-by-three grid, with “saccades” from one point to another determined pseudo-randomly. Movements were driven by custom scripts written using MATLAB 2019a (MathWorks, Massachusetts, USA). Before performing the task, participants were introduced to Maki through a practice run, familiarizing them with its movements. This was to minimize neural effects related to novelty or surprise. Maki’s head angle was also manually adjusted until the participant reported that the robot was looking at them.

1.4. Participants

Twenty-one participants (11 female, mean age = 35.6 years) were recruited using publicly posted flyers, internet postings, and word of mouth. They were enrolled in the order that they expressed interest and passed a finger thumb tapping screening procedure to establish fNIRS signal validity⁷⁰. Nineteen subjects completed the task with both human and robot partner, while two completed it with a single partner—one with human and one with robot, resulting in a total of n=20 per partner. The subject pool was further limited due to the following: one subject was excluded from all regression analyses as the eye tracker could not detect their eyes; two subjects were excluded from human regression analyses due to, in one case, insufficient fNIRS optode-scalp contact during human runs and, in the other, technical error in eye tracking data collection during human runs.

1.5. Environment

During the experiment, the overhead lights of the room were extinguished. An experimenter was present out of the view of the participant during each run. Two directed lights were used to fully luminate the partner’s face and eliminate shadows. Two diffuse bar-lights were used to softly illuminate the opaque SmartGlass during no-face-viewing periods in order to suppress the participant’s reflection. Participants only reported being able to see the rough outline of their head in the SmartGlass, with no internal features visible. These were on throughout run durations. Partners had comparable luminance (Human partner: 42.4 lumens; Robot partner: 42.6 lumens), with slightly lower luminance than the opaque smart glass (44.0 lumens). Luminance was measured with a luxmeter positioned at a typical eye height, placed 70cm from and facing the smart glass divider and 140cm from the partner.

1.6. Multimodal data collection

FNIRS, EEG and eye-tracking were collected simultaneously in order to assess cortical hemodynamics, neural oscillations, and eye behavior during the face viewing task.

1.6.1. Functional Near Infrared Spectroscopy (fNIRS)

FNIRS signal acquisition, optode localization, and signal processing were similar to methods described previously^{35–37,63,64}. Data were collected via a multichannel continuous-wave system (LABNIRS, Shimadzu Corporation, Kyoto, Japan) consisting of forty emitter-detector optode pairs. During the task, optodes were connected to a cap placed on the participant’s head based on size to fit comfortably. For consistency in cortical coverage, the middle anterior optode was placed 1 cm above the nasion; the middle posterior optode was placed in line with the inion; and the CZ optode was aligned with the anatomical CZ. After cap placement, hair was cleared from optode holders using a lighted fiber-optic probe (Daiso, Hiroshima, Japan) prior to optode placement. Optodes were arranged in a matrix, contacting the scalp, enabling acquisition of 128 channels. After optode placement and prior to beginning the experiment, signal-to-noise ratio was assessed by measuring attenuation of light for each channel, with adjustments made as needed^71,72.

1.6.2. Eye-tracking

Eye-movements were recorded using a desk-mounted Tobii Pro (Stockholm, Sweden) X3-120 eye-tracking system placed 70cm in front and slightly below the participant’s face. Eye behavior was recorded at 120 Hz. The Eye tracker was calibrated for each participant using a transparent plane with three dots placed around the face of the partner. Participants were instructed to look at each dot in turn, and each gaze angle was recorded. Calibration was confirmed by having participants look at each eye and the nose of the partner and confirming alignment. Synchronized scene video capturing the participant’s view of the partner was recorded at 30 Hz with a resolution of 1280x720 pixels using a Logitech c920 camera (Lausanne, Switzerland) positioned directly behind and above the participants’ head. This enabled tagging of participant looking behavior within a manually placed “face box” (Section 7.7.2).

1.6.3. Electroencephalography (EEG)

EEG data were acquired via a 256-Hz, 32-electrode dual-bioamplifer g.USB Amp system (G. Tec Medical Engineering, Austria). The electrode layout was adapted from the 10-10 system to accommodate optode placement on the fNIRS cap. Saline conducting gel was manually placed for each electrode after optode placement to ensure scalp contact. Scalp contact was manually reviewed per electrode using an oscilloscope, and adjustments were made as needed.

1.6.4. Recording of individual optode and electrode placement

After completion of the tasks, locations of optodes and electrodes were recorded for each participant using the Structure Sensor scanner (XRPro LLC, Saratov, RU) which creates a 3D model of the participant’s head and cap^73,74. Locations of the standard anatomical landmarks nasion, inion, cz, t3, and t4 as well as optode locations were manually placed on the 3D model. Electrode locations were then determined by calculating the midpoint between surrounding optodes. Locations were then corrected for cap drift using custom MATLAB scripts which rotated optode and electrode locations around the Montreal Neurological Institute (MNI) X-axis from left ear towards the midline. This was done to bring the cz optode in line with the anatomical cz according to original placement in order to account for stereotyped tilting of the cap towards the left ear that could occur during optode removal. Participant scans were normalized to MNI coordinates⁷⁵ using NIRS-SPM⁷⁶.

1.7. Preprocessing and analyses

A flow chart of preprocessing and analysis steps is shown in supplemental Figure S1.

1.7.1. fNIRS Preprocessing and main effect analysis

Analyses were conducted using NIRS-SPM⁷⁶ and custom scripts in MATLAB 2019a. Raw fNIRS optical density data were converted to changes in relative chromophore concentrations using a Beer-Lambert equation^77,78. Baseline drift was removed using NIRS-SPM wavelet detrending⁷⁹. Global components attributable to blood pressure and other systemic effects⁸⁰ were removed using a principal component analysis spatial filter^81,82. Main effect general linear models (GLM)⁸³ of face viewing > no-face viewing were constructed by convolving a boxcar model of events and rests (Figure 1B) convolved with the canonical hemodynamic response function provided in SPM8⁸⁴. Face viewing events consisted of both the initial and subsequent face viewing separated by the 200-ms disruption (Figure 1, grey inset). GLMs were subsequently fit to preprocessed data, providing beta values for each channel per participant and partner. Individual channel beta values were projected into normalized voxel space, and voxels were group averaged then rendered on the MNI brain. Analyses were performed on the combined OxyHb-deOxyHb signal^85,86 which is calculated by adding the absolute values of concentration change of OxyHb and deOxyHb. This combined signal reflects through a single value the expected and well-established task-related anticorrelated increase in OxyHb and concurrent decrease in deOxyHb⁸⁷. For each partner, task related activity was determined by contrasting voxel-wise activity during task blocks (Figure 1B, orange lines) with that during rest blocks (blue lines), which identifies cortical regions which show more activity during partner face viewing than no face viewing. The main effect GLM was used to assess replication of prior findings of increased right supramarginal gyrus activity occurring during real human face viewing as compared to robot face viewing³⁶.

1.7.2. Eye tracking preprocessing and behavioral analyses

Eye tracking data were processed using The Tobii Velocity-Threshold Identification Gaze Filter with default parameters⁸⁸. Eye behavior was calculated from the average of both eyes. Noise reduction was completed using a moving median filter over three samples. Velocity was calculated using a window length of 20ms. Interpolation of gaps in data were not used. Looking behavior was determined using a “face-box” manually drawn around each partner’s face. The face box was an oval that encompassed eyes, nose, mouth, forehead, cheeks, and chin, but excluded hair and ears as much as possible. When the eyes of the participant fell within the bounds of the face-box of their partner, it was considered a “face hit.” Data points in which neither eye was detected due to technical issues or eye-blinks were considered invalid and excluded from subsequent calculations. Face-viewing-events with more than one-third of data points being invalid were excluded from analyses.

1.7.3. Fixation duration and dwell time calculation

Fixations were identified using the Tobii Pro Lab (version 1.171, Stockholm, Sweden) feature detection algorithm with the default settings. Fixation threshold was 30°/s and physically adjacent datapoints were treated as a single fixation if they occurred within 75ms and 0.5° of visual angle. Fixations which began during a face viewing event were identified and their durations averaged per event. Dwell time—that is, cumulative face viewing time—was calculated from the amount of time per face viewing event that the eyes were within the partner’s face box, whether in fixation or not. To compare dwell time behavior, total time that the eyes of the participant were within the face box of the partner was calculated as a proportion of total time the face was visible per run. For the linear regression, event dwell time was calculated as the proportion of time spent in the face box per each event.

1.7.4. Linear regression analyses combining fNIRS and eye tracking

Two regression analyses were conducted: one with mean fixation duration and one with dwell time. Each was added as a regressor to the GLM. Regressors were a boxcar model for each eye metric with the height of each event boxcar reflecting either the dwell time (ms) or the average fixation duration (ms) for that event. Values were then demeaned per run and events with insufficient data (>1/3 of data points invalid) were set to zero. The resulting model was then convolved with the canonical hemodynamic response function. The regressions were fit to each run to calculate model fit. Group averages were calculated per partner to identify regions of brain activity which were best explained by eye behavior during face viewing events. Resulting positive clusters can be interpreted as increasing in a manner that correlates with the eye behavior when viewing a face but not during rests. Regions which show a distinct relationship to behavioral metrics based on partner are interpreted as being involved in stimulus-specific visual sensing.

1.7.5. EEG preprocessing.

Preprocessing was conducted using MATLAB, EEGLAB 2021 (Swartz Center for Computational Neuroscience, California, USA)⁸⁹ and Brainstorm 3 ⁹⁰. Data were band pass filtered to 1-100Hz. 60hz line noise was removed using the CleanLine EEGLAB extension. Large, non-blink artifacts and bad channels were manually identified and removed. Data were re-referenced to the average channel (Miyakoshi, n.d.). Independent component analysis was conducted using the MATLAB runica function and ICLabel⁹¹ was used to identify component sources. Components with <5% chance of having a neural source, as well as any which were manually identified as blink or eye movement components, were removed. Data were filtered to the alpha frequency band (8-13hz) for the analyses of interest: alpha frequency changes have been related to and co-localized with changes in blood oxygen signals and thus could feasibly drive differences in hemodynamics^92–94. Data were then epoched to -1.5 – 5s from the onset of the face-viewing events and were averaged per person. This epoch length was chosen to enable temporal localization of alpha fluctuations and encompasses both the initial and subsequent face viewing epoch described in section 7.2. Similar treatment was performed on data filtered to the theta frequency band (4-8hz) in order to assess whether significant differences were specific to the alpha frequency or tied to more general processing found in multiple frequency bands. The theta band was chosen for comparison because it has also been tied to face processing^95–97 and to spatial binding⁵⁰. We reasoned that if differences are simply the result of real faces being more engaging—as opposed to being due to unique temporal binding demands—then we would see real face specific engagement of general binding processes and thus would see significant differences between human and robot in both alpha and theta frequency bands.

1.7.6. Source estimation analysis

Source localization was conducted using Brainstorm’s Standardized low-resolution brain electromagnetic tomography (sLORETA)⁹⁸ function. SLORETA was chosen as it is highly accurate and stable^98,99. The default Montreal Neurological Institute (MNI) anatomy with boundary element method (BEM) modelling was used for all subjects, as anatomical MRI scans were not obtained. The MNI head model was linearly warped to each subject’s anatomical landmarks: T3, T4, Cz, inion, and nasion. Noise covariance was calculated using a pre-stimulus resting state baseline from -1000-0ms¹⁰⁰. The data covariance matrix was calculated using 0-1600ms. Default regularization parameters were used: 0.1 for noise and 3 for signal to noise ratio¹⁰¹. Minimum norm imaging sLORETA, constrained normal to the cortex, was used to create a whole cortex current flow model per subject. Single subject models were then projected to the un-warped MNI brain template, smoothed with a 3mm kernel, and z-score normalized to each subject’s baseline of -1000-0ms. Group averages were then calculated per partner.

1.7.7. Regions of interest

The largest continuous clusters from each of the linear regression analyses (section 2.7.3) were used to define anatomical regions of interest (ROIs) for assessing oscillatory activity. MNI coordinates for the centroid of each anatomical subregion for the two clusters—one corresponding to the lateral cortex and one to the dorsal parietal cortex—are as follows. The right dorsal parietal cortex cluster was made up of the superior parietal lobule [30, -51, 69], inferior parietal lobule [59,-50,47], and dorsal post-central gyrus [26, -33, 75]. The right lateral cortex cluster was made up of the supramarginal gyrus [66, -28, 32], ventral postcentral gyrus [68, -12, 24], and ventral precentral gyrus [64,9,20].

1.7.8. Significance determination and comparison of partners and epochs

Source estimated current flow was extracted from anatomical ROIs, absolute valued, and summed with the other cluster regions to form the dorsal and lateral EEG traces. Analyses were conducted on the absolute values to ensure that phases, which are arbitrarily allocated in sLORETA, did not cancel out and thus artificially suppress results. The alpha current flow envelope was calculated using the MATLAB function envelope with the type “peak” and a window of 100ms. It was then converted to log significance values reflecting change from baseline as described below, and these values were then used to determine local maxima that were significantly different between human and robot partner as well as initial and subsequent epoch.

A local maximum was defined as significantly greater than the comparable time point in the comparator if it met the following criteria: 1) The maximum was significantly greater than its own baseline (z>2.33; p<.01); and 2) the p-value of the maximum was more than an order of magnitude larger than the comparable time point and all time points within 100ms in the comparator. For example, if a maximum occurred at 100ms at an p-value=0.002, then in order to be significantly different than the comparator according to the above metrics, all time points from 0-200ms within the comparator would have to have a p-value>0.02.

Differences were calculated between the initial epoch (Figure 1, grey inset) of human and robot face viewing in order to test the hypothesis that alpha activity was significantly greater during human than robot face viewing. Comparisons were also conducted between the initial and subsequent epoch of human face viewing to assess exploratory questions about repetition suppression. If a local maximum is significantly greater during the initial than at the comparable timepoint in the subsequent epoch, then we can conclude that the signal in that region may be sensitive to repetition. Local maxima that are significantly greater than baseline and comparable across both epochs would suggest the region is not susceptible to repetition. Such a result would also give greater evidence of real-face specificity as it is a built-in replication of results.

Spatiotemporal processing of real faces is supported by dissociable visual-sensing-modulated neural circuitry

Data files

Abstract

README

Methods