Smartphones versus goggles for video-oculography: current status and future direction
Article information
Abstract
Assessment of eye movements is the cornerstone of diagnosing vestibular disorders and differentiating central from peripheral causes of dizziness. Nonetheless, accurate assessment of eye movements is challenging, especially in the emergency department and primary care settings. To overcome this challenge, clinicians objectively measure eye movements using devices like video-oculography (VOG) goggles, which provide a video recording of the eye and quantified eye position traces. However, despite the value of VOG goggles in studying eye movements, barriers such as high prices and the need for dedicated operators have limited their use to subspecialty clinics. Recent advancements in the hardware and software of smartphones have positioned them as potential alternatives to VOG goggles that can reliably record and quantify eye movements. Although currently not as accurate as VOG goggles, smartphones can provide a cheap, widely available tool that can be used in various medical settings and even at home by patients. We review the current state and future directions of the devices that can be used for recording and quantifying eye movements.
INTRODUCTION
Eye movement assessment is valuable in evaluating acute and chronic conditions affecting the extraocular muscles and associated nerves, inner ears, and brain. These conditions range from benign, such as benign positional paroxysmal vertigo (BPPV), to life-threatening, like vestibular strokes [1]. Moreover, patients with dizziness, who comprise approximately 5 million of the annual emergency department (ED) visits (3%–5%), require eye movement evaluation for accurate diagnosis and management [2,3]. Studies have shown that bedside examination of eye movements with the Head Impulse test, Nystagmus, Test of Skew (HINTS) examination battery is superior to magnetic resonance imaging with diffusion-weighted imaging in detecting strokes during the first 24 hours in patients with continuous dizziness and spontaneous nystagmus [4]. On the other hand, BPPV is diagnosed by detecting a characteristic nystagmus after positional maneuvers [5]. While these tests have a high diagnostic yield, they require expertise in eye movement examination, detection of subtle abnormalities, and interpretation of the findings. Moreover, studies have shown that not all physicians perform ideally in HINTS examinations [6]. Unfortunately, those with adequate expertise are not readily available throughout the United States [7].
Quantitative recording of eye movements addresses some of these challenges, provides the opportunity to detect subtle abnormalities, and can promote equitable access to eye movement examination in emergency rooms and underserved areas [8]. Here, we examine how recent technological advancements have changed the landscape of quantified eye tracking and offer insights into its future direction and impact on patient care.
VIDEO-OCULOGRAPHY GOGGLES
Video-oculography (VOG) is a method of quantified eye tracking that uses video recordings of the eyes. It is traditionally performed by VOG goggles, a pair of goggles with one or several built-in infrared cameras that enable eye detection and recording of eye movements while performing various tests [9]. These goggles need to be connected to a computer with software that processes the input and provides quantitative data and traces of eye movements based on the recordings. The clinician would be able to evaluate the quantified metrics (e.g., the gain of head impulse test [HIT], nystagmus velocity, or eye position traces), along with the videos of the eyes—and, in some cases, the videos from the examination room depicting the types of tests and positional maneuvers—using the same software.
Several VOG goggles are approved by the U.S. Food and Drug Administration (FDA). ICS Impulse (Natus Medical, Inc.) and EyeSeeCam (Interacoustics) are among the most widely used models in research and clinics [10,11]. The price of the goggles depends mainly on their model, the features they offer, and the complexity of analysis they provide, with higher-end devices routinely used in clinics and research costing as much as US $40,000. These VOG goggles require a trained technician to calibrate the instrument before recording each patient and to select the appropriate tests in the software before performing them on the patients. The technicians must also have adequate expertise in accurately performing the ocular and vestibular tests to obtain the correct measurements and output. The VOG goggles can quantify different components of bedside HINTS and positional eye movement exam by detecting and measuring the slow phase velocity of nystagmus, measuring the gain of HIT for different semicircular canals and detecting catch-up saccades, and measuring the vertical misalignment of the eye during the test of skew [12]. Furthermore, these goggles utilize an infrared camera at high frame rates (typically about 250 frames per second), which enables them to detect subtle or brief eye movements that might be invisible to the naked eye, which makes them especially valuable in detecting small covert saccades (i.e., catch-up saccades during HIT that would otherwise be invisible to the naked eye) [13]. The use of infrared light sensors in VOG goggles enables clinicians to obtain eye movement recordings in the dark, which is essential when there is a need for removing fixation—as some types of nystagmus tend to be suppressed by fixation and provide overall better-quality eye tracking than visible light [14]. These features make the VOG goggles a fitting tool for objectively screening vestibular strokes in acutely dizzy patients referred to the ED and evaluating eye movements in the outpatient setting. Studies have shown that quantitative HINTS recorded by VOG goggles can be used to diagnose vestibular strokes [15]. A multicenter randomized controlled clinical trial (Acute video-oculography for vertigo in emergency rooms for rapid triage; NCT02483429) has aimed to study the use of VOG goggles in the initial evaluation of patients with acute vertigo; the preliminary results indicate how remote evaluation of VOG recordings by experts has superior accuracy to that of ED evaluation [16]. The final results are expected to be published later this year.
Despite the advantages of VOG goggles in quantifying HINTS battery and their potential use for screening vestibular strokes, the high price of these goggles, the need for a trained technician, and their limited availability in the EDs throughout the country are barriers to their widespread use. On the other hand, smartphones can offer potential solutions to overcome these obstacles and act as reasonable alternatives for recording and quantifying eye movements [17,18].
SMARTPHONES
Smartphones are ubiquitous devices possessing cameras and sensors that allow them to analyze facial features, detect attention, and track the gaze. These features have positioned smartphones in the zone of interest for researchers in the field of eye movement tracking [19]. Multiple studies have investigated how videos captured by smartphones could be used to detect different features of eye movements. Most of the published literature on smartphone eye tracking investigates how eye movement videos obtained by a phone can be later analyzed to quantify certain eye movement features [17]. Despite the broad scope and utility of eye tracking in medicine, here we only discuss studies with adult participants that focused on quantifying components of the HINTS exam that could aid in diagnosing stroke in patients with acute vestibular syndrome. The majority of studies on quantifying nystagmus have relied on induced nystagmus in healthy volunteers in a controlled environment [17]. Nonetheless, Kıroğlu and Dağkıran [20] were able to show video recordings of nystagmus during Ménière attacks (without quantification) can aid in the earlier establishment of diagnosis in this patient group. On the other hand, studies on quantifying HIT gain have included both healthy volunteers and those with vestibular neuritis with abnormal vestibulo-ocular reflex and catch-up saccades [21,22].
Friedrich et al. [23] showed proof of concept for a convolutional neural network model that can quantify eye movement recordings of optokinetic nystagmus. In a similar study, van Bonn et al. [24] showed how a smartphone app can reliably detect nystagmus induced by optokinetic stimulation or caloric stimulation in healthy volunteers. Evaluating the HIT, Kuroda et al. [21] mounted a smartphone on a goggle frame and used the phone camera instead of VOG goggles to record the eyes while performing HIT. Their results indicate how videos from phones can detect low HIT gains similar to VOG and detect catch-up saccades (although with lower sensitivity).
To provide a comprehensive alternative to VOG goggles in screening for vestibular strokes, our team has developed an eye and head tracking application that utilizes the augmented reality (ARKit) features of the iPhone’s (Apple Inc.) front camera and automatically detects facial features, pupil position, and phone distance from the face [18]. In addition, the app provides time-stamped data on the eye position coordinates and other recorded variables (e.g., head position, lighting), along with graphs depicting the eye and head positions during the recording, similar to those of the VOG goggles [22].
We showed how this app’s combined head and eye tracking could be used to measure the HIT gain and detect vestibular hypofunction and catch-up saccades [22]. We outlined the calibration process we utilized in the recording process and highlighted how final measurements were obtained from the raw data [25]. Recently, we shared the data regarding the agreement of the measurements by our smartphone app and the VOG goggles, indicating a high correlation (Spearman correlation of 0.98 for horizontal optokinetic nystagmus and 0.94 for vertical) between the two in quantifying optokinetic nystagmus in healthy volunteers). Moreover, we showed how using an average calibration paradigm was as accurate as individual calibration for each participant, saving considerable time during recording [26]. Further work is needed to determine the extent of the detection thresholds for optimal output. Nonetheless, other studies on eye tracking have shown similar differences in the accuracy of smartphone tracking compared to the VOG goggles [19].
While the technical challenges of accurately performing the eye movement tests will remain unchanged, smartphone applications make recording eye movements as simple as recording a selfie video, eliminating the need for a separate device and software. Moreover, smartphones are widely available, and even clinicians with a limited number of dizzy patients could obtain quantified eye movement traces without purchasing expensive devices. While studies that our team and research teams from around the globe have conducted on smartphone eye tracking show promising results, several factors must be considered regarding the accuracy of smartphone recordings versus VOG goggles. Smartphone cameras that have been tested so far record with a lower frame rate than 250 frames per second in VOG goggles; this could limit the capability of phone cameras to detect more subtle movements—especially saccadic eye movements. Moreover, the need for a source of ambient lighting in phone recording, as opposed to infrared detectors in VOG goggles, limits their use in exams where visual fixation should be removed for optimal assessment. Furthermore, the goggles are worn on patients’ faces, unlike phones that must be placed at a certain distance to capture facial features needed for optimal detection. Therefore, goggles are always positioned at a relatively constant angle to the globe while the angle between the phone camera and the patient’s eyes changes as the head moves. This can potentially impact the quality of the recording as a part of the face or one of the eyes might move out of the frame during the recording. Finally, perhaps the most critical obstacle that needs to be addressed in smartphone eye tracking is the lack of a user-friendly output interface that would frame outcomes in a clinically relevant manner that would save the clinicians the time and the challenge of reviewing individual videos or having to upload the videos to be analyzed. Our efforts are currently focused on making our application user-friendly with outputs that could guide the clinical decision-making of the providers.
FUTURE DIRECTION
Diagnostic Decision Support
Table 1 summarizes the comparison between the phone and the goggles. When considering the future of smartphone eye tracking, it’s essential to remember that this technology is meant to complement the use of more advanced eye-tracking technologies, not replace them. Therefore, one must not expect a one-to-one match in the capabilities of smartphones and VOG goggles. Expanding access to eye movement quantification in settings that would have otherwise not utilized this technology is the core concept behind these applications. One can imagine a future with VOG goggles are used in subspecialized clinics and EDs in large hospitals, and phone-based eye recording at home, primary care settings, and rural EDs that don’t have the funds or staff necessary for 24-hour VOG goggle recording.
One of the pathways that could improve the diagnostic errors in distinguishing central from peripheral dizziness could rely on smartphone applications like the one our team has developed. Automation of output based on embedded or cloud-based algorithms and providing expert opinions based on recorded data are some examples of where these applications could be integrated into telehealth systems that are already in place. Likewise, the smartphone applications and ease of dissemination make them a fit for a follow-up method. For example, patients can be educated to use their smartphones at home to record their eye movements. In such a scenario, they could provide valuable information to their clinicians that might otherwise be impossible to obtain. Of note, the in-home use of cellphone applications can yield diagnostic data in episodic conditions with asymptomatic intervals that might lack objective findings during a clinic visit. Regulatory procedures such as FDA de novo and 510(K) applications are another necessary step. However, payor approval following the FDA regulatory process remains a significant barrier to widespread use.
Artificial Intelligence and Machine Learning Algorithms
With rapid advancements in the technologies embedded in smartphones in conjunction with the use of artificial intelligence (AI) and machine learning (ML), the future where cell phones provide ample information for clinical decision-making is no longer a matter of if but of when. Nonetheless, AI and ML algorithms entail their own unique challenges, including validation and privacy. So far, most published studies of eye tracking by ML algorithms have been confined to controlled research environments, highlighting the need for more studies exploring the validity of these tools in the clinical setting [17]. The variety of approaches in how ML algorithms quantify eye movements also highlights the need for a best-practice guideline for AI in eye tracking similar to those for AI in medical imaging to guide future research efforts in this field [27,28]. Moreover, depending on the primary user, the usability of the recording process has to be methodically studied to guide the future development of the technology [29]. The de-identification guidance based on the ‘Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule’ by the United States Department of Health and Human Services includes biometric information and full photographic images of the face among the list of patient identifiers to ensure patient privacy [30]. Although biometric identification based on face and iris recognition is well-established [31], several recent studies have shown that gaze tracking patterns [32] and eye movement information (e.g., saccades) [33] can be used as biometric data. Hence, we face a dual challenge of protecting potentially cyber-security compromising biometric data [34] and private health information. In the face of these challenges, scientists have worked on de-identifying the eye movement data used for training models. Özdel et al. [35] devised a scanpath analyzing model—which they made publicly available—that preserved the privacy of the individual data; they tested this model on three public datasets with promising results. Moreover, evaluating a convolutional neural network of facial and eye movement data, Seyedi et al. [36] showed that it’s possible to achieve good performance metrics while preserving the privacy of the data. Therefore, we can assume that even though using face and eye data for training and the use of AI models might pose a privacy threat, effective methods can be developed to mitigate such risks.
As the use of deep learning increases in the development of eye-tracking algorithms, issues such as the interpretability of algorithms, human-computer interaction development of eye-tracking algorithms, issues such as the interpretability of algorithms, human-computer interaction, and data-quality standards take center stage. Given these challenges, research teams are attempting to provide solutions. Kumar et al. [37] investigated the challenges in the interpretability of algorithms and highlighted key issues of “explainability beyond events,” “performance analysis of spatio-temporal data,” and “lack of annotation support for real-world data.” With regards to the need for improved human-computer interaction practices when dealing with fixed setups such as the VOG goggles- or smartphone-enabled tracking, Valtakari et al. [38] provided a deep dive into the factors (“number of gaze sources, use of scripted behavior, desired freedom of movement, data quality, and ease of data analysis”) best suited for each type of eye-tracking setup (“head-free vs. head-boxed vs. head-restricted”) as well as a comprehensive guide and decision tree for establishing the required research environment. Furthermore, Adhanom et al. [39] explored the technological limitations in data quality for eye tracking, highlighting the issues of low spatial precision and accuracy, high latency, low sampling rate, and calibration errors in real-world settings pose a significant constraint for standardization of eye-tracking solutions. Although the aforementioned issues limit the generalizability of the existing solutions, they also serve as future directions for research in studying eye movement disorders via the use of advanced deep-learning solutions.
CONCLUSION
Overall, smartphone applications that track eye movements have been previously proven to be worthy diagnostic tools. With further investment and studies in this field, we can expand the use of objective eye movement assessment beyond the current limits.
Notes
Funding/Support
None.
Conflicts of Interest
David Newman-Toker, Jorge Otero-Millan, Max Parker, and Nathan Farrell have a provisional patent application regarding the use of smartphone in tracking eye and head position. David Newman-Toker, Ali Saber Tehrani, Jorge Otero-Millan, Hector Rieiro, Pouya Barahim Bastani, Max Parker, and Nathan Farrell have a provisional patent application regarding using the EyePhone for recording saccades and smooth pursuit.
Availability of Data and Materials
All data generated or analyzed during this study are included in this published article. For other data, these may be requested through the corresponding author.
Authors' Contributions
Conceptualization: PBB, SB, DNT, AST; Investigation: PBB, SB, VP, HR, NF, MP, AST; Supervision: JOM, DNT, AST; Writing–Original Draft: All authors; Writing–Review & Editing: All authors.
All authors read and approved the final manuscript.