Text to Blind Motion

Hee Jae Kim1Kathakoli Sengupta1Masaki Kuribayashi2Hernisa Kaccori3Eshed Ohn-Bar1
Boston University 1 Waseda University 2 University of Maryland, College Park 3 NeurIPS 2024

BlindWays is the first multimodal 3D human motion benchmark for pedestrians who are blind, featuring data from 11 participants (varying in gender, age, visual acuity, onset of disability, mobility aid use, and navigation habits) in an outdoor navigation study. We provide rich two-level textual descriptions informed by third-person and egocentric videos.

High-Level Annotation

A blind man with a guide dog is walking up a set of stairs, holding the handle in his left hand. He walks confidently at a relatively fast pace and continues walking after reaching the top of the stairs.

Low-Level Annotation

A blind man with a guide dog is walking up a set of stairs, holding the handle in his left hand. He walks confidently up 11 stairs without hesitation. He reaches the top and takes seven more steps forward.

High-Level Annotation

A blind man with a cane in his right hand shuffles in place at one side of an intersection, turning in multiple directions in an attempt to orient himself. He maintains his cane in front of him tapping the ground.

Low-Level Annotation

A blind man with a cane in his right hand takes two steps forward, and then two side steps left while turning about 90 degrees to the right and tapping the ground in front of him with his cane. He then takes three small side steps to the right while tapping the ground in front of him with his cane. He then takes a small step to turn his body another 90 degrees to the right while tapping the ground in front of him with his cane.

High-Level Annotation

A blind woman with a cane is walking to avoid obstacles in her path. She seems to try to enter the chapel. She is using the cane in her right hand to change her direction whenever there is an obstruction in front of her.

Low-Level Annotation

A blind woman with a cane in her right hand is moving ahead using her cane to find the obstacles in her path. She finds the green barrier in her path with the help of her cane, which she is moving in front of her in a right-to-left direction. Once she finds the green barrier, she turns towards her left and moves ahead, moving her cane from right to left direction.

Abstract

People who are blind perceive the world differently than those who are sighted. This often translates to different motion characteristics; for instance, when crossing at an intersection, blind individuals may move in ways that could potentially be more dangerous, eg, exhibit higher veering from the path and employ touch-based exploration around curbs and obstacles that may seem unpredictable. Yet, the ability of 3D motion models to model such behavior has not been previously studied, as existing datasets for 3D human motion currently lack diversity and are biased toward people who are sighted. In this work, we introduce BlindWays, the first multimodal motion benchmark for pedestrians who are blind. We collect 3D motion data using wearable sensors with 11 blind participants navigating eight different routes in a real-world urban setting. Additionally, we provide rich textual descriptions that capture the distinctive movement characteristics of blind pedestrians and their interactions with both the navigation aid (eg, a white cane or a guide dog) and the environment. We benchmark state-of-the-art 3D human prediction models, finding poor performance with off-the-shelf and pre-training-based methods for our novel task. To contribute toward safer and more reliable autonomous systems that reason over diverse human movements in their environments, here is the public release of our novel text-and-motion benchmark.

Data Organization

This dataset contains contributions from 11 blind participants labelled with participant ids eg. P1, P2. Below is a brief overview of its structure:

  • Motion Data: The motion data for each participant, collected over 8 different routes, is stored in the Motion folder. This (num_frame X 23 X 3) joint data includes detailed information about their movements captured via Xsens motion capture sensors. Please click the "Visualisation" tab to access our visualisation script for the dataset.
  • Annotations: Annotations were collected from both expert and novice annotators, providing high-level and low-level descriptions of each motion sequence. These annotations are stored in the Annotations folder.
  • croissant.json: Metadata for our dataset.

Directory Organization of BlindWays:

  • BlindWays
    • Motion
      • P0_0001.npy
      • P0_0002.npy
      • ...
      • P10_0001.npy
      • ...
    • Annotations
      • P0_0001.json
      • P0_0002.json
      • ...
      • P10_0001.json
      • ...
    • croissant.json



Video