2nd Workshop on Visual Perception for Navigation in Human Environments

The JackRabbot Social Grouping and Activity Dataset and Benchmark


Workshop Goals: Dataset and Challenges

This is the second workshop from the JRDB workshop series, tailored to many perceptual problems for an autonomous robot to operate, interact and navigate in human environments. These perception tasks include any 2D or 3D visual scene understating problem as well as any other problems pertinent to human action, intention and social behaviour understanding such as 2D-3D human detection, tracking and forecasting, 2D-3D human body skeleton pose estimation, tracking and forecasting and human social grouping and activity recognition.

Recently, the community has paid increasing attention to the human activity understanding problem due to the availability of several large-scale annotated datasets for this computer vision and robotics task. However, the existing datasets for this problem are often collected from platforms such as Youtube and are only limited to 2D annotations for individual actions and activities annotations. The main focus of our CVPR workshop is the novel problem of social human activity understanding, consisting of three sub-tasks such as individual action detection, social group identification and social activity recognition. We also introduce JRDB-Act, a large-scale, ego-centric and multi-modal dataset.

JRDB dataset contains 67 minutes of the annotated sensory data acquired from the JackRabbot mobile manipulator and includes 54 indoor and outdoor sequences in a university campus environment. The sensory data includes a stereo RGB 360° cylindrical video stream, 3D point clouds from two LiDAR sensors, audio and GPS positions. In addition to the current 2D-3D person detection and tracking, we will release a new set of annotations for this dataset such as 1) individual actions, 2) human social group formation, and 3) social activity of each social group. Using these unique annotations, we would launch two new benchmarks and challenges for this workshop. We also have, as invited speakers, world-renowned experts in the field of visual perceptions for understanding human action, intention and social behaviour. Finally, we aim to foster discussion between the attendants to find useful synergies and applications of the solutions of these (or similar) perceptual tasks.

Dataset

The currently available annotations on the JackRobbot dataset and benchmark (JRDB) include:

  • 2D bounding box annotations around all the pedestrians visible in five RGB streams and their cylindrical composition.
  • 3D oriented cuboid annotations around pedestrians in two Velodyne-16 LiDAR streams.
  • 2D-3D associations between bounding boxes and cuboids.
  • Time consistent trajectories (tracks) for all annotated individuals in both 2D and 3D.
  • In addition to the above, we have provided a new set of annotations, including:

  • Individual action labels for all the individuals visible in RGB 360° cylindrical video stream, representing: 
  • Human pose actions (total 11 exclusive categories including a miscellaneous class). 
  • Human-to-human interaction actions (total 3 categories including a miscellaneous class). 
  • Human-to-object interaction actions (total 12 categories including a miscellaneous class). 
  • The action label difficulty level (4 categories) for each attribute. 
  • Social group formation of all the individuals, visible in RGB 360° cylindrical video stream, (ranging from the groups with 1 person to the groups with 29 people)
  • Social activity labels for each social groups as the accumulation of the majority of individual action labels for each social group
  • Open Challenge

    In addition to the existing four benchmarks and challenges on JRDB, i.e. 2D-3D person detection and tracking challenges, in this workshop, we organise three new challenges using our new annotations:
  • Human social group identification
  • Individual action detection
  • Social activity recognition
  • The first winner of each of the seven challenges will be awarded a $100-$300 amazon gift card and a certificate. The winners will also have an opportunity to present their work as a spotlight (5 minutes) and poster presentation during the workshop.
    Guidelines for participation: The participants should strictly follow the same submission policy provided in the main JRDB webpage, which can be found here. Also, in order to distinguish the challenge submissions from the other regular submissions, each submission name should be followed by a CVPR21 tag, e.g., "submissionname_CVPR21". Otherwise, we ignore those submissions for the challenge. Each challenge submission should be followed by an extended abstract submission via our CMT webpage (the details are available below) or a link to an existing Arxiv preprint /publication.
    Evaluation: we use the first metric after "name" in all the leaderboards as the main evaluation for ranking the entries. The evaluation instruction and toolkits for all seven benchmarks are available here.

    Call for Papers

    We invite researchers to submit their papers addressing topics related to autonomous (robot) navigation in human environments. Relevant topics include, but not limited to:

  • Social group formation and identification
  • Individual, group and social activity recognition
  • 2D or 3D human detection and tracking
  • 2D or 3D skeleton pose estimation and tracking
  • Human body reconstruction
  • Motion prediction and social models
  • 2D or 3D human face detection
  • Human gaze estimation
  • Visual and social navigation in crowded scenes
  • Traversability estimation
  • Dataset proposals and bias analysis
  • New metrics and performance measure for different visual perception problems related to autonomous navigation
  • Submissions could follow the CVPR format (4-8 double-column pages excluding references) with the submission deadline of April 12 or extended abstract (1 page, double-column excluding references) with the submission deadline of May 30. Accepted papers have the opportunity to be presented as a poster during the workshop. However, only papers in CVPR format will appear in the proceedings. By submitting to this workshop, the authors agree to the review process and understand that we will do our best to match papers to the best possible reviewers. The reviewing process is double-blind. Submission to the challenge is independent of the paper submission, but we encourage the authors to submit to one of the challenges.

    Important dates:

  • Submission deadline for the full papers: April 13-23:59 PST
  • Acceptance notification of full papers: April 17-23:59 PST
  • Camera-ready deadline for the full papers: April 19-23:59 PST
  • Submission deadline for the extended abstracts: May 30-23:59 PST
  • Acceptance notification of the extended abstracts: June 10-23:59 PST
  • Submissions can be made here. If you have any questions about submitting, please contact us here.

    Program

    Start Time(PST)
    End Time(PST)
    Description
    12:30 PM
    12:40 PM
    Introduction
    12:40 PM
    13:10 PM
    Invited Talk: Bastian Leibe, RWTH Aachen University - TBA
    13:10 PM
    13:30 PM
    Full Papers' Oral Presentation:
    Yuhang He etal, "Know Your Surroundings: Panoramic Multi-Object Tracking by Multimodality Collaboration"
    Emre Hatay etal, "Learning to Detect Phone-related Pedestrian Distracted Behaviors with Synthetic Data"
    13:30 PM
    14:00 PM
    Invited Talk: Laura Leal-Taixé/Aljosa Osep, Technical University Munich
    Tracking Every Object and Pixel
    14:00 PM
    14:30 PM
    Coffee Break & Video Demo Session
    14:30 PM
    15:00 PM
    Invited Talk: Marynel Vázquez, Yale University
    Applications of Graph Neural Networks to Spatial Reasoning in Robotics
    15:00 PM
    15:30 PM
    Invited Talk: Kris Kitani, Carnegie Mellon University
    Modeling Attention in Social Group Interactions
    15:30 PM
    15:45 PM
    Introduction to JRDB Activity Dataset and Challenge
    15:45 PM
    16:10 PM
    Challenge Winners' Presentation
    16:10 PM
    16:40 PM
    Invited Talk: Juan Carlos Niebles, Stanford University - TBA
    16:40 PM
    16:50 PM
    Discussion, Closing Remarks and Awards

    Invited Speakers

    Juan Carlos Niebles

    Juan Carlos Niebles

    Bastian Leibe

    Bastian Leibe

    Marynel Vázquez

    Marynel Vázquez

    Kris Kitani

    Kris
    Kitani

    Aljosa Osep

    Aljosa
    Osep

    Program Committee

    Name
    Organization
    Amir A. Sadeghian
    AiBee, Inc.
    Ehsan Adeli
    Stanford University
    Fatemeh Saleh
    Australian National University
    Mohsen Fayyaz
    University of Bonn
    Boxiao Pan
    Stanford University
    Bingbin Liu
    Carnegie Mellon University
    Shyamal Buch
    Stanford University
    Jingwei Ji
    Stanford University
    Shun Taguchi
    Toyota Central R&D Labs, Inc.

    Organizers

    Mahsa Ehsanpour

    Mahsa

    Ehsanpour

    Roberto Martín-Martín

    Roberto

    Martín-Martín

    Claudia Pérez-D’Arpino

    Claudia

    D’Arpino

    Ian Reid

    Ian

    Reid

    Silvio Savarese

    Silvio

    Savarese

    Hamid Rezatofighi

    Hamid

    Rezatofighi

    Sponsors