PSCD: Panoramic Semantic Change Detection Dataset

Ken Sakurada, Mikiya Shibuya, Weimin Wang, Yukuko Furukawa, Masaki Onishi,
National Institute of Advanced Industrial Science and Technology

The PSCD dataset is an image database for semantic scene chagne detection. It comprises 770 panoramic image pairs. Each pair consists of images I0, I1 taken at two different time points t0, and t1. These panoramic images are taken in urban areas. The PSCD dataset contains the change binary masks C0, C1, the semantic labels S0, S1, the instance labels D0, D1, the attributes A0, A1 (3D object, 2D texture).

Dataset Details

Specification

The projection type of the panoramic image is equirectangular. The resolution of the original image is 4,000 × 2,000. The dataset images are croped the top and bottom part as follows.

 
crop.jpeg
 

Class Definition

For original annotation data

We define 11 classes for the original annotation data as follows.

  • Ignore (No Change)
  • Humans
  • Group of Humans
  • Vehicles
  • Group of Vehicles
  • Barrier
  • Structure
  • Lane Marking
  • Vegetation
  • Object (Traffic)
  • Object (Others)

For semantic change detection task

Furthermore, we define 8 classes for semantic scene change detection task as follows.

  • Ignore (No Change)
  • Human
  • Vehicles
  • Barrier
  • Structure
  • Lane Marking
  • Object (Traffic)
  • Object (Others)

Directory Structure

PSCD

| --README.txt

| --t0 // 00000000.png - 00000769.png

| --t1 // 00000000.png - 00000769.png

| --mask_t0 // 00000000.png - 00000769.png

| --mask_t1 // 00000000.png - 00000769.png

| --mask // 00000000.png - 00000769.png

| --label_t0 // 00000000.png - 00000769.png

| --label_t1 // 00000000.png - 00000769.png

| --privacy_mask_t0 // 00000000.png - 00000769.png

| --privacy_mask_t1 // 00000000.png - 00000769.png

| --label_instance_t0 // 00000000.png - 00000769.png

| --label_instance_t1 // 00000000.png - 00000769.png

| --attribute_t0 // 00000000.png - 00000769.png

| --attribute_t1 // 00000000.png - 00000769.png

| --sensor_mask.png

Examples

RGB image I0

t0.jpeg

RGB image I1

t1.jpeg

Semantic change label S0

label_t0.jpeg

Semantic change label S1

label_t1.jpeg

Change mask C0

mask_t0.jpeg

Change mask C1

mask_t1.jpeg

Change mask C0 & C1

(Not aligned)

mask.jpeg

Privacy mask P0

privacy_mask_t0.jpeg

Privacy mask P1

privacy_mask_t1.jpeg

Instance change label D0

label_instance_t0.jpeg

Instance change label D1

label_instance_t0.jpeg

Attributes (2D, 3D) A0

attribute_t0.jpeg

Attributes (2D, 3D) A1

attribute_t1.jpeg

Sensor mask Msensor

sensor_mask  .jpeg


Experimental Results

Change Detection

F1 Score mIoU
Siamese-CDResNet (Ours) 0.697 0.719
CSCDNet (Ours) 0.698 0.719


Semantic Change Detection

CSCDNet + SSCDNet GT mask + SSCDNet CSSCDNet
Training data (CD / SCD) PCD / Vistas PSCD (mask) / Vistas - / Vistas PSCD (full)
DA for training - - - n/a
mIoU 0.192 0.196 0.215 0.223 0.303 0.283 0.322

Download

You may only use this dataset for academic research purposes and shall not redistribute it without our permission. When using this dataset in an academic paper, please cite the following paper as a reference:

@inproceedings{sakurada2020weakly,
              title={{Weakly Supervised Silhouette-based Semantic Scene Change Detection}},
              author={Sakurada, Ken and Shibuya, Mikiya and Wang, Weimin},
              booktitle={ICRA},
              pages={6861--6867},
              year={2020},
              organization={IEEE}
          }
          

Terms of use are displayed after you click the following button. When you agree to the term of use, you can download PSCD dataset.

Download

Contact

Please contact us at the following email address if you find any concerns.
M-pscd-ml@aist.go.jp

Personal information provided via email will be used for the purpose of processing inquiries and responding to inquiries, and will be handled appropriately set forth in the following Privacy Policy.
https://www.aist.go.jp/aist_e/privacy_policy/index_en.html

Acknowledgement

This work is partially supported by KAKENHI 18K18071 and the New Energy and Industrial Technology Development Organization (NEDO).

Our class definition file format follows the Mapillary Vistas Dataset [1]. We would like to acknowledge G. Neuhold et al. We use the face [2] and the number plate [3] detection algorithms for mosaicing images. We would also like to acknowledge Deepak Babu Sam and Sérgio Montazzolli Silva et al.

Reference

[1] G. Neuhold, T. Ollmann, S. Rota Bulò, and P. Kontschieder, 
    "The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes",
    International Conference on Computer Vision (ICCV), 2017.
[2] Deepak Babu Sam, Skand Vishwanath Peri, Mukuntha Narayanan Sundararaman, 
    Amogh Kamath, R. Venkatesh Babu,   
    "Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detection", 
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020.
[3] Sérgio Montazzolli Silva, Cláudio Rosito Jung,
    "License Plate Detection and Recognition in Unconstrained Scenarios",
    European Conference on Computer Vision (ECCV), 2018.