PSCD: Panoramic Semantic Change Detection Dataset

The PSCD dataset is an image database for semantic scene chagne detection. It comprises 770 panoramic image pairs. Each pair consists of images I₀, I₁ taken at two different time points t₀, and t₁. These panoramic images are taken in urban areas. The PSCD dataset contains the change binary masks C₀, C₁, the semantic labels S₀, S₁, the instance labels D₀, D₁, the attributes A₀, A₁ (3D object, 2D texture).

Dataset Details

Specification

The projection type of the panoramic image is equirectangular. The resolution of the original image is 4,000 × 2,000. The dataset images are croped the top and bottom part as follows.

Class Definition

For original annotation data

We define 11 classes for the original annotation data as follows.

Ignore (No Change)
Humans
Group of Humans
Vehicles
Group of Vehicles
Barrier
Structure
Lane Marking
Vegetation
Object (Traffic)
Object (Others)

For semantic change detection task

Furthermore, we define 8 classes for semantic scene change detection task as follows.

Ignore (No Change)
Human
Vehicles
Barrier
Structure
Lane Marking
Object (Traffic)
Object (Others)

Directory Structure

PSCD

| --README.txt

| --t0 // 00000000.png - 00000769.png

| --t1 // 00000000.png - 00000769.png

| --mask_t0 // 00000000.png - 00000769.png

| --mask_t1 // 00000000.png - 00000769.png

| --mask // 00000000.png - 00000769.png

| --label_t0 // 00000000.png - 00000769.png

| --label_t1 // 00000000.png - 00000769.png

| --privacy_mask_t0 // 00000000.png - 00000769.png

| --privacy_mask_t1 // 00000000.png - 00000769.png

| --label_instance_t0 // 00000000.png - 00000769.png

| --label_instance_t1 // 00000000.png - 00000769.png

| --attribute_t0 // 00000000.png - 00000769.png

| --attribute_t1 // 00000000.png - 00000769.png

| --sensor_mask.png

Examples

RGB image I₀
RGB image I₁
Semantic change label S₀
Semantic change label S₁
Change mask C₀
Change mask C₁
Change mask C₀ & C₁ (Not aligned)
Privacy mask P₀
Privacy mask P₁
Instance change label D₀
Instance change label D₁
Attributes (2D, 3D) A₀
Attributes (2D, 3D) A₁
Sensor mask M_sensor

Experimental Results

Change Detection


	F₁ Score	mIoU
Siamese-CDResNet (Ours)	0.697	0.719
CSCDNet (Ours)	0.698	0.719

Semantic Change Detection

	CSCDNet + SSCDNet				GT mask + SSCDNet		CSSCDNet
Training data (CD / SCD)	PCD / Vistas		PSCD (mask) / Vistas		- / Vistas		PSCD (full)
DA for training	-	✓	-	✓	-	✓	n/a
mIoU	0.192	0.196	0.215	0.223	0.303	0.283	0.322

Download

You may only use this dataset for academic research purposes and shall not redistribute it without our permission. When using this dataset in an academic paper, please cite the following paper as a reference:

@inproceedings{sakurada2020weakly,
              title={{Weakly Supervised Silhouette-based Semantic Scene Change Detection}},
              author={Sakurada, Ken and Shibuya, Mikiya and Wang, Weimin},
              booktitle={ICRA},
              pages={6861--6867},
              year={2020},
              organization={IEEE}
          }

Terms of use are displayed after you click the following button. When you agree to the term of use, you can download PSCD dataset.

Download

Contact

Please contact us at the following email address if you find any concerns.
M-pscd-ml@aist.go.jp

Personal information provided via email will be used for the purpose of processing inquiries and responding to inquiries, and will be handled appropriately set forth in the following Privacy Policy.
https://www.aist.go.jp/aist_e/privacy_policy/index_en.html

Acknowledgement

This work is partially supported by KAKENHI 18K18071 and the New Energy and Industrial Technology Development Organization (NEDO).

Our class definition file format follows the Mapillary Vistas Dataset [1]. We would like to acknowledge G. Neuhold et al. We use the face [2] and the number plate [3] detection algorithms for mosaicing images. We would also like to acknowledge Deepak Babu Sam and Sérgio Montazzolli Silva et al.

Reference

[1] G. Neuhold, T. Ollmann, S. Rota Bulò, and P. Kontschieder, 
    "The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes",
    International Conference on Computer Vision (ICCV), 2017.
[2] Deepak Babu Sam, Skand Vishwanath Peri, Mukuntha Narayanan Sundararaman, 
    Amogh Kamath, R. Venkatesh Babu,   
    "Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detection", 
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020.
[3] Sérgio Montazzolli Silva, Cláudio Rosito Jung,
    "License Plate Detection and Recognition in Unconstrained Scenarios",
    European Conference on Computer Vision (ECCV), 2018.