Comprehensive urban space representation with varying numbers of street-level images
Topics:
Keywords: Street-level imagery, urban space representation, multimodal data fusion, deep learning, urban village recognition
Abstract Type: Paper Abstract
Authors:
Yingjing Huang Peking University
Fan Zhang Peking University
Yong Gao Peking University
Wei Tu Shenzhen University
Fabio Duarte Massachusetts Institute of Technology
Carlo Ratti Massachusetts Institute of Technology
Diansheng Guo Tencent Corporation
Yu Liu Peking University
Abstract
Street-level imagery, with its granular detail, has revolutionized the way we observe and understand vast urban landscapes. Such imagery allows urban analysts to delve deeper into the intricate morphology of urban environments. However, previous studies have been limited to analyzing individual street-level images, thereby limiting their representation of a spatial unit, such as a street or grid. These units can contain varying numbers of street-level images, ranging from several to hundreds. As a result, a more comprehensive and representative approach is required to capture the complexity and diversity of urban environments at different spatial scales. To address this issue, this study proposes an innovative deep learning-based module, Vision-LSTM. This advanced module can effectively obtain vector representation from varying numbers of street-level images in spatial units. The effectiveness of the module is validated through experiments to recognize urban villages in Shenzhen, China, achieving reliable recognition results (overall accuracy: 91.6%) through multimodal learning that combines street-level imagery with remote sensing imagery and social sensing data. Moreover, when benchmarked against conventional image fusion techniques, Vision-LSTM showcased superior capability in discerning and capturing intricate correlations between varying street-level images. This novel approach not only amplifies the research potential of street-level imagery but also paves the way for more enriched, multimodal learning-driven urban studies. As urban environments continue to evolve, tools like Vision-LSTM will be instrumental in ensuring that our understanding remains comprehensive, detailed, and forward-looking.
Comprehensive urban space representation with varying numbers of street-level images
Category
Paper Abstract
Description
Submitted By:
Yingjing Huang
huangyingjing@stu.pku.edu.cn
This abstract is part of a session: GeoAI and Deep Learning Symposium: Urban AI and Sustainable Built Environment