Why data science is important for the built environment
Why building industry professionals should learn how to code
A jump start in the Python Programming Language
Overview of the Pandas data analysis library
Guidance in the loading, processing, and merging of data
Visualization of data from buildings
Basic machine learning concepts applied to building data
Examples of parametric analysis for the integrated design process
Examples of how to process time-series data from IoT sensors
Examples of analysis of thermal comfort data from occupants
Numerous starting points for using data science in other building-related tasks
The building industry is exploding with data sources that impact the energy performance of the built environment and health and well-being of occupants. Spreadsheets just don’t cut it anymore as the sole analytics tool for professionals in this field. Participating in mainstream data science courses might provide skills such as programming and statistics, however the applied context to buildings is missing, which is the most important part for beginners.
This course focuses on the development of data science skills for professionals specifically in the built environment sector. It targets architects, engineers, construction and facilities managers with little or no previous programming experience. An introduction to data science skills is given in the context of the building life cycle phases. Participants will use large, open data sets from the design, construction, and operations of buildings to learn and practice data science techniques.
Essentially this course is designed to add new tools and skills to supplement spreadsheets. Major technical topics include data loading, processing, visualization, and basic machine learning using the Python programming language, the Pandas data analytics and sci-kit learn machine learning libraries, and the web-based Colaboratory environment. In addition, the course will provide numerous learning paths for various built environment-related tasks to facilitate further growth.
Week 1: Introduction to Course and Python Fundamentals – In this introduction, an overview of key Python concepts is covered as well as the motivating factors for building industry professionals to learn to code. The NZEB at the NUS School of Design and Environment is introduced as an example of a building that uses various data science-related technologies in its design, construction, and operations.
Week 2: Introduction to the Pandas Data Analytics Library and Design Phase Application Examples – The foundational functions of Pandas are demonstrated in the context of the integrated design process through the processing of data from parametric EnergyPlus models. Further future learning path examples are introduced for the Design Phase including building information modeling (BIM) using Revit or Rhino, spatial analytics, and building performance modeling Python libraries.
Week 3: Pandas Analysis of Time-Series Data from IoT and Construction Phase Application Examples – Time-series analysis Pandas functions are demonstrated in the Construction Phase through the analysis of hourly IoT data from electrical energy meters. Further future learning path examples are introduced for the Construction Phase including project management, building management system (BMS) data analysis, and digital construction such as robotic fabrication.
Week 4: Statistics and Visualization Basics and Operations Phase Application Examples – Various statistical aggregations and visualization techniques using Pandas and the Seaborn library are demonstrated on Operations Phase occupant comfort data from the ASHRAE Thermal Comfort Database II. Further future learning path examples are introduced for the Operations Phase including energy auditing, IoT analysis, and occupant detection and reinforcement learning.
Week 5: Introduction to Machine Learning for the Built Environment – This concluding section gives an overview of the motivations and opportunities for the use of prediction in the built environment. Prediction, classification, and clustering using the sci-kit learn library is demonstrated on electrical meter and occupant comfort data. The course is concluded with suggestions on more in-depth Python, Data Science, and Statistics courses on EDx.
Development of this curriculum was led by Dr. Clayton Miller with support from NUS students Charlene Tan, Chun Fu, James Zhan, Matias Quintana, and Vanessa Neo.