Baltimore Housing Prices Disparity for Comparable Neighborhoods: A Case for Enabling Interactive, Visual Exploration of Neighborhoods Akshay Peshave, Siraj Memon, Vedmurtty Chavan, Tim Oates Computer Science & Electrical Engineering University of Maryland, Baltimore County {peshave1, siraj1, ved1, oates}@umbc.edu Abstract As government agencies increasingly make public data available online, it provides opportunities to leverage such data for descriptive, predictive and prescriptive analytics. One domain where these technological capabilities are ap- plicable is real-estate development and housing market do- main. This domain is of interest to home buyers, investors and policy makers. Diverse and varying preferences of resi- dents of a geography are latent behavioral factors that affect residential property prices. This paper describes a geographical area agnostic hous- ing typology classifier for Baltimore City communities or neighborhoods. Further, it discussed correlation analysis and composite Vital Signs scores to characterize city popu- lation perceptions of different community development cat- egories. These scores enable community clustering to in- vestigate price disparity in comparable communities based on configurable categories and year-on-year trend analysis. Various visualization possibilities are discussed in conjunc- tion with these approaches to make a case for interactive, visual exploration of geographical communities which may be extended to comparative analysis across geographies. Keywords— Multi-label Classification, Correlation Anal- ysis, Clustering, Data Visualization, Baltimore City, Hous- ing and Community Development. 1 INTRODUCTION As government agencies increasingly make public data available online, it provides opportunities to leverage such data for descriptive, predictive and prescriptive analytics. State-of-the-art data technologies and analytic methods can be used to drive innovation and development of platforms to help inform public policy effectively and efficiently for so- cial good. This also makes direct information provision to and engagement of the public-at-large possible at scale. In- teractive technology platforms which enable custom-tailored analyses and engagement processes will benefit both policy makers, policy enforcers and the general public. Data sci- ence technologies also allow federated open data consump- tion and and analytics to provide insights at varied abstrac- tions which will help citizens and policymakers make in- formed choices. One domain where these technological capabilities are useful is real-estate development and housing market. Con- sumers buy residential properties for personal use and as in- vestment instruments. Consequently, the housing market is susceptible to volatility due to sensitivity to varying factors that affect public perception and pricing bubbles. A wide ar- ray of data needs to be acquired and analyzed to inform such high monetary value decisions. In addition to geographic and housing market macro-data, a more subtle and latent be- havioral factor affects property pricing: diverse and varying preferences of residents of a geography. The ability to conveniently access and analyze aggregated forms of data which affects home prices is important to analyze such latent factors. Interactive platforms consum- ing relevant data from government agencies and third-party oganisations and applying state-of-the-art data science can efficiently enable this. Such platforms can serve to inform public policy and developmental efforts as well as help con- sumer decision making. The analytics workflow may take many forms based on the use-case and requirements thereof. This works explores one such analytics workflow for the residential property market in Baltimore City. The task is to enable visual exploration of Baltimore City neigh- borhoods which are similar in terms of some character-