by Federica Bianco, Zuzanna Kłyszejko and Sinziana Eckner03/08/17
The world of science and scientific inference has profoundly changed in the last 10 years or so: hail the data revolution. Our ability to process data grows exponentially (Moore’s law), and our ability to store it… well, so far it kept up. But unlike Galileo’s invention of the telescope or Darwins theory of evolution, this is a revolution is taking place across disciplines. Its byproducts are a whole new set of innovative research methods and the birth of new scientific fields. One of them - which is the subject of this blog post - is Urban Science.
Not that urban science did not exist before the new millenium: civil engineering is a form of urban science. The sociologists that study the urban environment, they too are urban scientists. But a new approach to urban science, which has a tight connection to data science, is growing in popularity: the study of the city as a system, better yet a “complex system” (i.e. a system that is composed of multiple interrelated parts) enabled by the large and diverse (i.e. high dimensional) urban datasets.
Cities are growing: over 80% of US population and 54% of world population lived in cities as of 2014. By comparison, in 1950, 30% of the world population was urban. By 2050 this percentage is projected to reach 66%. Source: UN report on world Urbanization prospects
Cities are a headache for sustainability, a source of pollution, a sink that requires a tremendous amount of energy. They were not designed with continuous growth in mind. Growth happens organically, and sometimes overnight.
However, cities are also centers that foster creativity through opportunities for human connection. Each city is a "complex system" composed of humans, the natural environment, and the built environment (buildings, cars, roads, the power grid), all interacting with each other in uncountably many ways. We can collect a tremendous amount of data for each of these components, both deliberately (e.g. through surveys or energy metering) or more or less accidentally (Twitter geo-tags can tell us where people are gathering). Even better, a tremendous amount of data is already publicly available! Especially for NYC: on March 7, 2012, former Mayor Bloomberg signed Local Law 11, commonly known as the “Open Data Law,” which mandated that all public data be made available through a single web portal by the end of 2018.
Image: One tale, three cities - NYC parks, NYC driving roads, NYC noise complaints. Source: https://github.com/fedhere/WIMLDSSmartCities
Rich data sets found in abundance, openly available on the web, plus a creative, diverse bunch of people who are excited about working on cool problems and making an impact makes a good recipe for studying the pulse of the city - its activity and rhythms, its metabolism - consumption and production, its vulnerabilities, identifying problems and generating solutions that help city resilience and improve quality of life.
The NYC Open Data team maintains a portal hrough which 98 city-agencies and offices share data. The selection includes 311 complaints, NYPD and FDNY statistics, data on all NYC park properties, a census that catalogs every single tree in NYC, taxi data including pickup-drop off location and tip amount for billions of rides, restaurant inspection results, 3-D models of buildings, energy consumption for every large building in NYC and much more.
We prepared a number of possible questions to investigate during the hack day, ranging from exploring crime stats around subway stations, finding disruptive events in historical traffic data, comparing Uber and Taxi ridership, discovering abusive construction sites reported to 311, measuring areas at high risk of disease or analyzing the clustering of tobacco and liquor stores and their proximity to vulnerable population.
If you're looking for inspiration on what kinds of projects could come out of our hack day, here are a handful of examples of clever, creative analyses of NYC open data: pedestrian and cyclist safety, a piece on stop and frisk by NCLU, another blog on stop and frisk, a publication about the effects of police surges on crime, the city's rythm by the hour and one of many pieces on Citibikes.
Come join us, learn with us, explore with us, make new friends and most importatnly have a great time!