Big data at the heart of smart cities

2015-09-15 03:07:59

“Smart city” is now a buzz word in South Asia, possibly due to the Modi government’s commitment to build 100 smart cities in India.  In Sri Lanka, some argue that Kandy should be the first smart city; others, that it is logical to start with Colombo. 

Smart cities are about feedback
A true “smart city” is characterized by enhanced feedback loops within the complex system of systems that constitutes the city.  In the not-smart city, government and other decision-makers act without adequate and timely feedback.  Surveys are the principal source of systematic data, but are expensive and cumbersome. 

They are thus rarely used.  Experimentation is not a viable option.  In the smart city, feedback is enormous.  Sensors generate big data, which when processed and used for resource allocation, improve the functioning of the city.  

At one extreme of smart-city approaches lies the vision of a centrally coordinated city resting on pervasive use of specialized sensors (e.g., one under each parking space; multiple sensors at intersections), real-time or non-real-time analysis of the resultant big-data flows, and reliance on mathematical models.  South Korea’s Songdo is the exemplar.  At the other end of the continuum lies the “crowd-sourced” smart city, wherein the workings of the city are sought to be transformed by apps developed at “hackathons” by outside, and mostly volunteer, coders.     

Both approaches have weaknesses.  Central coordination is very expensive and hard to get right.  When mistakes are made, they are difficult to correct.

 Crowdsourcing is attractive because of its potential to unleash decentralized innovation and because it is cheap.  But, unless it is highly structured both with regard to the data sets that are provided and in terms of implementing the solutions that are developed, transformational results are difficult to achieve.
A middle option focuses on citizens moving through time and space in the city as the primary sensors.  They generate the big data that when analyzed constitute the feedback that is the essence of a smart city.  Experimentation and learning are integral to this low-cost approach.  It is especially appropriate for the organically developed, congested cities in developing countries where the costs of installing and maintaining city-owned sensors would be quite high.  
Big data have always been there, but it is only recently that analysis has become tractable.  Over the past decades more data have been “datafied.”  Mayer-Schonberger and Cukier coined this neologism to describe very large data sets that include, but are not limited to, schema-less (unstructured, but processable) data.  Until recently constraints of computer memory, retrieval and processing limited the use of these data to entities who could afford to use supercomputers.  Hardware and memory have declined in price and improved in functionality and open-source software has been developed, democratizing big data analytics.  This has enabled non-profit entities such as LIRNEasia to mobilize local data scientists to undertake research that can contribute to smart-city developments.  This work has attracted attention from many including the UN organization dealing with big data and has been covered by newspapers in the region.

Citizens as sensors
Smart cities require timely and accurate data such as those about land use, about where people live and congregate and when, about their mobility, their economic conditions, where they spend their money, and about their social networks.  The sole source of comprehensive data in countries like ours at this time are the ubiquitous mobile phones.  Mobile network big data (MNBD) are generated by all phones, smart and otherwise.  

MNBD which includes call-detail records (CDRs) generated when calls and texts are sent/received, Internet is used and prepaid value is loaded, and visitor-location registry (VLR) data that are generated when handsets “tell” base transceiver stations (BTS) that they are in the coverage area.  These “meta-data” are collected for network operation and billing and exclude the content of communications.

LIRNEasia has demonstrated the value of MNBD in Sri Lanka.  Pseudonymized, historical CDRs from multiple mobile operators have been analyzed to understand and monitor land use, congregations of people, peak and off-peak travel patterns, communities, and traffic.  Correlations have been validated using other datasets where available.  The findings, some of which are described below, have been shared with senior government officials in urban planning and statistics.  

Changes in population density  
Colombo is a small city of 550,000 according to the 2012 Census.  It lost population since the previous count.  CDRs were analyzed to measure diurnal changes in population density and gain insights into who commutes into the city and from where.  

Using interpolation techniques to compensate for the fact that CDRs are only generated when owners send/receive a call/text, the location of a phone can be plotted on an hourly basis.  The population “hot spots” identified using MNBD have been correlated with the findings of a conventional transport survey that cost USD 300-400,000.  

The analyses suggested that certain areas, such as the southern and central parts of the city of Colombo and a few other locations such as industrial zones outside the city serve as sinks, attracting large numbers of people from surrounding suburban sources.  The northern part of the city, where the poor are concentrated, also functions as a source, showing lower density at midday on weekdays relative to midnight.  Many additional insights on where people come from, where they congregate and when, etc. have been generated. 

Insights on land use
The diurnal loading patterns of BTS, can provide valuable insights on land use.  BTS in the Colombo District (population 2.34 million; includes most of the Colombo metropolitan area and the City) have been classified into distinct categories by the application of unsupervised machine learning techniques to diurnal loading data.  

The two polar cases are shown as Figure 1.  The left-hand profile, where the peak use occurs around midday and the weekday differs from the weekend loading pattern is from a commercial area.  The right-hand profile is that from a BTS in a residential area.  Here, the peak occurs at around 7 PM and there is no significant difference between weekday and weekend.

Principal Component Analysis was used to identify the patterns in each BTS’s loading pattern.   Using an unsupervised machine learning technique, the 15 principal components of each BTS were used to classify the BTS into three categories reflecting different types of land use: predominately commercial, predominantly residential, and mixed.  It is possible to further disaggregate the intermediate locales to show which way they “lean,” i.e., “leaning commercial” or “leaning residential.”  

The analysis costs very little and can be done at frequent intervals, unlike the industry surveys that are conducted by the National Statistical Organization (NSO) every three or four years.

How to be really smart
Smart cities are not a synonym for special economic zones, where defined geographical areas are provided with high-quality infrastructure services as was done in the Bentota tourist zone and in the Katunayake EPZ.  

Infrastructure investments are needed, of course, but the city becomes smart only when its functioning improves because of enhanced feedback made possible by big data analytics and appropriate responses.  

It is always possible to invite the IBMs and Ciscos to transplant hardware-intensive solutions from Marseilles or Stockholm.  But the really smart approach would be to build on the low-cost, context-sensitive approach centered on MNBD that has already been demonstrated in Sri Lanka and which is drawing increasing attention outside the country as well. 

  Comments - 0

Add comment

Comments will be edited (grammar, spelling and slang) and authorized at the discretion of Daily Mirror online. The website also has the right not to publish selected comments.
Name is required

Email is required
Comment cannot be empty