Word Count: 2801
In this literature review, it will first identify the human development problem of urban crowdedness that people are facing right now, then it will introduce several proposed measurement and index by scientists in an effort to extract data from the urban model; after that, a thorough examination and analysis will be conducted on present case studies of urban population, identifying their techniques, models, methodologies, strengths as well as shortcomings; moreover, the literature review will present the Agent-Based Model in the context of urban population, identifying its usefulness, strength and weakness, and applications in the real-world scenario; finally, a conclusion will be provided and a board research question of how can we extract and measure data effectively in the application of urban population will be proposed.
The global population has been growing at an unprecedented rate in the past decades. It only took 12 years for the world population to add a staggering number of one billion people. What is more surprising is that over half of all global population live in cities, which merely covers three percent of the world’s land surface. While the booming urban population brings economic prosperity and diversities to the cities, it has also caused many social problems and affected the urban environment in a negative way such as traffic congestion, noise, skyrocketing real estate prices and crowder living spaces. However, Big data and geospatial technologies can be applied to tackle the urban population issue by examining and analyzing the data, understanding the population trend and behaviors in order to find optimize solutions, thus improving the urban environment. The specific regions the review will focus on are mainly China and South Asia due to their large population quantity. This literature review is related to Amartya Sen’s definition of human development by proposing solutions and ideas for a better urban living environment. The United Nations’ goal of Sustainable Cities and Communities is also in relation to this article. In this literature review, I will examine the current papers on the topic of urban population, identify the gap and shortcomings in the papers as well as analyze the future potentials of this field of study in order to address the board research question that how can data science methods and technologies be practically applied to the field of urbanization study, help us understand the population trends and behaviors in this urban complex adaptive system, and finally offer solutions to optimize the urban living environment, contributing to the human development through expanding the environmental freedom of sustainability.
It is indisputable that the field of data science is becoming more and more relevant in everyone’s daily life. Technologies such as artificial intelligence and neural networks, as well as numerous models and Big Data, are being applied more and more widely to many fields of study; and that also includes the study of urbanization and population. However, while the scientists are paying more attention to it, they have found that few reliable methods and standards have been developed to address the problem of urban population measurement due to the “unclearness in the definition of urban functional compactness” (Lan 2021, p. 2). In light of this, some scientists have proposed various indexes and models to standardize and measure urban population in a more effective way, and they have succeeded to some extent in their real-world application (Lan 2021; Yao 2017).
In Lan’s study, he utilizes datasets such as POIs (points of interest), GIS (geographic information systems), VIIRS(Visible Infrared Imaging Radiometer Suite) and RNO (Roads Network of OpenStreetMap) to create the Functional Compactness Index(FCI), which seeks to measure and “distinguish the differences [of] the functions [and] compactness” in urban areas. This technique works by first “[taking] the street blocks in RNO data as the basic spatial analysis unit. Then, the FCI identifies the functional attributes of each street block according to the POIs. Finally, it uses NPP/VIIRS nighttime light data to determine the intensity of human activity in functional zones.” In this study, POIs are used to determine the functionality of every basic street block units. They are “identified based on the proportion of each kind” out of six kinds and the belonging of a street block is determined by which functionality has the largest number of points. Meanwhile the intensity of human activity is based on the data from VIIRS (Lan 2021, p. 3). The FCI is developed comprehensively to measure both the compactness and the functionality of a particular urban area. And according to the authors, the index has shown success in modeling four important Chinese cities, reaching conclusions regarding urban population that greater human intensity means more compact city functions; also, the mixing of residential zones with other zones is positive correlated with the compactness of a city. However, the research gap in this paper is that the Functional Compactness Index is not able to monitor and measure the flow, or movement of the population; FCI is only able to provide an overview of a city’s population density and a chuck of urban area’s functionality.
The measurement method above addresses the demand of an accurate overview of the urban population data, however, there needs to a measurement of detailed data of small area population in a city. To address the problem that normal geospatial models focus only on the large-scale population distribution and spatial analysis units are too large and general, a model that measures microscale population distribution at local level has also been introduced (Yao et al.,2017). With the datasets from POIs and APIs (application programming interfaces), and the important Realtime Tencent User Density (RTUD) provided by Tencent, it works by first choose the high-population POIs and “map the preliminary population disaggregation;” then construct a nonlinear population model with the integration of RFA and other big data “to indicate the spatial heterogeneity;” finally calculate the microscale population distribution in the building level (Yao 2017, p. 1226-1227). Since the datasets that the researchers put the most emphasis on include time-sensitive datasets such as the Realtime Tencent User Density, we will be able to get a much better idea of the population distribution at specific times of a day; moreover, the accuracy and details of the model is improved compared with the FCI since this model measures population distribution in smaller spatial analysis units at building level. Its application also seems successful in modeling the microscale population distribution of China’s largest cities (Beijing, Shanghai, and Guangzhou); however, one unignorable deficiency of this model is the use of private sector Tencent’s dataset. Although its users in China have reached 808 Million, while covering more than 93% of the population in China’s largest cities, it might still create a bias that acts against or ignores the population that does not use Tencent. Also, it raises the privacy issue that should private companies send third parties its user’s data, regardless of its intent? Although the two proposed models have addressed the lack of urban population measurement to some extent, improvements can be made so that the measurement model is able to represent the entire urban population as well as detect the population’s trend, flow and dynamic in real time, providing scientists and policy makers useful data to optimize urban environment.
After discussing the concepts and models of geospatial data in the field of urban population, we will be examining a few case studies in which the models and technologies of data science are applied in the real-world scenario; by extracting and analyzing data, as well as drawing the outcome and conclusion, the case study will present us an overview of the advantages and shortcomings of the current data science applications in real world. In order to better illustrate the progresses that the field of data science has made over the past years, I believe it will be beneficial to compare two case studies dated 2014 (Shirazi et al.,2014) and 2019 (Xie et al., 2019), respectively.
In 2014, Geospatial techniques are applied to analyze the city growth and population trend of Lahore, Pakistan (Shirazi et al.,2014). In the paper, the authors use techniques such as remote sensing, Land Use and the Land Cover (LULC), and GIS. In this study, the image-processing technique of remote sensing is crucial, creating a base map that “is the only entity which directs the data to a clear spatial dimension;” along with surveying data from the Survey of Pakistan and other government institutions, the researchers have created a suitable base map for GIS modeling. Since the researchers are interested in the past development of the city of Lahore, the source maps were scanned for digitalization. So that the growth map of Lahore can be developed; during the period of 63 years since Independence, Lahore has “has resulted in major development in south and southeast directions” due to physical barriers and national boundary; it is also worth noting that the physical growth of the city “has been along major roads” (Shirazi et al 2014, p. 275-277). It is not hard to tell that the geospatial technology that the researchers have used in this study is comparatively primitive; the map data sources were utilized to create the growth map and identify the physical barriers, political factors and other reasons that have shaped the urban development and city limit of Lahore. It is only able to provide a board overview that might be useful for future planning but fails to address any practical problems as well as provide in-depth analysis of city growth.
In a much more recent study, the relationship between polycentric urbanization and urban population dynamics is investigated (Xie et al., 2019). In this study, similar geospatial techniques as the previous study such as remote sensing and GIS are used; however, this study adds more technologies and datasets such as Defense Meteorological Satellite Program—Operational Linescan System (DMSP/OLS) to extract the contour of built-up areas, Geographically Weighted Regression (GWR) and social media data (in particular Weibo, the Chinese Twitter). The researchers develop the model by first performing “relative radiation correction” as data preprocessing; second, GWI is used to define the city centers and subcenters; Third, “the minimum cost distance and the optimal route buffer” and the “multitemporal built-up areas” are extracted to be analyzed in the economic corridor. Three major Chinese cities are examined by the model and the approach is compared with other methodologies. For the three cities Wuhan, Xi’an and Shenyang, the model produces optimal routes connecting the main centers and the subcenters that have the lowest costs for transportation, which is generated via the “minimum cost distance algorithm;” It also detects the growth of city and the emergence of city subcenters, along with the expansion of road network. The results have shown success in both detecting and predicting the urban expansion direction, in addition to establishing the relationship between the “polycentric structure and urban dynamics” (Xie et al 2019, p. 1, 11-13, 20). Compared with the similar 2014 case study, the advancement of data science and methodology can be visualized; policy makers are presented with a more comprehensive and accurate growth map to determine future economic policies and urban regulations accordingly. However, similar problems to the previous section also persisted here; the bias of the data, the drawback of failing to be simulating human activities at an individual level, and the privacy concerns are all gaps that future urban geospatial population studies need to address. While the current geospatial techniques can produce satisfactory results for references of urban management, there are definitely rooms for future improvement and refinement.
As we have identified the gaps and shortcomings of conventional urban geospatial techniques, in this section, I will introduce Agent-Based Modeling (ABM), a geospatial data science method; by explaining the methodology, discussing the pros and cons of ABM, and providing arguments why the application of ABM can improve the outcomes of geospatial urban population studies. The idea of Agent-Based Modeling was introduced in the 1940s but did not become widespread until the 1990s. It is currently widely applied in many academic disciplines such as biology, epidemiology, as well as social sciences, noting its ability to simulating individual activities and behaviors and how the trends and dynamics of the lower-level subjects alter behavior of the higher-level system.
In an ABM-utilized study that correlates to our topic of urban population by investigating the human exposures to urban environmental stresses (Yang et al.,2018), we have the opportunity to understand the Agent-Based models more thoroughly and comprehensively. By its definition, ABM “considers the essential known and measurable aspects of an agent;” but more importantly, ABM is able to acknowledge the “nonlinearities” and the “underlying dynamic processes,” which reflects the true nature of a complex system such as human societies. In this study, the researchers construct the ABM framework with three overlapping layers: “spatial data of the concerned urban environment, concentrations of environmental stress sources, and human activities.” Within this framework, environmental stress sources that vary by times of a day are seen as factors that influence the exposures of individual agents, who “dynamically follow their daily life according to predetermined rules that are set according to empirical studies and specific surveys.” During the simulation, the model collects and summarizes “both individual and collective exposure and inform relevant exposure reduction strategies,” so the pollution exposures to human beings can be measured and analyzed. Due to the fact that the urban population is extremely diversified and they all behave in their unique ways, the researchers “group people with similar attributes and behaviors” with all kinds of personal characteristics including “age, gender, work, income, education, living and working location, and access to cars or public transport, as well as the environmental conditions.” For the ease of data collection and computation, it is assumed the daily routine of a certain group of agents is uniform (Yang et al 2018, p. 6-9). With this model that reflects individual behaviors and their trends and influences, it successfully simulates the city of Hamburg and provides precise data to the researchers. This study has demonstrated both viability and superiority of the application of Agent-Based Model in urban studies; it also works well to address the problem raised previously regarding the failure to simulate at individual level. Additionally, the use of surveys eliminates the concern of privacy data violation, as well as some possible bias of private sector data.
According to Barder’s argument that the society functions as a complex adaptive system and development is its evolving process, ABM’s unique characteristic that models the subjects’ behavior in order to observe its effects on the higher-level system will help us prominently in identifying and addressing population problems that hinders human urban development. In another study that explores the connections between ABM and Computational Social Science (Conte et al., 2014), similar opinions were offered; The Agent-Based Modeling is the “only known approach apt to model and reproduce sets of heterogeneous agents interacting and communicating indifferent ways,” which is central in urban population studies that involves massive number of individuals and complex behaviors. Moreover, “ABM allows the interplay between different levels of a social system to be modeled and observed,” enabling the researchers to observe the overall effects that cannot be observed from modeling or simulating a single unit. Of course, there are shortcomings of ABM: the minimal-conditions logic designed to keep the computation simple, can often generate agents that “become arbitrary, poorly comparable, competent in highly specific domains of knowledge and disarmingly in apt in any other,” thus “reduces the validity of ABMs.” Also, the overuse of ABM will lead to the neglect of “the micro-foundations of phenomena” in the scientific world (Conte et al 2014, p. 2-5); much like the debate between Anderson and Kitchin, I side with Kitchin that pursuing only for present efficiency and convenience without understanding the theory and mechanism will obstruct human’s development of science and technology.
This literature review has proposed a research question to utilize geospatial technologies to improve human’s well-being in the urban environment, examined the recent studies of urban population studies pointing out their strengths and weaknesses, and introduced the Agent-Based Model and its application to suggest a method for improvement of the preciseness and trustworthiness of geospatial urban studies. The main gaps identified in the study are the limitations of the data which the researchers are able to collect, the failure for the models to stimulate at the individual level, as well as the privacy concerns of citizens. In the section above, the ABM model is able to solve some of the problems, but there are few studies that directly apply ABM model to identify and address urban population problems. So the essential research question is, how can we identify, measure, and extract the urban populaion data more effectively, so we can tackle the development problems caused by overcrowding population, optimizing the urban living environment to its citizens?
References:
Lan, T., Shao, G., Xu, Z., Tang, L., & Sun, L. (2021). Measuring urban compactness based on functional characterization and human activity intensity by integrating multiple geospatial data sources. Ecological Indicators, 121, N.PAG. https://doi.org/10.1016/j.ecolind.2020.107177
Yao, Y., Liu, X., Li, X., Zhang, J., Liang, Z., Mai, K., & Zhang, Y. (2017). Mapping fine-scale population distributions at the building level by integrating multisource geospatial big data. International Journal of Geographical Information Science, 31(6), 1220–1244. https://doi.org/10.1080/13658816.2017.1290252
Xie, Z., Ye, X., Zheng, Z., Li, D., Sun, L., Li, R., & Benya, S. (2019). Modeling Polycentric Urbanization Using Multisource Big Geospatial Data. Remote Sensing, 11(3), 310. https://doi.org/10.3390/rs11030310
Shirazi, S. A., & Kazmi, S. J. H. (2014). Analysis of Population Growth and Urban Development in Lahore-Pakistan using Geospatial Techniques: Suggesting some future Options. South Asian Studies (1026-678X), 29(1), 269–280.
Yang, L., Hoffmann, P., Scheffran, J., Rühe, S., Fischereit, J., & Gasser, I. (2018). An Agent-Based Modeling Framework for Simulating Human Exposure to Environmental Stresses in Urban Areas. Urban Science, 2(2), 36. doi:10.3390/urbansci2020036
Conte, R., & Paolucci, M. (2014, June 10). On agent-based modeling and computational social science. Retrieved March 22, 2021, from https://www.frontiersin.org/articles/10.3389/fpsyg.2014.00668/full