PhD Thesis

“Towards a Data-driven Approach for Agent-Based Modelling: Simulating Spanish Postmodernisation”

Interdisciplinary PhD in Computer Science (combining Artificial Intelligence, Social Networks and Sociology)
Awarded by the Universidad Complutense de Madrid
Mention: Honors (Sobresaliente Cum Laude) & European Mention
Language: Full text in English & summary in Spanish

Advisors

Juan Pavón, Professor of Computer Science in Universidad Complutense de Madrid (Spain), Director of the Research Group in Software Agents GRASIA
Millán Arroyo, Professor of Sociology in Universidad Complutense de Madrid (Spain)

Jury

Nigel Gilbert, Professor of Sociology in University of Surrey (UK), Director of the Centre for Research in Social Simulation (CRESS)
Luis Antunes, Professor of Computer Science in Universidade de Lisboa (Portugal), Director of the Group of Studies in Social Simulation (GUESS)
Guillaume Deffuant, Researcher in Cemagref (France), Head of the Laboratoire d’Ingénierie des Systèmes Complexes (LISC)
Pablo Gervás, Professor of Computer Science in Universidad Complutense de Madrid (Spain), Director of the Research Group Natural Interaction based on Language (NIL)
Luis Hernández-Yáñez, Professor of Computer Science in Universidad Complutense de Madrid (Spain), Chairperson of the Department of Software Engineering and Artificial Intelligence

Hosting labs

Funding

This project was supported by a 4-year scholarship from the Universidad Complutense de Madrid (used within 2007-2009)

Keywords

agent-based model, data-driven modelling, demography, friendship, fuzzy logic, microsimulation, postmodern society, social network, social values, social simulation, societal transition, surveys, transition.

Abstract

In the lasts decades, computer simulation in general, and agent based modelling (ABM) in particular, has become one of the mainstream modelling techniques in many scientific fields, especially in Social Sciences such as Sociology or Economics. Social simulation allows the study of the complexity inherent to social phenomena and it is attracting multidisciplinary research teams in order to manage this complexity.

There are different methodologies for ABM that, after compiling experience in processes, methods and tools, attempt to provide a systematic way to tackle new problems. Both the Multi-Agent Systems field and ABM have tried to provide robust methodologies to guide researchers in the modelling process.

However, there is an important epistemological distinction among agent-based models that these methodologies do not consider. Models can be classified depending on their research aim, and this classification can have methodological implications. Sometimes researchers seek a generic model to explain a social phenomena from a high degree of abstraction, and one that is simple enough to be used as an illustration of a specific theory or hypothesis. On the other hand, researchers may prefer to focus on the expressiveness of the model, together with the empirical descriptiveness of a specific case study. The first case corresponds to Theoretical Research, while the second one would be Data-driven Research.

Nowadays, most of the models are conceived from the theoretical approach, and thus methodologies are frequently biased towards them. However, without disregarding the role of theory, models can also seek expressiveness. In order to do that, they may have needs that are not met in general methodologies. For instance, issues such as the empirical initialisation, the limitations of data collection, the throughout empirical validation or the role of data in the design are not usually considered in those methodologies. Thus, there is a lack of a complete ABM methodology that, assuming data-driven research has a different approach and aims, provides a specific flow of data-driven model development. Such methodology should consider the key role of empirical data throughout all the modelling stages. This lack has caused most data-driven models to be constructed without a common frame.

This work attempts to fill this gap and build a complete methodology to guide data-driven agent-based modelling. Therefore, it can be advocated that when there are available data from the observation of the real phenomenon, the modelling and simulation process involves additional stages. This methodology attempts to guide the injection of empirical data into the simulation, bringing them closer to the real phenomenon under study, while acknowledging the important role of theory in the whole process. Therefore, the approach is complemented with a systematic method for the exploration of the model space in order to achieve comprehensible but descriptive models. Such a method was coined `Deepening KISS’, as it is exposed in the methodological chapter.

This methodology is supported technically by the specification and implementation of a social agent framework. Such framework is structured in modules which can be enabled at will in order to facilitate the exploration of the model space and its incremental construction, both in the frame of the data-driven approach. Instead of attempting to build a general-purpose framework, this agent framework focuses on a family of problems which can be best tackled within it.

Moreover, an in-depth case study was developed to test and validate the application of the methodology and proposed framework. This case study addresses the complex issue of social values evolution, together with the friendship emergence and the demographic dynamics involved.

The construction of this agent-based model, coined Mentat, can be summarised in a series of key milestones. The proposed data-driven methodology is applied intensively through the course of its development. The modelling process has been realised as (a) bottom-up and (b) top-down. (a) is represented by the social network arising from the micro behaviour and friendship dynamics. (b) relies on the elaborated demographic model. The conceptualisation and specification of (a) and (b) has been justified theoretically in order to support its development. They have been implemented within the modular agent framework, designed in incremental layers. Mentat features are structured in modules which can be enabled or disabled in order to explore the model space following the stages defined in the methodology. The model is validated from a quantitative macro perspective (empirical validation), from a qualitative micro perspective (social dynamics matching the theoretical assumptions) and from a theoretical perspective (discussing its sociological consistency). Different techniques of Artificial Intelligence are applied and combined in the model, testing the framework adaptability and their use for social simulation. Mentat serves as a case study of the methodology and framework, but it also provides some sociological insight of the problem under study, by giving new support to specific theories. The ABM specifically stresses the key significance of demographic dynamics in the case study: the evolution of social values in Spain during the end of 20th Century. This implies that intergenerational changes are considerably more important than intragenerational ones in this Spanish context, and supports Inglehart’s theories of values evolution.