Towards Validating Social Network Simulations - Advances in Social Simulation

Information Technology Reference

In-Depth Information

Different assumptions, such as transitivity of edges can be built in by choosing dif-

ferent objective functions. The likelihood of this class of models in explaining a set of

data can be calculated and computer algorithms are used to find the best fit to the data

(given a particular kind of objective function) including statistics as to how like-

ly/unlikely that best model was (e.g., [13]).

For example, if one has panels of survey data about actor properties; choose some

assumptions about the nature of the relationships between actors and an algorithm

will come up with a dynamic network model for these.

This process is analogous to regressing an equation (with variables and parameters)

to a set of data - one determines the form of the equation and then finds the best fit to

the data. By comparing the fit of different models one gains information about the

data, for example, to what extent a linear model + random noise explains the data, or

whether including a certain variable explains the data better than without. In the case

of ERGM, the above machinery allows one to find which class of networks, resulting

from a given objective function, best fits a set of data. In other words, it gives a “sur-

prise free” view of what a network that gave the data might look like.

From the point of view of validating a simulation, the available empirical data

could be used to infer the best kind of ERGM model against which the actual network

produced by a simulation could be judged by calculating the probability of the output

network being generated by that particular ERGM model. This would be a kind of a

“null model” comparison giving an indication of the extent to which the simulation

deviated from the “surprise free” ERGM base-line model. This scheme has a number

of advantages: (1) a well-defined basis on which the comparison was made, (2) the

ability to use non-network data (e.g., waves of panel data) and (3) the ability to use

assumptions, such as transitivity, to be made explicit. However, it still boils down to a

single measure each time against a particular class of ERGM models, and it does

depend on the extent to which the network is in fact “surprise free”.

7

Towards a Scheme for Validating Networks

In this section, we give an example of how some of the aforementioned network

measures were used to compare simulated results against real life data and to explore

the parameter space of an agent-based model. Elsewhere [1, 2, 3], we used the affinity

measure to identify the compatibility of nodes based on their attributes. For the cor-

responding agent-based model, we developed an algorithm that enabled agents to find

other agents with similar attributes. In particular, it identified the significance of each

attribute in order to determine the compatibility among the agents. The higher the

affinity measure, the higher the importance of an attribute. Hence, this insight helped

us determine parameter space for our local processes for our model.

After collecting simulation results, we then compared degree distribution (both dis-

tribution type and its fitted parameter values), clustering coefficient, average commu-

nity modularity (along with the identified number of communities) and the calculated

Silo indices of the attribute space. We observed time series of the clustering coeffi-

cient and the standard deviation in number of links over the course of each simulation

Advances in Social Simulation

Search WWH ::

Custom Search

Home