Databases Reference
In-Depth Information
2-stars
(subgraphs consisting of a node with two spokes—so a node
with degree 3 has three
2-stars
associated to it) given the number of
nodes, and have these act as variables
z
i
of your model, and then tweak
the associated coefficients
θ
i
to get them tuned to a certain type of
behavior you observe or wish to simulate. If
z
1
refers to the number
of triangles, then a positive value for
θ
1
would indicate a tendency
toward a larger number of triangles, for example.
Additional graph statistics that have been introduced include
k
-stars
(subgraphs consisting of a node with
k
spokes—so a node with degree
k
+ 1
has
k
+ 1
k
-stars associated with it), degree, or
alternating
k
-
stars
, an aggregation statistics on the number of
k
-stars for various
k
.
Let's give you an idea of what an ERGM might look like formula-wise:
1
κ
Pr Y
=
y
=
θ
1
z
1
y
+
θ
2
z
2
y
+
θ
3
z
3
y
Here we're saying that the probability of observing one particular re‐
alization of a random graph or network,
Y
, is a function of the graph
statistics or properties, which we just described as denoted by
z
i
.
In this framework, a Bernoulli network is a special case of an ERGM,
where we only have one variable corresponding to number of edges.
Inference for ERGMs
Ideally—though in some cases unrealistic in practice—one could ob‐
serve a sample of several networks,
Y
1
, ...,
Y
n
, each represented by their
adjacency matrices, say for a fixed number
N
of nodes.
Given those networks, we could model them as independent and
identically distributed observations from the same probability model.
We could then make inferences about the parameters of that model.
As a first example, if we fix a Bernoulli network, which is specified by
the probability
p
of the existence of any given edge, we can calculate
the likelihood of any of our sample networks having come from that
Bernoulli network as
D
−
d
i
p
d
i
n
L
=∏
i
1−
p
where
d
i
is the number of observed edges in the
i
th network and
D
is
the total number of dyads in the network, as earlier. Then we can back
out an estimator for
p
as follows: