Network Analysis & Latent Variable Models

Perhaps our first network

Adolescent Social Structure by by Jim Moody

Can reveal surprising structure

Adolescent Social Structure by by Jim Moody

This used to be surprising

The political blogosphere and the 2004 election: Divided they blog by Lada Adamic

Protein-Protein Networks

Network Map of Protein-Protein Interactions by Erich E. Wanker of the Max Delbrück Center for Molecular Medicine (MDC)

Social media networks

Hairballs in Political Economy

BIT Formation

Structures do hide in hairballs …

International Conflict Event Warning System (ICEWS): Material Conflict by Minhas, Hoff, & Ward

Outline

What makes network data particular?
Talk about the "A" in AME
Talk about the "M" in AME
Additive and Multiplicative Effect (amen)
Does it work? Application
What's next?

Relational data

Relational data consists of

a set of units or nodes
a set of measurements, \(y_{ij}\), specific to pairs of nodes (i,j)

What's wrong with GLM?

GLM: \(y_{ij} \sim \beta^{T} X_{ij} + e_{ij}\)

Networks typically show evidence against independence of {\(e_{ij} : i \neq j\)}

Not accounting for dependence can lead to:

biased effects estimation
uncalibrated confidence intervals
poor predictive performance
inaccurate description of network phenomena

We've been hearing this concern for decades now:

Thompson & Walker (1982)
Frank & Strauss (1986)
Kenny (1996)
Krackhardt (1998)

Beck et al. (1998)
Signorino (1999)
Li & Loken (2002)
Hoﬀ and Ward (2004)

Snijders (2011)
Erikson et al. (2014)
Aronow et al. (2015)
Athey et al. (2016)

What network phenomena? Sender heterogeneity

Values across a row, say \(\{y_{ij},y_{ik},y_{il}\}\), may be more similar to each other than other values in the adjacency matrix because each of these values has a common sender \(i\)

What network phenomena? Receiver heterogeneity

Values across a column, say \(\{y_{ji},y_{ki},y_{li}\}\), may be more similar to each other than other values in the adjacency matrix because each of these values has a common receiver \(i\)

What network phenomena? Sender-Receiver Covariance

Actors who are more likely to send ties in a network may also be more likely to receive them

What network phenomena? Reciprocity

Values of \(y_{ij}\) and \(y_{ji}\) may be statistically dependent

Lets explore some data

# library(devtools) ; devtools::install_github('s7minhas/amen')
library(amen) # Load additive and multiplicative effects pkg
data(IR90s) # Load trade data

Y[1:5,1:5] # Data organized in an adjacency matrix

##           ARG        AUL       BEL        BNG        BRA
## ARG        NA 0.05826891 0.2468601 0.03922071 1.76473080
## AUL 0.0861777         NA 0.3784364 0.10436002 0.21511138
## BEL 0.2700271 0.35065687        NA 0.01980263 0.39877612
## BNG 0.0000000 0.01980263 0.1222176         NA 0.01980263
## BRA 1.6937791 0.23901690 0.6205765 0.03922071         NA

Parsing the hairball - Nodal Effects

Parsing the hairball - Covariance

Anything else going on … reciprocity?

Is the USA just lucky?

# Reciprocity
cor(c(Y), c(t(Y)), use='complete')

## [1] 0.9392867

# Reciprocity beyond nodal variation?
senMean = apply(Y, 1, mean, na.rm=TRUE)
recMean = apply(Y, 2, mean, na.rm=TRUE)
globMean = mean(Y, na.rm=TRUE)
resid <- Y - ( globMean + outer(senMean,recMean,"+"))
cor(c(resid), c(t(resid)), use='complete')

## [1] 0.8591242

Social Relations Model

Social Relations Model: Nodal Effects

Social Relations Model: Nodal Variance

Social Relations Model: Dyadic Variance

What can we do with this?

Lets model trade using the SR-R-M framework

\[ \begin{aligned} y_{i,j} = \beta_d^T \textbf{x}_{d,i,j} + \beta_r^T \textbf{x}_{r,i} +\beta_c^T \textbf{x}_{c,j} + a_i + b_j + \epsilon_{i,j} \end{aligned} \] Variables we might want to include:

Log(Pop.) of \(i\) and \(j\)
Log(GDP) of \(i\) and \(j\)
Polity of \(i\) and \(j\)

Number of conflicts from \(i\) to \(j\)
Log(Distance) between \(i\) and \(j\)

Log Number of common IGOs between \(i\) and \(j\)

Probit Regression Framework

(Hoff 2005; Westveld & Hoff 2010; Hoff et al. 2013; Fosdick & Hoff 2015; Minhas et al. 2016)

Threshold model: linking latent \(Z\) to \(Y\)

\(y_{ij} = 1(z_{ij}>0)\)
\(z_{ij} = \beta^{T} x_{ij} + e_{ij}\)

Social relations model: inducing network covariance

\(e_{ij} = a_{i} + b_{j} + \epsilon_{ij}\)
\(\{(a_{1},b_{1}),\ldots,(a_{n},b_{n})\} \sim N(0, \Sigma_{ab})\)
\(\{(\epsilon_{ij},\epsilon_{ji}) i \neq j \} \sim N(0, \Sigma_{\epsilon})\)

Estimation:

Iterative MCMC algorithm in which we iteratively sample from the full conditionals of each parameter of interest

Running the model in R

MCMC routine:

Arguments:

Y an n x n square relational matrix
Xdyad an n x n x pd array of dyadic covariates
Xrow an n x pr array of sender covariates
Xcol an n x pc array of receiver covariates
rvar TRUE/FALSE: fit sender random effects
cvar TRUE/FALSE: fit receiver random effects
dcor TRUE/FALSE: fit dyadic correlation

model one of "nrm", "bin", "ord", "cbin", "frn", "rrl"
intercept TRUE/FALSE: fit with an intercept?
symmetric TRUE/FALSE: are relations directed?
nscan number of iterations of the markov chain
burn burn in for the chain
odens output density
R dimension of multiplicative effects

Inputting nodal covariates

Nodal covariates should be structured as:

an \(n \times p\) matrix of covariates, where \(n\) corresponds to number of actors and \(p\) covariates
In the directed case, row and nodal covariates need to be inputted separately into Xrow and Xcol

Xn[1:10,]

##          pop      gdp polity
## ARG 3.548755 5.864710   7.18
## AUL 2.895912 6.011414  10.00
## BEL 2.314514 5.370685  10.00
## BNG 4.789989 5.177956   5.00
## BRA 5.070915 6.963597   8.00
## CAN 3.377588 6.531009  10.00
## CHN 7.091101 8.114522  -7.00
## COL 3.652734 5.324862   7.82
## EGY 4.063542 5.371521  -3.55
## FRN 4.082272 7.101956   9.00

Inputting dyadic covariates

Dyadic covariates should be structured as:

an \(n \times n \times p\) array of covariates, where \(p\) now corresponds to the number of dyadic covariates

Xd[1:3,1:3,]

conflicts

##     ARG AUL BEL
## ARG  NA   0   0
## AUL   0  NA   0
## BEL   0   0  NA

distance

##       ARG   AUL   BEL
## ARG    NA 11.72 11.31
## AUL 11.72    NA 16.71
## BEL 11.31 16.71    NA

shared_igos

##      ARG  AUL  BEL
## ARG   NA 3.83 3.92
## AUL 3.83   NA 4.02
## BEL 3.92 4.02   NA

Running SRM model with covariates

fitSRM = ame(Y=Y,
             Xdyad=Xd, # incorp dyadic covariates
             Xrow=Xn, # incorp sender covariates
             Xcol=Xn, # incorp receiver covariates
             symmetric=FALSE, # tell AME trade is directed
             intercept=TRUE, # add an intercept             
             model='nrm', # model type
             rvar=TRUE, # sender random effects (a)
             cvar=TRUE, # receiver random effects (b)
             dcor=TRUE, # dyadic correlation
             R=0, # we'll get to this later
             nscan=10000, burn=5000, odens=25,
             plot=FALSE, print=FALSE, gof=TRUE
             )

objects returned in fitSRM

names(fitSRM)

##  [1] "BETA" "VC"   "APM"  "BPM"  "U"    "V"    "UVPM" "EZ"   "YPM"  "GOF"

\(\beta\) trace plot & distribution

paramPlot(fitSRM$BETA[,1:5])

\(\beta\) trace plot & distribution

paramPlot(fitSRM$BETA[,6:ncol(fitSRM$BETA)])

SRM variance parameters

grid.arrange( paramPlot(fitSRM$VC),
  arrangeGrob( abPlot(fitSRM$APM, 'Sender Effects'),
               abPlot(fitSRM$BPM, 'Receiver Effects') ), ncol=2 )

Capturing network features?

gofPlot(fitSRM$GOF, symmetric=FALSE)

What are we missing?

Homophily: "birds of a feather flock together"
Stochastic equivalence: nothing as pithy to say here, but this model focuses on community detection

Lets build on what we have so far and find an expression for \(\gamma\):

\[ y_{ij} \approx \beta^{T} X_{ij} + a_{i} + b_{j} + \gamma(u_{i},v_{j}) \]

Latent class model/blockmodels

(Holland et al. 1983; Nowicki & Snijders 2001; Rohe et al. 2011; Airoldi et al. 2013)

Each node \(i\) is a member of an (unknown) latent class:

\[ \textbf{u}_{i} \in \{1, \ldots, K \}, \; i \in \{1,\ldots, n\} \\ \] The probability of a tie between \(i\) and \(j\) is:

\[ Pr(Y_{ij}=1 | \textbf{u}_{i}, \textbf{u}_{j}) = \theta_{\textbf{u}_{i} \textbf{u}_{j}} \]

Nodes in the network may have a small or high probability of ties: \(\theta_{kk}\) may be small or large
Nodes in the same class are stochastically equivalent

Software packages:

CRAN: statnet (Handcock et al. 2016)
CRAN: blockmodels (Leger 2015)

LCM for community detection

Newman (2006): Adjectives and Nouns

White & Murphy (2016): Mixed membership stochastic block model

Latent distance model

(Hoff et al. 2002; Krivitsky et al. 2009; Sewell & Chen 2015)

Each node \(i\) has an unknown latent position

\[ \textbf{u}_{i} \in \mathbb{R}^{k} \]

The probability of a tie from \(i\) to \(j\) depends on the distance between them

\[ Pr(Y_{ij}=1 | \textbf{u}_{i}, \textbf{u}_{j}) = \theta - |\textbf{u}_{i} - \textbf{u}_{j}| \]

Nodes nearby one another are more likely to have a tie, and will likely have similar ties to others

Software packages:

CRAN: latentnet (Krivitsky et al. 2015)
CRAN: VBLPCM (Salter-Townshend 2015)

LDM for low dim representations of homophily

Kirkland (2012): North Carolina Legislators

Kuh et al. (2015): Discerning prey and predators from food web

Latent factor model

(Hoff 2003; Hoff 2007)

Each node \(i\) has an unknown latent factor

\[ \textbf{u}_{i} \in \mathbb{R}^{k} \]

The probability of a tie from \(i\) to \(j\) depends on their latent factors

\[ \begin{aligned} Pr(Y_{ij}=1 | \textbf{u}_{i}, \textbf{u}_{j}) =& \theta + \textbf{u}_{i}^{T} \Lambda \textbf{u}_{j} \, \text{, where} \\ &\Lambda \text{ is a } K \times K \text{ diagonal matrix} \end{aligned} \]

Can account for both stochastic equivalence and homophily
Comes at the cost of harder to interpret multiplicative factors … lets see what I mean

Software packages:

CRAN: amen (Hoff et al. 2015)

Putting it all together: AME

\[ \begin{aligned} y_{ij} &= g(\theta_{ij}) \\ &\theta_{ij} = \beta^{T} \mathbf{X}_{ij} + e_{ij} \\ &e_{ij} = a_{i} + b_{j} + \epsilon_{ij} + \textbf{u}_{i}^{T} \textbf{D} \textbf{v}_{j} \\ \end{aligned} \]

\(a_{i} + b_{j} + \epsilon_{ij}\), are additive random effects and account for sender, receiver, and within-dyad dependence
multiplicative effects, \(\textbf{u}_{i}^{T} \textbf{D} \textbf{v}_{j}\), capture higher-order dependence patterns that are left over in \(\theta\) after accounting for any known covariate information

Estimating with multiplicative effects

Multiplicative effects can be added by toggling the R input parameter

fitAME = ame(Y=Y,
             Xdyad=Xd, # incorp dyadic covariates
             Xrow=Xn, # incorp sender covariates
             Xcol=Xcol, # incorp receiver covariates
             symmetric=FALSE, # tell AME trade is directed
             intercept=TRUE, # add an intercept             
             model='nrm', # model type
             rvar=TRUE, # sender random effects (a)
             cvar=TRUE, # receiver random effects (b)
             dcor=TRUE, # dyadic correlation
             R=2, # 2 dimensional multiplicative effects
             nscan=10000, burn=25, odens=25,
             plot=FALSE, print=FALSE, gof=TRUE
             )

Capturing network features part 2

gofPlot(fitAME$GOF, symmetric=FALSE)

Visualizing the multiplicative effects

ggCirc(Y=Y, U=fitAME$U, V=fitAME$V)

Benefits of this approach

At its core, AME is just a GLM with random effects used to ensure that we can treat dyadic observations as conditionally independent
AME can be used:
- on both undirected and directed data,
- on longitudinal and static networks,
- and on a variety of distribution types we commonly encounter in political science (binomial, gaussian, and ordinal).

Real world comparison

Cranmer et al. (2017)

Great paper comparing a few inferential network approaches
Utilized Swiss climate change policy collaboration network as application (Ingold, 2008)

\(\beta\) Estimates

Which approach fits \(Y\) best?

Out-of-sample Network Cross-Validation

Which approach fits network dependencies best?

What else can we do with AME?

Null model: model with no covariates

- Weschle (2017)
- Gallop and Minhas (WP)
- Greenhill (2016)

Covariate estimation in longitudinal networks

- Ward et al. (2013)
- Metternich et al. (2015)
- Dorff et al. (WP)

To end

LFM is a powerful framework that has proven useful
A lot of other things going on:
- Community structure in longitudinal, multidimensional arrays (Mucha et al. 2010)
- Multilinear tensor regression (Hoff 2015, Schein et al. 2015, Minhas et al. 2016)
- Intersection of network based methods to text analysis (Henry et al. 2016, Huang et al. 2015)
Takeaway here is that these methods are useful when we study systems in which interactions are interdependent
These interdependent relations may at times be of interest themselves or in other cases may just help us to better predict

Simulation analysis

Does AME actually reduce bias?

Cranmer & Desmarias (2017)

Hoff provides an argument that rests on exchangeability (Aldous, 1985)

Playing on ERGM's turf

Basis of simulation analysis

# Network simulation
simY = simulate.formula(network(n) ~ edges + edgecov(edgeVar) + networkTerm, 
                 coef=c(
                   interceptValue,
                   dyadParamValue,
                   netParamValue
                   ) )

# Run ergm
ergm(simY ~ edges + edgecov(edgeVar) + networkTerm)

# Run ame with and without multiplicative effects
ame(simY, Xdyad=edges, K=0)
ame(simY, Xdyad=edges, K=2)

Perhaps our first network

Can reveal surprising structure

This used to be surprising

Protein-Protein Networks

Social media networks

Hairballs in Political Economy

Structures do hide in hairballs …

Outline

Relational data

What's wrong with GLM?

What network phenomena? Sender heterogeneity

What network phenomena? Receiver heterogeneity

What network phenomena? Sender-Receiver Covariance

What network phenomena? Reciprocity

Lets explore some data

Parsing the hairball - Nodal Effects

Parsing the hairball - Covariance

Anything else going on … reciprocity?

Is the USA just lucky?

Social Relations Model

Social Relations Model: Nodal Effects

Social Relations Model: Nodal Variance

Social Relations Model: Dyadic Variance

What can we do with this?

Probit Regression Framework

Running the model in R

Inputting nodal covariates

Inputting dyadic covariates

Running SRM model with covariates

\(\beta\) trace plot & distribution

\(\beta\) trace plot & distribution

SRM variance parameters

Capturing network features?

What are we missing?

Latent class model/blockmodels

LCM for community detection

Latent distance model

LDM for low dim representations of homophily

Latent factor model

Putting it all together: AME

Estimating with multiplicative effects

Capturing network features part 2

Visualizing the multiplicative effects

Benefits of this approach

Real world comparison

\(\beta\) Estimates

Which approach fits \(Y\) best?

Which approach fits network dependencies best?

What else can we do with AME?

To end

Simulation analysis

Playing on ERGM's turf

Preliminary results