Monday, August 12, 2013

Using abn: additive Bayesian networks

­­Since finishing all the imputation stuff I’ve mostly been working on implementing a method for additive Bayesian networks to my plant invasion dataset. I did some basic research and decided to try the abn package for no particular reason apart from it seemed to do what I wanted best and looked to have good documentation

One advantage of this over other Bayesian network packages based on my not-at-all-exhaustive research was that: (1) I think it is the newest and I like that; (2) it can model mixed data in that it can take variables that have a gaussian, poisson or binary distribution...what this means is that in fact categorical factors have to be turned into multiple yes (1) no (0) binary factors.

The abn packages seems to have come from epidemiological research. Thus I couldn't find any papers where it is used in ecology. I also liked the idea of trying something ecologists haven't used yet.

First, this package needs complete data, so (multiple) imputation or deletion of missing data (rarely ideal) is essential before you even start...but then I get the feeling that you could be layering models onto models with imputation...none of this is great but you have to work things out as you can I suppose. Then, because it takes so long to run and get out a network, re-running the abn analysis for each m imputed data sets becomes unappealing, meaning that you're unlikely to go the extra mile to get the benefit of multiple imputation.

Second, splitting up your categorical factor variables into several binary factors will inflate the number of variables plugged into the model, possibly a lot. I had 34 variables after splitting and a dataset of 466 species. This can increase computation time for the networks to run a lot. But there are ways around it (to do with ban and retain matrices as I will show).

Third, you will need to install Graphviz to visualise the networks.

So, let's go. First to manipulate the data and set up the distribution list.

# load abn library
  library(abn)

# set working directory
  setwd("your directory")

# load an imputed data set
  imp <- read.csv("imputed_data_1.csv")

# subset and arrange data frame to have variable you want and force factors to be factors if they aren't
abn.nox <- imp[,c(4, 5:11, 13:27, 29, 32:41)] 
abn.nox$dispersal.vector.no <- dat$dispersal.vector.no 
# set factors to be factors
abn.nox$str.comepetitive <- as.factor(abn.nox$str.comepetitive) 
abn.nox$str.stress <- as.factor(abn.nox$str.stress)
abn.nox$str.ruderal <- as.factor(abn.nox$str.ruderal)
abn.nox <- na.omit(abn.nox) # check all missing values are removed
abn.nox.sub <- abn.nox[,c(1:23, 28, 35, 25:27, 29:34)]

# setup distribution list for each node
  subdists <- list(states.nox="poisson",
mrt="gaussian",
cz.cult="poisson",
us.cult="poisson",
habitats="gaussian",
flor.zone="gaussian",
alt.range.cz="gaussian",
squares.cz="gaussian",
clonal.index="poisson",
ldmc="gaussian",
ssb.range="gaussian",
sla="gaussian",
monocarp="binomial",
polycarp="binomial",
annual="binomial",
shrub="binomial",
tree="binomial",
str.comepetitive="binomial",
str.stress="binomial",
str.ruderal="binomial",
height="poisson",
flower.period="poisson",
popagule.length="poisson",
pol.vector.no="gaussian",
dispersal.vector.no="gaussian",
self.pol="binomial",
insect.pol="binomial",
wind.popl="binomial",
ant.dispersed="binomial",
exoz.dispersed="binomial",
endoz.dispersed="binomial",
wind.dispersed="binomial",
water.dispersed="binomial",
self.dispersed="binomial"
);

abn uses 'ban' and 'retain' matrices to allow or ban parent arcs (this is all really well explained on the website). These matrices can reflect prior beliefs about how some variables could interract with others and ban silly relationships you know aren't possible. 

The idea is to start with one 'parent' and then alter the ban and retain matrix depending on the previous model to narrow down the search space, enabling more patents to be added. 

TIP: make each matrix in excel first so you can have your variables labeled as column and row headings and then save matrix as a .txt file. Copy and paste the text from the .txt file into R and then add the needed commas to the end of the rows...Interpretation of the matrix is that the variable in the column is allowed (0) or banned (1) from causing with the variable in the row (i.e. there is directionality coded in these matrices)

###1 Parent
  subban <- matrix(c(
0,0,1,0,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,1,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,1,1,1,1,
1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,1,1,1,1,
1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,1,0,1,1,1,
1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,1,1,0,1,1,
1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,1,1,1,0,1,
1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,1,1,1,1,0
),byrow=TRUE,ncol=34);
colnames(subban)<-rownames(subban)<-names(abn.nox); #names must be set

  subretain <- matrix(c(
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
),byrow=TRUE,ncol=34);
colnames(subretain)<-rownames(subretain)<-names(abn.nox); #names must be set

# parent limits set to one initially
maxparen <- 1

# Make cache
subcache1 <- buildscorecache(data.df=abn.sub, data.dists=subdists, dag.banned=subban, dag.retained=subretain, max.parents=maxparen);

# perform heuristic search. Other option is an exhaustive search followed by (computationally intensive) bootstrapping to adjust for overfitting. The heuristic search can be used to produce a 50% consensus network that gets around overfitting and is less computationally intensive.
subheur1 <-search.hillclimber(score.cache=subcache1, num.searches=1000, seed=0, verbose=FALSE, trace=FALSE, timing.on=TRUE);

# dot output for the consensus network in Graphviz
tographviz(dag.m=subheur1$consensus, data.df=abn.nox, data.dists=subdists, outfile="abn1.dot");

# fit model to 50% consensus directed acyclic graph (DAG)
cons.mat1 <- subheur1$consensus
cons.mod1 <- fitabn(dag.m = cons.mat1, data.df=abn.sub, data.dists=subdists,compute.fixed=TRUE);

# get the goodness of fit for the model (marginal log liklihood)
cons.mod1$mlik

Then, you are going to need to increase the number of parents allowed by adding arcs to the 'ban' matrix.

##### 2 parents
subban <- matrix(c(
0,0,1,0,1,1,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,1,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,1,1,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,1,1,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,1,1,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,1,1,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,1,1,1,1,1,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,1,1,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,1,1,1,1,1,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,1,1,1,1,1,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,1,1,1,1,1,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,1,1,1,1,1,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,1,1,1,1,1,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,1,1,1,1,1,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
1,1,1,1,1,1,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,0,0,0,0,0,1,
1,1,1,1,1,1,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,1,0,1,0,0,0,0,0,1,
1,1,1,1,1,1,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,1,
1,1,1,1,1,1,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,1,1,1,1,
1,1,1,1,1,1,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,1,1,1,1,
1,1,1,1,1,1,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,1,0,1,1,1,
1,1,1,1,1,1,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,1,1,0,1,1,
1,1,1,1,1,1,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,1,1,1,0,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0
),byrow=TRUE,ncol=34);
colnames(subban)<-rownames(subban)<-names(abn.nox); #names must be set

This time the retain matrix is just the last consensus network from the last search
subretain <- cons.mat1

# parent limits
  maxparen <- 2

# Make cache
subcache2 <- buildscorecache(data.df=abn.nox, data.dists=subdists, dag.banned=subban, dag.retained=subretain, max.parents=maxparen);

# perform heuristic search
subheur2 <-search.hillclimber(score.cache=subcache2, num.searches=1000, seed=0, verbose=FALSE, trace=FALSE, timing.on=TRUE, dag.retained=subretain);

# dot output for the consensus network
tographviz(dag.m=subheur2$consensus,data.df=abn.nox, data.dists=subdists, outfile="abn2.dot");

# fit model to consensus DAG and get the goodness of fit
cons.mat2 <- subheur2$consensus
cons.mod2 <- fitabn(dag.m = cons.mat2, data.df=abn.nox, data.dists=subdists,compute.fixed=TRUE);
cons.mod2$mlik

Then you just need to keep on upping the number of parents and appending links to the ban matrix (so, if you have set the parents to two but some variables only have one parent, all other parents can be banned because we know there are less than two parents supported). In my case, the optimal number was six parents.

When no more parents can be added OR there is no further improvement in fitted model goodness-of-fit, then you have the optimal number of parents.

Now to the bit that you can't find on the abn website...the programme has recently been changed a bit but the documentation isn't updated. Mostly, it's ok. But when plotting the posteriour density estimates, it all goes wrong. A quick helpful email with Fraser Lewis, the author of the package, and I had the right code (yes I could have worked it out but I am not good at writing loops!):

library(Cairo);
CairoPDF("abnNoxious6ParentsMarginals.pdf");
for(i in 1:length(cons.mod6nox$marginals)){
 onenode<-cons.mod6nox$marginals[[i]];
 for(j in 1:length(onenode)){
 plot(onenode[[j]],type="l",main=names(onenode)[j]);
 }
 }
dev.off();

Then, I modified the .dot file so that all the labels were as I liked them, with proper publication-ready names and with the arcs labelled with median effect sizes, and also dashed to indicate negative median values, or solid to indicate positive median values.


And this is the Graphviz code to make the diagram (NB values for arcs are not actually derived from the data, they're just an example)

digraph dag { 

"Number of states noxious"[shape=diamond, style=filled, fillcolor=grey];
"Minimum residence time"[shape=oval];
"Czech cultivation index"[shape=diamond];
"U.S. cultivation index"[shape=diamond];
"Habitats"[shape=oval];
"Floral zones"[shape=oval];
"Czech altitudinal range"[shape=oval];
"Czech grid squares"[shape=oval];
"Clonality index"[shape=diamond];
"Leaf dry matter content"[shape=oval];
"Seed bank longevity"[shape=oval];
"Specific leaf area"[shape=oval];
"Monocarp"[shape=square];
"Polycarp"[shape=square];
"Annual"[shape=square];
"Shrub"[shape=square];
"Tree"[shape=square];
"Competitive \n Grime strategy"[shape=square];
"Stress tolerant \n Grime strategy"[shape=square];
"Ruderal \n Grime strategy"[shape=square];
"Height"[shape=diamond];
"Flowering period"[shape=diamond];
"Popagule length"[shape=diamond];
"Number of \n pollination vectors"[shape=oval];
"Number of \n dispersal vectors"[shape=oval];
"Self pollinated"[shape=square];
"Insect pollinated"[shape=square];
"Wind pollinated"[shape=square];
"Ant dispersed"[shape=square];
"Exozoochory"[shape=square];
"Endozoochory"[shape=square];
"Wind dispersed"[shape=square];
"Water dispersed"[shape=square];
"Self dispersed"[shape=square];



"Minimum residence time"->"Number of states noxious" [label="0.48"];
"Czech cultivation index"->"U.S. cultivation index" [label="0.12"];
"U.S. cultivation index"->"Number of states noxious" [label="0.89"];
"U.S. cultivation index"->"Minimum residence time" [label="0.26"];
"Habitats"->"Czech altitudinal range" [arrowhead=none, arrowtail=none, label="0.25"];
"Habitats"->"Czech grid squares" [arrowhead=none, arrowtail=none, label="0.60"];
"Czech grid squares"->"Minimum residence time" [label="0.25"];
"Czech grid squares"->"U.S. cultivation index" [label="0.20"];
"Czech grid squares"->"Floral zones" [arrowhead=none, arrowtail=none, label="0.38"];
"Czech grid squares"->"Czech altitudinal range" [arrowhead=none, arrowtail=none, label="0.37"];
"Clonality index"->"Wind pollinated" [arrowhead=none, arrowtail=none, label="0.45"];
"Leaf dry matter content"->"Number of states noxious" [style=dashed, label="-0.45"];
"Leaf dry matter content"->"Specific leaf area" [arrowhead=none, arrowtail=none, style=dashed, label="-0.30"];
"Seed bank longevity"->"Habitats" [label="0.50"];
"Seed bank longevity"->"Czech grid squares" [label="0.27 "];
"Seed bank longevity"->"Popagule length" [style=dashed, label="-0.33"];
"Specific leaf area"->"Self pollinated" [arrowhead=none, arrowtail=none, label="0.46"];
"Polycarp"->"Clonality index" [label="0.84"];
"Polycarp"->"Height" [style=dashed, label="-1.05"];
"Annual"->"Czech cultivation index" [style=dashed, label="-1.93"];
"Annual"->"Clonality index" [style=dashed, label="-1.51"];
"Annual"->"Height" [style=dashed, label="-1.64"];
"Annual"->"Self pollinated" [label="1.73"];
"Annual"->"Ant dispersed" [style=dashed, label="-1.12"];
"Annual"->"Endozoochory" [style=dashed, label="-2.31"];
"Shrub"->"Czech cultivation index" [label="0.66"];
"Shrub"->"Leaf dry matter content" [label="0.79"];
"Shrub"->"Height" [label="0.64"];
"Shrub"->"Exozoochory" [style=dashed, label="-3.91"];
"Shrub"->"Endozoochory" [label="3.27"];
"Shrub"->"Wind dispersed" [style=dashed, label="-1.68"];
"Shrub"->"Water dispersed" [style=dashed, label="-22.77"];
"Tree"->"Number of states noxious" [style=dashed, label="-2.62"];
"Tree"->"Czech cultivation index" [label="1.35"];
"Tree"->"Leaf dry matter content" [label="0.76"];
"Tree"->"Height" [label="2.43"];
"Tree"->"Flowering period" [style=dashed, label="-0.64"];
"Tree"->"Self pollinated" [style=dashed, label="-1.94"];
"Tree"->"Exozoochory" [style=dashed, label="-3.00"];
"Tree"->"Water dispersed" [style=dashed, label="-22.55"];
"Competitive \n Grime strategy"->"Habitats" [label="0.71"];
"Competitive \n Grime strategy"->"Floral zones" [style=dashed, label="-0.54"];
"Competitive \n Grime strategy"->"Popagule length" [arrowhead=none, arrowtail=none, label="0.54"];
"Stress tolerant \n Grime strategy"->"Habitats" [style=dashed, label="-0.32"];
"Stress tolerant \n Grime strategy"->"Popagule length" [arrowhead=none, arrowtail=none, style=dashed, label="-0.47"];
"Height"->"Czech grid squares" [label="0.03"];
"Height"->"Competitive \n Grime strategy" [label="5.16"];
"Height"->"Stress tolerant \n Grime strategy" [arrowhead=none, arrowtail=none, style=dashed, label="-0.48"];
"Height"->"Ruderal \n Grime strategy" [arrowhead=none, arrowtail=none, style=dashed, label="-1.45"];
"Height"->"Popagule length" [arrowhead=none, arrowtail=none, label="0.03"];
"Height"->"Wind pollinated" [arrowhead=none, arrowtail=none, label="0.16"];
"Height"->"Ant dispersed" [arrowhead=none, arrowtail=none, style=dashed, label="-2.14"];
"Flowering period"->"Floral zones" [label="0.18"];
"Flowering period"->"Seed bank longevity" [arrowhead=none, arrowtail=none, label="0.23"];
"Flowering period"->"Insect pollinated" [arrowhead=none, arrowtail=none, label="0.41"];
"Number of \n dispersal vectors"->"Seed bank longevity" [arrowhead=none, arrowtail=none, label="0.22"];
"Self pollinated"->"Competitive \n Grime strategy" [arrowhead=none, arrowtail=none, style=dashed, label="-1.37"];
"Self pollinated"->"Ruderal \n Grime strategy" [arrowhead=none, arrowtail=none, label="0.66"];
"Self pollinated"->"Number of \n pollination vectors" [label="1.83"];
"Insect pollinated"->"Clonality index" [arrowhead=none, arrowtail=none, style=dashed, label="-0.32"];
"Insect pollinated"->"Number of \n pollination vectors" [label="1.74"];
"Insect pollinated"->"Exozoochory" [arrowhead=none, arrowtail=none, style=dashed, label="-2.03"];
"Insect pollinated"->"Water dispersed" [arrowhead=none, arrowtail=none, style=dashed, label="-1.25"];
"Wind pollinated"->"Number of states noxious" [label="0.89"];
"Wind pollinated"->"Leaf dry matter content" [arrowhead=none, arrowtail=none, label="0.79"];
"Wind pollinated"->"Number of \n pollination vectors" [label="1.75"];
"Ant dispersed"->"Competitive \n Grime strategy" [arrowhead=none, arrowtail=none, label="1.28"];
"Ant dispersed"->"Stress tolerant \n Grime strategy" [arrowhead=none, arrowtail=none, label="0.89"];
"Ant dispersed"->"Number of \n dispersal vectors" [label="1.17"];
"Ant dispersed"->"Self pollinated" [arrowhead=none, arrowtail=none, label="1.05"];
"Ant dispersed"->"Insect pollinated" [arrowhead=none, arrowtail=none, label="2.11"];
"Ant dispersed"->"Wind pollinated" [arrowhead=none, arrowtail=none, style=dashed, label="-2.70"];
"Exozoochory"->"Number of states noxious" [label="0.74"];
"Exozoochory"->"Popagule length" [arrowhead=none, arrowtail=none, label="0.38"];
"Exozoochory"->"Number of \n dispersal vectors" [label="1.03"];
"Exozoochory"->"Self pollinated" [arrowhead=none, arrowtail=none, style=dashed, label="-0.57"];
"Exozoochory"->"Wind pollinated" [arrowhead=none, arrowtail=none, label="2.30"];
"Endozoochory"->"Height" [arrowhead=none, arrowtail=none, style=dashed, label="-0.91"];
"Endozoochory"->"Popagule length" [arrowhead=none, arrowtail=none, label="0.43"];
"Endozoochory"->"Number of \n dispersal vectors" [label="0.78"];
"Endozoochory"->"Insect pollinated" [arrowhead=none, arrowtail=none, label="3.82"];
"Endozoochory"->"Wind pollinated" [arrowhead=none, arrowtail=none, style=dashed, label="-1.58"];
"Wind dispersed"->"Number of \n dispersal vectors" [label="0.85"];
"Water dispersed"->"Leaf dry matter content" [arrowhead=none, arrowtail=none, style=dashed, label="-0.58"];
"Water dispersed"->"Competitive \n Grime strategy" [arrowhead=none, arrowtail=none, style=dashed, label="-1.80"];
"Water dispersed"->"Number of \n dispersal vectors" [label="1.15"]; 

}