conmat
: Programmatic generation of synthetic contact matrices2 April 2025
Learn and discuss new open source tools for business analytics and data science!
But how do they come about!?
Contact diary studies follow individuals and ask them:
They are undoubtedly the gold standard for data on the number of contacts.
But they are prohibitively expensive, and logistically very challenging.
Published in 2008, Mossong et. al undertook a contact diary study of 7290 participants across Europe.
It remains the most widely cited contact diary study.
Prem et. al formed a model that predicts the number of contacts based on the age structure of the population.
This model then allows for prediction to settings that are outside the original POLYMOD countries.
Not much information to train model. We model the number of contacts an individual in bin i has with bin j:
\begin{aligned} c_{ij} = \beta_{0,i} + \beta_1(|i-j|) + &\beta_2(|i-j|^2) + \beta_3(i\times j) + \beta_4(i+j) +\\ &\beta_5\max(i, j) + \beta_6\min(i, j) \end{aligned}
🤓 Akshually 🤓
We want to fit this as a generalised additive model, so actually we have splines on these terms to satisfy the smoothness requirements.
This is very similar to the approach by Prem et. al who perform post-hoc smoothing of the parameters after MCMC.
We’ve used POLYMOD-trained models for over a decade.
There are 44 other surveys available in one collection on Zenodo, and probably more out there in the wild.
Some recent work from Harris et. al has shown that differences in the survey design can have a significant impact on these synthetic contact matrices.
Tip
conmat
allows for training on survey of choice, and projection onto target demographic of choice
china_survey <- socialmixr::get_survey("https://doi.org/10.5281/zenodo.3878754") |>
summarise(contacts = sum(cnt_home))
china_pop <- read_csv("./data/china_pop_age_dist.csv")
china_pop_cm <- conmat::as_conmat_population(
data = china_pop,
age = lower.age.limit,
population = population
)
model <- conmat::fit_single_contact_model(
contact_data = china_survey,
population = china_pop_cm
)
predicted_contacts <- conmat::predict_contacts(
model = model,
population = china_pop_cm,
age_breaks = c(seq(0, 80, by = 5), Inf)
)
These matrices are for a Chinese population, using a Chinese survey (left) and POLYMOD (right), for the home setting.
These matrices are for a Chinese population, using a Chinese survey (left) and POLYMOD (right), for the home setting.
conmat
is a new, open-source, programmable system to generate synthetic contact matrices
conmat
is a new, open-source, programmable system to generate synthetic contact matrices
Available right now:
or pre-computed matrices on Zenodo: https://zenodo.org/records/12776714
SPECTRUM-SPARK Seed Funding
Code and slides are available at https://github.com/MikeLydeamore/talk-conmat-projections