Dear MIMI collaborators,
This is the third newsletter on the progress of the MIMI study. We are thankful for your support and especially for the help you have provided during the quality control process during the last year, presented in more detail below.
Missingness patterns are described in the table below (green = data present, red = data absent).
Based on this, and the possibility for harmonization, Karin Jensevik has now completed the data harmonization process to determine which variables to include in the analyses. You can find a very condensed summary of the harmonization process in table below.
Stefan Gustafsson has assessed pre-analytical blood sample-related issues. As seen in the figure below (B), there were unfortunately substantial differences in storage times between cohorts, and in a few cohorts also slight differences between cases and subcohorts. Citrate/EDTA issues were also present (A). We have now settled on strategies for handling this.
Thereafter, Stefan Gustafsson and Erik Lampa have worked hard on understanding the missingness patterns in proteins and metabolites, and have run numerous simulations with different settings for different assumptions of the missingness generating mechanisms. We have eventually settled on strategies for handling this, which will differ somewhat for metabolites and proteins.
We have performed simulations of various methods for handling the cluster structure in the study, which was slightly challenging given the weighted case-cohort approach, and have now landed in robust methods. We observed worrying differences in incidence rates between the cohorts, but, after communication with you, we have now retrieved the correct totals (for subcohort weights) and will be able to move on with the analyses. Slight differences in incidence rates remain, but we believe these are true; as illustrated in the figure below.
After this long QC process, we are now ready to move on to the next stage, the primary analyses, which we expect to initially produce two separate manuscripts, one on the proteomics data and one with the results from the metabolomics data analyses. The two analyses will be published separately because of slight differences in analysis settings, mostly related to differences in missingness assumptions in the two datasets. Thereafter, more manuscripts will hopefully follow. Our publication manager, Ida Björkgren, will handle all administrative work related to submission of the manuscripts.