Improving ramification detection of St. Nicolas House Analysis

A combination approach




St. Nicolas Analysis, snha, network reconstruction, R-squared gaining, linear model check, graph estimation


The St. Nicolas House Analysis (SNHA) is a new graph estimation method for detection of extensive interactions among variables. It operates by ranking absolute bivariate correlation coefficients in descending order thereby creating hierarchic association chains. The latter characterizes dependence structures of interacting variables which can be visualized in a corresponding network graph as a chain of end-to-end connected edges representing direct relationships between the connected nodes. The important advantage of this relatively new approach is that it produces less false positive edges resulting from indirect or transitive associations than expected with standard correlation or linear model-based approaches. Here we aim to improve the detection of ramifications in graphs by addition of different data processing layers to SNHA. They include the combinations of the extensions R-squared gaining(RSG) and linear model check(LMC).
SNHA together with these so-called extensions were benchmarked against default SNHA and other reference methods available for the programming language R. In the end combinations of RSG, LMC and Bootstrapping improve SNHA performance across different network types, albeit at the cost of longer computation time.


Barabasi, A. L./Albert, R. (1999). Emergence of scaling in random networks. Science 286 (5439), 509–512. DOI:

Bekkar, M./Djemaa, H. K./Alitouche, T. A. (2013). Evaluation measures for models assessment over imbalanced data sets. Journal of Information Engineering and Applications 3, 27–38. Available online at

Bicego, M./Mensi, A. (2023). Null/No Information Rate (NIR): a statistical test to assess if a classification accuracy is significant for a given problem, 2023. Available online at

Bodenberger, B. Improved network reconstruction using resampling methods. Project work thesis at University of Potsdam. Potsdam.

Bozdogan, H. (1987). Model selection and Akaike’s Information Criterion (AIC): The general theory and its analytical extensions. Psychometrika 52 (3), 345–370. DOI:

Chen, S./Mar, J. C. (2018). Evaluating methods of inferring gene regulatory networks highlights their lack of performance for single cell gene expression data. BMC Bioinformatics 19 (1), 232. DOI:

Chicco, D./Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 21 (1), 6. DOI:

Diestel, R. (2017). Graph theory. 0072-5285. DOI:

Epskamp, S./Cramer, A. O./Waldorp, L. J./Schmittmann, V. D./Borsboom, D. (2012). qgraph: Network visualizations of relationships in psychometric data. Journal of Statistical Software 48 (4), 1–18. DOI:

Filosi, M./Visintainer, R./Riccadonna, S./Jurman, G./Furlanello, C. (2014). Stability indicators in network reconstruction. PLOS ONE 9 (2), 1–24. DOI:

Friedman, J./Hastie, T./Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9 (3), 432–441. DOI:

García, V./Mollineda, R. A./Sánchez, J. S. (2009). Index of balanced accuracy: A performance measure for skewed class distributions. In: Helder Araujo/Ana Maria Mendonça/Armando J. Pinho et al. (Eds.). Pattern recognition and image analysis. Berlin, Heidelberg, Springer Berlin Heidelberg, 441–448. DOI:

Groth, D. (2022). Asg: Package for generating correlation networks based on association chains. Available online at

Groth, D. (2023). SNHA: Package for generating correlation networks based on association chains. Available online at

Groth, D./Scheffler, C./Hermanussen, M. (2019). Body height in stunted Indonesian children depends directly on parental education and not via a nutrition mediated pathway - Evidence from tracing association chains by St. Nicolas House Analysis. Anthropol Anz 76 (5), 445–451. DOI:

Hemelrijk, C. K. (1990). A matrix partial correlation test used in investigations of reciprocity and other social interaction patterns at group level. Journal of Theoretical Biology 143 (3), 405–420. DOI:

Hermanussen, M./Aßmann, C./Groth, D. (2021). Chain reversion for detecting associations in interacting variables—St. Nicolas House Analysis. International Journal of Environmental Research and Public Health 18 (4), 1741. Available online at DOI:

Huynh-Thu, V. A./Irrthum, A./Wehenkel, L./Geurts, P. (2010). Inferring regulatory networks from expression data using tree-based methods. PLOS ONE 5 (9), 1–10. DOI:

Jiang, H./Fei, X./Liu, R./Roeder, K./Lafferty, J./Wasserman, L./Li, X./Zhao, T. (2021). Huge: High-dimensional undirected graph estimation.

Krivitsky, P. N./Hunter, D. R./Morris, M./Klumb, C. (2023). ergm 4: New features for analyzing exponential-family random graph models. Journal of Statistical Software 105 (6), 1–44. DOI:

Logsdon, B. A./Mezey, J. (2010). Gene expression network reconstruction by convex feature selection when incorporating genetic perturbations. PLOS Computational Biology 6 (12), 1–13. DOI:

Marks, D. S./Colwell, L. J./Sheridan, R./Hopf, T. A./Pagnani, A./Zecchina, R./Sander, C. (2011). Protein 3D structure computed from evolutionary sequence variation. PLOS ONE 6 (12), 1–20. DOI:

Meinshausen, N./Bühlmann, P. (2006). High-dimensional graphs and variable selection with the Lasso. The Annals of Statistics 34 (3), 1436–1462. DOI:

Miles, J. (2005). R-squared, adjusted R-squared. In: Brian Everitt/David Howell (Eds.). Encyclopedia of statistics in behavioral science. John Wiley & Sons, Ltd.

Moris, C. (2023). Improving ramification detection of St. Nicolas House Analysis. Project work thesis at University of Potsdam.

Novine, M./Mattsson, C. C./Groth, D. (2022). Network reconstruction based on synthetic data generated by a Monte Carlo approach. Human Biology and Public Health 3. DOI:

R Core Team (2022). R: A Language and Environment for Statistical Computing. Vienna, Austria. Available online at

Tasaki, S./Sauerwine, B./Hoff, B./Toyoshiba, H./Gaiteri, C./Chaibub Neto, E. (2015). Bayesian network reconstruction using systems genetics data: Comparison of MCMC methods. Genetics 199 (4), 973–989. DOI:

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58 (1), 267–288. DOI:

Wu, S.-H./Chen, K.-L./Hsu, C./Chen, H.-C./Chen, J.-Y./Yu, S.-Y./Shiu, Y. (2022). Creatine supplementation for muscle growth: A scoping review of randomized clinical trials from 2012 to 2021. Nutrients 14 (6). DOI:

Zhao, T./Liu, H./Roeder, K./Lafferty, J./Wasserman, L. (2020). The huge package for high-dimensional undirected graph estimation in R.




How to Cite

Chen, S., Moris, C., & Groth, D. (2024). Improving ramification detection of St. Nicolas House Analysis: A combination approach. Human Biology and Public Health, 1.



International Student Summer School