Nottingham, England, United Kingdom

Bush Farm Road, Boghall, Scotland, United Kingdom

Errol Road, Kingoodie, Scotland, United Kingdom

14 Rue Pierre et Marie Curie, Paris 5e Arrondissement, Île-de-France, France

147 Rue de l'Université, Paris 7e Arrondissement, Île-de-France, France

133 Waterloo Road, London Borough of Lambeth, England, United Kingdom

25 Rue du Docteur Roux, Paris 15e Arrondissement, Île-de-France, France

OHEJP Codes4strains project logo


Start: October 2019
Duration: 3 Years
Domain: Foodborne Zoonoses, Antimicrobial Resistance
Members:  IP, INRA, ANSES- France, PHE- UK
Contact: Dr Sylvain Brisse (IP)

Codes4strains: Tracking bacterial pathogens through sources, geography and time using stable phylogenetically informative genome codes

The implementation of genome sequencing in public health microbiology has allowed the natural variation exhibited by pathogenic bacteria to be leveraged for infectious disease surveillance and outbreak detection. Genotype information derived from WGS allows the monitoring of pathogenic potential and the tracking of epidemic behaviour, to inform infection control, diagnostic and treatment practice.

To track strains globally, and as they spread between the environment, food, animals and humans, universal strain nomenclatures are necessary. Two main strain nomenclatures approaches are currently existing.

First, core genome Multilocus Sequence Typing (cgMLST) is widely applied for bacterial pathogen surveillance. It relies on predefined gene loci, the sequence variants of which are given unique identifiers (allelic numbers). Resulting allelic profiles are given unique identifiers (cgST) or are grouped based on their similarity, generally using the single-linkage clustering method. A

An alternative approach known as the SNP address was developed at Public Health England. Different from MLST, it is based on single nucleotide polymorphisms (SNP) compared to a reference genome. Single-linkage clustering is performed based on the resulting SNP distance between isolates. An original concept of the SNP address is to apply several thresholds upon allelic or SNP differences. The ‘address’ is a multi-positions code, where each position corresponds to the cluster membership at descending thresholds of genetic (SNP) distance among strains resulting in a multi-level nomenclature which provides a good approximation of the phylogenetic relatedness among isolates. Likewise, several cgMLST thresholds can be used to provide phylogenetic information on top of classification purposes, as was done for Listeria monocytogenes by the group of the main applicant.

Providing multi-level information on phylogenetic relatedness has proved helpful for epidemiological investigations and for prospective surveillance. This has facilitated outbreak detection as well as providing the framework for case/control studies at different diversity levels, depending on the length or complexity of an outbreak. Further, utilising a flexible level of divergence to define an ‘outbreak type’ aids hypothesis generation and may allow in some cases to identify the specific source of the outbreak by maximizing the power of case-control source attribution studies.

SNP and cgMLST approaches have complementary characteristics. One strength of the cgMLST approach is its standardized aspect (predefined sets of loci; unlike SNPs, which have proven difficult to standardize), which maximizes the applicability of the method for international or cross-sector strain comparisons where analysis is performed independently. In turn, whole-genome SNPs are more discriminatory than cgMLST, which relies on predefined set of ‘core’ loci. Therefore, SNP and cgMLST should be regarded as two useful approaches to be integrated jointly in future genomic epidemiology strategies.

However, one major limitation of current SNP address or multi-level cgMLST classifications is that they utilise single-linkage clustering to define groups. This approach is unstable, as the fusion of predefined groups upon discovery of ‘intermediate’ genotypes is an inherent mathematical property of single-linkage. This issue is pertinent within epidemiological timescales, where intermediate genotypes have a high probability of being sampled. It is our experience in both applicants groups that the fusion of predefined groups is a challenge to handle in practice, and introduces nomenclatural confusion.

Currently, no genomic nomenclature system of bacterial pathogens exists that combines complete stability of identifiers, high standardization and reproducibility and high resolution. This gap represents an important barrier to the field of genomic epidemiology and slows down communication and action against the transmission of pathogens across sectors, world regions and over long periods of time. This critical gap was addressed in the Codes4strains PhD project.

Congratulations to Mélanie for being awarded her doctorate degree in Autumn 2022!

Project Assets

PhD Final Thesis Report

Tessier, E., Hennart, M., Badell, E., Passet, V., Toubiana, J., Biron, A., Gourinat, A. C., Merlet, A., Colot, J., & Brisse, S. (2023). Genomic Epidemiology of Corynebacterium diphtheriae in New Caledonia. Microbiology spectrum. 11(3), e0461622. DOI:

Arcari, G., Hennart, M., Badell, E., & Brisse, S. (2023). Multidrug-resistant toxigenic Corynebacterium diphtheriae sublineage 453 with two novel resistance genomic islands. Microbial genomics. 9(1), mgen000923. DOI:

Museux, K., Arcari, G., Rodrigo, G., Hennart, M., Badell, E., Toubiana, J., & Brisse, S. (2023). Corynebacteria of the diphtheriae Species Complex in Companion Animals: Clinical and Microbiological Characterization of 64 Cases from France. Microbiology spectrum. e0000623. Advance online publication. DOI:

Hennart, M., Crestani, C., Bridel, S., Armatys, N., Brémont, S., Carmi-Leroy, A., Landier, A., Passet, V., Fonteneau, L., Vaux, S., Toubiana, J., Badell, E., & Brisse, S. (2023). A global Corynebacterium diphtheriae genomic framework sheds light on current diphtheria reemergence.

Hennart, M., Guglielmini, J., Bridel, S., Maiden, M.C.J., Jolley, K.A., Criscuolo, A., Brisse, S. (2022). A Dual Barcoding Approach to Bacterial Strain Nomenclature: Genomic Taxonomy of Klebsiella pneumoniae Strains. Molecular Biology and Evolution. 39 (7), msac135. DOI:

12-month PhD report-December 2021.

Guglielmini, J., Hennart, M., Badell, E., Toubiana, J., Criscuolo, A., & Brisse, S. (2021). Genomic Epidemiology and Strain Taxonomy of Corynebacterium diphtheriaeJournal of clinical microbiology. 59(12), e0158121. DOI:

Badell, E., Alharazi, A., Criscuolo, A., Almoayed, K., Lefrancq, N., Bouchez, V., Guglielmini, J., Hennart, M., Carmi-Leroy, A., Zidane, N., Pascal-Perrigault, M., Lebreton, M., Martini, H., Salje, H., Toubiana, J., Dureab, F., Dhabaan, G., & Brisse, S., Rawah, A. & Al-Somainy, A. (2021). Ongoing diphtheria outbreak in Yemen: a cross-sectional and genomic epidemiology study. The Lancet Microbe. 2(8), E386-E396.  DOI:

Badell, E., Hennart, M., Rodrigues, C., Passet, V., Dazas, M., Panunzi, L., Bouchez, V., Carmi-Leroy, A., Toubiana, J., & Brisse, S. (2020). Corynebacterium rouxii sp. nov., a novel member of the diphtheriae species complex. Research in microbiology171(3-4), 122–127. DOI:

Hennart, M., Panunzi, LG., Rodrigues, C., Gaday, Q., Baines, SL., Barros-Pinkelnig, M., Carmi-Leroy, A., Dazas, M., Wehenkel, AM., Didelot, X., Toubiana, J., Badell, E., Brisse, S. (2020). Population genomics and antimicrobial resistance in Corynebacterium diphtheriae. Genome Medicine. 12, p 1-18. DOI:

Deliverables of Codes4strains project (Part1)

Deliverables of Codes4strains project (Part2)

Poster presentation at 13th International Meeting on Microbial Epidemiological Markers (IMMEM XIII), Bath, UK. 14-17th September 2022.

Poster presentation & participation in 3-minute thesis competition at One Health EJP Annual Scientific Meeting, Orvieto, Italy. 11-13th April 2022.

Melanie Hennart OHEJP student photo


About me:
My name is Melanie and I am 24 years old. I have a Master’s degree in Bioinformatics and Modelling (BIM) from Sorbonne University since July 2018. My academic background in this multidisciplinary course has allowed me to develop my knowledge of biology and to acquire skills in the field of informatics, such as the use and implementation of bioinformatics tools. I have therefore chosen to focus my scientific profile on methodological development for the analysis of large genomic datasets

What motivated me to do a PhD:
My professional experience in the Unit “Biodiversity and Epidemiology of Bacterial Pathogens” at the Institut Pasteur allowed me to discover the field of molecular epidemiology and reinforced my interest in comparative genomics. In addition, pursuing my doctoral studies will allow me to explore and design methods of bacterial nomenclature that will facilitate communication between different public health actors in the future. This project will allow me to benefit from the expertise of the various laboratories in bacterial population genomics and bioanalysis in the field of microbiology.



Log in with your credentials

Forgot your details?