Intelligence artificielle et régulation génique

  • High throughput experiments such as next generation sequencing are often used to answer simple biological questions; “which genes are more expressed in breast cancer compared to normal?”. Given the huge amount of information generated for each experiment, this is equivalent to having a privileged access to an oracle and asking “what time is it?”. Machine learning is an excellent tool for discovering hidden information in large amounts of data. These not only allow life scientists to get better answers but also to generate novel hypotheses. Our lab looks for opportunities in medical and fundamental biology data where information theory and machine learning can make a substantial impact. A few examples of our discoveries include using Shannon’s Entropy to discover transcriptional disorder in cancer (PLoS CB, 2008), simulating a biologists behavior to identify a method to detect microRNA targets (Nature Methods, 2009) and using novel bioinformatics strategies to discover the impact of introns on gene expression (Cell, 2013; Genome Biology 2017; Nature Communications 2017).


    Artificial Intelligence in Biology and Health, October 2018: Back to the symposium with all the videos of the speakers


    William Ritchie
    Ritchie William
    Aubin Thomas
    Thomas Aubin
    Sylvain Barriere
    Barriere Sylvain
    Lucile Broseus
    Broseus Lucile
    Claudio Lorenzi
    Lorenzi Claudio
    Stanislas Fereol
    Fereol Stanislas


    Exploring the Roles of CREBRF and TRIM2 in the Regulation of Angiogenesis by High-Density Lipoproteins.

    Wong NKP, Cheung H, Solly EL, Vanags LZ, Ritchie W, Nicholls SJ, Ng MKC, Bursill CA, Tan JTM

    2018 - Int J Mol Sci, 19(7)

    Demander l'article complet29958463

    An NF90/NF110-mediated feedback amplification loop regulates dicer expression and controls ovarian carcinoma progression.

    Barbier J, Chen X, Sanchez G, Cai M, Helsmoortel M, Higuchi T, Giraud P, Contreras X, Yuan G, Feng Z, Nait-Saidi R, Deas O, Bluy L, Judde JG, Rouquier S, Ritchie W, Sakamoto S, Xie D, Kiernan R

    2018 - Cell Res, 28(5):556-571

    Demander l'article complet29563539

    IRFinder: assessing the impact of intron retention on mammalian gene expression

    Middleton R, Gao D, Thomas A, Singh B, Au A, Wong JJ, Bomane A, Cosson B, Eyras E, Rasko JE, Ritchie W.

    2017 - Genome Biol. , 18(1):51

    Demander l'article complet28298237

    Intron retention is regulated by altered MeCP2-mediated splicing factor recruitment

    Wong JJ, Gao D, Nguyen TV, Kwok CT, van Geldermalsen M, Middleton R, Pinello N, Thoeng A, Nagarajah R, Holst J, Ritchie W, Rasko JEJ

    2017 - Nat Commun, 8, 15134

    Demander l'article complet28480880
    Afficher toutes les publications

    Publications de l'équipe

  • Ultra-long sequencing to detect cancer-associated intron retention

    Intron retention (IR) occurs when an intron is included in a mature mRNA. Previously regarded as a by-product of faulty splicing, transcripts with retained introns are often rapidly degraded by a surveillance mechanism called nonsense-mediated decay (NMD). We discovered that numerous cell types make use of this mechanism by increasing the amount of transcripts with retained introns for degradation in granulopoiesis (Cell, 2013), pluripotent stem cells (Nature, 2014) and erythrocyte differentiation (Blood, 2016). IR was recently found to have a major role in modulating tumour suppressor genes in hundreds of different cancers (Nature Genetics, 2015). However, because IR could not previously be correctly identified, numerous studies have overlooked potential biomarkers and therapeutic targets linked to this novel type of gene regulation. In this project we will combine new long RNA sequencing with classical Illumina sequencing to define IR with unprecedented accuracy. This will enable us to define IR features that contribute to normal development and disease.

    Programming genetic networks to extract hidden information in sequencing data

    Advances in next generation sequencing methods have revealed that transcription is more pervasive, more diverse and more cryptic than expected. Despite this heterogeneity in information and despite the fact that our understanding of transcript architecture is incomplete, bioinformatics analyses of these data are frequently initiated through a common, biased procedure; they are mapped to a reference genome or transcriptome. This step does not account for major changes in the genome or transcriptome as can be the case in multiple cancers nor does it account for small sequence variations common between individuals. As a result, only a portion of transcriptional information measured by NGS is used to discover meaningful signatures between different biological samples.