Integrating transcription factor binding site information with gene expression datasets

Ian B. Jeffery, Stephen F. Madden, Paul A. McGettigan, Guy Perrière, Aedín C. Culhane, Desmond G. Higgins

Research output: Contribution to journalArticlepeer-review

Abstract

Motivation: Microarrays are widely used to measure gene expression differences between sets of biological samples. Many of these differences will be due to differences in the activities of transcription factors. In principle, these differences can be detected by associating motifs in promoters with differences in gene expression levels between the groups. In practice, this is hard to do. Results: Wecombine correspondence analysis, between group analysis and co-inertia analysis to determine which motifs, from a database of promoter motifs, are strongly associated with differences in gene expression levels. Given a database of motifs and gene expression levels from a set of arrays, the method produces a ranked list of motifs associated with any specified split in the arrays. We give an example using the Gene Atlas compendium of gene expression levels for human tissues where we search for motifs that are associated with expression in central nervous system (CNS) or muscle tissues. Most of the motifs that we find are known from previous work to be strongly associated with expression in CNS or muscle. We give a second example using a published prostate cancer dataset where we can simply and clearly find which transcriptional pathways are associated with differences between benign and metastatic samples.

Original languageEnglish
Pages (from-to)298-305
Number of pages8
JournalBioinformatics
Volume23
Issue number3
DOIs
Publication statusPublished - 1 Feb 2007
Externally publishedYes

Fingerprint

Dive into the research topics of 'Integrating transcription factor binding site information with gene expression datasets'. Together they form a unique fingerprint.

Cite this