Volume no :8, Issue no: 1, September (2012)

USING DATA MINING TOOLS TO FIND SIMILARITIES IN GENETIC PREDICTORS FOR COLON CANCER RECURRENCE

Author's: JOHN IHRIE and MINDY HONG
Pages: [1] - [13]
Received Date: August 21, 2012
Submitted by:

Abstract

Prognosis predictors based on gene lists have been proposed to supplement existing methods for predicting risk of recurrence in colon cancer patients. Currently, staging systems are used to assess risk in individual patients, but these systems often lack accuracy. Genetic predictors might improve risk assessment; however, different research teams often obtain dissimilar gene lists. In this study, web-based data mining tools are used to explore similarities of seven gene lists that are difficult to discern at the gene level. These lists are examined at three levels: gene, pathway, and network. WebGestalt is applied to identify statistically significant pathways in each list; Genes2Networks is then employed to search for relevant networks for each possible pair of lists and to create a network for all seven lists combined. Finally, the positive matching index is used to compare each list with each other list at all three levels. Even though gene sets showed little or no similarities at the gene level, similarities were generally greater at the pathway and network levels. Four non-list genes (AR, EGFR, GSN, and CEBPB) are identified in the combined- list network that might play a role in colon cancer recurrence. The results help support the widely held belief that biological networks play an important role in disease behaviour and suggest that these seven prognosis predictors might be more similar than they appear. Comparing genetic prognosis predictors might help scientists better understand the underlying biology of colon cancer and gene-based prediction.

Keywords

biological networks, biological pathways, colon cancer, colorectal cancer, data mining, Genes2Networks, genetic predictor, graph theory, positive matching index, prognosis predictor, WebGestalt.