Figure 2. Validation of methods' output lists of driver candidates and the approach taken to combine them.
(A) Proportion of genes included in the Cancer Gene Census (CGC) depending on the number of top-ranking genes from the list retrieved by each method from the pan-cancer analysis. Note that OncodriveCLUST retrieves only 72 genes, therefore it does not appear in the last two histograms, and ActiverDriver retrieves 95 genes, and thus it doesn't appear in the last histogram. (B) Venn diagram showing the overlap between the genes selected by each method in the pan-cancer analysis. The numbers in parenthesis represent the CGC genes rate in each group. Note that the CGC rates of groups of genes exhibiting more than one signal of positive selection range from 13% (OncodriveCLUST-MuSiC) to 92% (genes with the four signals). On the other hand, these rates are rather low in genes that posses only one signal, ranging between 4% (MuSiC) and 11% (OncodriveFM). Based on these results we decided to establish the quasi-majority vote described in Methods to select the genes in the core of the HCD list. In other words, genes with at least two signals of positive selection either in the pan-cancer analysis and/or any per-project analysis were nominated as high-confidence drivers. (C) Bar graph detailing the proportion of CGC depending on the number of signals of positive selection identified in the genes.