Print version
Search Pub Med
Expansion of the data-supported druggable proteome
The landmark 2002 publication by Hopkins and Groom coined the term “druggable genome”(1) for the predicted total of proteins likely to bind small molecules with approximate lead-like chemical properties, experimentally useful binding affinity and consequent activity modulation. By paralogous extrapolation of 120 approved drug targets at that time (i.e. the drugged genome) they arrived at a figure of 3000. However, this was based on a proteome estimate of 30,000 that has since shrunk to a more defined total of 20,203 Swiss-Prot canonical human protein entries. Their pages on the UniProtKB website now include four database cross-references in the new Chemistry section which allow a more detailed update of the druggable proteome, based largely on chemistry-to-protein mapping data curated from the literature. They are thus evidence-supported statistics rather than homology-based transitive estimates. These include links to 2927 target entries from ChEMBL, 2191 from BindingDB, 1563 from DrugBank and 1340 from the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb). Statistical comparisons between these, that form the subject of this work, are informative both for defining druggable sets and following their continued expansion. The union of all four sets, 3603, encompasses ~ 18% of the proteome. However, the proportion that would match our own GtoPdb criteria for drugablity mapping is difficult to estimate, since the chemistry-to-protein curation strategies and source selections for each database diverge considerably (2). This is manifest in the relatively high unique content of 1147(31% of the union) for the sources. However, they converge as a 4-way intersect for 490 proteins (13% of the union) that include an updated drugged proteome. Concordance between at least two independent sources (i.e. the non-unique proportion) expands to 2456 or 12% of the proteome. This represents the most precise data-supported druggable proteome snapshot for each UniProtKB release. Orthogonal comparative analyses of these intersecting sets will be presented, including by Gene Ontology functional categories, target class content, secreted vs. non-secreted, and disease gene links. For reasons that will be expanded on, the utility of this druggable proteome assessment is very high in pharmacology and drug discovery in terms of database mapping to drug leads as chemical starting points for target validation experiments. Utility also extends to chemical biology via the use of activity modulating small molecules as probes for function. Initiatives such as “Illuminating the Druggable Genome Program” (NIH) and addressing the untargeted kinome (3) are certain to expand the druggable genome coverage, although the conversion rate to solidly validated new therapeutic targets is likely to remain low. 1) Hopkins AL and Groom CR (2002) Nat Rev Drug Discov 1(9): 727-730. 2) Southan C et al. (2013) Mol Inform 32(11-12): 881-897. 3) Knapp S et al. (2013) Nat Chem Biol 1: 3-6.
|