
NEC Corporation announced that it has verified the effectiveness of a new drug screening system, ChemMinerTM, which utilizes data mining techniques such as active learning.
A `passive’ learning system obtains data for learning and then proceeds to learn it. In active learning, the learning system `actively’ chooses data that should be learned. The active learning system enables a ten-fold improvement in screening performance, resulting in an approximate 90% decrease in screening cost, as compared with conventional screening systems. The system’s effectiveness has been verified through collaborative research with Tanabe Seiyaku Co., Ltd, a Japanese pharmaceutical company, to which the system will be delivered in September.
The main points of this new system are as follows:
Development of a novel technique called “exponential selection”
At an actual screening site, detection of active compounds (compounds that are effective for drug purposes when they interact with protein in some form) is rare, thus discovery of active compounds has been one of the issues with conventional screening. NEC’s proprietary new technique allows scoring of compounds by the information gained from prior screening by low or high probability. This method aims to detect low scored compounds with low probability to aid selection of more informative compounds during the next stage of screening. This method enables the finding of extremely rare active compounds from anywhere from several hundred thousand to several million sets of compounds.
Development of a novel technique called “descriptor sampling”
With active learning, data learned at an early stage can limit the diversity of a system, thus new types of compounds are often not discovered, representing another major issue of the random screening method. With this unique technique, some descriptors are masked and are not utilized for learning, allowing greater diversity, which can lead to the finding of diverse groups of compounds.
Improved system performance as compared with conventional methods
By applying the previous two points to a type of G protein-coupled receptor (GPCRs – a family of proteins that are major, important screening targets in the drug discovery field), NEC was able to demonstrate improved screening system performance as compared with conventional methods for several data groups. The number of assay wet experiments, which are vital to the finding of active compounds, carried out during screening was reduced by anywhere from 88% – 97% compared to conventional methods. Computer based experiments contained approximately 1500 “hits” that bind to GPCR, from a library consisting of 260,000 compounds. During the first experiment 5000 data were learned, in which 37 hits were included. By using NEC’s new method, 91 percent of these “hits” by only 12% of the cost of the conventional method were located. This improvement achieves a substantial reduction in screening costs by approximately 90% from several hundred thousand to several tens of thousands of dollars.
During the drug discovery process, systems will screen a huge chemical library, consisting of anywhere from one hundred thousand to one million chemical compounds, in order to search for chemical compounds effective in drug creation. This incurs exorbitant cost as screenings require the performing of costly wet experiments. NEC expects this new system to respond to the need for a more economical drug screening system, which has been long sought after in the pharmaceutical field.
NEC’s Bio-IT Business Promotion Center will begin offering outsourcing services for the screening of new drugs from September, 2005 and plans to begin sales of ChemMinerTM at a later date.