公共および独自のデータセットでトレーニングされたLOGPおよびLOGD補正モデルの比較

PMID：35359246DOI：10.1007/s10822-022-00450-9

文献タイプ：

Journal Article

5大医学誌の要約と
著名医師による解説が無料で読めます

会員登録(医師のみ)してログイン
すると翻訳の精度が向上します

概要

Abstract

創薬では、オクタノール/水の分配および分布係数、LOGPおよびLOGDが分子の親油性のメトリックとして広く使用されており、潜在的な薬物の生物活性と生体排気能に強い影響を与えます。LOGPを計算するためのさまざまな確立された方法、主にフラグメントまたはアトムベースのさまざまな方法がありますが、LOGD予測は一般に、特定のpHでの中性およびイオン化された集団を推定するために計算されたLOGPとPKAに依存しています。CLOGPなどのアルゴリズムには、一般的に化学的に関連する分子の系統的エラーにつながる制限がありますが、PKA推定は、イオン化可能な部分の電子的、帰納的および共役効果の相互作用のために一般的に困難です。統合された機械学習QSARモデリングアプローチを提案して、モデルソフトウェアによってモデル記述子として予測されたCLOGPとPKAを使用しながら、実験データでモデルをトレーニングすることによりLOGDを予測します。ソフトウェアによって計算されたCLOGDの損失関数を最適化することにより、ソフトウェアからの記述子と利用可能な実験的LOGDデータの両方を組み込んだ修正モデルを構築します。さらに、ソフトウェアが予測されたPKAを使用して、LOGDモデルからLOGPを計算します。ここでは、公開または商用利用可能なLOGDデータを使用してモデルをトレーニングして、このアプローチが親油性の商用ソフトウェア予測を改善できることを示しています。他のLOGDデータセットに適用されると、このアプローチは、商用ソフトウェアに対するLOGDおよびLOGP予測の適用性のドメインを拡張します。これらのモデルのパフォーマンスは、独自のLOGDデータのより大きなセットで構築されたモデルと比較して好意的に比較されます。

In drug discovery, partition and distribution coefficients, logP and logD for octanol/water, are widely used as metrics of the lipophilicity of molecules, which in turn have a strong influence on the bioactivity and bioavailability of potential drugs. There are a variety of established methods, mostly fragment or atom-based, to calculate logP while logD prediction generally relies on calculated logP and pKa for the estimation of neutral and ionized populations at a given pH. Algorithms such as ClogP have limitations generally leading to systematic errors for chemically related molecules while pKa estimation is generally more difficult due to the interplay of electronic, inductive and conjugation effects for ionizable moieties. We propose an integrated machine learning QSAR modeling approach to predict logD by training the model with experimental data while using ClogP and pKa predicted by commercial software as model descriptors. By optimizing the loss function for the ClogD calculated by the software, we build a correction model that incorporates both descriptors from the software and available experimental logD data. Additionally, we calculate logP from the logD model using the software predicted pKa's. Here, we have trained models using publicly or commercial available logD data to show that this approach can improve on commercial software predictions of lipophilicity. When applied to other logD data sets, this approach extends the domain of applicability of logD and logP predictions over commercial software. Performance of these models favorably compare with models built with a larger set of proprietary logD data.

医師のための臨床サポートサービス

ヒポクラ x マイナビのご紹介

無料会員登録していただくと、さらに便利で効率的な検索が可能になります。

Translated by Google