https://doi.org/10.1093/bib/bbaf171
요약
- IDP 구조 샘플링을 위한 새로운 소프트웨어 도구 개발: 이 논문은 무질서 단백질(IDPs)의 구조 다양성을 포괄적으로 탐색할 수 있는 IDPConformerGenerator라는 오픈소스 도구를 소개하며, 다양한 실험 제약조건을 통합할 수 있는 유연한 프레임워크를 제공함.
- 실험 제약조건 통합 가능 (NMR, FRET 등): NMR, SAXS, FRET 등의 다양한 실험 데이터를 제약조건으로 포함하여 물리적으로 타당한 구조 ensemble 생성이 가능하며, 실험 기반 모델링의 정밀도를 향상시킴.
- sidechain rotamer 및 backbone 구조 샘플링 모두 지원: backbone은 PDB 기반 통계적 접근으로, sidechain은 Dunbrack rotamer library를 사용하여 원자수준의 정확도 있는 구조 생성을 지원함.
- 다양한 출발점과 제약조건 조합 가능: sequence, secondary structure bias, dihedral angle constraints, and excluded volume 등의 구체적 입력값 조정이 가능해 sequence-specific conformational ensemble을 만들 수 있음.
- 기존 IDP 모델링 도구와의 차별성 강조: Flexible backbone sampling, modular plugin 방식, 사용자 정의 constraint 지원 등에서 Flexible-Meccano, TraDES, Flexible Tail 등 기존 도구보다 발전된 성능을 보여줌.
Abstract
Phase separation (PS) is essential in cellular processes and disease mechanisms, highlighting the need for predictive algorithms to analyze uncharacterized sequences and accelerate experimental validation. Current high-accuracy methods often rely on extensive annotations or handcrafted features, limiting their generalizability to sequences lacking such annotations and making it difficult to identify key protein regions involved in PS. We introduce Phase Separation’s Transfer-learning Prediction (PSTP), which combines conformational embeddings with large language model embeddings, enabling state-of-the-art PS predictions from protein sequences alone. PSTP performs well across various prediction scenarios and shows potential for predicting novel-designed artificial proteins. Additionally, PSTP provides residue-level predictions that are highly correlated with experimentally validated PS regions. By analyzing 160 000+ variants, PSTP characterizes the strong link between the incidence of pathogenic variants and residue-level PS propensities in unconserved intrinsically disordered regions, offering insights into underexplored mutation effects. PSTP’s sliding-window optimization reduces its memory usage to a few hundred megabytes, facilitating rapid execution on typical CPUs and GPUs. Offered via both a web server and an installable Python package, PSTP provides a versatile tool for decoding protein PS behavior and supporting disease-focused research.