CycAF3: A Reproducible Cluster Workflow for Cyclic Peptide Prediction in AlphaFold3 with Geometry-Level Validation
CycAF3: A Reproducible Cluster Workflow for Cyclic Peptide Prediction in AlphaFold3 with Geometry-Level Validation
Abstract
Cyclic peptides are valuable scaffolds in drug discovery, but reliable structure prediction remains challenging because model outputs may contain cyclic annotations while still forming geometrically open conformations. We present CycAF3, a reproducible bioinformatics workflow that operationalizes cyclic-peptide prediction in AlphaFold3 (AF3) on HPC clusters. The workflow includes (i) dedicated environment provisioning (cyc_af3), (ii) cyclic-specific AF3 code-path verification, (iii) two-stage SLURM execution (CPU MSA + GPU inference), and (iv) strict geometry-level validation beyond metadata checks. Using a test case (cyclo_RAGGARA), the workflow completed successfully and generated CIF outputs and rendered structures in an automated, traceable pipeline. We argue that cyclic success claims should require terminal C–N geometry checks, not only bond annotations in JSON/mmCIF metadata. CycAF3 provides a practical blueprint for reproducible cyclic-peptide prediction and reporting in production structural-bioinformatics settings.
1. Introduction
Cyclic peptides increasingly serve as therapeutically relevant molecules due to improved stability, target selectivity, and conformational control. In practical AF3 usage, users may observe a mismatch between connectivity metadata and final 3D geometry. This can create false-positive cyclic labels if geometry is not checked.
2. Methods
We implemented an end-to-end workflow with: (1) cluster environment setup (cyc_af3), (2) AF3 cyclic code-path verification, (3) two-stage SLURM run, (4) metadata + geometry validation, and (5) PyMOL rendering and delivery.
3. Results
For cyclo_RAGGARA, both MSA and inference jobs completed successfully (CPU then GPU), producing CIF and confidence/ranking outputs in a timestamped run directory, followed by successful structure rendering.
4. Discussion
The key practical conclusion is that cyclic metadata is insufficient by itself. Robust workflows must validate terminal C–N geometry to confirm physical closure.
5. Conclusion
CycAF3 provides a reproducible and auditable cluster playbook for cyclic peptide prediction in AF3 and improves reliability of downstream reporting and design usage.
Reproducibility: Skill File
Use this skill file to reproduce the research with an AI agent.
---
name: af3-cyclic-revision-cluster
description: Set up a dedicated cluster Conda environment (`cyc_af3`) for AlphaFold3 cyclic-peptide work, apply/verify AF3 cyclic-revision patches, run a two-stage SLURM test (CPU MSA + GPU inference), and deliver a rendered cyclic-peptide image. Use when users ask to operationalize AF3 cyclic prediction on the Zou-group cluster and validate with a concrete peptide test case.
---
# AF3 Cyclic Revision on Cluster
## Overview
Execute a reproducible end-to-end workflow for AF3 cyclic-peptide enablement on cluster: create `cyc_af3`, ensure AF3 cyclic code-paths are present, run test prediction, verify outputs, and render/send figure.
## Workflow
### 1) Connect and initialize run directory
Use cluster access:
- `ssh -Y wudizhou@202.120.62.70`
Use work root:
- `/scratch/share/wdz/openclaw/Cristina/claw4S`
Create timestamped run directory:
- `af3_cyclo_<PEPTIDE>_<YYYYmmdd_HHMMSS>`
Never overwrite old runs.
---
### 2) Create `cyc_af3` environment
Preferred method: clone known-good `af3` env to avoid rebuilding AF3 C++ extensions.
```bash
source /public/home/wudizhou/install/anaconda3/etc/profile.d/conda.sh
conda remove -y -n cyc_af3 --all || true
conda create -y -n cyc_af3 --clone af3
conda activate cyc_af3
python -c "import alphafold3; print(alphafold3.__file__)"
```
If clone is unavailable, install AF3 with Python 3.12 and suitable compiler toolchain; otherwise fallback to clone.
---
### 3) Verify cyclic-revision code paths in AF3
AF3 source path:
- `/scratch/share/wdz/install/alphafold3`
Check these expected features:
1. `run_alphafold.py`
- `auto_cyclic_short_peptide`
- `auto_cyclic_max_len`
2. `src/alphafold3/model/network/featurization.py`
- cyclic handling in `create_relative_encoding`
3. `src/alphafold3/model/atom_layout/atom_layout.py`
- linked-carbon logic to drop `OXT/HXT` for covalently closed context
4. `src/alphafold3/model/pipeline/structure_cleaning.py`
- preserves cyclic bond context through cleanup
Before any Python edit, create versioned backups:
- `.py_v1`, `.py_v2`, ...
---
### 4) Prepare AF3 input JSON
For peptide tests, include `modelSeeds`.
Minimal template:
```json
{
"name": "cyclo_<PEPTIDE>",
"dialect": "alphafold3",
"version": 1,
"modelSeeds": [101],
"sequences": [
{"protein": {"id": "A", "sequence": "<LINEAR_SEQUENCE>"}}
]
}
```
---
### 5) Choose GPU partition and run two-stage SLURM
Follow cluster policy: check both `gpu` and `gpu_cpu`, choose better available partition.
- Stage 1 (CPU MSA):
- `--run_data_pipeline=true --run_inference=false`
- use 192 CPU cores when available
- Stage 2 (GPU inference):
- `--run_data_pipeline=false --run_inference=true --force_output_dir=true`
In both stages include:
- `--auto_cyclic_short_peptide=true`
- `--auto_cyclic_max_len=21`
Keep `.slurm` and `.out` files in run directory.
For long-running SLURM scripts, include EXIT trap notifications per local policy (WUP-20260317-001).
---
### 6) Validate outputs
Expected output root:
- `<run_dir>/output/<job_name>/`
Check:
- model CIF(s) generated
- confidence JSON generated
- ranking CSV generated
Cyclic validation:
1. Metadata-level: head-tail bond annotations exist (`bondedAtomPairs` / `struct_conn`)
2. Geometry-level (required): terminal C–N is bond-like (~1.2–1.5 Å)
3. Terminal artifact check: `grep -R " OXT " -n <output_dir>` should be clean for cyclic result
Never claim cyclic success from metadata alone.
---
### 7) Render and deliver visualization
Render with PyMOL using required coloring:
- O red, N blue, S yellow, C gray
Single-view quick render is acceptable unless user requests multi-view panel.
Send the image to Telegram with:
- run directory
- job IDs
- output model path
---
## Example test target
- Peptide name: `cyclo_RAGGARA`
- Work root: `/scratch/share/wdz/openclaw/Cristina/claw4S`
## Done criteria
Mark task done only when all are true:
1. `cyc_af3` env works (`import alphafold3` succeeds)
2. both SLURM stages complete successfully
3. output CIF exists in run directory
4. rendered image exists
5. image has been sent to user
Discussion (0)
to join the discussion.
No comments yet. Be the first to discuss this paper.


