生物医学 AI Agent 技能集合(1676个)¶
一句话概述:Awesome Bio Agent Skills 是一个精心整理的生物医学研究 AI Agent 技能集合,涵盖基因组学、蛋白质组学、单细胞分析、临床 AI 以及蛋白质设计等领域。
核心功能与知识点详解¶
项目核心目标¶
- 资源整合:汇总生物医学研究中可复用的 AI Agent 技能,避免重复造轮子。
- 标准化格式:每个技能以独立文件夹形式存在,内含
SKILL.md文件,遵循统一的模板,便于 Agent 框架加载与调用。 - 跨领域覆盖:涵盖基因组、蛋白质组、单细胞、临床医学、蛋白质设计等关键方向。
数据来源¶
技能来自 20 个公开仓库,包括但不限于:
- bioskills
- 其他未列明的仓库(sources 模块会显示完整列表)
技能分类(15 类)¶
- Genomics(基因组学)— 526 个技能
- Proteomics(蛋白质组学)
- Single-Cell Analysis(单细胞分析)
- Biology and AI(生物学与 AI)
- Clinical and Medical(临床与医学)
- Transcriptomics(转录组学)
- Database Query(数据库查询)
- Multi-Omics Integration(多组学整合)
- Bioinformatics Utilities(生物信息学工具)
- Visualization(可视化)
- Workflow Orchestration(工作流编排)
- Epigenomics(表观基因组学)
- Pathway Analysis(通路分析)
- Metagenomics(宏基因组学)
- Protein Design(蛋白质设计)
技能格式¶
每个技能文件夹中包含:
SKILL.md:技能描述、输入输出、依赖、使用方法、示例等。- 可选的其他文件(如配置文件、测试数据)。
技能示例(Genomics 分类的部分技能列表):
| Skill | Source | Description |
|---|---|---|
| bio-alignment-amplicon-clipping | bioskills | Trim PCR primers from aligned reads in amplicon-panel BAMs using samtools ampliconclip. Use when processing SARS-CoV-2 ARTIC, hereditary cancer pan... |
| bio-alignment-filtering | bioskills | Filter alignments by flags, mapping quality, and regions using samtools view and pysam. Use when extracting specific reads, removing low-quality al... |
| bio-alignment-indexing | bioskills | Create and use BAI/CSI indices for BAM/CRAM files using samtools and pysam. Use when enabling random access to alignment files or fetching specific... |
| bio-alignment-io | bioskills | Read, write, and convert multiple sequence alignment files using Biopython Bio.AlignIO. Supports Clustal, PHYLIP, Stockholm, FASTA, Nexus, and othe... |
| bio-alignment-msa-parsing | bioskills | Parse and analyze multiple sequence alignments using Biopython. Extract sequences, identify conserved regions, analyze gaps, work with annotations,... |
| bio-alignment-msa-statistics | bioskills | Calculate alignment statistics including sequence identity, conservation scores, substitution matrices, and similarity metrics. Use when comparing... |
| bio-alignment-multiple | bioskills | Perform multiple sequence alignment using MAFFT, MUSCLE5, ClustalOmega, or T-Coffee. Guides tool and algorithm selection based on dataset size, seq... |
| bio-alignment-pairwise | bioskills | Perform pairwise sequence alignment using Biopython Bio.Align.PairwiseAligner. Use when comparing two sequences, finding optimal alignments, scorin... |
| bio-alignment-sorting | bioskills | Sort alignment files by coordinate or read name using samtools and pysam. Use when preparing BAM files for indexing, variant calling, or paired-end... |
| bio-alignment-structural | bioskills | Align protein structures using Foldseek 3Di, TM-align, US-align, DALI, or Foldmason for structural MSA. Predict, score, and superpose backbone coor... |
| bio-alignment-trimming | bioskills | Trim multiple sequence alignments using ClipKIT, trimAl, BMGE, Divvier, or HMMcleaner with mode selection guidance per downstream goal. Use when re... |
| bio-alignment-validation | bioskills | Validate alignment quality with insert size distribution, proper pairing rates, GC bias, strand balance, and other post-alignment metrics. Use when... |
| bio-atac-seq-allele-specific-accessibility | bioskills | Detect allele-specific chromatin accessibility from ATAC-seq using WASP, GATK ASEReadCounter, or RASQUAL. Use when mapping cis-regulatory genetic v... |
| bio-atac-seq-atac-peak-calling | bioskills | Call accessible chromatin regions from ATAC-seq BAM files using MACS3, MACS2, Genrich, or HMMRATAC. Use when identifying open chromatin from aligne... |
| bio-atac-seq-consensus-peakset | bioskills | Build a differential-ready consensus peakset from per-replicate ATAC-seq peaks using iterative overlap removal, fixed-width re-centering, and major... |
| bio-bam-statistics | bioskills | Generate alignment statistics using samtools flagstat, stats, depth, coverage, and mosdepth. Use when assessing alignment quality, calculating cove... |
| bio-basecalling | bioskills | Convert raw Nanopore signal data (FAST5/POD5) to nucleotide sequences using Dorado basecaller. Covers model selection, GPU acceleration, modified b... |
| bio-bedgraph-handling | bioskills | Create, manipulate, and convert bedGraph files for genome browser visualization. Covers bedGraph format, conversion to/from bigWig, normalization,... |
| bio-biomart-queries | bioskills | Bulk-query Ensembl BioMart (and other BioMart instances) for cross-database ID mapping, gene/transcript/exon coordinates, and ortholog tables. Use... |
| bio-causal-genomics-colocalization-analysis | bioskills | Test whether two or more traits share a causal variant at a locus using Bayesian colocalization (coloc.abf, coloc.susie, HyPrColoc, moloc, eCAVIAR,... |
| bio-causal-genomics-effector-gene-prioritization | bioskills | Maps GWAS-implicated loci to candidate effector (causal) genes by integrating variant-to-gene (V2G) features via Open Targets L2G (Mountjoy 2021),... |
| bio-causal-genomics-fine-mapping | bioskills | Resolves GWAS associations to candidate causal variants and credible sets via SuSiE, susie_rss, FINEMAP, CAVIAR, DAP-G, PAINTOR, PolyFun, SuSiEx, M... |
| bio-causal-genomics-genetic-correlation | bioskills | Estimate bivariate genetic correlation (rg) between traits from GWAS summary statistics or individual-level genotypes using cross-trait LDSC, HDL,... |
| bio-causal-genomics-genomic-sem | bioskills | Fits structural equation models to GWAS summary statistics using GenomicSEM (Grotzinger 2019), including common-factor models, confirmatory factor... |
| bio-causal-genomics-mediation-analysis | bioskills | Decompose total effects into direct and indirect paths through mediators using mediation, CMAverse 4-way, HIMA/HIMA2 high-dimensional, BAMA, two-st... |
| bio-causal-genomics-mendelian-randomization | bioskills | Estimate causal effects of an exposure on an outcome from GWAS summary statistics using genetic instruments. Implements IVW (fixed/random), MR-Egge... |
| bio-causal-genomics-pleiotropy-detection | bioskills | Detect and adjust for horizontal pleiotropy in two-sample Mendelian randomization by distinguishing uncorrelated (UHP) from correlated (CHP) pleiot... |
| bio-causal-genomics-proteome-mr-drug-target | bioskills | Runs cis-pQTL Mendelian randomization for drug-target validation using UKB-PPP (Olink), deCODE (SomaScan), Fenland, INTERVAL, ARIC, and FinnGen-PPP... |
| bio-causal-genomics-transcriptome-wide-association | bioskills | Performs gene-level association from GWAS summary statistics via genetically predicted tissue expression using FUSION, PrediXcan, S-PrediXcan, S-Mu... |
| bio-cfdna-preprocessing | bioskills | Preprocesses cell-free DNA sequencing data including adapter trimming, alignment optimized for short fragments, and UMI-aware duplicate removal usi... |
| bio-chipseq-allele-specific-binding | bioskills | Detects allele-specific transcription factor or histone modification binding from heterozygous-variant ChIP-seq using WASP (reference-bias filter;... |
| bio-chipseq-chip-deep-learning | bioskills | Trains and applies base-resolution deep learning models on ChIP-seq / ChIP-nexus / CUT&RUN data. Uses BPNet (Avsec ...) |
(以上表格仅展示 Genomics 类别的部分技能,完整列表请查阅项目仓库 bioskill_index_v3.csv 文件。)
使用方式¶
此集合设计为 Agent 框架的“技能库”。你可以:
- 克隆仓库到本地:
- 将 skills 目录下的子文件夹(即每个技能)拷贝到 Agent 框架(如 OpenClaw、NanoClaw、Biomni)的技能目录中。
- Agent 会自动识别
SKILL.md文件并根据描述调用对应工具。
如何贡献¶
- 欢迎提交新的技能文件夹(遵循
SKILL.md模板)。 - 欢迎报告缺失或错误技能。
- 请参考仓库中的
CONTRIBUTING.md(如果存在)。
常见问题¶
Q1:技能是否都是独立的? 是的,每个技能是一个自包含的文件夹,包含所需的描述和代码。
Q2:技能基于哪些 Agent 框架? 兼容 Claude 基础 Agent 框架,显式支持 OpenClaw、NanoClaw、Biomni。其他框架可能也适用,但未正式测试。
Q3:技能数量是否准确? 集合包含 1676 个去重后的技能,索引文件为 bioskill_index_v3.csv。
Q4:这些技能可以直接在本地运行吗? 每个技能内部可能包含 Python 脚本或 Shell 命令,但需依赖外部工具(如 samtools、Biopython 等)。请阅读相应 SKILL.md 中的依赖说明。
Q5:20 个来源仓库是哪些? 请查看项目仓库中的 Sources 模块(部分:## Sources)。
速查表¶
| 类别 | 技能数量(示例) | 说明 |
|---|---|---|
| Genomics | 526 | WGS/WES 分析、变异注释、GWAS、CNV、结构变异、单倍型分型、基因组组装 |
| Proteomics | 未列举 | 蛋白质组学 |
| Single-Cell Analysis | 未列举 | 单细胞分析 |
| Biology and AI | 未列举 | 生物学与人工智能结合 |
| Clinical and Medical | 未列举 | 临床与医学 |
| Transcriptomics | 未列举 | 转录组学 |
| Database Query | 未列举 | 数据库查询 |
| Multi-Omics Integration | 未列举 | 多组学整合 |
| Bioinformatics Utilities | 未列举 | 生物信息学工具 |
| Visualization | 未列举 | 可视化 |
| Workflow Orchestration | 未列举 | 工作流编排 |
| Epigenomics | 未列举 | 表观基因组学 |
| Pathway Analysis | 未列举 | 通路分析 |
| Metagenomics | 未列举 | 宏基因组学 |
| Protein Design | 未列举 | 蛋白质设计 |
注意:以上数量除 Genomics 外均未在原文中给出,请以项目
bioskill_index_v3.csv为准。
许可证¶
本项目采用 CC0 1.0 Universal 协议,属于公共领域(Public Domain)。