跳转至

生物医学 AI Agent 技能集合(1676个)

一句话概述:Awesome Bio Agent Skills 是一个精心整理的生物医学研究 AI Agent 技能集合,涵盖基因组学、蛋白质组学、单细胞分析、临床 AI 以及蛋白质设计等领域。

核心功能与知识点详解

项目核心目标

  • 资源整合:汇总生物医学研究中可复用的 AI Agent 技能,避免重复造轮子。
  • 标准化格式:每个技能以独立文件夹形式存在,内含 SKILL.md 文件,遵循统一的模板,便于 Agent 框架加载与调用。
  • 跨领域覆盖:涵盖基因组、蛋白质组、单细胞、临床医学、蛋白质设计等关键方向。

数据来源

技能来自 20 个公开仓库,包括但不限于:

  • bioskills
  • 其他未列明的仓库(sources 模块会显示完整列表)

技能分类(15 类)

  1. Genomics(基因组学)— 526 个技能
  2. Proteomics(蛋白质组学)
  3. Single-Cell Analysis(单细胞分析)
  4. Biology and AI(生物学与 AI)
  5. Clinical and Medical(临床与医学)
  6. Transcriptomics(转录组学)
  7. Database Query(数据库查询)
  8. Multi-Omics Integration(多组学整合)
  9. Bioinformatics Utilities(生物信息学工具)
  10. Visualization(可视化)
  11. Workflow Orchestration(工作流编排)
  12. Epigenomics(表观基因组学)
  13. Pathway Analysis(通路分析)
  14. Metagenomics(宏基因组学)
  15. Protein Design(蛋白质设计)

技能格式

每个技能文件夹中包含:

  • SKILL.md:技能描述、输入输出、依赖、使用方法、示例等。
  • 可选的其他文件(如配置文件、测试数据)。

技能示例(Genomics 分类的部分技能列表):

SkillSourceDescription
bio-alignment-amplicon-clippingbioskillsTrim PCR primers from aligned reads in amplicon-panel BAMs using samtools ampliconclip. Use when processing SARS-CoV-2 ARTIC, hereditary cancer pan...
bio-alignment-filteringbioskillsFilter alignments by flags, mapping quality, and regions using samtools view and pysam. Use when extracting specific reads, removing low-quality al...
bio-alignment-indexingbioskillsCreate and use BAI/CSI indices for BAM/CRAM files using samtools and pysam. Use when enabling random access to alignment files or fetching specific...
bio-alignment-iobioskillsRead, write, and convert multiple sequence alignment files using Biopython Bio.AlignIO. Supports Clustal, PHYLIP, Stockholm, FASTA, Nexus, and othe...
bio-alignment-msa-parsingbioskillsParse and analyze multiple sequence alignments using Biopython. Extract sequences, identify conserved regions, analyze gaps, work with annotations,...
bio-alignment-msa-statisticsbioskillsCalculate alignment statistics including sequence identity, conservation scores, substitution matrices, and similarity metrics. Use when comparing...
bio-alignment-multiplebioskillsPerform multiple sequence alignment using MAFFT, MUSCLE5, ClustalOmega, or T-Coffee. Guides tool and algorithm selection based on dataset size, seq...
bio-alignment-pairwisebioskillsPerform pairwise sequence alignment using Biopython Bio.Align.PairwiseAligner. Use when comparing two sequences, finding optimal alignments, scorin...
bio-alignment-sortingbioskillsSort alignment files by coordinate or read name using samtools and pysam. Use when preparing BAM files for indexing, variant calling, or paired-end...
bio-alignment-structuralbioskillsAlign protein structures using Foldseek 3Di, TM-align, US-align, DALI, or Foldmason for structural MSA. Predict, score, and superpose backbone coor...
bio-alignment-trimmingbioskillsTrim multiple sequence alignments using ClipKIT, trimAl, BMGE, Divvier, or HMMcleaner with mode selection guidance per downstream goal. Use when re...
bio-alignment-validationbioskillsValidate alignment quality with insert size distribution, proper pairing rates, GC bias, strand balance, and other post-alignment metrics. Use when...
bio-atac-seq-allele-specific-accessibilitybioskillsDetect allele-specific chromatin accessibility from ATAC-seq using WASP, GATK ASEReadCounter, or RASQUAL. Use when mapping cis-regulatory genetic v...
bio-atac-seq-atac-peak-callingbioskillsCall accessible chromatin regions from ATAC-seq BAM files using MACS3, MACS2, Genrich, or HMMRATAC. Use when identifying open chromatin from aligne...
bio-atac-seq-consensus-peaksetbioskillsBuild a differential-ready consensus peakset from per-replicate ATAC-seq peaks using iterative overlap removal, fixed-width re-centering, and major...
bio-bam-statisticsbioskillsGenerate alignment statistics using samtools flagstat, stats, depth, coverage, and mosdepth. Use when assessing alignment quality, calculating cove...
bio-basecallingbioskillsConvert raw Nanopore signal data (FAST5/POD5) to nucleotide sequences using Dorado basecaller. Covers model selection, GPU acceleration, modified b...
bio-bedgraph-handlingbioskillsCreate, manipulate, and convert bedGraph files for genome browser visualization. Covers bedGraph format, conversion to/from bigWig, normalization,...
bio-biomart-queriesbioskillsBulk-query Ensembl BioMart (and other BioMart instances) for cross-database ID mapping, gene/transcript/exon coordinates, and ortholog tables. Use...
bio-causal-genomics-colocalization-analysisbioskillsTest whether two or more traits share a causal variant at a locus using Bayesian colocalization (coloc.abf, coloc.susie, HyPrColoc, moloc, eCAVIAR,...
bio-causal-genomics-effector-gene-prioritizationbioskillsMaps GWAS-implicated loci to candidate effector (causal) genes by integrating variant-to-gene (V2G) features via Open Targets L2G (Mountjoy 2021),...
bio-causal-genomics-fine-mappingbioskillsResolves GWAS associations to candidate causal variants and credible sets via SuSiE, susie_rss, FINEMAP, CAVIAR, DAP-G, PAINTOR, PolyFun, SuSiEx, M...
bio-causal-genomics-genetic-correlationbioskillsEstimate bivariate genetic correlation (rg) between traits from GWAS summary statistics or individual-level genotypes using cross-trait LDSC, HDL,...
bio-causal-genomics-genomic-sembioskillsFits structural equation models to GWAS summary statistics using GenomicSEM (Grotzinger 2019), including common-factor models, confirmatory factor...
bio-causal-genomics-mediation-analysisbioskillsDecompose total effects into direct and indirect paths through mediators using mediation, CMAverse 4-way, HIMA/HIMA2 high-dimensional, BAMA, two-st...
bio-causal-genomics-mendelian-randomizationbioskillsEstimate causal effects of an exposure on an outcome from GWAS summary statistics using genetic instruments. Implements IVW (fixed/random), MR-Egge...
bio-causal-genomics-pleiotropy-detectionbioskillsDetect and adjust for horizontal pleiotropy in two-sample Mendelian randomization by distinguishing uncorrelated (UHP) from correlated (CHP) pleiot...
bio-causal-genomics-proteome-mr-drug-targetbioskillsRuns cis-pQTL Mendelian randomization for drug-target validation using UKB-PPP (Olink), deCODE (SomaScan), Fenland, INTERVAL, ARIC, and FinnGen-PPP...
bio-causal-genomics-transcriptome-wide-associationbioskillsPerforms gene-level association from GWAS summary statistics via genetically predicted tissue expression using FUSION, PrediXcan, S-PrediXcan, S-Mu...
bio-cfdna-preprocessingbioskillsPreprocesses cell-free DNA sequencing data including adapter trimming, alignment optimized for short fragments, and UMI-aware duplicate removal usi...
bio-chipseq-allele-specific-bindingbioskillsDetects allele-specific transcription factor or histone modification binding from heterozygous-variant ChIP-seq using WASP (reference-bias filter;...
bio-chipseq-chip-deep-learningbioskillsTrains and applies base-resolution deep learning models on ChIP-seq / ChIP-nexus / CUT&RUN data. Uses BPNet (Avsec ...)

(以上表格仅展示 Genomics 类别的部分技能,完整列表请查阅项目仓库 bioskill_index_v3.csv 文件。)

使用方式

此集合设计为 Agent 框架的“技能库”。你可以:

  1. 克隆仓库到本地:
    git clone https://github.com/BioTender-max/awesome-bio-agent-skills.git
    
  2. 将 skills 目录下的子文件夹(即每个技能)拷贝到 Agent 框架(如 OpenClaw、NanoClaw、Biomni)的技能目录中。
  3. Agent 会自动识别 SKILL.md 文件并根据描述调用对应工具。

如何贡献

  • 欢迎提交新的技能文件夹(遵循 SKILL.md 模板)。
  • 欢迎报告缺失或错误技能。
  • 请参考仓库中的 CONTRIBUTING.md(如果存在)。

常见问题

Q1:技能是否都是独立的? 是的,每个技能是一个自包含的文件夹,包含所需的描述和代码。

Q2:技能基于哪些 Agent 框架? 兼容 Claude 基础 Agent 框架,显式支持 OpenClaw、NanoClaw、Biomni。其他框架可能也适用,但未正式测试。

Q3:技能数量是否准确? 集合包含 1676 个去重后的技能,索引文件为 bioskill_index_v3.csv

Q4:这些技能可以直接在本地运行吗? 每个技能内部可能包含 Python 脚本或 Shell 命令,但需依赖外部工具(如 samtools、Biopython 等)。请阅读相应 SKILL.md 中的依赖说明。

Q5:20 个来源仓库是哪些? 请查看项目仓库中的 Sources 模块(部分:## Sources)。

速查表

类别技能数量(示例)说明
Genomics526WGS/WES 分析、变异注释、GWAS、CNV、结构变异、单倍型分型、基因组组装
Proteomics未列举蛋白质组学
Single-Cell Analysis未列举单细胞分析
Biology and AI未列举生物学与人工智能结合
Clinical and Medical未列举临床与医学
Transcriptomics未列举转录组学
Database Query未列举数据库查询
Multi-Omics Integration未列举多组学整合
Bioinformatics Utilities未列举生物信息学工具
Visualization未列举可视化
Workflow Orchestration未列举工作流编排
Epigenomics未列举表观基因组学
Pathway Analysis未列举通路分析
Metagenomics未列举宏基因组学
Protein Design未列举蛋白质设计

注意:以上数量除 Genomics 外均未在原文中给出,请以项目 bioskill_index_v3.csv 为准。

许可证

本项目采用 CC0 1.0 Universal 协议,属于公共领域(Public Domain)。