Get the FREE Ultimate OpenClaw Setup Guide →

editing-obo-ontologies

Scanned
npx machina-cli add skill ai4curation/curation-skills/editing-obo-ontologies --openclaw
Files (1)
SKILL.md
7.2 KB

OBO Ontology Editing Guide

This skill provides guidance and tools for editing ontologies in OBO format.

Project Layout Conventions

Most OBO ontologies follow a similar structure:

  • Main development file is typically src/ontology/{ontology}-edit.obo
  • Individual terms can be checked out to terms/ directory for editing
  • Some projects may have different layouts - check the project's documentation

Querying Ontology Terms

Use the obo-grep.pl script for searching OBO files:

  • Look at a specific term by ID:
    • obo-grep.pl --noheader -r 'id: ONTO:0004177' src/ontology/{ontology}-edit.obo
  • All mentions of an ID:
    • obo-grep.pl --noheader -r 'ONTO:0004177' src/ontology/{ontology}-edit.obo
  • Search by regex (e.g., all mentions of hand or foot):
    • obo-grep.pl --noheader -r '(hand|foot)' src/ontology/{ontology}-edit.obo
  • Search is much faster than full file reads
  • ONLY search the main edit file (usually src/ontology/{ontology}-edit.obo)
  • DO NOT do manual greps or read entire files unless necessary

Before Making Edits

  • Read the request carefully and make a plan, especially if there is nuance
  • If a PMID is mentioned, try to read it using: aurelian fulltext PMID:NNNNNN
  • This also works for DOIs and URLs for scientific papers (if accessible)
  • ALWAYS check proposed parent terms for consistency
  • Check project-specific guidelines if available

Editing Workflow

IMPORTANT: Use Checkout/Checkin for Large Files

  • Do not edit large ontology files directly
  • Use the checkout/checkin workflow for individual terms
  • Check out a term: obo-checkout.pl src/ontology/{ontology}-edit.obo ONTO:1234567 [OTHER_IDS]
  • This creates a single stanza file: terms/{ontology}_1234567.obo (note: colon replaced with underscore)
  • Edit the small file in the terms/ folder
  • Check back in: obo-checkin.pl src/ontology/{ontology}-edit.obo ONTO:1234567 [OTHER_IDS]
  • Checking in updates the edit file and removes the file from terms/
  • You can edit multiple terms in one batch file if needed

Scripts Available

This skill includes three essential scripts:

  1. obo-grep.pl - Fast searching of OBO files
  2. obo-checkout.pl - Extract terms to individual files for editing
  3. obo-checkin.pl - Merge edited terms back into main file

All scripts are available in your PATH when this skill is loaded.

OBO Format Guidelines

Basic Structure

  • Term ID format: ONTO:NNNNNNN (check project conventions for number of digits)
  • Each term requires:
    • id: - unique identifier
    • name: - human-readable label
    • namespace: - ontology namespace
    • def: - definition with references in square brackets
  • Use standard relationship types: is_a, part_of, has_part, etc.
  • Follow existing term patterns for consistency

Handling New Term Requests (NTRs)

  • Check project conventions for temporary ID ranges
  • Example: Some projects use ranges like ONTO:777xxxx for new terms
  • Always check for ID clashes: grep 'id: ONTO:777' src/ontology/{ontology}-edit.obo
  • NEVER guess ontology IDs - use search tools to find actual terms
  • NEVER guess PMIDs for references - do web searches if needed

Citations and References

  • Cite publications appropriately: def: "..." [PMID:nnnn, doi:mmmm]
  • Fetch full text when needed: aurelian fulltext <PMID:nnn> (also works with DOIs and URLs)
  • All synonyms should include proper citations
  • Never use empty brackets [] without a source

Synonyms

Synonyms should include proper attribution:

Correct:

synonym: "alternative name" EXACT [PMID:12345678]
synonym: "abbrev" EXACT ABBREVIATION [PMID:12345678]

Relationships and Logical Definitions

  • All terms should have at least one is_a parent
  • Logical definitions follow genus-differentia form
  • Text definitions should mirror logical definitions
  • Include source attribution for relationships when based on literature:

Logical Definitions (intersection_of)

Example of proper intersection_of usage:

[Term]
id: ONTO:0000715
name: specific disease
def: "A general disease that involves specific location." [PMID:12345678]
is_a: ONTO:0001082 ! general disease
intersection_of: ONTO:0004971 ! general disease
intersection_of: disease_has_location UBERON:0000029 ! specific location

Note that in OWL this corresponds to: 'specific disease' EquivalentTo 'general disease' and 'disease has location' some 'specific location'

Obsoleting Terms

  • Obsolete terms should have NO logical axioms (is_a, relationship, intersection_of)
  • Obsolete terms may have one replaced_by tag (exact replacement)
  • Or multiple consider tags (suggested alternatives)
  • Always include obsolescence reason and tracker reference

Example of simple obsolescence:

[Term]
id: ONTO:0100334
name: obsolete term name
property_value: IAO:0000231 OMO:0001000
property_value: IAO:0000233 "https://github.com/{project}/issues/XXXX" xsd:anyURI
is_obsolete: true
replaced_by: ONTO:0100321

Example with considerations instead of replacement:

[Term]
id: ONTO:0100229
name: obsolete term name
def: "OBSOLETE. Original definition here." [original references]
property_value: IAO:0000231 OMO:0001000
property_value: IAO:0000233 "https://github.com/{project}/issues/XXXX" xsd:anyURI
is_obsolete: true
consider: ONTO:0100259
consider: ONTO:0100260

Important Notes on Obsolescence

  • Synonyms and xrefs can be migrated to replacement terms judiciously
  • Never do complete merges with alt_id - use obsolescence with replacement instead
  • No relationships should point to an obsolete term
  • When obsoleting, you may need to rewire other terms to "skip" the obsoleted term

Metadata Best Practices

  • Link to issue trackers: property_value: IAO:0000233 "https://github.com/{project}/issues/XXXX" xsd:anyURI
  • Sign new terms (don't tag pre-existing terms):
    property_value: http://purl.org/dc/terms/creator https://orcid.org/0000-0001-2345-6789
    
  • All terms should have definitions with at least one reference (preferably PMID)
  • Dates are typically auto-generated by build processes

Syntax Checking

Validate OBO syntax using ROBOT:

robot convert --catalog src/ontology/catalog-v001.xml \
  -i src/ontology/{ontology}-edit.obo \
  -f obo \
  -o {ontology}-edit.TMP.obo

Use -vvv flag for full stack trace if there are errors.

Design Patterns

Many OBO ontologies use DOSDP (Dead Simple Ontology Design Patterns):

  • Check src/patterns/dosdp-patterns/*.yaml for project-specific patterns
  • Follow existing patterns when creating similar terms
  • Common patterns include:
    • Location-based disease patterns
    • Gene-related disease patterns
    • Part-of hierarchies
    • Abnormality patterns

Important Reminders

  • NEVER guess identifiers of any kind
  • If you include an identifier not provided by the user, you MUST verify it
  • PMIDs can be checked with aurelian or web search
  • Always follow project-specific conventions and check existing examples
  • When in doubt, ask for clarification rather than making assumptions

Source

git clone https://github.com/ai4curation/curation-skills/blob/main/editing-obo-ontologies/SKILL.mdView on GitHub

Overview

This skill provides guidance and tools for editing ontologies in OBO format. It covers project layout, term querying with obo-grep.pl, and a checkout/checkin workflow to edit individual terms without touching large files.

How This Skill Works

Term queries use obo-grep.pl to search the main edit file (src/ontology/{ontology}-edit.obo) for speed. For safe edits, check out a term with obo-checkout.pl to create a small file under terms/{ontology}_<id>.obo, edit it there, then check in with obo-checkin.pl to merge changes back into the main file.

When to Use It

  • You need to locate specific terms or patterns quickly using obo-grep.pl in the main edit file.
  • You are editing the properties of a single term without touching the entire ontology.
  • You want to validate parent/child consistency before finalizing edits.
  • You are creating new terms (NTRs) and must avoid ID clashes and incorrect IDs.
  • You need to cite PMIDs/DOIs in definitions and synonyms and ensure citations are present.

Quick Start

  1. Step 1: obo-grep.pl --noheader -r 'ONTO:0004177' src/ontology/{ontology}-edit.obo
  2. Step 2: obo-checkout.pl src/ontology/{ontology}-edit.obo ONTO:1234567 [OTHER_IDS]
  3. Step 3: Edit terms/{ontology}_1234567.obo, then obo-checkin.pl src/ontology/{ontology}-edit.obo ONTO:1234567 [OTHER_IDS]

Best Practices

  • Always use the checkout/checkin workflow for individual terms.
  • Limit searches to the main edit file with obo-grep.pl; avoid full-file grepping.
  • Ensure required fields exist: id, name, namespace, def, and citations.
  • When adding NTRs, check for ID clashes and follow project conventions.
  • Cite publications in defs and synonyms; never leave empty brackets without a source.

Example Use Cases

  • Query ONTO:0004177 in src/ontology/{ontology}-edit.obo with obo-grep.pl
  • Checkout ONTO:1234567 to terms/{ontology}_1234567.obo and edit
  • Edit def: 'def: ... [PMID:...]' and update synonyms
  • Create an NTR like ONTO:777xxxx and verify against existing IDs with grep
  • Run obo-checkin.pl to merge edited terms back into the main file

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers