Get the FREE Ultimate OpenClaw Setup Guide →

data-management-plan-creator

npx machina-cli add skill aipoch/medical-research-skills/data-management-plan-creator --openclaw
Files (1)
SKILL.md
5.9 KB

Data Management Plan (DMP) Creator

Automatically generate draft Data Management and Sharing Plans (DMSP) compliant with NIH 2023 policy requirements and FAIR principles.

Overview

This Skill generates comprehensive Data Management and Sharing Plans (DMSP) that meet NIH's 2023 Final Policy for Data Management and Sharing. The output follows FAIR principles (Findable, Accessible, Interoperable, Reusable) to ensure research data is properly managed and shared.

Requirements

  • Python 3.8+
  • No external dependencies required (uses standard library only)

Usage

Command Line

python scripts/main.py \
    --project-title "Your Research Project Title" \
    --pi-name "Principal Investigator Name" \
    --data-types "genomic,imaging,clinical" \
    --repository "GEO,Figshare" \
    --output dmsp_draft.md

Interactive Mode

python scripts/main.py --interactive

As a Module

from scripts.main import DMSPCreator

creator = DMSPCreator(
    project_title="Cancer Genomics Study",
    pi_name="Dr. Jane Smith",
    institution="National Cancer Institute",
    data_types=["genomic sequencing", "clinical metadata"],
    estimated_size_gb=500,
    repositories=["dbGaP", "GEO"],
    sharing_timeline="6 months after study completion"
)

dmsp = creator.generate_plan()
creator.save_to_file("dmsp_output.md")

Parameters

ParameterTypeDefaultRequiredDescription
--project-titlestring-YesTitle of the research project
--pi-namestring-YesName of the Principal Investigator
--institutionstring-YesResearch institution or organization
--data-typesstring-YesComma-separated list of data types (e.g., "genomic,imaging,clinical")
--estimated-sizefloat-NoEstimated data size in GB
--repositorystring-YesComma-separated list of target repositories
--sharing-timelinestringNo later than the end of the award periodNoWhen data will be shared
--access-restrictionsstring-NoAny access restrictions (e.g., "controlled-access for sensitive data")
--format-standardsstring-NoData format standards to be used
--outputstringdmsp_[timestamp].mdNoOutput file path
--interactiveflag-NoRun in interactive mode

NIH DMSP Required Elements

The generated plan addresses all six required elements per NIH policy:

  1. Data Type - Types and estimated amount of scientific data
  2. Related Tools, Software and/or Code - Tools needed to access/manipulate data
  3. Standards - Standards for data/metadata to be applied
  4. Data Preservation, Access, and Associated Timelines - Repository selection and sharing timeline
  5. Access, Distribution, or Reuse Considerations - Factors affecting subsequent access
  6. Oversight of Data Management and Sharing - Plans for compliance monitoring

FAIR Principles Implementation

Findable

  • Persistent identifiers (DOIs)
  • Rich metadata with standard vocabularies
  • Registration in searchable repositories

Accessible

  • Standardized communication protocols
  • Metadata available even if data is no longer available
  • Access procedures clearly documented

Interoperable

  • Standard data formats
  • Standard terminologies and vocabularies
  • Qualified references to other data

Reusable

  • Detailed provenance information
  • Clear usage licenses
  • Domain-relevant community standards

Example Output

The generated DMSP includes:

  • Executive summary
  • NIH-compliant section headers
  • Specific language for data type descriptions
  • FAIR-aligned metadata standards
  • Repository recommendations
  • Timeline for data sharing
  • Access control procedures
  • Roles and responsibilities

References

License

MIT License - See project root for details.

Risk Assessment

Risk IndicatorAssessmentLevel
Code ExecutionPython/R scripts executed locallyMedium
Network AccessNo external API callsLow
File System AccessRead input files, write output filesMedium
Instruction TamperingStandard prompt guidelinesLow
Data ExposureOutput files saved to workspaceLow

Security Checklist

  • No hardcoded credentials or API keys
  • No unauthorized file system access (../)
  • Output does not expose sensitive information
  • Prompt injection protections in place
  • Input file paths validated (no ../ traversal)
  • Output directory restricted to workspace
  • Script execution in sandboxed environment
  • Error messages sanitized (no stack traces exposed)
  • Dependencies audited

Prerequisites

# Python dependencies
pip install -r requirements.txt

Evaluation Criteria

Success Metrics

  • Successfully executes main functionality
  • Output meets quality standards
  • Handles edge cases gracefully
  • Performance is acceptable

Test Cases

  1. Basic Functionality: Standard input → Expected output
  2. Edge Case: Invalid input → Graceful error handling
  3. Performance: Large dataset → Acceptable processing time

Lifecycle Status

  • Current Stage: Draft
  • Next Review Date: 2026-03-06
  • Known Issues: None
  • Planned Improvements:
    • Performance optimization
    • Additional feature support

Source

git clone https://github.com/aipoch/medical-research-skills/blob/main/scientific-skills/Academic writing/data-management-plan-creator/SKILL.mdView on GitHub

Overview

Automatically generate draft Data Management and Sharing Plans (DMSP) that comply with NIH 2023 Final Policy and apply FAIR principles. The tool outputs comprehensive plans covering NIH's six required elements, data types, repositories, standards, timelines, and oversight to support compliant and reusable data sharing.

How This Skill Works

Uses Python 3.8+ with the standard library to collect inputs such as project title, PI, institution, data types, estimated size, repositories, and sharing timeline via CLI or interactive mode. It then assembles NIH six-required elements and FAIR-aligned sections into a markdown DMSP draft, or exposes a module interface for programmatic generation.

When to Use It

  • Preparing an NIH grant proposal (R01/R21) that requires a compliant Data Management and Sharing Plan
  • Plans involving multiple data types and repositories needing a clear sharing timeline
  • Projects with restricted data requiring controlled-access metadata and licenses
  • Early-stage data planning to define formats, standards, and provenance
  • Compliance reviews or institutional approvals before final NIH submission

Quick Start

  1. Step 1: Run the CLI with required arguments, e.g., python scripts/main.py --project-title Your Research Project Title --pi-name Principal Investigator Name --institution Your Institution --data-types genomic,imaging,clinical --repository GEO,dbGaP --sharing-timeline 6 months after study completion --output dmsp_draft.md
  2. Step 2: If you prefer guided prompts, run python scripts/main.py --interactive
  3. Step 3: If using the module directly, create and save: dmsp = creator.generate_plan(); creator.save_to_file("dmsp_output.md")

Best Practices

  • Clearly define data types and estimated data volumes up front
  • Map each data type to the NIH six required DMSP elements
  • Select repositories early and document access, licensing, and sharing timeline
  • Specify data formats, metadata standards, and vocabularies for interoperability
  • Incorporate governance and oversight steps for ongoing compliance

Example Use Cases

  • Draft DMSP for an NIH R01 in cancer genomics using dbGaP and GEO with a six‑month post-study sharing timeline
  • Clinical study data with controlled-access metadata and a clear access procedure
  • Imaging data plan following DICOM standards with FAIR metadata and repository registration
  • Multi‑site microbiome project coordinating data formats, licenses, and cross‑institution access
  • Pilot study releasing non-sensitive data openly with a documented usage license

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers