Get the FREE Ultimate OpenClaw Setup Guide →

data-export-pdf

Scanned
npx machina-cli add skill Microck/ordinary-claude-skills/data-export-pdf --openclaw
Files (1)
SKILL.md
11.2 KB

PDF Report Generation (Universal)

Overview

This skill enables you to create professional PDF reports containing analysis summaries, formatted tables, and embedded visualizations. Unlike cloud-hosted solutions, this skill uses the reportlab Python library and executes locally in your environment, making it compatible with ALL LLM providers including GPT, Gemini, Claude, DeepSeek, and Qwen.

When to Use This Skill

  • Generate analysis reports with text and tables
  • Create summary PDFs with embedded plots
  • Export formatted documentation
  • Produce publication-ready supplementary materials
  • Combine multiple analysis results into a single document

How to Use

Step 1: Import Required Libraries

from reportlab.lib.pagesizes import letter, A4
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib.units import inch
from reportlab.lib import colors
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, Table, TableStyle, PageBreak, Image
from reportlab.lib.enums import TA_CENTER, TA_LEFT, TA_JUSTIFY
from datetime import datetime
import matplotlib.pyplot as plt

Step 2: Create Basic PDF Document

# Create PDF file
pdf_filename = "analysis_report.pdf"
doc = SimpleDocTemplate(pdf_filename, pagesize=letter)
story = []  # Container for PDF elements

# Get default styles
styles = getSampleStyleSheet()
title_style = styles['Title']
heading_style = styles['Heading1']
normal_style = styles['Normal']

# Add title
story.append(Paragraph("Analysis Report", title_style))
story.append(Spacer(1, 0.2*inch))

# Add date
date_text = f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M')}"
story.append(Paragraph(date_text, normal_style))
story.append(Spacer(1, 0.3*inch))

# Build PDF
doc.build(story)
print(f"✅ PDF saved to: {pdf_filename}")

Step 3: Add Text Content

story = []

# Title
story.append(Paragraph("Single-Cell RNA-seq Analysis Report", title_style))
story.append(Spacer(1, 0.2*inch))

# Section heading
story.append(Paragraph("1. Overview", heading_style))
story.append(Spacer(1, 0.1*inch))

# Paragraph text
overview_text = """
This report summarizes the single-cell RNA-seq analysis performed on the dataset.
The analysis includes quality control, normalization, dimensionality reduction,
clustering, and cell type annotation.
"""
story.append(Paragraph(overview_text, normal_style))
story.append(Spacer(1, 0.2*inch))

Step 4: Add Tables

# Prepare table data
table_data = [
    ['Metric', 'Value'],  # Header
    ['Total Cells', '5,000'],
    ['Total Genes', '20,000'],
    ['Mean Genes/Cell', '2,500'],
    ['Median UMIs/Cell', '10,000']
]

# Create table
table = Table(table_data, colWidths=[2.5*inch, 2*inch])

# Style table
table.setStyle(TableStyle([
    # Header styling
    ('BACKGROUND', (0, 0), (-1, 0), colors.grey),
    ('TEXTCOLOR', (0, 0), (-1, 0), colors.whitesmoke),
    ('ALIGN', (0, 0), (-1, -1), 'CENTER'),
    ('FONTNAME', (0, 0), (-1, 0), 'Helvetica-Bold'),
    ('FONTSIZE', (0, 0), (-1, 0), 12),

    # Body styling
    ('BACKGROUND', (0, 1), (-1, -1), colors.beige),
    ('GRID', (0, 0), (-1, -1), 1, colors.black),
    ('FONTNAME', (0, 1), (-1, -1), 'Helvetica'),
    ('FONTSIZE', (0, 1), (-1, -1), 10),
]))

story.append(table)
story.append(Spacer(1, 0.3*inch))

Step 5: Embed Images/Plots

# Save matplotlib figure first
fig, ax = plt.subplots(figsize=(6, 4))
# ... create your plot ...
plot_filename = "temp_plot.png"
fig.savefig(plot_filename, dpi=150, bbox_inches='tight')
plt.close(fig)

# Add image to PDF
story.append(Paragraph("2. UMAP Visualization", heading_style))
story.append(Spacer(1, 0.1*inch))
img = Image(plot_filename, width=4*inch, height=3*inch)
story.append(img)
story.append(Spacer(1, 0.2*inch))

Complete Example: Analysis Report

from reportlab.lib.pagesizes import letter
from reportlab.lib.styles import getSampleStyleSheet
from reportlab.lib.units import inch
from reportlab.lib import colors
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, Table, TableStyle, Image
from datetime import datetime
import matplotlib.pyplot as plt
import pandas as pd

def create_analysis_report(adata, output_path="analysis_report.pdf"):
    """Create comprehensive PDF analysis report"""

    # Initialize PDF
    doc = SimpleDocTemplate(output_path, pagesize=letter)
    story = []
    styles = getSampleStyleSheet()

    # Title
    story.append(Paragraph("Single-Cell RNA-seq Analysis Report", styles['Title']))
    story.append(Spacer(1, 0.2*inch))
    story.append(Paragraph(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M')}", styles['Normal']))
    story.append(Spacer(1, 0.3*inch))

    # Overview
    story.append(Paragraph("1. Dataset Overview", styles['Heading1']))
    story.append(Spacer(1, 0.1*inch))

    overview_data = [
        ['Metric', 'Value'],
        ['Total Cells', f'{adata.n_obs:,}'],
        ['Total Genes', f'{adata.n_vars:,}'],
        ['Observations', ', '.join(adata.obs.columns[:5].tolist())],
    ]

    table = Table(overview_data, colWidths=[2.5*inch, 3.5*inch])
    table.setStyle(TableStyle([
        ('BACKGROUND', (0, 0), (-1, 0), colors.grey),
        ('TEXTCOLOR', (0, 0), (-1, 0), colors.whitesmoke),
        ('ALIGN', (0, 0), (-1, -1), 'LEFT'),
        ('FONTNAME', (0, 0), (-1, 0), 'Helvetica-Bold'),
        ('GRID', (0, 0), (-1, -1), 1, colors.black),
        ('BACKGROUND', (0, 1), (-1, -1), colors.beige),
    ]))
    story.append(table)
    story.append(Spacer(1, 0.3*inch))

    # Cluster distribution
    if 'clusters' in adata.obs:
        story.append(Paragraph("2. Cluster Distribution", styles['Heading1']))
        story.append(Spacer(1, 0.1*inch))

        cluster_counts = adata.obs['clusters'].value_counts().sort_index()
        cluster_data = [['Cluster', 'Cell Count', 'Percentage']]
        total_cells = adata.n_obs

        for cluster, count in cluster_counts.items():
            percentage = (count / total_cells) * 100
            cluster_data.append([str(cluster), str(count), f'{percentage:.1f}%'])

        table = Table(cluster_data, colWidths=[1.5*inch, 1.5*inch, 1.5*inch])
        table.setStyle(TableStyle([
            ('BACKGROUND', (0, 0), (-1, 0), colors.grey),
            ('TEXTCOLOR', (0, 0), (-1, 0), colors.whitesmoke),
            ('ALIGN', (0, 0), (-1, -1), 'CENTER'),
            ('FONTNAME', (0, 0), (-1, 0), 'Helvetica-Bold'),
            ('GRID', (0, 0), (-1, -1), 1, colors.black),
            ('BACKGROUND', (0, 1), (-1, -1), colors.lightblue),
        ]))
        story.append(table)
        story.append(Spacer(1, 0.3*inch))

    # Visualization (if UMAP exists)
    if 'X_umap' in adata.obsm:
        story.append(Paragraph("3. UMAP Visualization", styles['Heading1']))
        story.append(Spacer(1, 0.1*inch))

        # Create UMAP plot
        fig, ax = plt.subplots(figsize=(6, 5))
        scatter = ax.scatter(
            adata.obsm['X_umap'][:, 0],
            adata.obsm['X_umap'][:, 1],
            c=adata.obs['clusters'].astype('category').cat.codes if 'clusters' in adata.obs else 'blue',
            s=5, alpha=0.5
        )
        ax.set_xlabel('UMAP1')
        ax.set_ylabel('UMAP2')
        ax.set_title('UMAP Projection')

        plot_path = 'temp_umap.png'
        fig.savefig(plot_path, dpi=150, bbox_inches='tight')
        plt.close(fig)

        img = Image(plot_path, width=5*inch, height=4*inch)
        story.append(img)

    # Build PDF
    doc.build(story)
    print(f"✅ PDF report saved to: {output_path}")

    return output_path

# Usage
create_analysis_report(adata, "my_analysis_report.pdf")

Best Practices

  1. Page Size: Use letter (US) or A4 (international) for standard documents
  2. Margins: SimpleDocTemplate has default margins (1 inch); adjust with leftMargin, rightMargin, etc.
  3. Images: Save matplotlib figures at 150-300 DPI for good quality
  4. Tables: Keep column counts reasonable (4-6 columns max for readability)
  5. File Cleanup: Delete temporary image files after PDF creation
  6. Memory: For large documents, build in sections to manage memory

Advanced Features

Custom Page Header/Footer

from reportlab.lib.pagesizes import letter
from reportlab.pdfgen import canvas

def add_header_footer(canvas_obj, doc):
    canvas_obj.saveState()
    # Header
    canvas_obj.setFont('Helvetica', 9)
    canvas_obj.drawString(inch, letter[1] - 0.5*inch, "Analysis Report")
    # Footer
    canvas_obj.drawString(inch, 0.5*inch, f"Page {doc.page}")
    canvas_obj.restoreState()

doc = SimpleDocTemplate(pdf_filename, pagesize=letter)
doc.build(story, onFirstPage=add_header_footer, onLaterPages=add_header_footer)

Multi-Column Layout

from reportlab.platypus import Frame, PageTemplate

frame1 = Frame(doc.leftMargin, doc.bottomMargin, doc.width/2-6, doc.height, id='col1')
frame2 = Frame(doc.leftMargin+doc.width/2+6, doc.bottomMargin, doc.width/2-6, doc.height, id='col2')

doc.addPageTemplates([PageTemplate(id='TwoCol', frames=[frame1, frame2])])

Color-Coded Tables

# Highlight significant results
for i, row in enumerate(deg_results):
    if row['qvalue'] < 0.05:
        table.setStyle(TableStyle([
            ('BACKGROUND', (0, i+1), (-1, i+1), colors.yellow)
        ]))

Common Use Cases

QC Report

qc_metrics = {
    'Total Cells': adata.n_obs,
    'Median Genes/Cell': int(adata.obs['n_genes'].median()),
    'Median UMIs/Cell': int(adata.obs['n_counts'].median()),
    'Mean Mito %': f"{adata.obs['percent_mito'].mean():.2f}%"
}

table_data = [['Metric', 'Value']] + [[k, str(v)] for k, v in qc_metrics.items()]
# ... create table as shown above

DEG Summary Table

# Top 10 upregulated genes
top_genes = deg_df.nlargest(10, 'log2FC')[['gene', 'log2FC', 'qvalue']]
table_data = [['Gene', 'log2FC', 'Q-value']]
for _, row in top_genes.iterrows():
    table_data.append([row['gene'], f"{row['log2FC']:.2f}", f"{row['qvalue']:.2e}"])

Troubleshooting

Issue: "reportlab not found"

Solution:

import subprocess
subprocess.check_call(['pip', 'install', 'reportlab'])

Issue: "Image not found"

Solution: Ensure image path is correct and file exists before adding to PDF:

import os
if os.path.exists(plot_filename):
    img = Image(plot_filename, width=4*inch, height=3*inch)
    story.append(img)

Issue: "Table exceeds page width"

Solution: Reduce column widths or font size:

table = Table(data, colWidths=[1.5*inch, 1.5*inch, 2*inch])
table.setStyle(TableStyle([('FONTSIZE', (0, 0), (-1, -1), 8)]))

Technical Notes

  • Library: Uses reportlab (pure Python, widely supported)
  • Execution: Runs locally in the agent's sandbox
  • Compatibility: Works with ALL LLM providers (GPT, Gemini, Claude, DeepSeek, Qwen, etc.)
  • File Size: Text-heavy PDFs are small (<1MB); image-heavy PDFs can be 5-20MB
  • Performance: Typical report generation takes 1-3 seconds

References

Source

git clone https://github.com/Microck/ordinary-claude-skills/blob/main/skills_all/data-export-pdf/SKILL.mdView on GitHub

Overview

Generates professional PDF reports containing analysis summaries, formatted tables, and embedded visuals using the ReportLab library. Runs locally in your environment, avoiding cloud dependencies and ensuring compatibility with any LLM provider.

How This Skill Works

The skill builds a PDF document from a story of flowables. You import libraries, create a SimpleDocTemplate, and assemble content with Paragraphs, Tables, and Images, then call doc.build(story) to write the final PDF.

When to Use It

  • Generate analysis reports with text and tables
  • Create summary PDFs with embedded plots
  • Export formatted documentation
  • Produce publication-ready supplementary materials
  • Combine multiple analysis results into a single document

Quick Start

  1. Step 1: Import Required Libraries
  2. Step 2: Create Basic PDF Document
  3. Step 3: Add Text Content, Tables, and Images to the Story

Best Practices

  • Plan document structure before coding (title, sections, and order of sections).
  • Reuse a consistent style sheet (getSampleStyleSheet) and define clear typography.
  • Use Table and TableStyle to create readable, publication-quality tables with header styling.
  • Embed visuals by saving plots as PNGs and adding Image flowables with fixed sizes.
  • Test with different page sizes (letter or A4) and DPI to ensure clarity for distribution.

Example Use Cases

  • Single-Cell RNA-seq Analysis Report with text, a metrics table, and a UMAP image embedded.
  • UMAP Visualization embedded in a PDF alongside an overview section.
  • Method summary and QC metrics exported as a single, publication-ready document.
  • A supplementary materials PDF combining results from multiple analyses.
  • A reusable PDF template that stitches together multiple report sections into one file.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers