Get the FREE Ultimate OpenClaw Setup Guide →

paper-analyzer

Scanned
npx machina-cli add skill proyecto26/sherlock-ai-plugin/paper-analyzer --openclaw
Files (1)
SKILL.md
2.9 KB

Academic Paper Analyzer – In-Depth Analysis of Academic Papers

Core Capabilities

  • MinerU Cloud API for high-precision PDF parsing
  • Automatic extraction of images, tables, and LaTeX formulas
  • Multiple writing styles: storytelling / academic / concise
  • Optional formula explanations: insert formula images with detailed symbol explanations
  • Optional code analysis: combine explanations with GitHub open-source code
  • Output Markdown + HTML (base64-embedded images)

Prerequisites

MinerU API Token

  1. Visit https://mineru.net and register an account
  2. Obtain an API Token
  3. Set an environment variable (recommended):
    export MINERU_TOKEN="your_token_here"
    

Dependency Installation

pip install requests markdown

Workflow

Step 1: PDF Parsing (Using MinerU API)

python scripts/mineru_api.py <pdf_path> <output_dir>

Or pass the token directly:

python scripts/mineru_api.py paper.pdf ./output YOUR_TOKEN

Output:

  • output_dir/*.md – Markdown files (including formulas and tables)
  • output_dir/images/ – High-quality extracted images

Step 2: Extract Paper Metadata

python scripts/extract_paper_info.py <output_dir>/*.md paper_info.json

Step 3: Style Selection (Ask the User)

Before generating the article, you must ask the user to choose the following options:

1. Writing Style (Required)

StyleCharacteristicsUse Cases
storytellingStarts from intuition, uses metaphors and examples, narrative-drivenBlogs, tech columns, popular science
academicProfessional terminology, rigorous expression, preserves original conceptsAcademic reports, surveys, research group sharing
conciseStraight to the point, tables and lists, high information densityQuick reads, paper overviews, technical research

2. Formula Option (Optional)

OptionDescription
with-formulasInsert formula images and explain symbol meanings in detail
no-formulas (default)Pure text description, no formula images

3. Code Option (Optional, only if the paper has GitHub)

OptionDescription
with-codeClone the repository, include key source code, and explain it alongside the paper
no-code (default)No code analysis

Step 4: Intelligent Article Generation

(...)

API Limits

  • Maximum file size: 200MB
  • Maximum pages per file: 600
  • Supports PDF, DOC, PPT, images, and more

Source

git clone https://github.com/proyecto26/sherlock-ai-plugin/blob/main/skills/paper-analyzer/SKILL.mdView on GitHub

Overview

paper-analyzer converts academic papers into in-depth technical articles using the MinerU Cloud API for high-precision PDF parsing. It automatically extracts images, tables, and LaTeX formulas, supports optional formula explanations and GitHub code analysis, and outputs Markdown and HTML formats.

How This Skill Works

The tool parses PDFs via the MinerU Cloud API to extract content such as images, tables, and LaTeX formulas. After parsing, you choose a writing style (storytelling, academic, or concise) and optional features (with-formulas and/or with-code); it then generates article-ready Markdown and HTML with embedded visuals.

When to Use It

  • You need a storytelling piece for a tech blog that explains a paper intuitively.
  • You must produce an academic report preserving professional terminology and concepts.
  • You want a concise, high-density overview for quick internal review.
  • You have a GitHub repo linked to the paper and want integrated code analysis.
  • You require a math-heavy article with formula images and detailed symbol explanations.

Quick Start

  1. Step 1: Set up your MinerU API token and install dependencies (pip install requests markdown).
  2. Step 2: Run the MinerU parsing script on your PDF to generate Markdown and extract images.
  3. Step 3: Choose a writing style and optional features, then generate the Markdown/HTML article.

Best Practices

  • Define the target writing style before generation to guide tone and structure.
  • Provide the paper in a clean PDF to improve parsing accuracy (images, tables, formulas).
  • Use with-formulas when equations are central to the paper's contributions.
  • If code analysis is needed, attach the related GitHub repo and choose with-code.
  • Review the Markdown/HTML output to verify embedded images render correctly.

Example Use Cases

  • Story-driven blog post explaining a machine learning paper with formula explanations.
  • Academic survey preserving methods and terminology for a conference proceedings.
  • Concise quick-read overview for a research group briefing.
  • HTML article with embedded images for a conference handout or slides.
  • Code-enhanced article linking to a GitHub repo and analyzing its implementation.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers