reviewing-ai-papers
Scannednpx machina-cli add skill oaustegard/claude-skills/reviewing-ai-papers --openclawReviewing AI Papers
When users request analysis of AI/ML technical content (papers, articles, blog posts), extract actionable insights filtered through an enterprise AI engineering lens and store valuable discoveries to memory for cross-session recall.
Contextual Priorities
Technical Architecture:
- RAG systems (semantic/lexical search, hybrid retrieval)
- Vector database optimization and embedding strategies
- Model fine-tuning for specialized scientific domains
- Knowledge distillation for secure on-premise deployment
Implementation & Operations:
- Prompt engineering and in-context learning techniques
- Security and IP protection in AI systems
- Scientific accuracy and hallucination mitigation
- AWS integration (Bedrock/SageMaker)
Enterprise & Adoption:
- Enterprise deployment in regulated environments
- Building trust with scientific/legal stakeholders
- Internal customer success strategies
- Build vs. buy decision frameworks
Analytical Standards
- Maintain objectivity: Extract factual insights without amplifying source hype
- Challenge novelty claims: Identify what practitioners already use as baselines. Distinguish "applies existing techniques" from "genuinely new methods"
- Separate rigor from novelty: Well-executed study of standard techniques ≠ methodological breakthrough
- Confidence transparency: Distinguish established facts, emerging trends, speculative claims
- Contextual filtering: Prioritize insights mapping to current challenges
Analysis Structure
For Substantive Content
Article Assessment (2-3 sentences)
- Core topic and primary claims
- Credibility: author expertise, evidence quality, methodology rigor
Prioritized Insights
- High Priority: Direct applications to active projects
- Medium Priority: Adjacent technologies worth monitoring
- Low Priority: Interesting but not immediately actionable
Technical Evaluation
- Distinguish novel methods from standard practice presented as innovation
- Flag implementation challenges, risks, resource requirements
- Note contradictions with established best practices
Actionable Recommendations
- Research deeper: Specific areas requiring investigation
- Evaluate for implementation: Techniques worth prototyping
- Share with teams: Which teams benefit from this content
- Monitor trends: Emerging areas to track
Immediate Applications Map insights to current projects. Identify quick wins or POC opportunities.
For Thin Content
- State limitations upfront
- Extract marginal insights if any
- Recommend alternatives if topic matters
- Keep brief
Memory Integration
Automatic storage triggers:
- High-priority insights (directly applicable)
- Novel techniques worth prototyping
- Pattern recognitions across papers
- Contradictions to established practice
Storage format:
remember(
"[Source: {title or url}] {condensed insight}",
"world",
tags=["paper-insight", "{domain}", "{technique}"],
conf=0.85 # higher for strong evidence
)
Compression rule:
- Full analysis → conversation (what user sees)
- Condensed insight → memory (searchable nugget with attribution)
- Store the actionable kernel, not the whole analysis
Example:
Analysis says: "Hybrid retrieval (BM25 + dense) shows 23% improvement over pure semantic search for scientific queries. Two-stage approach..."
Store as: "[Source: arxiv.org/abs/2401.xxxxx] Hybrid BM25+dense retrieval: 23% lift over semantic-only for scientific corpora. Requires 10K+ domain examples for fine-tuning benefit."
Tags: ["paper-insight", "rag", "hybrid-retrieval", "scientific-domain"]
Output Standards
- Conciseness: Actionable insights, not content restatement
- Precision: Distinguish demonstrates/suggests/claims/speculates
- Relevance: Connect to focus areas or state no connection
- Adaptive depth: Match length to content value
Constraints
- No hype amplification
- No timelines unless requested
- No speculation beyond article
- Note contradictions explicitly
- State limitations on thin content
Source
git clone https://github.com/oaustegard/claude-skills/blob/main/reviewing-ai-papers/SKILL.mdView on GitHub Overview
Analyzes AI/ML papers, blogs, and articles through an enterprise AI engineering lens to extract actionable insights. It filters hype, emphasizes rigor, and stores findings for cross-session recall to accelerate adoption and risk-aware decision-making.
How This Skill Works
User submits a URL or document. The system analyzes the content against enterprise priorities (RAG, embeddings, fine-tuning, prompt engineering, LLM deployment, security, and on-prem considerations). It surfaces prioritized, actionable insights and then stores a condensed kernel in memory for future recall.
When to Use It
- User provides a URL or document for AI/ML content analysis (papers, blog posts, or articles).
- User asks to 'review this paper' or extract actionable takeaways for deployment.
- Evaluating retrieval techniques and embeddings for enterprise search (RAG/semantic-lexical hybrid).
- Planning or optimizing fine-tuning, prompt engineering, or LLM deployment strategies.
- Assessing deployment, security, and governance requirements in regulated environments.
Quick Start
- Step 1: Provide a URL or upload the document to analyze.
- Step 2: Run the review to surface enterprise-focused, prioritized insights.
- Step 3: Save the condensed kernel to memory for cross-session recall.
Best Practices
- Maintain objectivity; separate hype from verified results.
- Map insights to enterprise priorities: RAG, embeddings, on-prem deployment, security.
- Flag implementation challenges, resource requirements, and operational risks.
- Differentiate well-supported results from speculative claims; note rigor vs novelty.
- Provide concrete next steps, quick-win experiments, and cross-team relevance.
Example Use Cases
- Identify a two-stage retrieval improvement (BM25 + dense) that boosts scientific-query accuracy over semantic-only approaches.
- Assess domain-specific fine-tuning strategies and their impact on accuracy for regulated industries.
- Evaluate prompt-in-context learning techniques to reduce hallucinations in research summaries.
- Compare on-premise vs cloud deployment trade-offs with security/IP considerations in an enterprise setting.
- Recommend AWS Bedrock/SageMaker deployment patterns and governance controls for cross-team adoption.