c-cpp-to-lean4-translator
Scannednpx machina-cli add skill ArabelaTso/Skills-4-SE/c-cpp-to-lean4-translator --openclawC/C++ to Lean4 Translator
Overview
Transform C or C++ programs into equivalent Lean4 code that preserves the original semantics while leveraging Lean4's functional programming paradigm, strong type system, and proof capabilities.
Translation Workflow
Step 1: Analyze Input Code
Understand the C/C++ program structure and semantics:
-
Identify program components:
- Functions and their signatures
- Data structures (structs, classes, arrays)
- Control flow patterns (loops, conditionals)
- Memory management (allocation, pointers)
- I/O operations
- Dependencies and includes
-
Understand semantics:
- What does the program compute?
- What are the inputs and outputs?
- Are there side effects?
- What are the invariants and preconditions?
-
Note translation challenges:
- Pointer arithmetic
- Mutable state
- Imperative loops
- Manual memory management
- Undefined behavior
Step 2: Design Lean4 Structure
Plan the Lean4 equivalent before writing code:
-
Choose appropriate types:
Intfor signed integersNatfor unsigned integers and array indicesFloatfor floating-point numbersArrayfor dynamic arraysListfor linked lists- Custom
structuretypes for structs/classes
-
Determine purity:
- Pure functions: return values directly
- Side effects: use
IOmonad - Mutable state: use
IO.ReforSTmonad
-
Plan control flow translation:
- Loops → Recursive functions
- Mutable variables → Function parameters
- Early returns → Conditional expressions
-
Handle memory:
- Stack allocation → Direct values
- Heap allocation → Automatic memory management
- Pointers → Direct values or references
Step 3: Translate Code
Follow these translation principles:
Functions
Pattern: Pure function
// C/C++
int add(int a, int b) {
return a + b;
}
-- Lean4
def add (a b : Int) : Int :=
a + b
Pattern: Function with side effects
// C/C++
void printSum(int a, int b) {
printf("%d\n", a + b);
}
-- Lean4
def printSum (a b : Int) : IO Unit :=
IO.println (a + b)
Control Flow
Pattern: If-else
// C/C++
int max(int a, int b) {
if (a > b) return a;
else return b;
}
-- Lean4
def max (a b : Int) : Int :=
if a > b then a else b
Pattern: For loop → Tail recursion
// C/C++
int sum(int n) {
int result = 0;
for (int i = 0; i < n; i++) {
result += i;
}
return result;
}
-- Lean4
def sum (n : Nat) : Nat :=
let rec loop (i acc : Nat) : Nat :=
if i >= n then acc
else loop (i + 1) (acc + i)
loop 0 0
Pattern: While loop → Recursion
// C/C++
int factorial(int n) {
int result = 1;
while (n > 1) {
result *= n;
n--;
}
return result;
}
-- Lean4
def factorial (n : Nat) : Nat :=
let rec loop (n acc : Nat) : Nat :=
if n <= 1 then acc
else loop (n - 1) (acc * n)
loop n 1
Data Structures
Pattern: Struct
// C/C++
struct Point {
int x;
int y;
};
-- Lean4
structure Point where
x : Int
y : Int
deriving Repr
Pattern: Array
// C/C++
int arr[5] = {1, 2, 3, 4, 5};
-- Lean4
def arr : Array Int := #[1, 2, 3, 4, 5]
Pointers and References
Key principle: Lean4 doesn't have raw pointers. Translate based on usage:
- Read-only pointers: Pass by value
- Output parameters: Return values (use tuples for multiple returns)
- Mutable references: Use
IO.Refor return new values
// C/C++ - Output parameter
void swap(int* a, int* b) {
int temp = *a;
*a = *b;
*b = temp;
}
-- Lean4 - Return tuple
def swap (a b : Int) : Int × Int :=
(b, a)
Step 4: Ensure Type Correctness
Lean4's type system is strict. Address common type issues:
-
Integer types:
- Use
Natfor non-negative values (array indices, counts) - Use
Intfor potentially negative values - Convert explicitly:
n.toNat,n.toInt
- Use
-
Array bounds:
- Lean4 requires proof of valid indices
- Use safe accessors:
arr.get?,arr[i]? - Or use
arr[i]!with runtime check
-
Division:
- Natural number division:
n / m(rounds down) - Integer division: use
Int.div - Handle division by zero explicitly
- Natural number division:
-
Type annotations:
- Add explicit types when inference fails
- Use
: Typefor clarity
Step 5: Test and Verify
Ensure the translated code works correctly:
-
Compile check:
lake build -
Create test cases:
#eval add 2 3 -- Should output 5 #eval factorial 5 -- Should output 120 -
Compare outputs:
- Run original C/C++ program
- Run translated Lean4 program
- Verify outputs match for same inputs
-
Handle edge cases:
- Empty arrays
- Zero values
- Negative numbers
- Boundary conditions
Step 6: Optimize and Refine
Improve the translated code:
-
Use Lean4 idioms:
- Replace manual recursion with
List.foldl,Array.foldl - Use pattern matching instead of nested if-else
- Leverage standard library functions
- Replace manual recursion with
-
Add documentation:
/-- Calculate the sum of first n natural numbers -/ def sum (n : Nat) : Nat := n * (n + 1) / 2 -
Consider performance:
- Use tail recursion for loops
- Prefer
ArrayoverListfor random access - Use
@[inline]for small functions
Common Translation Patterns
For detailed patterns, see translation_patterns.md.
Quick Reference
| C/C++ | Lean4 |
|---|---|
int x | def x : Int |
unsigned int x | def x : Nat |
float x | def x : Float |
bool x | def x : Bool |
char* str | def str : String |
int arr[] | def arr : Array Int |
struct S | structure S where |
for (...) | let rec loop ... |
while (...) | let rec loop ... |
if (...) {...} | if ... then ... else ... |
switch (...) | match ... with |
return x | x (last expression) |
void f() | def f : IO Unit |
printf(...) | IO.println ... |
Examples
Example 1: Simple Algorithm
C/C++ Input:
int gcd(int a, int b) {
while (b != 0) {
int temp = b;
b = a % b;
a = temp;
}
return a;
}
Lean4 Output:
def gcd (a b : Nat) : Nat :=
if b = 0 then a
else gcd b (a % b)
Example 2: Array Processing
C/C++ Input:
int findMax(int arr[], int size) {
int max = arr[0];
for (int i = 1; i < size; i++) {
if (arr[i] > max) {
max = arr[i];
}
}
return max;
}
Lean4 Output:
def findMax (arr : Array Int) : Option Int :=
if arr.isEmpty then
none
else
some (arr.foldl max arr[0]!)
Example 3: Struct with Methods
C/C++ Input:
struct Rectangle {
int width;
int height;
int area() {
return width * height;
}
int perimeter() {
return 2 * (width + height);
}
};
Lean4 Output:
structure Rectangle where
width : Nat
height : Nat
deriving Repr
def Rectangle.area (r : Rectangle) : Nat :=
r.width * r.height
def Rectangle.perimeter (r : Rectangle) : Nat :=
2 * (r.width + r.height)
Example 4: I/O Program
C/C++ Input:
#include <stdio.h>
int main() {
int n;
printf("Enter a number: ");
scanf("%d", &n);
printf("Factorial: %d\n", factorial(n));
return 0;
}
Lean4 Output:
def factorial (n : Nat) : Nat :=
if n <= 1 then 1
else n * factorial (n - 1)
def main : IO Unit := do
IO.print "Enter a number: "
let input ← IO.getStdIn >>= (·.getLine)
match input.trim.toNat? with
| some n =>
IO.println s!"Factorial: {factorial n}"
| none =>
IO.println "Invalid input"
Best Practices
- Start simple: Translate basic functions first, then build up complexity
- Preserve semantics: Ensure the Lean4 code computes the same results
- Use types wisely: Leverage Lean4's type system for correctness
- Embrace immutability: Prefer pure functions over mutable state
- Test thoroughly: Verify outputs match for various inputs
- Document assumptions: Note any semantic differences or limitations
- Leverage standard library: Use built-in functions when available
- Handle errors gracefully: Use
OptionorExceptfor error cases
Limitations and Considerations
- Undefined behavior: C/C++ undefined behavior must be handled explicitly in Lean4
- Performance: Functional code may have different performance characteristics
- Concurrency: C/C++ threading requires different approaches in Lean4
- Low-level operations: Bit manipulation and pointer arithmetic need careful translation
- External libraries: C/C++ library calls may not have direct Lean4 equivalents
- Macros: C preprocessor macros need manual translation
- Templates: C++ templates translate to Lean4 generics differently
Resources
- Translation patterns: See translation_patterns.md for comprehensive pattern catalog
- Lean4 documentation: https://lean-lang.org/documentation/
- Lean4 standard library: https://github.com/leanprover/lean4/tree/master/src/Init
Source
git clone https://github.com/ArabelaTso/Skills-4-SE/blob/main/skills/c-cpp-to-lean4-translator/SKILL.mdView on GitHub Overview
Transforms C or C++ programs into Lean4 code that preserves original semantics while leveraging Lean4's functional style and strong typing. The output is intended to be well-typed, executable Lean4 code that can run and be verified. It covers functions, data structures, and imperative patterns, translating loops, structs, and arrays into Lean4 equivalents while handling pointers and side effects appropriately.
How This Skill Works
The process follows a three-step workflow: analyze input to identify components, design a Lean4 structure with appropriate types and purity, then translate code using established patterns such as pure functions, IO for side effects, tail-recursive loops, and Lean4 structures for structs. It maps C concepts to Lean4 primitives like Int, Nat, Float, Array, List, and custom structures, and converts control flow to recursion or conditional expressions. Pointers and memory management are reframed as direct values, references, or IO/state, ensuring the result remains executable and well-typed.
When to Use It
- You want to port a C or C++ algorithm to Lean4 for verification or functional style.
- You need a Lean4 version of an imperative program that preserves semantics.
- You are translating loops and mutable state into recursive Lean4 patterns.
- You want Lean4 data structures for structs and arrays to replace C equivalents.
- You seek Lean4 code that can be compiled and run with IO effects when needed.
Quick Start
- Step 1: Provide your C/C++ code snippet or small project.
- Step 2: The translator analyzes semantics, determines Lean4 types, purity, and memory mapping.
- Step 3: Receive well-typed Lean4 code ready to compile and test in your environment.
Best Practices
- Identify pure functions and side effects early to decide IO usage.
- Map C types to Lean4 types: Int, Nat, Float, Array, List, and custom structures.
- Translate loops into tail-recursive functions and mutable state into parameters or IO references.
- Handle memory and pointers by using direct values or references rather than raw pointers.
- Test translations incrementally with small, representative inputs to validate semantics.
Example Use Cases
- Translate a simple sum function from C to Lean4 and verify results.
- Convert a factorial implementation using a while loop to a tail-recursive Lean4 version.
- Port a C struct Point { int x; int y; } into a Lean4 structure and use it in calculations.
- Translate an array processing routine to Lean4 using Array and map-like patterns.
- Translate a function with an output parameter (pointer) into a Lean4 function returning a tuple.