BLADE

PyPI version Maintenance Python 3.11+ Codecov Colab Notebook BLADE framework

BLADE is a Python framework for benchmarking the Llm Assisted Design and Evolution of algorithms. BLADE (Benchmark suite for LLM-driven Automated Design and Evolution) provides a standardized benchmark suite for evaluating automatic algorithm design algorithms, particularly those generating metaheuristics by large language models (LLMs). It focuses on heuristic optimization and integrates a diverse set of problems and methods, facilitating fair and comprehensive benchmarking.

Documentation Guide

  • Start here for conceptual usage: Introduction and Quick Start.

  • For benchmark catalog and taxonomy: Benchmarks.

  • For API reference by feature area: BLADE API.

  • For interactive monitoring: Webapp.

  • Comprehensive Benchmark Suite: Covers various classes of black-box optimization problems.

  • LLM-Driven Algorithm Design: Supports algorithm evolution and design using large language models.

  • Built-In Baselines: Includes state-of-the-art metaheuristics for comparison and LLM-driven AAD algorithms.

  • Automatic Logging & Visualization: Integrated with IOHprofiler for performance tracking.

Included Benchmark Function Sets

BLADE incorporates several benchmark function sets to provide a comprehensive evaluation environment:

Name

Short Description

Number of Functions

Multiple Instances

BBOB (Black-Box Optimization Benchmarking)

A suite of 24 noiseless functions designed for benchmarking continuous optimization algorithms. Reference

24

Yes

SBOX-COST

A set of 24 boundary-constrained functions focusing on strict box-constraint optimization scenarios. Reference

24

Yes

MA-BBOB (Many-Affine BBOB)

An extension of the BBOB suite, generating functions through affine combinations and shifts. Reference

Generator-Based

Yes

GECCO MA-BBOB Competition Instances

A collection of 1,000 pre-defined instances from the GECCO MA-BBOB competition, evaluating algorithm performance on diverse affine-combined functions. Reference

1,000

Yes

HLP (High-Level Properties)

Generated benchmarks guided by high-level property combinations (e.g., separable, multimodality).

Generator-Based

Yes

In addition, several real-world applications are included.

Real World Benchmarks

Name

Description

Analysis

Auto-Correlation 1

Minimise max(g) / I^2 for non-negative signals under fixed discretisation of [-1/4, 1/4].

Auto-Correlation 2

Maximise L_2^2 / (L_1 · L_∞) for non-negative signals using discrete auto-convolution.

Auto-Correlation 3

Minimise max(||g||) / I^2 for real-valued signals with non-zero integral.

AutoML

AutoML Pipelines

Generate and evaluate machine learning pipelines using scikit-learn.

Combinatorics

Erdős Minimum-Overlap Problem

Minimise the suprenum overlap integral between complementary measurable functions.

Euclidean Steiner Tree Problem

Minimise MST(points + Steiner points) / MST(points) ratio by adding optimal Steiner nodes.

Graph Colouring Problem

Minimise the number of colours needed to colour graph nodes so adjacent nodes never share a colour.

Fourier

Fourier Uncertainty Inequality

Minimise uncertainty bound for functions of form P(x)e^{-πx²} under Hermite constraints.

Geometry

Heilbronn (Unit Triangle)

Maximise the area of the smallest triangle formed by 11 points in a unit-area triangle.

Heilbronn (Unit Convex Region)

Maximise the area of the smallest triangle formed by 13-14 points in a unit-area convex region.

Kissing Number (11D)

Maximise number of integer vectors satisfying high-dimensional kissing constraints.

Min/Max Distance Ratio

Minimise squared ratio of maximum to minimum pairwise distances (2D/3D variants).

Spherical Code

Maximise the minimum pairwise angle among 30 points on a unit sphere.

Kernel Tuner

Kernel Tuning Benchmark

Evaluate metaheuristics for hardware kernel optimisation under constraints.

Logistics

Travelling Salesman Problem

Minimise total tour distance visiting each 2D point exactly once.

Vehicle Routing Problem

Minimise total travel distance for capacitated vehicles serving weighted customers.

Matrix Multiplication via Tensor Decomposition

Tensor CP Factorisation

Find smallest CP rank enabling exact matrix multiplication under quantised factors.

Number Theory

Sums vs Differences

Maximise c(U) measuring imbalance between sumsets and difference sets.

Packing

Circle Packing

Maximise total packed circle area inside a circular container without overlap.

Hexagonal Packing

Minimise area of smallest enclosing hexagon containing disjoint regular hexagons.

Rectangle Packing

Pack disjoint circles inside a fixed-perimeter rectangle under containment constraints.

Unit Square Packing

Pack disjoint circles inside a unit square while satisfying non-overlap constraints.

These benchmarks are provided with ready-to-run instances in run_benchmarks/, while reusable benchmark definitions are organized under iohblade/benchmarks by domain.

Included Search Methods

The suite contains the state-of-the-art LLM-assisted search algorithms:

Algorithm

Description

Link

LLaMEA

Large Language Model Evolutionary Algorithm

code, paper

EoH

Evolution of Heuristics

code, paper

FunSearch

Google’s GA-like algorithm

code, paper

ReEvo

Large Language Models as Hyper-Heuristics with Reflective Evolution

code, paper

LLM-Driven Heuristics Neighbourhood Search

LLM-Driven Neighborhood Search for Efficient Heuristic Design

code, paper

Monte Carlo Tree Search

Monte Carlo Tree Search for Comprehensive Exploration in LLM-Based Automatic Heuristic Design

code, paper

Note

FunSearch is currently not yet integrated.

Supported LLM APIs

BLADE supports integration with various LLM APIs to facilitate automated design of algorithms:

LLM Provider

Description

Integration Notes

Gemini

Google’s multimodal LLM designed to process text, images, audio, and more. Reference

Accessible via the Gemini API, compatible with OpenAI libraries. Reference

OpenAI

Developer of GPT series models, including GPT-4, widely used for natural language understanding and generation. Reference

Integration through OpenAI’s REST API and client libraries.

Ollama

A platform offering access to various LLMs, enabling local and cloud-based model deployment. Reference

Integration details can be found in their official documentation.

Claude

Anthropic’s Claude models for safe and capable language generation. Reference

Accessed via the Anthropic API.

DeepSeek

Developer of the DeepSeek family of models for code and chat. Reference

Access via OpenAI compatible API at https://api.deepseek.com.

Evaluating against Human Designed Baselines

An important part of BLADE is the final evaluation of generated algorithms against state-of-the-art human-designed algorithms. In the iohblade.baselines part of the package, several well-known SOTA black-box optimizers are implemented to compare against, including but not limited to CMA-ES and DE variants.

For the final validation, BLADE uses IOHprofiler, providing detailed tracking and visualization of performance metrics.

🤖 Contributing

Contributions to BLADE are welcome! Here are a few ways you can help:

  • Report Bugs: Use GitHub Issues to report bugs.

  • Feature Requests: Suggest new features or improvements.

  • Pull Requests: Submit PRs for bug fixes or feature additions.

Please refer to CONTRIBUTING.md for more details on contributing guidelines.

License

Distributed under the MIT License. See LICENSE for more information.

Cite us

If you use BLADE in your research, please consider citing the associated paper:

TBA

Indices and tables