Benchmarks¶
BLADE includes a collection of benchmarks ranging from BBOB to Google DeepMind–inspired tasks derived from the AlphaEvolve paper. These instances are available in two complementary forms:
run_benchmarks/provides standalone reference scripts for running each task directly.iohblade/benchmarkspackages the same tasks for programmatic use in experiments and pipelines.
The packaged benchmarks are grouped by domain.
Methodology Taxonomy¶
The benchmark families in iohblade/benchmarks can also be grouped by
optimization structure.
Black-box Continuous¶
BBOBmabbobphotonics
Black-box Discrete/Mixed-Integer¶
combinatoricslogisticspackingkerneltuner
Structured Scientific Program/Search¶
analysisfouriernumber_theorymatrix_multiplicationgeometry
ML Pipeline Optimization¶
automl
List of Benchmarks¶
Analysis¶
Methodology class: Structured Scientific Program/Search.
Perform auto-correlation on a time series using the following configuration:
The domain size
Ndefines the discretisation of \([-\tfrac{1}{4}, \tfrac{1}{4}]\).Step size: \(dx = \frac{0.5}{N}\)
Auto-convolution: \(g = dx \cdot \mathrm{conv}(f, f, \text{mode='full'})\)
Riemann sum analogues:
\(I = dx \sum_i f[i]\)
\(L_1 = dx \sum_j |g[j]|\)
\(L_{2}^{2} = dx \sum_j g[j]^2\)
\(L_{\infty} = \max_j |g[j]|\)
\(\max_g = \max_j g[j]\)
\(\max_{\text{abs}_g} = \max_j |g[j]|\)
Auto-Correlation 1¶
Score: \(\frac{\max_g}{I^2}\)
Constraints: \(f \ge 0\), \(I > 0\)
Optimisation direction: minimisation
Default: \(N = 600\)
Auto-Correlation 2¶
Score: \(\frac{L_{2}^{2}}{L_1 \cdot L_{\infty}}\)
Constraints: \(f \ge 0\)
Optimisation direction: maximisation
Default: \(N = 50\)
Auto-Correlation 3¶
Score: \(\frac{\max_{\text{abs}_g}}{I^2}\)
Constraints: \(f\) real-valued, \(I \ne 0\)
Optimisation direction: minimisation
Default: \(N = 400\)
AutoML¶
Methodology class: ML Pipeline Optimization.
The AutoML benchmark evaluates LLM-generated scikit-learn pipelines on supervised learning tasks hosted on the OpenML platform.
Task source: OpenML tasks (either individual task IDs or suites such as
amlb-classification-allfrom the AutoML Benchmark).What is optimized: a single Python class implementing a pipeline with:
__init__(self, X, y, **hyperparameters)(fit once) and__call__(self, X)(predict), using only standard ML libraries (primarilyscikit-learn).Evaluation protocol: uses the official OpenML split definition (repeat/fold/sample) and the task’s default evaluation measure (e.g., accuracy, AUC, RMSE/MAE). Scores are aggregated across all official splits.
Black Box and Photonics Optimisation (BBO & Photonics)¶
Methodology class: Black-box Continuous.
Contains a set of Black Box Optimisation problems from the ioh benchmarks library.
Includes both
BBOBandmabbobbenchmark families.
Also contains a set of photonics problems, supported by black bock optimisation.
Combinatorics¶
Methodology class: Black-box Discrete/Mixed-Integer.
Erdős Minimum-Overlap Problem¶
The continuous Erdős minimum-overlap problem seeks to find measurable functions \(f, g: [-1,1] \rightarrow [0,1]\) that satisfy:
Complementarity: \(f(x) + g(x) = 1\) for all \(x \in [-1,1]\)
Unit mass: \(\int_{-1}^{1} f(x) \, dx = \int_{-1}^{1} g(x) \, dx = 1\)
Bounds: \(f(x), g(x) \in [0,1]\) for all \(x \in [-1,1]\)
And minimize the maximum overlap integral:
\[C := \sup_{x \in [-2,2]} \int_{-1}^{1} f(t) g(x+t) \, dt\]where g is extended by zero outside [-1,1].
Euclidean Steiner Tree Problem¶
Given an undirected graph, the Euclidean Steiner Tree algorithm optimises the minimum spanning tree by adding extra nodes (Steiner points) to the graph.
Adding these nodes allows for a shorter MST.
This benchmark takes a set of points, runs MST on the original points and on the points combined with Steiner points, and returns their ratio.
Optimisation goal:
\(\min \frac{\text{mst(points + steiner_points)}}{\text{mst(points)}}\)
Graph Colouring Problem¶
The Graph Colouring benchmark focuses on assigning colours to the vertices of a graph such that no two adjacent vertices share the same colour.
Input: a graph \(G = (V, E)\) with vertices \(V\) and edges \(E\).
Objective: minimise the number of colours used while ensuring a valid colouring.
- Constraints:
Each vertex receives exactly one colour.
Adjacent vertices must not share the same colour.
- Fitness:
- \[\text{fitness} = |C| \quad \text{where } C \text{ is the set of colours used}\]
Fourier¶
Methodology class: Structured Scientific Program/Search.
Fourier Uncertainty Inequality¶
This benchmark minimises the Fourier Uncertainty Inequality given by \(f(x) = P(x) e^{-\pi x^2}\) where \(r_{\max}\) is the largest positive root after which \(P(x)\) remains non-negative.
Function Class:
\[f(x) = P(x) \, e^{-\pi x^2}, \quad P(x) = \sum_{k=0}^{K-1} c[k] \, H_{4k}(x)\]\(H_n\) are physicists’ Hermite polynomials.
Evenness holds by construction (degrees 0,4,8,…).
Constraints:
\(P(0) < 0\)
Leading coefficient \(c[K-1] > 0\) (scale-invariant: any positive scaling leaves score unchanged)
Tail nonnegativity: \(P(x) \ge 0 \quad \forall x \ge r_{\max}\)
Optional numeric sanity check: \(P(x_{\max}) \ge 0\) for large \(x_{\max}\)
Geometry¶
Methodology class: Structured Scientific Program/Search.
Heilbronn Problem on a Unit-Area Triangle¶
This benchmark finds \(n = 11\) points inside a triangle of area 1 that maximize the minimum area of any triangle formed by these points.
- Score:
- \[\text{Score} = \min_{a,b,c} \Big( \max \sqrt{s (s-a)(s-b)(s-c)} \Big)\]
- where:
\(a,b,c \in \text{points}\)
\(s = \frac{a+b+c}{2}\)
\(a \ne b \ne c\)
Heilbronn Problem on a Unit-Area Convex Region¶
This benchmark finds \(n \in \{13, 14\}\) points inside a convex region of area 1 that maximize the minimum area of any triangle formed by these points.
The convex hull of the points defines the region, which is then rescaled to have area 1.
- Score:
- \[\text{Score} = \min_{a,b,c} \Big( \max \sqrt{s (s-a)(s-b)(s-c)} \Big)\]
- where:
\(a,b,c \in \text{points}\)
\(s = \frac{a + b + c}2\)
\(a \ne b \ne c\)
Kissing Number in 11 Dimensions¶
- This benchmark maximises the count of integer vectors
\(C \subset \mathbb{Z}^{11} \setminus \{0\}\)
that satisfies the kissing configuration constraint:
\[\min_{x \ne y} \|x - y\| \ge \max_{x \in C} \|x\|\]
- Score:
- \[\text{Score} = |C|\]
Min/Max Distance Ratio¶
This benchmark finds a configuration of points that minimizes the squared ratio of the maximum to minimum pairwise distances.
- Fitness function:
- \[\text{Fitness} = \left( \frac{\max_{i<j} d(i,j)}{\min_{i<j} d(i,j)} \right)^2\]
- Variants:
2D space with \(n = 16\) points
3D space with \(n = 14\) points
Spherical Code¶
This benchmark is a maximisation problem where \(n=30\) distinct points are arranged on the surface of a unit sphere.
Objective: maximize the minimum pairwise angle between any two points.
Score:
\[\text{f} = \min_{i \ne j} \theta(i,j)\]where \(\theta(i,j)\) is the angle between points \(i\) and \(j\).
Kernel Tuner¶
Methodology class: Black-box Discrete/Mixed-Integer.
This benchmark evaluates metaheuristic algorithms for hardware kernel tuning across integer and variable-dimensional search spaces with constraints.
- The algorithm is scored based on performance of the kernel using metrices like:
runtime
throughput
custom matrices.
Logistics¶
Methodology class: Black-box Discrete/Mixed-Integer.
Travelling Salesman Problem¶
Given a set of 2-D points \((x, y)\), find the sortest path connecting those points.
Optimisation Direction: Minimisation.
- Fitness:
- \[\text{Minimise} \quad \sum_{i \in V} \sum_{j \in V, j \ne i} d(i,j) \, x_{ij}\]
- Constraints:
Each customer is served exactly once.
Vehicle Routing Problem¶
Given a set of 2-D points with their weights as, \((x, y, w)\) and a count \(n\) of similar vehicles, with capacity \(c\), and a depot point \((x,y, 0)\) find the minimum travel distance.
Optimisation Direction: Minimisation
- Fitness:
- \[\text{Minimise} \quad \sum_{i \in V} \sum_{j \in V, j \ne i} d(i,j) \, x_{ij}\]
- Constraints:
Each customer is only serviced once.
For each vehicle: \(sum_{i \in v_j} w_i \le c\) where \(v_j\) is a vehicle, and \(j \in \{0,\ldots,n\}\).
Matrix Multiplication via Tensor Decomposition¶
Methodology class: Structured Scientific Program/Search.
- Given matrix dimensions:
- \[n \times m \ \cdot \ m \times p \ \longrightarrow \ n \times p\]
define a tensor
\[T \in \mathbb{R}^{(n \cdot m) \times (m \cdot p) \times (p \cdot n)}\]that encodes ordinary matrix multiplication.
- Factorisation:
- A rank-\(r\) CP decomposition of \(T\):
- \[T[i,j,k] = \sum_{\ell=1}^{r} F_1[i,\ell] \cdot F_2[j,\ell] \cdot F_3[k,\ell]\]
yields a matrix multiplication algorithm requiring only \(r\) scalar multiplications.
- Objective:
Find the smallest rank :math:`r` that allows exact reconstruction (zero error).
- Constraints:
- All entries of the factor matrices \(F_1, F_2, F_3\) must lie on a quantisation grid:
grid = 0.5, 1.0, 1.5, ...
Number Theory¶
Methodology class: Structured Scientific Program/Search.
Sums vs Differences (Single-Set Formulation)¶
This benchmark searches for a finite set
\(U \subset \mathbb{Z}^+ \cup \{0\}\)
that maximizes the following quantity:
\[c(U) = 1 + \frac{\log|U-U| - \log|U+U|}{\log(2 \max(U) + 1)}\]Purpose:
\(c(U)\) lower-bounds the exponent \(C_6\) in the inequality
\[|A-B| \gtrsim |A+B|^{C_6}\]A larger \(c(U)\) corresponds to a better lower bound on \(C_6\).
Evaluation:
The evaluator computes \(|U+U|\) and \(|U-U|\) exactly.
Implementation uses FFT convolution/correlation on the indicator function of \(U\) over the domain \([0, \max(U)]\).
Packing¶
Methodology class: Black-box Discrete/Mixed-Integer.
Contains a set of packing problems, ranging from Hexagonal Packing to Circle Packing.
Circle Packing¶
This benchmark selects and places candidate circles inside a circular container to maximize total packed area under non-overlap constraints.
Input parameters:
A circular container \(c = (x, y, r) \in \mathbb{R}^3\)
Candidate radii \(r_i \; \forall i \in \{1, \ldots, n\}\)
Tolerance \(\epsilon > 0\) (default \(10^{-12}\))
Optimisation objective:
\[O = \max \sum_{i=1}^{n} \alpha_i \, \pi r_i^2\]where:
\(\alpha_i \in \{0,1\}\) indicates whether circle \(i\) is included.
Non-overlap constraint:
\[\| (x_i, y_i) - (x_j, y_j) \| \ge r_i + r_j + \epsilon \quad \forall i \ne j \text{ with } \alpha_i = \alpha_j = 1\]Containment constraint:
\[\| (x_i, y_i) - (x, y) \| + r_i \le r + \epsilon \quad \forall i \text{ with } \alpha_i = 1\]
Only circles satisfying both constraints contribute to the objective.
Hexagonal Packing¶
- This benchmark generates \(n\) disjoint regular hexagons parameterised by
- \[(x_i, y_i, \theta_i) \quad \forall i \in \{1, \ldots, n\}\]
where \((x_i, y_i)\) is the center and \(\theta_i\) the orientation.
Disjointness condition:
Any two hexagons must be separated by at least tolerance \(\epsilon\).
- Optimisation objective:
- The evaluator computes the smallest enclosing regular hexagon
\(h\) that contains all generated hexagons.
- Goal:
- \[\text{minimise } \text{Area}(h)\]
Rectangle Packing¶
This benchmark requires the algorithm to generate \(n\) disjoint circles inside a rectangular container of perimeter \(4\), respecting a specified tolerance \(\epsilon\).
- Generated circles:
- \[(x_i, y_i, r_i) \quad \forall i \in \{1, \ldots, n\}\]
- Constraints:
- Non-overlap:
- \[\| (x_i, y_i) - (x_j, y_j) \| \ge r_i + r_j + \epsilon \quad \forall i \ne j\]
- Containment:
- \[(x - x_i)^2 + (y - y_i) = r_i^2 \text{ must lie entirely within the rectangle} \quad \forall i\]
Unit Square Packing¶
This benchmark requires the algorithm to generate \(n\) disjoint circles inside a unit-square, respecting a specified tolerance \(\epsilon\).
- Generated circles:
- \[(x_i, y_i, r_i) \quad \forall i \in \{1, \ldots, n\}\]
- Constraints:
- Non-overlap:
- \[\| (x_i, y_i) - (x_j, y_j) \| \ge r_i + r_j + \epsilon \quad \forall i \ne j\]
- Containment:
- \[(x - x_i)^2 + (y - y_i) = r_i^2 \text{ must lie entirely within the square} \quad \forall i\]