Realizing cubic isogeny primes

AI-assisted experiments

Barinder S. Banwait

Joint with Maarten Derickx

Using AI to find examples · 1st May 2026
Centre de Recherches Mathématiques, Montréal, Canada

What we really want from you is a discussion of how you used the AI to get some nice results — the ups and downs and how things played out. Sort of a mingled hacker talk with math research — I hope that makes sense?

$E/\mathbb{Q}$ elliptic curve.

Barry Mazur, 1992
Barry Mazur
G.M. Bergman, CC BY-SA 4.0
Theorem (Mazur, 1978)

If $E/\mathbb{Q}$ is an elliptic curve admitting a rational $p$-isogeny, then $$p \in \{2,\,3,\,5,\,7,\,11,\,13,\,17,\,19,\,37,\,43,\,67,\,163\}$$ $$=: \mathrm{IsogPrimeDeg}(\mathbb{Q}).$$

John Cremona
John Cremona
Uni Warwick
Question (Cremona, 2010)

Mazur did this for $\mathbb{Q}$ in '78 — can you do it for any other number field?

Theorem (B., 2021)

Assuming GRH, we have the following.

$$\begin{aligned} \mathrm{IsogPrimeDeg}(\mathbb{Q}(\sqrt{7})) &= \mathrm{IsogPrimeDeg}(\mathbb{Q})\\ \mathrm{IsogPrimeDeg}(\mathbb{Q}(\sqrt{-10})) &= \mathrm{IsogPrimeDeg}(\mathbb{Q})\\ \mathrm{IsogPrimeDeg}(\mathbb{Q}(\sqrt{5})) &= \mathrm{IsogPrimeDeg}(\mathbb{Q}) \cup \left\{23, 47\right\} \end{aligned}$$

Actually this is a corollary of the following.

Theorem (B., 2021)

Let $K$ be a quadratic field which is not imaginary quadratic of class number $1$. Then there is an algorithm which computes a superset of $\mathrm{IsogPrimeDeg}(K)$.

Theorem (B.–Derickx, 2022)

Let $K$ be a number field that does not contain the Hilbert class field of an imaginary quadratic field. Then there is an algorithm that computes a superset of $\mathrm{IsogPrimeDeg}(K)$.

Theorem (B.–Derickx, 2022)

Assuming GRH, we have the following.

$$\begin{aligned} \mathrm{IsogPrimeDeg}(\mathbb{Q}(\zeta_7)^+) &= \mathrm{IsogPrimeDeg}(\mathbb{Q})\\ \mathrm{IsogPrimeDeg}(\mathbb{Q}(\alpha)) &= \mathrm{IsogPrimeDeg}(\mathbb{Q}) \cup \left\{29\right\}\\ \mathrm{IsogPrimeDeg}(\mathbb{Q}(\beta)) &= \mathrm{IsogPrimeDeg}(\mathbb{Q}), \end{aligned}$$

where $\alpha^3 - \alpha^2 - 2\alpha - 20 = 0$ and $\beta^3 - \beta^2 - 3\beta + 1 = 0$.

Question
  1. How can we go from a superset of $\mathrm{IsogPrimeDeg}(K)$ to the set itself?
  2. Can we compute $\mathrm{IsogPrimeDeg}(K)$ for all cubic number fields in the LMFDB?

Why cubic? Because for every cubic field, $\mathrm{IsogPrimeDeg}(K)$ is finite.

Question

Given $K$, is $29$ an isogeny prime for $K$?

i.e. does the modular curve $X_0(29)$ admit a noncuspidal $K$-rational point?

Theorem (B.–Derickx)

Suppose $p = 23$, $29$ or $31$. Let $K$ be a cubic field such that $X_0(p)$ admits a noncuspidal $K$-rational point. Then we have the following.

  1. The discriminant $\Delta_K$ of $K$ is negative.
  2. There is a finite set of explicitly computable hyperelliptic curves over $\mathbb{Q}$ of genera $2$ or $3$ such that the quadratic twist at $\Delta_K$ of at least one of them admits a $\mathbb{Q}$-rational point.

For $p = 29$, there are 5 curves:

$$\begin{aligned} H_1 : y^2 &= -28x^8 - 368x^7 - 208x^6 + 13552x^5 + 14456x^4 \\ &\quad - 371088x^3 - 115984x^2 + 5414928x - 11303036 \\ & \\ H_2 : y^2 &= -28x^8 + 32x^7 - 2224x^6 + 10560x^5 + 46912x^4 \\ &\quad + 40960x^3 - 675072x^2 - 2022400x - 1538048 \\ & \\ H_3 : y^2 &= -4x^6 + 152x^5 - 2480x^4 + 21760x^3 - 106544x^2 \\ &\quad + 269056x - 270080 \\ & \\ H_4 : y^2 &= -108x^6 - 680x^5 + 3756x^4 + 70480x^3 + 110060x^2 \\ &\quad - 1337000x - 7539244 \\ & \\ H_5 : y^2 &= -4x^8 + 56x^7 + 16x^6 - 2304x^5 - 18368x^4 \\ &\quad - 64512x^3 - 112832x^2 - 91392x - 27392 \end{aligned}$$

$H_1, H_2, H_5$ have genus $3$ (degree $8$); $H_3, H_4$ have genus $2$ (degree $6$).

Granville
ANTS torsion paper

This paper implemented many techniques for showing that twists of hyperelliptic modular curves don't admit a $\mathbb{Q}$-point (TwoCoverDescent, MWSieve, IsELS, EllipticCurveChabauty, Chabauty0).

We use this to show that many cubic fields $K$ provably do not admit $29$ as an isogeny prime. We add this to a growing ground truth dataset.

Ground-truth dataset

INPUT $K$ (cubic field, $\Delta_K < 0$)
Can Magma show twists of the 5 curves do not have a $\mathbb{Q}$-point?
Yes ↓
Label: 'NO'
No ↓
Does a point search up to height 50 show that $\#X_0(29)(K) > 2$?
Yes ↓
Label: 'YES'
No ↓
Label: 'UNKNOWN'

Question: Can we train an ML model to detect cubic fields that admit $29$ as an isogeny prime?

Discriminant histogram by verdict

There is a way of generating YESs from plugging in rational $x$-values into one of the $5$ hyperelliptic curves, via a converse to our theorem,

but the discriminants obtained are enormous.

Artificially balancing the dataset with this gave us a dataset of the following distribution:

Balanced discriminant histogram

The features we used in the dataset:

class_group, class_number, conductor, disc_abs, index, monogenic, narrow_class_group, narrow_class_number, num_ram, z1, z2, …, z30

(z1z30 are the first 30 zeta coefficients.)

We trained (1) XGBoost, (2) Random Forest on this balanced dataset, 80/20 train-test split.

Test set (held-out 20%, 2049 rows, balanced):

ModelaccuracyROC-AUCPR-AUC
XGBoost0.99460.99960.9996
RandomForest0.99320.99930.9994

Top features (XGBoost):

featuregainpermutation Δ PR-AUC
index0.7690.257
z30.0270.000
z20.0190.000
z270.015
z60.015
class_group_rank0.014
z290.0004
num_ram0.0003

What did the model actually learn?

index carries 77% of the gain and a 26% PR-AUC drop under permutation — basically the entire signal. Why?

index = 1index > 1
NO (cascade)5,014 (96%)199 (4%)
YES (converse)77 (1.5%)4,954 (98.5%)
Unknowns1,779 (95%)92 (5%)

index is almost a perfect class indicator on the balanced training set — but the unknowns share the NO distribution (overwhelmingly $1$).

Predicted on the $1871$ unknowns: only $9$ (XGBoost) / $7$ (RF) flagged YES; median $P(\text{YES}) = 0.0001$.

The model isn't learning the mathematics; it's learning the data provenance — which is a perfect proxy for the label only because we built it that way.

The honest experiment: LMFDB-only

  • Train CSV: master_features.csv5,314 rows (5,213 NO + 101 YES), ~52:1 imbalance.
  • 5-fold stratified CV on labelled rows; metrics on out-of-fold predictions.
  • Three models compared: Weighted XGBoost, Focal-loss XGBoost ($\gamma = 2$), Isolation Forest (anomaly detection on NO rows).

Results

Out-of-fold metrics across 5-fold stratified CV (101 YES across all folds):

ModelROC-AUCPR-AUCYES in top-50Lift
Weighted XGBoost0.9690.72447 / 101~49×
Focal-loss XGBoost0.9440.57636 / 101~38×
Isolation Forest0.7230.09710 / 101~10×

(Random would put ~$0.95$ YES rows in the top 50.)

  • The PR-AUC gap is decisive. ROC-AUC is forgiving under heavy imbalance (it averages over the 5,213 negatives); PR-AUC is the honest metric when YES is rare. Weighted XGBoost beats focal by $0.15$ absolute (~25% relative) on PR-AUC, and beats Isolation Forest by $0.63$ — a different planet.
  • Same ordering at every recall@$k$ we measured.
  • At top-200 (< 4% of the dataset), Weighted XGBoost captures 76% of all YES rows.
  • Isolation Forest treats YES as anomalies — barely beats random, because YES cubics aren't anomalous, just structured.

Top features (Weighted XGBoost)

Feature importance comparison

Top features — permutation importance

XGBoost permutation importance

disc_abs dominates ($\Delta$ PR-AUC $\approx 0.66$), with z29, class_group_max, z19, z3, regulator following.

⚠ Yellow flag: heavy disc_abs reliance — this drove the rest of the experiments.

Honest vs. cautionary

LMFDB-only (52:1)Balanced + converse-engineered (1:1)
Test PR-AUC0.720.9996
Distribution shift?NoneBaked in
Transfers to unknowns?YesNo (predicts NO for nearly all)
What was learned?Real signalA label proxy (index)

The first one is the honest result. The second is the cautionary tale.