How often are there integer solutions for equations of the form *x*^{2} – *dy*^{2} = –1?

Donghyun Lim for Quanta Magazine

In the third century BCE, Archimedes posed a riddle about herding cattle that, he claimed, only a truly wise person could solve. His problem ultimately boiled down to an equation that involves the difference between two squared terms, which can be written as *x*^{2} – *dy*^{2} = 1. Here, *d* is an integer — a positive or negative counting number — and Archimedes was looking for solutions where both *x* and *y* are integers as well.

This class of equations, called the Pell equations, has fascinated mathematicians over the millennia since.

Some centuries after Archimedes, the Indian mathematician Brahmagupta, and later the mathematician Bhāskara II, provided algorithms to find integer solutions to these equations. In the mid-1600s, the French mathematician Pierre de Fermat (who was unaware of that work) rediscovered that in some cases, even when *d* was assigned a relatively small value, the smallest possible integer solutions for *x* and *y *could be massive. When he sent a series of challenge problems to rival mathematicians, they included the equation *x*^{2} – 61*y*^{2} = 1, whose smallest solutions have nine or 10 digits. (As for Archimedes, his riddle essentially asked for integer solutions to the equation *x*^{2} – 4,729,494*y*^{2} = 1. “To print out the smallest solution, it takes 50 pages,” said Peter Koymans, a mathematician at the University of Michigan. “In some sense, it is a gigantic troll by Archimedes.”)

But the solutions to the Pell equations can do much more. For instance, say you want to approximate $latex sqrt{2}$, an irrational number, as a ratio of integers. It turns out that solving the Pell equation *x*^{2} – 2*y*^{2} = 1 can help you do that: $latex sqrt{2}$ (or, more generally, $latex sqrt{d}$) can be approximated well by rewriting the solution as a fraction of the form *x*/*y*.

Perhaps even more intriguing, those solutions also tell you something about particular number systems, which mathematicians call rings. In such a number system, mathematicians might adjoin $latex sqrt{2}$ to the integers. Rings have certain properties, and mathematicians want to understand those properties. The Pell equation, it turns out, can help them do so.

And so “a lot of very famous mathematicians — almost every mathematician in some period of time — actually studied this equation because of how simple it is,” said Mark Shusterman, a mathematician at Harvard University. Those mathematicians included Fermat, Euler, Lagrange and Dirichlet. (John Pell, not so much; the equation was mistakenly named after him.)

Now Koymans and Carlo Pagano, a mathematician at Concordia University in Montreal, have proved a decades-old conjecture related to the Pell equation, one that quantifies how often a certain form of the equation has integer solutions. To do so, they imported ideas from another field — group theory — while simultaneously gaining a better understanding of a key but mysterious object of study in that field. “They used really deep and beautiful ideas,” said Andrew Granville, a mathematician at the University of Montreal. “They really nailed it.”

**Broken Arithmetic**

In the early 1990s, Peter Stevenhagen, a mathematician at Leiden University in the Netherlands, was inspired by some of the connections he saw between the Pell equations and group theory to make a conjecture about how often these equations have integer solutions. But “I didn’t expect it to be proved anytime soon,” he said — or even in his lifetime. Available techniques did not seem strong enough to attack the problem.

His conjecture depends on a particular feature of rings. In the ring of numbers where, for example, $latex sqrt{-5}$ has been added to the integers (mathematicians often work with “imaginary” numbers like $latex sqrt{-5}$), there are two distinct ways to split a number into its prime factors. The number 6, for example, can be written not just as 2 × 3, but also as (1 + $latex sqrt{-5}$) × (1 – $latex sqrt{-5}$). As a result, in this ring, unique prime factorization — a central tenet of arithmetic, one practically taken for granted in the normal integers — breaks down. The extent to which this occurs is encoded in an object associated to that ring, called a class group.

One way that mathematicians try to gain deeper insights into a number system they’re interested in — say, $latex sqrt{2}$ adjoined to the integers — is to compute and study its class group. Yet it’s almost prohibitively difficult to pin down general rules for how class groups behave across all these different number systems.

Peter Koymans in the library of the Mathematical Institute of Leiden University, where he’s a guest for the summer.

Ilaria Prosepe

In the 1980s, the mathematicians Henri Cohen and Hendrik Lenstra put forth a broad set of conjectures about what those rules should look like. These “Cohen-Lenstra heuristics” could tell you much about class groups, which in turn should reveal properties of their underlying number systems.

There was just one problem. While a lot of computations seem to support the Cohen-Lenstra heuristics, they’re still conjectures, not proofs. “As far as theorems go, until very recently we knew almost nothing,” said Alex Bartel, a mathematician at the University of Glasgow.

Intriguingly, a class group’s typical behavior is inextricably intertwined with the behavior of Pell equations. Understanding one problem helps make sense of the other — so much so that Stevenhagen’s conjecture “has also been a test problem for whatever progress has been made on the Cohen-Lenstra heuristics,” Pagano said.

The new work involves the negative Pell equation, where *x*^{2} – *dy*^{2} is set to equal −1 instead of 1. In contrast to the original Pell equation, which always has an infinite number of integer solutions for any *d*, not all values of *d *in the negative Pell equation yield an equation that can be solved. Take *x*^{2} – 3*y*^{2} = −1: No matter how far along the number line you look, you’ll never find a solution, even though *x*^{2} – 3*y*^{2} = 1 has infinitely many solutions.

In fact, there are a lot of values of *d* for which the negative Pell equation can’t be solved: Based on known rules about how certain numbers relate to one another, *d* cannot be a multiple of 3, 7, 11, 15 and so on.

But even when you avoid those values of *d *and consider only the remaining negative Pell equations, it’s still not always possible to find solutions. In that smaller set of possible values of *d*, what proportion actually works?

In 1993, Stevenhagen proposed a formula that gave a precise answer to that question. Of the values for *d* that might work (that is, values that are not multiples of 3, 7, etc.), he predicted that approximately 58% would give rise to negative Pell equations with integer solutions.

Stevenhagen’s guess was motivated in particular by the link between the negative Pell equation and the Cohen-Lenstra heuristics on class groups — a link that Koymans and Pagano exploited when, 30 years later, they finally proved him correct.

**A Better Cannon**

In 2010, Koymans and Pagano were still undergraduate students — not yet familiar with Stevenhagen’s conjecture — when a paper came out that made some of the first progress on the problem in years.

In that work, which was published in the *Annals of Mathematics*, the mathematicians Étienne Fouvry and Jürgen Klüners showed that the proportion of values of *d *that would work for the negative Pell equation fell within a certain range. To do that, they got a handle on the behavior of some elements of the relevant class groups. But they’d need an understanding of many more elements to home in on Stevenhagen’s much more precise estimate of 58%. Unfortunately, those elements remained inscrutable: Novel methods were still needed to make sense of their structure. Further progress seemed impossible.

Then, in 2017, when Koymans and Pagano were both in graduate school together at Leiden University, a paper appeared that changed everything. “When I saw this, I immediately recognized that it was a very, very impressive result,” Koymans said. “It was like, OK, now I have a cannon that I can shoot at this problem and hope that I can make progress.” (At the time, Stevenhagen and Lenstra were also professors at Leiden, which helped spark Koymans and Pagano’s interest in the problem.)

The paper was by a graduate student at Harvard, Alexander Smith (who is now a Clay fellow at Stanford). Koymans and Pagano weren’t alone in hailing the work as a breakthrough. “The ideas were amazing,” said Granville. “Revolutionary.”

Carlo Pagano, a mathematician at Concordia University in Montreal.

Olga Pagano

Smith had been trying to understand properties of solutions to equations called elliptic curves. In doing so, he worked out a specific part of the Cohen-Lenstra heuristics. Not only was it the first major step in cementing those broader conjectures as mathematical fact, but it involved precisely the piece of the class group that Koymans and Pagano needed to understand in their work on Stevenhagen’s conjecture. (This piece included the elements that Fouvry and Klüners had studied in their partial result, but it also went far beyond them.)

However, Koymans and Pagano couldn’t simply use Smith’s methods right away. (If that had been possible, Smith himself would probably have done so.) Smith’s proof was about class groups associated to the right number rings (ones in which $latex sqrt{d}$ gets adjoined to the integers) — but he considered all integer values of *d*. Koymans and Pagano, on the other hand, were only thinking about a tiny subset of those values of *d*. As a result, they needed to assess the average behavior among a much smaller fraction of class groups.

Those class groups essentially constituted 0% of Smith’s class groups — meaning that Smith could throw them away when he was writing his proof. They didn’t contribute to the average behavior that he was studying at all.

And when Koymans and Pagano tried to apply his techniques to just the class groups they cared about, the methods broke down immediately. The pair would need to make significant changes to get them to work. Moreover, they weren’t just characterizing one class group, but rather the discrepancy that might exist between two different class groups (doing so would be a major part of their proof of Stevenhagen’s conjecture) — which would also require some different tools.

So Koymans and Pagano started combing more carefully through Smith’s paper in hopes of pinpointing exactly where things started to go off the rails. It was difficult, painstaking work, not just because the material was so complicated, but because Smith was still refining his preprint at the time, making needed corrections and clarifications. (He posted the new version of his paper online last month.)

For a whole year, Koymans and Pagano learned the proof together, line by line. They met every day, discussing a given section over lunch before spending a few hours at a blackboard, helping each other work through the relevant ideas. If one of them made progress on his own, he texted the other to update him. Shusterman recalls sometimes seeing them working long into the night. In spite of (or perhaps because of) the challenges it entailed, “that was very fun to do together,” Koymans said.

They ultimately identified where they’d need to try a fresh approach. At first, they were only able to make modest improvements. Together with the mathematicians Stephanie Chan and Djordjo Milovic, they figured out how to get a handle on some additional elements in the class group, which allowed them to get better bounds than Fouvry and Klüners had. But significant pieces of the class group’s structure still eluded them.

One major problem they had to tackle — something for which Smith’s method no longer worked in this new context — was ensuring that they were truly analyzing “average” behavior for class groups as the values of *d *got larger and larger. To establish the proper degree of randomness, Koymans and Pagano proved a complicated set of rules, called reciprocity laws. In the end, that allowed them to gain the control they needed over the difference between the two class groups.

That advance, coupled with others, allowed them to finally complete the proof of Stevenhagen’s conjecture earlier this year. “It’s amazing that they were able to solve it completely,” Chan said. “Previously, we had all these issues.”

What they did “surprised me,” Smith said. “Koymans and Pagano have sort of kept my old language and just used it to push further and further in a direction that I barely understand anymore.”

**The Sharpest Tool**

From the time he introduced it five years ago, Smith’s proof of one part of the Cohen-Lenstra heuristics was seen as a way to open doors to a host of other problems, including questions about elliptic curves and other structures of interest. (In their paper, Koymans and Pagano list about a dozen conjectures they hope to use their methods on. Many have nothing to do with the negative Pell equation or even class groups.)

“A lot of objects have structures that are not dissimilar to these sorts of algebraic groups,” Granville said. But many of the same roadblocks that Koymans and Pagano had to confront are also present in these other contexts. The new work on the negative Pell equation has helped dismantle these roadblocks. “Alexander Smith has told us how to build these saws and hammers, but now we have to make them as sharp as possible and as hard-hitting as possible and as adaptable as possible to different situations,” Bartel said. “One of the things this paper does is go a great deal in that direction.”

All of this work, meanwhile, has refined mathematicians’ understanding of just one facet of class groups. The rest of the Cohen-Lenstra conjectures remain out of reach, at least for the moment. But Koymans and Pagano’s paper “is an indication that the techniques we have for attacking problems in Cohen-Lenstra are kind of growing up,” Smith said.

Lenstra himself was similarly optimistic. It is “absolutely spectacular,” he wrote in an email. “It really opens up a new chapter in a branch of number theory that is just as old as number theory itself.”