Erdos 281 solved with ChatGPT 5.2 Pro

94 points | by nl 2 hours ago

32 comments

  • xeeeeeeeeeeenu 24 minutes ago

    > no prior solutions found.

    This is no longer true, a prior solution has just been found[1], so the LLM proof has been moved to the Section 2 of Terence Tao's wiki[2].

    [1] - https://www.erdosproblems.com/forum/thread/281#post-3325

    [2] - https://github.com/teorth/erdosproblems/wiki/AI-contribution...

  • doctoboggan 25 minutes ago

    Can anyone give a little more color on the nature of Erdos problems? Are these problems that many mathematicians have spend years tackling with no result? Or do some of the problems evade scrutiny and go un-attempted for most of the time?

    EDIT: After reading a link someone else posted to Terrance Tao's wiki page, he has a paragraph that somewhat answers this question:

    > Erdős problems vary widely in difficulty (by several orders of magnitude), with a core of very interesting, but extremely difficult problems at one end of the spectrum, and a "long tail" of under-explored problems at the other, many of which are "low hanging fruit" that are very suitable for being attacked by current AI tools. Unfortunately, it is hard to tell in advance which category a given problem falls into, short of an expert literature review. (However, if an Erdős problem is only stated once in the literature, and there is scant record of any followup work on the problem, this suggests that the problem may be of the second category.)

    from here: https://github.com/teorth/erdosproblems/wiki/AI-contribution...

  • sequin 34 minutes ago

    FWIW, I just gave Deepseek the same prompt and it solved it too (much faster than the 41m of ChatGPT). I then gave both proofs to Opus and it confirmed their equivalence.

    The answer is yes. Assume, for the sake of contradiction, that there exists an \(\epsilon > 0\) such that for every \(k\), there exists a choice of congruence classes \(a_1^{(k)}, \dots, a_k^{(k)}\) for which the set of integers not covered by the first \(k\) congruences has density at least \(\epsilon\).

    For each \(k\), let \(F_k\) be the set of all infinite sequences of residues \((a_i)_{i=1}^\infty\) such that the uncovered set from the first \(k\) congruences has density at least \(\epsilon\). Each \(F_k\) is nonempty (by assumption) and closed in the product topology (since it depends only on the first \(k\) coordinates). Moreover, \(F_{k+1} \subseteq F_k\) because adding a congruence can only reduce the uncovered set. By the compactness of the product of finite sets, \(\bigcap_{k \ge 1} F_k\) is nonempty.

    Choose an infinite sequence \((a_i) \in \bigcap_{k \ge 1} F_k\). For this sequence, let \(U_k\) be the set of integers not covered by the first \(k\) congruences, and let \(d_k\) be the density of \(U_k\). Then \(d_k \ge \epsilon\) for all \(k\). Since \(U_{k+1} \subseteq U_k\), the sets \(U_k\) are decreasing and periodic, and their intersection \(U = \bigcap_{k \ge 1} U_k\) has density \(d = \lim_{k \to \infty} d_k \ge \epsilon\). However, by hypothesis, for any choice of residues, the uncovered set has density \(0\), a contradiction.

    Therefore, for every \(\epsilon > 0\), there exists a \(k\) such that for every choice of congruence classes \(a_i\), the density of integers not covered by the first \(k\) congruences is less than \(\epsilon\).

    \boxed{\text{Yes}}

      amluto 18 minutes ago

      I find it interesting that, as someone utterly unfamiliar with ergodic theory, Dini’s theorem, etc, I find Deepseek’s proof somewhat comprehensible, whereas I do not find GPT-5.2’s proof comprehensible at all. I suspect that I’d need to delve into the terminology in the GPT proof if I tried to verify Deepseek’s, so maybe GPT’s is being more straightforward about the underlying theory it relies on?

  • carbocation an hour ago

    The erdosproblems thread itself contains comments from Terence Tao: https://www.erdosproblems.com/forum/thread/281

  • redbluered an hour ago

    Has anyone verified this?

    I've "solved" many math problems with LLMs, with LLMs giving full confidence in subtly or significantly incorrect solutions.

    I'm very curious here. The Open AI memory orders and claims about capacity limits restricting access to better models are interesting too.

      bpodgursky an hour ago

      Terence Tao gave it the thumbs up. I don't think you're going to do better than that.

        bparsons 3 minutes ago

        It's already been walked back.

  • pessimist an hour ago

    From Terry Tao's comments in the thread:

    "Very nice! ... actually the thing that impresses me more than the proof method is the avoidance of errors, such as making mistakes with interchanges of limits or quantifiers (which is the main pitfall to avoid here). Previous generations of LLMs would almost certainly have fumbled these delicate issues.

    ...

    I am going ahead and placing this result on the wiki as a Section 1 result (perhaps the most unambiguous instance of such, to date)"

    The pace of change in math is going to be something to watch closely. Many minor theorems will fall. Next major milestone: Can LLMs generate useful abstractions?

      radioactivist 30 minutes ago

      Seems like the someone dug something up from the literature on this problem (see top comment on the erdosproblems.com thread)

      "On following the references, it seems that the result in fact follows (after applying Rogers' theorem) from a 1936 paper of Davenport and Erdos (!), which proves the second result you mention. ... In the meantime, I am moving this problem to Section 2 on the wiki (though the new proof is still rather different from the literature proof)."

  • dernett 35 minutes ago

    This is crazy. It's clear that these models don't have human intelligence, but it's undeniable at this point that they have _some_ form of intelligence.

      brendyn 15 minutes ago

      If LLMs weren't created by us but where something discovered in another species' behaviour it would be 100% labelled intelligence

      qudat 33 minutes ago

      My take is that a huge part of human intelligence is pattern matching. We just didn’t understand how much multidimensional geometry influenced our matches

        keeda 13 minutes ago

        Yes, it could be that intelligence is essentially a sophisticated form of recursive, brute force pattern matching.

        I'm beginning to think the Bitter Lesson applies to organic intelligence as well, because basic pattern matching can be implemented relatively simply using very basic mathematical operations like multiply and accumulate, and so it can scale with massive parallelization of relatively simple building blocks.

        sdwr 28 minutes ago

        I don't think it's accurate to describe LLMs as pattern matching. Prediction is the mechanism they use to ingest and output information, and they end up with a (relatively) deep model of the world under the hood.

          keeda 6 minutes ago

          Yes, the world model building is achieved via pattern matching and happens during ingestion and training, but that is also part of the intelligence.

          DrewADesign 12 minutes ago

          Which is even more true for humans.

      altmanaltman 21 minutes ago

      Depends on what you mean by intelligence, human intelligence and human

      ekianjo 24 minutes ago

      It's pattern matching. Which is actually what we measure in IQ tests, just saying.

        jadenpeterson 21 minutes ago

        There's some nuance. IQ tests measure pattern matching and, in an underlying way, other facets of intelligence - memory, for example. How well can an LLM 'remember' a thing? Sometimes Claude will perform compaction when its context window reaches 200k "tokens" then it seems a little colder to me, but maybe that's just my imagination. I'm kind of a "power user".

        rurban 17 minutes ago

        I call it matching. Pattern matching had a different meaning.

  • a_tartaruga 37 minutes ago

    Out of curiosity why has the LLM math solving community been focused on the Erdos problems over other open problems? Are they of a certain nature where we would expect LLMs to be especially good at solving them?

      krackers 26 minutes ago

      I guess they are at a difficulty where it's not too hard (unlike millennium prize problems), is fairly tightly scoped (unlike open ended research), and has some gravitas (so it's not some obscure theorem that's only unproven because of it's lack of noteworthiness).

  • ashleyn 25 minutes ago

    I guess the first question I have is if these problems solved by LLMs are just low-hanging fruit that human researchers either didn't get around to or show much interest in - or if there's some actual beef here to the idea that LLMs can independently conduct original research and solve hard problems.

      dyauspitr 24 minutes ago

      There is still value on letting these LLMs loose on the periphery and knocking out all the low hanging fruit humanity hasn’t had the time to get around to. Also, I don’t know this, but if it is a problem on Erdos I presume people have tried to solve it atleast a little bit before it makes it to the list.

  • mikert89 39 minutes ago

    I have 15 years of software engineering experience across some top companies. I truly believe that ai will far surpass human beings at coding, and more broadly logic work. We are very close

      anonzzzies 16 minutes ago

      HN will be the last place to admit it; people here seem to be holding out with the vague 'I tried it and it came up with crap'. While many of us are shipping software without touching (much) code anymore. I have written code for over 40 years and this is nothing like no-code or whatever 'replacing programmers' before, this is clearly different judging from the people who cannot code with a gun to their heads but still are shipping apps: it does not really matter if anyone believes me or not. I am making more money than ever with fewer people than ever delivering more than ever.

      We are very close.

      (by the way; I like writing code and I still do for fun)

      user3939382 a minute ago

      They can only code to specification which is where even teams of humans get lost. Without much smarter architecture for AI (LLMs as is are a joke) that needle isn’t going to move.

      daxfohl 27 minutes ago

      They already do. What they suck at is common sense. Unfortunately good software requires both.

        anonzzzies 15 minutes ago

        Most people also suck at common sense, including most programmers, hence most programmers do not write good software to begin with.

        marktl 17 minutes ago

        Or is it fortunate (for a short period at least).

  • ares623 an hour ago

    This must be what it feels like to be a CEO and someone tells me they solved coding.