I recently read this debate on StackExchange.

…it is hard for me to resist sharing one of my favorite examples of improving proofs with modern machinery: the Intermediate Value Theorem. This theorem is so intuitively obvious, but the proof using classical analysis involves taking a supremum of the set of … and then showing using continuity that this supremum satisfies … There are lots of \(\delta\)s and \(\varepsilon\)s and the proof feels uninspiring and technical at best.

Enter topology. The proof that the image of a connected set is connected for a continuous function is simple and intuitive, as is the notion of a connected set. Once this is established, the Intermediate Value Theorem is essentially just the statement that an interval is a connected set, so the image must be connected. This proof captures, in my opinion, the intuition of the Intermediate Value Theorem in a precise way.

“This is really the same proof in a different language.”

How do you distinguish a different proof from the same proof in different language? My point is that the concepts simplify the language of the proof and put it in an intuitive context.

I would suggest that, at least for this theorem, what separates the messy proof from the slick one (even if they are morally the same) is that the latter practices a clear separation between independent concepts, and the former just rolls everything up into one big argument. In other words, it uses lemmas.

I think you have really hit the nail on the head. Never underestimate the value of “splitting” theorems by introducing a key definition: here, we go from IVT to the definition of connectedness and then two results, (i) the general result using the new definition (a continuous image of a connected space is connected) and (ii) the observation that a specific structure satisfies the new definition (connectedness of intervals in R). It’s amazing how many results have this form and how useful this is: e.g. Hensel’s Lemma leads to Henselian rings, and so forth.

To a programmer, this entire debate would be rather silly because the “deep idea” at work is nothing more than “don’t do the same work twice”.

Here’s another idea: not a deep one but perhaps useful.

Modularizing proofs could give a benchmark for the “cleanness” of a proof: a proof is “well-done” if logically distinct ideas are presented as separate theorems, much like how separate functions are broken off into distinct programs in software.

Yan Zhang once told me that he
thought of programming as the *practice* of mathematics, while research
math is the “theory” of mathematics. This is probably the best
illustration of that idea that I’ve seen so far.