But this is impossible by definition
The title may seem like a contradiction. How can you differentiate something that’s not even continuous?
The usual definition of the derivative of a function
If
Consider the step function, also known as the Heaviside step function. The Heaviside step function,
What’s its derivative? Forget the formal definition of the derivative for a moment, and just consider the step function and the intuitive idea of a derivative as a slope.
Outside of
Look at the step function, left to right. At 0, supposing for a moment there is a derivative, then it can’t be any standard number. It’s not 0 since it’s certainly not flat. It’s not negative since it’s increasing to the right. It’s bigger than 1, 2, 3, 10000000000…
Since it goes from 0 to 1 in the space of a single point, the derivative would have to be infinitely big because its difference quotient is
The usual way mathematicians deal with the Problem is to introduce generalized functions. They are also called distributions, which is confusing since probability distributions are completely different.
The formal definition of a generalized function is: an element of the continuous dual space of a space of smooth functions.
Well, what the fuck does that mean?
Luckily, you don’t have to care. The Problem was that no real number is infinitely big, so let’s just deal with that directly. Instead of introducing some abstract dual space of smooth functions, we can just enlarge the space of numbers. You’re used to this since childhood, ever since you first learned about negative numbers, irrational numbers, and imaginary numbers. This is one more enlargement.
The reason for the Problem is that we are really thinking about a line but identifying it with the real numbers. Identifying a line with the real numbers is just that – an identification – and it’s not even the best one.
We’ll use the hyperreal numbers from the unsexily named field of nonstandard analysis to offer a radically elementary way of thinking about the problem. We will think of the line as the hyperreals rather than the reals. Consider this picture, where
Now rather than a dual space, you may think of a fundamentally nonstandard function. Its domain or range (or both) are nonstandard. And unlike a generalized “function”, it’s a bona fide function, just not over the reals1. For the rest of this post, I will call the usual generalized functions “generalized functions” and this concept “nonstandard functions”. This class of nonstandard functions includes the generalized functions, but is bigger. Rather than cutting down the space of functions to a smaller space, we’re enlarging the space of numbers and functions between them.
Take the Dirac delta function. It’s infinitely tall, infinitely skinny, and has area 1 under it. So its nonzero domain is infinitesimal, and its range is infinitely big. We can formalize “infinitely tall, infinitely skinny, and has area 1 under it” very literally, as you’ll see. We will also free it from living only under an integral sign, like Cauchy did when he first defined it in 1827.2
What does it mean to be infinitely big?
A number
Here are the hyperintegers
A world in every grain of sand
A number
0 is the only number which is both standard and infinitesimal, and it’s the smallest infinitesimal. There is no second smallest infinitesimal, just like how there’s no second smallest positive real number.
We say 2 numbers
There is a standard part function, which takes a hyperreal number and gives you the standard real number that it’s infinitely close to. Infinitely big numbers have no standard part, since they’re not infinitely close to any real number. The standard part of an infinitesimal is 0.
What’s wrong with ?
The problem with
Unlike the usual conception of infinity,
Examples of hyperfinite quantities
Here’s a video of an example since my friend Ben asked. It features some leaves since I was taking a walk.
The Number of Pieces an Integral is Cut Into
You’re probably familiar with the idea that each piece has infinitesimal width, but what about the question of ‘how MANY pieces are there?’. The answer to that is a hypernatural number. Let’s call it
The Number of Sides of a Circle
Consider a circle as a regular polygon with infinitely many sides.
In the plot below, even 100 sides is barely discernible from a true circle.
The Number of Colors in the Spectrum
In our everyday experience, we perceive colors as a continuous spectrum, seamlessly transitioning from one hue to another. However, when we apply nonstandard analysis, we can think of the color spectrum as being divided into
Imagine splitting the visible spectrum, say from 400 nm (violet) to 700 nm (red), into
Germ of Generality: The Step Function
Now we’ll elaborate our running example: the step function
We can model the step function dynamically or statically.
Dynamically:
But we can do it statically as well. Instead of making a sequence, why not just use ONE number? We can skip to the end of the process and just let
KEY POINT: Our nonstandard logistic function, the point of this whole post, is:
This nonstandard logistic function is appreciably indiscernible from the step function. The difference between them is “one (standard) point thick”.
The Derivative of the Nonstandard Logistic Function
KEY(ER) POINT: To take the derivative, you just, uh, take the derivative. Treat
If you want a formal definition of this “new” derivative:
The derivative of a function
where
If you did this with the usual definition of generalized functions, you wouldn’t figure out how to compute anything with them until about halfway through math grad school. Or never, since I just asked my friend Elliot Glazer and he said they never got to the actual computation, just the definition. But a motivated AP calculus high school student can do this. Helluva simplication.
DERIVATIVE:
Here’s a graph of the derivative:
Spoiler alert: it’s the Dirac delta. Which makes sense, since the delta is (nearly) 0 everywhere except at the origin, where it’s infinitely big. Which is what we expected from the intuitive analysis.
Case Analysis: Exploring Different Regimes
Let’s analyze how the derivative behaves across different values of
Positive appreciable (Not infinitesimal and not 0)
A number is appreciable if it’s not infinitesimal. Y’know, big enough that you could appreciate (see) it.
Let’s plug in a (standard) positive rational number for
Consider the subexpression
So we can replace
Now our expression is:
Intuitively,
Let’s examine the Taylor series of
Since
This shows that
Then the numerator
The bottom term
Putting it all together, the whole fraction is an infinitesimal number divided a number that’s nearly 1. AKA it’s infinitesimal. So outside of 0 it looks flat.
Ditto for negative appreciable
Infinitesimal nonzero
This is a bit more complicated, since it depends on the order of
Let’s just take 1 specific value to illustrate. Consider
The original formula is:
Plugging in
So this particular value
At
Remember that I said that 0 is the smallest infinitesimal? That means that any number (even an infinite one) times 0 is still 0. EXACTLY 0.
Exactly at
Since
This highlights something that is very difficult to even think about in the standard approach: the EXACT height of something infinitely tall. Delta functions are familiar to physicists and engineers, and they’re even familiar with the idea that the domain is infinitesimal. But the height is always treated as if it’s some sort of magic symbol called INFINITY. If you asked them ‘ok but HOW tall is it?’, they’d just say INFINITY. But here it’s not just infinite, but a specific infinite number divided by 4.
Let me tell you, people look at you funny if you say something is infinity over 4 tall. In the standard approach,
Higher derivatives
No reason to stop at the first derivative. The logistic function is infinitely differentiable, so we can just keep taking derivatives.
The second derivative is:
This function is sometimes called the Laplacian of the indicator, or the dipole moment of a magnet.
Personally, I find the magnet picture intuitively helpful. A point charge flips from positive to negative in the space of a single point, and the closer you get, the higher the value of the magnetic potential. infinitely close to the magnet and the potential is infinitely big.
The graph of this one looks confusing plotted but here it is:
The higher derivatives are called multipole moments but I’ll stop at 2.
Conclusion
Sometimes, it’s easier to solve a problem by reexamining old assumptions than by introducing heavy machinery. Often.
By using nonstandard analysis and infinite numbers, we’ve found a way to differentiate the Heaviside step function using actual functions rather than distributions. This approach offers a more intuitive understanding of discontinuous functions and their derivatives, bridging the gap between mathematical rigor and intuitive comprehension.
In the realm of nonstandard analysis, infinite numbers aren’t obstacles but stepping stones—bridges that connect the discontinuities of mathematics with the continuity of intuition.
Here is a video I made on this. And a software library. And another software library. Here’s a calculator that works with infinitesimals and infinitely big numbers.
Credit to Euler, Cauchy, Mikhail Katz.
-
This concept can be extended far beyond the reals, but that’s a topic for another post. ↩
-
Differential forms are another thing nonstandard analysis can free from the tyranny of life under the yoke of the integral sign. But that’s for another post. ↩
-
How did I know to use the logistic function? Luck. Before that bums you out too much, keep in mind that determining whether a (standard or not) function is even continuous at a point is undecidable. That’s why no one gave you a general formula to find limits, because there isn’t one. This is just another example of that. ↩
-
The reason for rationals is that you can make them as close as you want to any real number, and they’re easy to work with. ↩