Algebraic manipulation is always a fun way to make something profound seem like mere trickery.
We’ll derive importance sampling by showing how it reduces to multiplying and dividing by the same thing (and is therefore equivalent because we just multiplied by 1).
We have some random variable \(x\) with PDF \(p\) that we want to take the expectation of. My notation will be sloppy except where it counts.
\[\mathbb{E}_p(x) = \int x \cdot p(x) dx\]We now introduce another PDF \(q\). By multiplying and dividing by it, we can get an expectation with respect to \(q\) instead of \(p\).
\[\begin{align*} \mathbb{E}_p(x) = \int x \cdot p(x) dx && \text{Definition of expectation with respect to PDF p} \\ = \int x \cdot p(x) \cdot \frac{q(x)}{q(x)} dx && \text{Multiply and divide by q(x)} \\ = \int (x \cdot \frac{p(x)}{q(x)}) \cdot q(x) dx && \text{Move one of the q's over and notice you have an expectation with respect to q} \\ = \mathbb{E}_q(x \cdot \frac{p(x)}{q(x)}) && \text{Rejoice} \\ \end{align*}\]Remembering the trick makes it easy to re-derive thankfully. I never remember it and had to do this derivation twice in the 10 minutes it took to write.