PSA: reward is part of the habit loop too

The usual setup of a habit is

cue
craving
routine
reward

Among my social circle, rewarding oneself seems to be lost by the wayside. Sadly funny for a crowd of people into reinforcement learning, since they’re trying to skip the reinforcement and wondering why it doesn’t stick¹. My reward is usually reading fiction or playing a video game. For brushing my teeth, it’s the nice shiny feeling at the end, which I make a point of noticing. For going to the gym, it’s the steam room after.

Figure out what you like. Anything can be a reward as long as it feels good. Don’t try to logic yourself into wanting “the right rewards”. Has that worked before? Surrender. Let the soft animal of your body love what it loves.

RL without the reward is an environment (states, actions, transitions) that’s Markovian. It has a syntax (things are happening), but no semantics (without a reward function, there’s no interpretation of whether something is good or bad). ↩

PSA: reward is part of the habit loop too

Compactness of the Classical Groups

Derivative AT a Discontinuity

Just because 2 things are dual, doesn't mean they're just opposites

Boolean Algebra, Arithmetic POV

discontinuous linear functions

Continuous vs Bounded

Minimal Surfaces

November 2, 2023

NTK reparametrization

Kate from Vancouver, please email me

PSA: reward is part of the habit loop too

Related Posts

Compactness of the Classical Groups

Derivative AT a Discontinuity

Just because 2 things are dual, doesn't mean they're just opposites

Boolean Algebra, Arithmetic POV

discontinuous linear functions

Continuous vs Bounded

Minimal Surfaces

November 2, 2023

NTK reparametrization

Kate from Vancouver, please email me