PSA: reward is part of the habit loop too

The usual setup of a habit is

  1. cue
  2. craving
  3. routine
  4. reward

Among my social circle, rewarding oneself seems to be lost by the wayside. Sadly funny for a crowd of people into reinforcement learning, since they’re trying to skip the reinforcement and wondering why it doesn’t stick1. My reward is usually reading fiction or playing a video game. For brushing my teeth, it’s the nice shiny feeling at the end, which I make a point of noticing. For going to the gym, it’s the steam room after.

Figure out what you like. Anything can be a reward as long as it feels good. Don’t try to logic yourself into wanting “the right rewards”. Has that worked before? Surrender. Let the soft animal of your body love what it loves.

  1. RL without the reward is an environment (states, actions, transitions) that’s Markovian. It has a syntax (things are happening), but no semantics (without a reward function, there’s no interpretation of whether something is good or bad). 

Related Posts

NTK reparametrization

Kate from Vancouver, please email me

ChatGPT Session: Emotions, Etymology, Hyperfiniteness

Some ChatGPT Sessions

2016 ML thoughts

My biggest takeaway from Redwood Research REMIX

finite, actual infinity, potential infinity

Actions and Flows

a kernel of lie theory

The hyperfinite timeline