PSA: reward is part of the habit loop too

The usual setup of a habit is

cue
craving
routine
reward

Among my social circle, rewarding oneself seems to be lost by the wayside. Sadly funny for a crowd of people into reinforcement learning, since they’re trying to skip the reinforcement and wondering why it doesn’t stick¹. My reward is usually reading fiction or playing a video game. For brushing my teeth, it’s the nice shiny feeling at the end, which I make a point of noticing. For going to the gym, it’s the steam room after.

Figure out what you like. Anything can be a reward as long as it feels good. Don’t try to logic yourself into wanting “the right rewards”. Has that worked before? Surrender. Let the soft animal of your body love what it loves.

RL without the reward is an environment (states, actions, transitions) that’s Markovian. It has a syntax (things are happening), but no semantics (without a reward function, there’s no interpretation of whether something is good or bad). ↩

PSA: reward is part of the habit loop too

Etymology Is Astrology for Men

a perfectable programming language

MBTI and AI

Double Date

Worse Than a Sranc

thanks whole foods lady

Another way of doing big O notation

Compactness of the Classical Groups

Derivative AT a Discontinuity

Just because 2 things are dual, doesn't mean they're just opposites

PSA: reward is part of the habit loop too

Related Posts

Etymology Is Astrology for Men

a perfectable programming language

MBTI and AI

Double Date

Worse Than a Sranc

thanks whole foods lady

Another way of doing big O notation

Compactness of the Classical Groups

Derivative AT a Discontinuity

Just because 2 things are dual, doesn't mean they're just opposites