Reflections on Agents and RL

Value Functions as the Embodiment of Human Long-Termism

We now have a concise definition of an agent: "AI that can independently perform tasks, make decisions, and execute plans." (See image below—allegedly OpenAI's internal roadmap to AGI, though I couldn't verify the source).
AI Development Levels

From my understanding, at its core, an agent can be understood as a generative model wrapped with reinforcement learning—essentially, a model that, in addition to optimizing for next-token prediction, also incorporates a value function to guide its actions.

This framework makes sense to me, at least for now, so let's run a quick thought experiment:
Could we categorize human cognitive abilities in a similarly progressive way?

If we temporarily set aside the social roles and significance of different cognitive styles and instead focus solely on intellectual competence, we might get a hierarchy like this:

  • To become an organizer, one must first be an innovator.
  • To be an innovator, one must first function as an agent.
  • To be an agent, one must first be a reasoner.

This idea emerged from a recent personal realization.

Lately, I've been reassessing my decision-making framework, realizing that I never defined a clear value function for myself.

I've been navigating between short-term external feedback loops and scattered reward signals (such as periodic reflections and project-based feedback). But these are, at best, immediate reward functions. A major sticking point is that I haven't explicitly defined what larger system I'm optimizing for. (There are probably other reasons too—I just can't quite analyze them yet. But I suspect that a tendency to people-please has sometimes overridden my pursuit of what I actually want, reinforced by habit and a generally supportive environment.)

This week, I removed certain social media platforms that tend to manipulate self-perception and replaced them with information-dense alternatives. The change was immediately noticeable—my perspective broadened significantly. And under the push of external pressures, I've started re-evaluating the importance of long-term planning, gradually shaping more structured value functions for myself.

Now, why is a value function distinct, and why does it matter?

Unlike an emergent skill or a reactive mechanism that simply responds to external stimuli or new prompts, a value function provides a top-down, long-term guiding principle—something akin to how the frontal lobe performs expected value calculations to restrict decision-making pathways.

Interestingly, one might assume that an objective function should be the true definition of "long-termism." But for humans, the species-wide objective function is likely just survival and reproduction. This means that at an individual level, it's actually the value function—something we actively define ourselves—that becomes the real embodiment of personal long-termism.