A utility-based agent in artificial intelligence is the answer to a problem that simpler AI agents cannot solve: what do you do when multiple actions all technically achieve your goal, but some achieve it far better than others? Reflex agents react. Goal-based agents reach targets. A utility-based agent in AI does something more sophisticated: it measures how good each possible outcome actually is and picks the best one available given the current situation.
This distinction matters more than it might look. In any real deployment environment, trade-offs are constant. A self-driving car does not just need to reach a destination; it needs to do so safely, quickly, and legally, all at the same time. A medical diagnostic system does not just need to flag a condition; it needs to weigh diagnostic accuracy against treatment risk and patient history. These are not goal-satisfaction problems. They are optimisation problems, and utility-based agents are built to handle exactly that.
Comprehensive Summary
- Utility-Based Agent Definition: A utility-based agent in AI picks actions by scoring every possible outcome and choosing the one with the highest utility value, not just the one that meets a binary goal.
- Utility Function in Artificial Intelligence: The utility function U(s) assigns a numerical value to each state, letting the agent compare outcomes that involve trade-offs rather than clear right-or-wrong choices.
- Architecture of Utility-Based Agents: Every utility-based agent runs on five components: sensors, a world model, a utility function, a reasoning engine, and actuators that carry out the selected action.
- Utility Theory in Artificial Intelligence: Rooted in economics and decision theory, utility theory gives AI agents a principled way to reason under uncertainty by maximising expected utility across probabilistic outcomes.
- Applications of Utility AI Agents: Autonomous vehicles, healthcare diagnostics, financial trading systems, and e-commerce recommendation engines all use utility-based reasoning in production today.
Key Takeaways
- A utility-based agent in AI scores outcomes on a spectrum rather than pass-or-fail, which is what makes it actually useful when real-world trade-offs are involved.
- Getting the utility function wrong, bad value/weights, missing dimensions, and ignored edge cases produce an agent that optimises confidently for the wrong thing.
- The field is shifting toward agents that learn their utility functions from human feedback, cutting the manual design work and getting closer to what people actually want.
Curious how agentic AI systems are built?
Understanding Utility-Based Agents: Definition and Core Concepts
Before getting into architecture and applications, it helps to be clear about what a utility-based agent actually is and how it sits within the broader landscape of AI agent types. The core idea is deceptively simple: assign a number to every possible state of the world, and always move toward the state with the highest number. The complexity lies in designing that number to accurately reflect what you actually want.
What Is a Utility-Based Agent in AI? Plain-Language Definition
A utility-based agent in AI is an autonomous system that selects actions by evaluating the desirability of every outcome those actions could produce. It does this using a utility function, which maps each possible world state to a numerical score. The agent then picks the action most likely to lead to the highest-scoring state.
Where a goal-based agent asks, “Does this action get me to the goal?” a utility-based agent asks, “How good is this outcome compared to every other outcome I could reach?” That shift from binary to continuous evaluation is what makes utility-based agents better suited to environments where good enough is not the same as best possible.
Primary Components of a Utility-Based Agent: Sensors, Models, and Actuators
Every utility-based agent is built around five components working in sequence. Sensors read the environment and feed raw data into the system. A world model processes that data and builds an internal representation of the current state. The utility function scores possible future states. A reasoning engine selects the action predicted to maximise utility. Actuators then carry out that action in the physical or digital environment.
These components are not optional layers. Remove any one of them, and the agent either cannot perceive its environment, cannot evaluate outcomes, or cannot act. In production systems, the world model and utility function are typically the components that require the most engineering effort to get right.
How Utility-Based Agents Differ From Reflex and Model-Based Agents
A simple reflex agent maps percepts directly to actions using if-then rules. It has no memory and no model of the world. A model-based agent improves on this by maintaining an internal state that tracks the world over time, but it still selects actions based on whether they achieve a goal, not how well they achieve it.
A utility-based agent goes further on both dimensions. It maintains a world model like a model-based agent, but it also evaluates the quality of each reachable state using a utility function. The result is an agent that can handle uncertainty, compare imperfect options, and make principled trade-offs, three things that simpler agent types cannot do reliably.
Utility Theory in Artificial Intelligence: Foundations and Rational Choice
Utility theory did not start in an AI lab. Economists and decision theorists were wrestling with it centuries before anyone built a computer, trying to answer one question: how does a rational agent choose between uncertain outcomes? The answer they landed on was not “follow a rulebook.” It was “assign a value to every possible outcome and pick the one with the highest value.” AI borrowed this framework wholesale because it solves exactly the same problem that agent designers face: how do you get a system to make sensible choices when the environment is messy, and outcomes are never guaranteed?
The Utility Function in Artificial Intelligence: Mathematical Definition U(s)
The utility function in artificial intelligence is written as U(s), where s is any possible state of the world and U(s) is the number that says how good that state is. The higher the number, the more the agent prefers that state. Every decision the agent makes comes down to one rule: take the action that leads to the highest U(s).
Most real problems involve more than one thing the agent cares about, so the utility function expands into a weighted sum across multiple factors:
U(s) = w₁·f₁(s) + w₂·f₂(s) + … + wₙ·fₙ(s)
Each f(s) measures something concrete about that state. In an autonomous vehicle, f₁ might be safety margin, f₂ the travel time, and f₃ the fuel consumption. The w values are the weights, and they carry all the judgement calls — how much does speed matter relative to safety? How much does fuel efficiency matter relative to both?
Getting those weights right is genuinely hard. In most production deployments of utility function AI, it is where the real engineering argument happens.
Expected Utility and Probabilistic Reasoning in Utility Function AI
Real environments are not deterministic. An action does not always produce the same outcome. Utility function AI handles this through expected utility: rather than evaluating a single outcome, the agent computes a probability-weighted average of utilities across all possible outcomes that an action might produce.
The formula is EU(a) = Σ P(s’|s,a) · U(s’), where P(s’|s,a) is the probability of reaching state s’ by taking action a from state s. The agent selects the action with the highest expected utility. This allows it to reason sensibly under uncertainty, preferring a reliable moderate outcome over a gamble that offers a high payoff but a high probability of failure.
How Utility Theory in Artificial Intelligence Enables Trade-Off Decision Making
The reason utility theory in artificial intelligence is so useful is that it gives the agent a single consistent framework for comparing outcomes that differ across multiple dimensions simultaneously. Without it, an agent faced with two options, one faster but riskier, one slower but safer, has no principled way to choose.
With a utility function, the trade-off becomes computable. The agent converts speed and safety into utility scores, applies the relevant weights, and picks the option with the higher total. This is exactly how humans make rational decisions under uncertainty, and it is the theoretical foundation that makes utility-based agents more powerful than rule-following or goal-satisfying alternatives.
How a Utility-Based Agent in AI Works: Architecture and Decision Workflow
The decision process inside a utility-based agent in AI follows a repeating loop. Perceive, model, evaluate, act, repeat. Each pass through the loop refines the agent’s understanding of its environment and produces one action. In fast-moving environments, this loop may run hundreds of times per second. In slower domains like financial analysis or treatment planning, a single pass may take several minutes of computation.
Step 1 – Perception: How the Utility-Based Agent Reads Its Environment
Perception is where the agent gathers data from its environment through sensors. In a physical robot, those sensors are cameras, lidar, microphones, and tactile inputs. In a software agent, sensors are data feeds, API responses, database queries, and user inputs.
The raw sensor data is noisy and incomplete. The agent’s first job is to filter, aggregate, and interpret that data into a usable representation of the current state. The quality of this perception layer directly limits how well the agent can evaluate outcomes. A utility function applied to bad state data will still produce bad decisions.
Step 2 – Internal Model Update and Outcome Prediction Using Utility Function AI
Perception gives the agent raw data. What it does next is build a picture of the world from that data and keep that picture current. The internal model is not just a snapshot of right now; it holds a running record of what actions were taken, how the environment responded, and what patterns have emerged over time. That history is what lets the agent make sensible predictions rather than treating every moment as if nothing came before it.
Once the model is updated, the agent runs through its candidate actions one by one and asks: if I do this, what happens next? In a predictable environment, that question has a clean answer, and the agent maps each action to a single predicted state. In an uncertain environment, the same action might lead to several different states with different probabilities, so the agent builds a distribution of likely outcomes instead. Either way, what comes out of this step is a set of predicted future states ready to be scored.
Step 3 – Action Selection: Maximising the Utility Function in Artificial Intelligence
Now the utility function in artificial intelligence does its job. The agent takes every predicted state from the previous step and runs it through U(s’) to get a score. In probabilistic settings, it computes EU(a) the expected utility of each action, weighted by the probability of each outcome that action might produce. The action with the highest score wins.
Straightforward in principle, but one situation trips up agents that are not carefully designed: two actions score nearly identically. The utility function alone gives no guidance on how to break that tie. In practice, designers add a secondary rule to prefer the cheaper action, prefer the one whose consequences can be undone, and prefer the one with less variance in outcomes. None of this comes automatically from the utility function itself. It has to be built in deliberately, and skipping it produces agents that behave arbitrarily whenever the margin between options is small.
How LLMs Integrate With Utility-Based Agents for Complex Goal Reasoning
Large language models are increasingly being used as reasoning layers inside agentic AI systems, and utility-based decision-making is a natural fit for that architecture. In frameworks like LangChain and AutoGen, an LLM can act as the world model, interpreting ambiguous inputs, generating candidate action plans, and predicting likely outcomes in natural language.
A separate utility scoring layer then evaluates those candidate plans against defined objectives and selects the one with the highest expected utility. This hybrid architecture combines the language understanding and generative reasoning of LLMs with the principled optimisation of utility function AI, producing agents capable of tackling complex, multi-step tasks in messy real-world conditions.
Want to learn to build agentic AI systems?
Utility-Based Agent vs. Goal-Based Agent: A Side-by-Side Comparison
Goal-based and utility-based agents get grouped together often enough that the distinction starts to feel minor. It is not. Both maintain internal world models and both plan ahead, but the similarity stops there. A goal-based agent is asking a yes-or-no question: Did this action get me to the goal? A utility-based agent is asking something harder: out of everything I could do right now, which option is actually the best? For anyone building systems that face ambiguous inputs, competing objectives, or outcomes that are acceptable but not ideal, that difference determines whether the agent behaves sensibly or just technically succeeds.
When to Use a Utility-Based Agent Over Other AI Agent Types
Pick a utility-based agent in AI when the problem has more than one thing to optimise at the same time, when outcomes sit on a spectrum of better and worse rather than simply working or not working, or when the environment is probabilistic enough that no action reliably produces the same result twice. If any one of those conditions is true, simpler agent types will either fail outright or produce decisions you cannot justify.
| Agent Type | Decision Basis | Handles Trade-offs | Works Under Uncertainty | Best For |
| Simple Reflex | Condition-action rules | No | No | Fast, predictable environments |
| Model-Based Reflex | Internal state + rules | No | Partially | Tracking partially observable states |
| Goal-Based | Achieves the goal or not | No | Partially | Clear, binary success conditions |
| Utility-Based | Maximises utility score | Yes | Yes | Complex trade-offs, uncertain outcomes |
| Learning Agent | Adapts from feedback | Yes, over time | Yes | Dynamic environments with changing goals |
Advantages of Utility-Based Agents: Adaptability, Flexibility, and Reliability
The biggest thing a utility-based agent gets right is that it never treats a decision as pass-or-fail when it does not have to. If the best option is off the table, it finds the next best one. It does not freeze, it does not behave randomly, it just moves to whatever scores highest given what is actually available. That alone makes it more dependable than goal-based or reflex alternatives in any environment that changes.
- Handles competing objectives without breaking. Multiple goals get encoded as weighted terms in one function, so the agent always has a single consistent basis for choosing, even when those goals pull in opposite directions.
- Reasons sensibly under uncertainty. Expected utility calculation means the agent weighs both the value of an outcome and the probability of reaching it. It will not chase a high-reward action that has a low chance of working when a moderate-reward action is far more reliable.
- Conservative by design in high-stakes settings. That preference for reliable outcomes over risky ones is not a bug. In autonomous vehicles, medical systems, or financial platforms, an agent that avoids variance is exactly what you want.
- Cheaper to maintain when the environment shifts. Change the weights in the utility function and the agent’s priorities update. You do not touch the decision architecture at all, which is a significant advantage in production systems that need to adapt without full redeployment.
Disadvantages and Computational Costs of Utility Function AI Models
The strengths above come with real costs, and anyone deploying a utility function in artificial intelligence in a live system needs to understand them before they become production problems.
- Expensive to compute at scale. Scoring every possible state in a large environment, especially with probabilistic branching across multiple future steps, adds up fast. Real-time systems often hit hard compute limits before the agent has evaluated everything it should.
- Utility function design is genuinely hard. Choosing the right features, assigning weights that reflect actual priorities, and making sure the function does not reward unintended behaviour requires deep domain knowledge and careful testing. Most failures in utility function AI systems trace back to this step.
- Perverse optimisation is a real risk. An agent will maximise whatever you tell it to maximise. If the function is slightly wrong, the agent will find that wrongness and exploit it, behaving perfectly according to its function while doing exactly the opposite of what you wanted.
- Sparse or unreliable data degrades performance. The probabilistic reasoning that makes these agents strong under uncertainty depends on accurate probability estimates. In domains where historical data is thin or the environment shifts faster than the model updates, those estimates go wrong, and so do the decisions.
- Difficult to audit. When a utility-based agent makes a bad call, tracing why requires unpicking the utility scores, the probability estimates, and the model state at the moment of decision. That is not impossible, but it is significantly harder than reviewing a rule-based system where the logic is explicit.
Real-World Applications of Utility-Based Agents in Artificial Intelligence
Utility-based agents have moved well past academic examples. They are embedded in production systems across industries where decisions involve simultaneous trade-offs and uncertain outcomes.
The applications described below are not hypothetical. Each represents a deployed or actively researched use of utility-based reasoning in a real industry context as of 2026.
Utility-Based Agents in Autonomous Vehicles and Smart Home Systems
Autonomous vehicles are one of the clearest examples of a utility-based agent in AI deployment. At every moment, the vehicle’s decision system evaluates thousands of possible actions, steering adjustments, braking intensities, and lane positions and selects the one that maximises a utility function balancing safety margin, speed, fuel consumption, and passenger comfort.
Smart home thermostats like those used in modern HVAC systems apply similar logic at a smaller scale. The agent continuously weighs energy cost, current indoor temperature, outdoor conditions, and predicted occupancy patterns to select heating or cooling actions that maximise a utility function built around comfort and efficiency simultaneously.
Utility Function AI in Healthcare Diagnostics and Treatment Planning
In healthcare, treatment decisions are rarely clean. Two patients with the same diagnosis can have completely different clinical priorities, and a utility function AI system handles that by scoring each intervention across every dimension that matters: survival probability, side effect profile, treatment length, quality of life during and after. The agent does not pick the same answer for everyone. It picks the answer that scores highest, given that specific patient’s situation.
What makes this genuinely useful is that clinicians can adjust the weights. A patient who prioritises quality of life over maximum survival time gets a utility function that reflects that. A patient willing to tolerate aggressive side effects for a better long-term prognosis gets a different one. Two patients with near-identical diagnoses might receive completely different recommendations, one toward chemotherapy, one toward surgery, not because the agent is inconsistent, but because their utility functions correctly capture that they are not the same person facing the same trade-off.
Utility Theory in Artificial Intelligence for Financial Trading and Risk Management
Utility theory in artificial intelligence and financial economics shares a long history. Expected utility was a core concept in portfolio theory decades before anyone applied it to AI agents. Algorithmic trading systems were among the first to put it into production, and the fit is obvious: every trade involves a tension between expected return, downside risk, portfolio volatility, and transaction cost, and those factors cannot be optimised independently.
A trading agent using a utility function does not chase maximum return. It chases maximum utility, which means a trade that offers strong expected returns but introduces dangerous concentration risk might score lower than a more modest trade with a cleaner risk profile. Risk management systems use the same logic for position limits and hedging decisions. The utility function encodes where the firm actually sits on the risk-appetite spectrum, so the agent’s decisions reflect both what the firm wants to earn and what it cannot afford to lose at the same time, on every decision.
Utility-Based Agents in E-Commerce Recommendation and Supply Chain Logistics
E-commerce recommendation engines increasingly use utility-based scoring to balance multiple objectives: relevance to the user’s expressed preferences, margin contribution of recommended products, inventory availability, and diversity of recommendations to avoid filter bubbles.
Supply chain logistics systems use utility-based agents to optimise routing, warehousing, and fulfilment decisions under conditions where cost, speed, reliability, and carbon footprint all need to be balanced simultaneously. A pure goal-based system optimising only for delivery speed would produce different and often worse outcomes than a utility-based one that also accounts for cost and sustainability.
Want to build AI systems to handle real-world trade-offs?
Challenges of Applying Utility Theory in Artificial Intelligence at Scale
Utility theory in artificial intelligence is elegant in theory and genuinely difficult in practice. The gap between a well-defined utility function on paper and a well-behaved agent in a live system is where most real engineering challenges live.
Computational Complexity of Scaling Utility Function AI to Real-Time Systems
Evaluating a utility function in artificial intelligence over a large state space is computationally expensive. In a game with a small number of possible states, this is tractable. In real-world environments where the number of possible states is effectively infinite, and the agent must act in milliseconds, brute-force utility maximisation is not feasible.
Practical deployments address this through approximation methods: pruning the search space, using heuristic state representations, or limiting lookahead depth in planning. Each approximation introduces error. The challenge is keeping that error small enough that the agent still selects good actions rather than just approximately optimal ones.
Ethical Challenges: Encoding Human Values in Utility-Based Agent Design
Any utility-based agent embeds a set of values in its utility function. Those values determine what the agent treats as good outcomes and what it ignores. When those values are incomplete, biased, or misaligned with what people actually care about, the agent will optimise confidently in the wrong direction.
Bias in training data can corrupt the utility function in subtle ways. An agent trained on historical hiring data might assign high utility to candidate profiles that reflect past discriminatory patterns. A healthcare agent trained on outcomes data from a non-representative patient population might produce suboptimal recommendations for underrepresented groups. Fixing these issues requires careful audit of what the utility function is actually rewarding, which is technically and ethically demanding work.
Handling Uncertainty and Incomplete Information in the Utility Function in Artificial Intelligence
Real environments are partially observable. The agent rarely has access to the complete state of the world. It must make decisions based on incomplete sensor data, outdated models, and uncertain predictions about how the environment will respond to its actions.
The utility function in artificial intelligence handles this through probabilistic reasoning, but that requires accurate probability estimates. In domains where historical data is sparse or where the environment changes faster than the agent can learn, those probability estimates are unreliable. The agent’s decisions may still be optimal given its beliefs about the world, but if those beliefs are wrong, optimal expected utility translates into poor real-world outcomes.
How to Design and Evaluate a Utility-Based Agent in AI Projects
Designing a utility-based agent in AI is not just a modelling exercise. It is an engineering process that requires iterative testing, domain expertise, and a clear framework for evaluating whether the agent is actually doing what you intended.
Choosing the Right Utility Function in Artificial Intelligence for Your Use Case
The first question to answer is what outcomes the agent should maximise and what constraints it must respect. This sounds straightforward, but getting it wrong is the most common source of agent misbehaviour in production. A utility function that captures most of what matters but omits a key dimension will produce an agent that systematically ignores that dimension in pursuit of everything else.
Practical steps for designing the utility function in artificial intelligence for a new project: list all measurable outcomes that matter, assign initial weights based on domain expertise, test those weights in simulation, and iterate based on where the agent’s choices diverge from what a domain expert would choose. The weights are not a one-time decision; they require ongoing calibration as the deployment environment changes.
Testing and Benchmarking Utility-Based Agent Performance in Simulation
Never evaluate a utility-based agent only in its production environment. Simulation environments allow systematic testing across edge cases that may occur rarely in the real world but matter enormously when they do. An autonomous vehicle agent needs to perform well in low-probability, high-consequence scenarios, such as icy roads, sudden pedestrian crossings, and sensor failures that may not appear in normal operation.
Benchmarking should compare the agent’s utility scores against human expert decisions on the same scenarios, against alternative utility function designs, and against simpler baseline agents. The goal is not just to confirm that the agent performs better than random; it is to understand exactly where and why it diverges from expert judgment, and whether those divergences are acceptable.
Tools and Frameworks That Support Utility Function AI Development
Several frameworks are available for utility function AI development in 2026. For LLM-based agentic systems, LangChain provides agent executor components that can be extended with custom utility scoring layers. AutoGen supports multi-agent orchestration where utility functions can govern inter-agent task allocation.
For classical utility-based planning, the PDDL (Planning Domain Definition Language) ecosystem provides standardised tools for defining state spaces and utility objectives. OpenAI Gym and its successor Gymnasium remain widely used for simulation environments where utility-based agents can be trained and benchmarked. For probabilistic reasoning components, libraries like pgmpy and PyMC support the Bayesian inference work that underlies expected utility computation.
Ready to go from theory to building working AI agents?
Future Directions for Utility-Based Agents and Utility Theory in Artificial Intelligence
The core principles of utility theory in artificial intelligence are not changing. What is changing fast is how those principles are being applied at scale, across multiple agents, and in systems that learn their own utility functions rather than having them hand-coded.
Multi-Agent Systems and Cooperative Utility Function AI Architectures
Single-agent utility maximisation is well understood. The frontier in 2026 is multi-agent systems where multiple utility-based agents operate simultaneously, and their utility functions interact, sometimes cooperatively and sometimes competitively.
In cooperative settings, a fleet of delivery drones optimising a shared city-wide delivery objective, or a network of hospital resource scheduling agents, the challenge is designing collective utility functions that align individual agent incentives with system-wide goals. Naive approaches where each agent maximises its own utility independently can produce globally suboptimal outcomes, a well-documented problem in game theory. Current research is focusing on mechanism design approaches that structure the agents’ utility functions to make individual and collective optimisation align.
Advances in Learning Utility Functions Directly From Human Feedback
Hand-designing utility functions is slow, expensive, and error-prone. The most promising direction in current research is learning utility functions directly from human behaviour and feedback, eliminating the need to specify weights manually.
Reinforcement learning from human feedback (RLHF), already central to LLM alignment, is being extended to broader utility-based agents in AI architectures. Rather than requiring domain experts to specify utility function weights upfront, the agent observes human choices or receives human ratings on its decisions and infers the underlying utility function that best explains those preferences. Inverse reinforcement learning (IRL) is a closely related technique, working backwards from observed expert behaviour to recover the utility function that the expert appears to be maximising.
Conclusion
Utility-based agents represent the clearest expression of rational decision-making in AI: pick the action that produces the best outcome, measured consistently across every dimension that matters. That principle has roots in 18th-century economics and remains the most principled framework available for building AI systems that need to navigate genuine trade-offs in uncertain, real-world environments. The computational challenges are real, the ethics challenges are serious, but the underlying approach is sound and production-proven across autonomous vehicles, healthcare, finance, and logistics.
If you want to build a career working with these systems, theoretical understanding alone is not enough. Amquest Education’s Generative AI and Agentic AI programme gives you hands-on experience with the tools, frameworks, and design principles behind systems like these.
FAQs on Utility-Based Agents in Artificial Intelligence
What is a utility-based agent in artificial intelligence?
An AI agent that assigns a numerical score to every possible outcome and picks the action most likely to produce the highest score. Unlike goal-based agents, it measures how good outcomes are, not just whether they qualify as success.
What is the difference between a goal-based agent and a utility-based agent?
A goal-based agent stops when it reaches the goal. A utility-based agent keeps optimising it distinguishes between a barely acceptable outcome and an excellent one, which matters whenever trade-offs are involved.
What is the utility function in a utility-based agent?
A mathematical function written as U(s) that maps any possible world state s to a number representing how desirable that state is. The agent always acts to maximise this function.
What are real-world examples of utility-based agents?
Autonomous vehicle navigation systems, algorithmic trading platforms, hospital treatment planning tools, and e-commerce recommendation engines all use utility-based reasoning in production.
What are the advantages of utility-based agents over other AI agents?
They handle uncertainty, compare imperfect options, and make principled trade-offs. When the best option is unavailable, they find the next best rather than failing or behaving unpredictably.
What are the disadvantages or limitations of utility-based agents?
Computing utility over large state spaces is expensive and often requires approximation. Designing the utility function correctly is hard, and a poorly specified function produces confidently wrong behaviour.
How does a utility-based agent make decisions step by step?
It reads its environment through sensors, updates its internal world model, predicts outcomes for each candidate action, scores those outcomes using the utility function, and executes the highest-scoring action.
What is the difference between a utility-based agent and a learning agent?
A utility-based agent uses a fixed utility function designed by humans. A learning agent adapts its behaviour over time based on feedback. Modern systems often combine both a learning component that refines the utility function alongside a utility-based decision layer.
Where does the concept of utility in AI agents come from?
From economics and decision theory. Researchers like Daniel Bernoulli and later von Neumann and Morgenstern formalised expected utility theory in the 18th and 20th centuries. AI adopted the framework because it maps directly onto rational agent design.
Are utility-based agents used in reinforcement learning?
Yes. Reinforcement learning reward functions are closely related to utility functions, and techniques like inverse reinforcement learning work specifically to recover a utility function from observed agent behaviour.
