Home » Technical Topics » Machine Learning

Algorithms for decision making: excellent free download book from MIT

8959674477

https://algorithmsbook.com/

MIT press provides another excellent book in creative commons.

Algorithms for decision making: free download book

I plan to buy it and I recommend you do. This book provides a broad introduction to algorithms for decision making under uncertainty.

The book takes an agent based approach

An agent is an entity that acts based on observations of its environment. Agents

may be physical entities, like humans or robots, or they may be nonphysical entities,

such as decision support systems that are implemented entirely in software.

The interaction between the agent and the environment follows an observe-act cycle or loop.

  • The agent at time t receives an observation of the environment
  • Observations are often incomplete or noisy;
  • Based in the inputs, the agent then chooses an action at through some decision process.
  • This action, such as sounding an alert, may have a nondeterministic effect on the environment.
  • The book focusses on agents that interact intelligently to achieve their objectives over time.
  • Given the past sequence of observations and knowledge about the environment, the agent must choose an action at that best achieves its objectives in the presence of various sources of uncertainty including:
  1. outcome uncertainty, where the effects of our actions are uncertain,
  2. model uncertainty, where our model of the problem is uncertain,
    3. state uncertainty, where the true state of the environment is uncertain, and
  3. interaction uncertainty, where the behavior of the other agents interacting in the environment is uncertain.

The book is organized around these four sources of uncertainty.

Making decisions in the presence of uncertainty is central to the field of artificial intelligence

Table of contents is

Introduction

Decision Making

Applications

Methods

History

Societal Impact

Overview

PROBABILISTIC REASONING

 Representation

Degrees of Belief and Probability

Probability Distributions

Joint Distributions

Conditional Distributions

Bayesian Networks

Conditional Independence

Summary

Exercises

viii contents

 

Inference

Inference in Bayesian Networks

Inference in Naive Bayes Models

Sum-Product Variable Elimination

Belief Propagation

Computational Complexity

Direct Sampling

Likelihood Weighted Sampling

Gibbs Sampling

Inference in Gaussian Models

Summary

Exercises

 Parameter Learning

Maximum Likelihood Parameter Learning

Bayesian Parameter Learning

Nonparametric Learning

Learning with Missing Data

Summary

Exercises

 Structure Learning

Bayesian Network Scoring

Directed Graph Search

Markov Equivalence Classes

Partially Directed Graph Search

Summary

Exercises

 

Simple Decisions

Constraints on Rational Preferences

Utility Functions

Utility Elicitation

Maximum Expected Utility Principle

Decision Networks

Value of Information

Irrationality

Summary

Exercises

SEQUENTIAL PROBLEMS

 Exact Solution Methods

Markov Decision Processes

Policy Evaluation

Value Function Policies

Policy Iteration

Value Iteration

Asynchronous Value Iteration

Linear Program Formulation

Linear Systems with Quadratic Reward

Summary

Exercises

Approximate Value Functions

Parametric Representations

Nearest Neighbor

Kernel Smoothing

Linear Interpolation

Simplex Interpolation

Linear Regression

Neural Network Regression

Summary

Exercises

 Online Planning

Receding Horizon Planning

Lookahead with Rollouts

Forward Search

Branch and Bound

Sparse Sampling

Monte Carlo Tree Search

Heuristic Search

Labeled Heuristic Search

Open-Loop Planning

Summary

Exercises

 

 Policy Search

Approximate Policy Evaluation

Local Search

Genetic Algorithms

Cross Entropy Method

Evolution Strategies

Isotropic Evolutionary Strategies

Summary

Exercises

 Policy Gradient Estimation

Finite Difference

Regression Gradient

Likelihood Ratio

Reward-to-Go

Baseline Subtraction

Summary

Exercises

Policy Gradient Optimization

Gradient Ascent Update

Restricted Gradient Update

Natural Gradient Update

Trust Region Update

Clamped Surrogate Objective

Summary

Exercises

 Actor-Critic Methods

Actor-Critic

Generalized Advantage Estimation

Deterministic Policy Gradient

Actor-Critic with Monte Carlo Tree Search

Summary

 

 Policy Validation

Performance Metric Evaluation

Rare Event Simulation

Robustness Analysis

Trade Analysis

Adversarial Analysis

Summary

Exercises

MODEL UNCERTAINTY

 Exploration and Exploitation

Bandit Problems

Bayesian Model Estimation

Undirected Exploration Strategies

Directed Exploration Strategies

Optimal Exploration Strategies

Exploration with Multiple States

Summary

Exercises

 Model-Based Methods

Maximum Likelihood Models

Update Schemes

Exploration

Bayesian Methods

Bayes-adaptive MDPs

Posterior Sampling

Summary

Exercises

Model-Free Methods

Incremental Estimation of the Mean

Q-Learning

Sarsa

Eligibility Traces

Reward Shaping

Action Value Function Approximation

Experience Replay

Summary

Exercises

 

 Imitation Learning

Behavioral Cloning

Dataset Aggregation

Stochastic Mixing Iterative Learning

Maximum Margin Inverse Reinforcement Learning

Maximum Entropy Inverse Reinforcement Learning

Generative Adversarial Imitation Learning

Summary

Exercises

PART IV STATE UNCERTAINTY

19 Beliefs 373

Belief Initialization

Discrete State Filter

Linear Gaussian Filter

Extended Kalman Filter

Unscented Kalman Filter

Particle Filter

Particle Injection

Summary

Exercises

20 Exact Belief State Planning 399

Belief-State Markov Decision Processes

Conditional Plans

Alpha Vectors

Pruning

Value Iteration

Linear Policies

Summary

Exercises

Offline Belief State Planning 

Fully Observable Value Approximation

Fast Informed Bound

Fast Lower Bounds

Point-Based Value Iteration

Randomized Point-Based Value Iteration

Sawtooth Upper Bound

Point Selection

Sawtooth Heuristic Search

Triangulated Value Functions

Summary

Exercises

Online Belief State Planning 

Lookahead with Rollouts

Forward Search

Branch and Bound

Sparse Sampling

Monte Carlo Tree Search

Determinized Sparse Tree Search

Gap Heuristic Search

Summary

Exercises

Controller Abstractions 

Controllers

Policy Iteration

Nonlinear Programming

Gradient Ascent

Summary

Exercises

PART V MULTIAGENT SYSTEMS

Multiagent Reasoning 

Simple Games 

Response Models

Dominant Strategy Equilibrium

Nash Equilibrium

Correlated Equilibrium

Iterated Best Response

Hierarchical Softmax

Fictitious Play

Gradient Ascent

Summary

Exercises

Sequential Problems 

Markov Games

Response Models

Nash Equilibrium

Fictitious Play

Gradient Ascent

Nash Q-Learning

Summary

Exercises

State Uncertainty 

Partially Observable Markov Games

Policy Evaluation

Nash Equilibrium

Dynamic Programming

Summary

Exercises

Collaborative Agents 

Decentralized Partially Observable Markov Decision Processes

Subclasses

Dynamic Programming

Iterated Best Response

Heuristic Search

Nonlinear Programming

Summary

Exercises

APPENDICES

Mathematical Concepts

Measure Spaces

Probability Spaces

Metric Spaces

Normed Vector Spaces

Positive Definiteness

Convexity

Information Content

Entropy

Cross Entropy

Relative Entropy

Gradient Ascent

Taylor Expansion

Monte Carlo Estimation

Importance Sampling

Contraction Mappings

Graphs

 Probability Distributions

 Computational Complexity

Asymptotic Notation

Time Complexity Classes

Space Complexity Classes

Decideability

 

 Neural Representations

Neural Networks

Feedforward Networks

Parameter Regularization

Convolutional Neural Networks

Recurrent Networks

Autoencoder Networks

Adversarial Networks

 Search Algorithms

Search Problems

Search Graphs

Forward Search

Branch and Bound

Dynamic Programming

Heuristic Search

 Problems

Hex World

2048

Cart-Pole

Mountain Car

Simple Regulator

Aircraft Collision Avoidance

Crying Baby

Machine Replacement

Catch

F.10 Prisoner€™s Dilemma

Rock-Paper-Scissors

Traveler€™s Dilemma

Predator-Prey Hex World

Multi-Caregiver Crying Baby

Collaborative Predator-Prey Hex World

 

 Julia

Types

Functions

Control Flow

Packages

Convenience Functions

Book link

Algorithms for decision making: free download book