Reinforcement Learning

Machine Learning, like several other disciplines, models many phenomena as random variables (RVs): variables that can take on any value in some set, each with some probability.1 These RVs can be pixel intensities in an image, words in a sentence, sensor readings from a robot’s surroundings, or even the exchange of observations and actions as an agent interacts with its environment in time. Information theory is one way to study these RVs. One of the many interesting results in information theory is the conservation of directed information: a law that tells us that as two entities exchange RVs back and forth, the total information they share is neatly partitioned into what flows from one to the other, and what flows back. This post’s focus will be the proof of that conservation law. ...

Reinforcement Learning

On the Conservation of Mutual Information

Position: Deployed Reinforcement Learning should be Continual