The first paper published on this subject was by Paul Werbos back in 1977. Later it was termed Adaptive Critic Designs (ACD). Over the years, ACD gradually gained popularity and as more and more people with controls background joined in, the term has been changed to Adaptive Dynamic Programming (ADP).
There are several synonyms used for ADP including "adaptive critic designs", "adaptive dynamic programming", "approximate dynamic programming", "asymptotic dynamic programming", "neural dynamic programming", "neuro-dynamic programming", "reinforcement learning", and "relaxed dynamic programming". Dynamic programming was invented by Richard Bellman back in the 1950's. But soon it was discovered that it involves too much computation due to the well-known "curse of dimensionality". That is to say, for a problem of moderate size, the computational complexity of the original dynamic programming approach cannot be handled by most computers (even nowadays) and for many years the approach has only shown its theoretical value. In other words, we really cannot implement true dynamic programming in practice when the problem size is big! Therefore, no matter what we call it, in the case of ADP, we all try to approximate the solutions of dynamic programming. Because of this, a lot of people like to use the term "approximate dynamic programming" instead of "adaptive dynamic programming". I use "adaptive dynamic programming" since I work on control applications.
In 2002, a workshop was held in Mexico on this subject and at the time, it was called Approximate Dynamic Programming (ADP). It was also the consensus among participants of the workshop that ADP should be closely connected with Reinforcement Learning (RL). The US National Science Foundation sponsored the workshop by paying all expenses for everyone who attended the workshop. There were 29 researchers around the globe who came to the workshop. The main product of the workshop was an edited book - Handbook of Learning and Approximate Dynamic Programming, published in 2004. Bernard Widrow from Stanford also spoke ar the workshop. Widrow was the one who coined the term "Adaptive Critics" earlier in the 1970's. More info about the 2002 workshop can be found at http://www.fulton.asu.edu/~nsfadp/
Again in 2006, NSF sponsored another workshop in Mexico. This time, 42 researchers were invited to attend the workshop, including Dimitri Bertsekas. The main objective of the 2006 workshop included outreach to Mexican students and researchers. See http://www.fulton.asu.edu/~nsfadp/
In the past many years, special sessions (invited sessions) have been organized at IJCNN/WCCI each year on topics related to ADP. Since 2007, IEEE started a symposium on ADP and RL and it is organized now every two years. In 2007, the first ADPRL symposium was held in Honolulu, Hawaii and in 2009 in Nashville, Tennessee. The symposium is part of the IEEE Symposium Series on Computational Intelligence.
In July 2007, a special issue on Neural Networks for Feedback Control was published by the IEEE Trans. Neural Networks. There are a couple of papers on ADP and RL among the 20 papers published. In August 2008, a special issue on Adaptive Dynamic Programming and Reinforcement Learning for Feedback Control was published by IEEE Trans. Systems, Man and Cybernetics-B. The Guest editors were Frank Lewis, Derong Liu and George Lendaris. Click here to see its content.
In 2008, a technical committee on ADPRL was formed within the IEEE Computational Intelligence Society. The founding chair of the committee is Derong Liu. Click here for info about the TC.
To the best of my knowledge, the following is the list of books published on ADPRL (please feel free to email me your book info).
Click here to go back.