本篇主要提供深度强化学习实践Maxim,Lapan东南电子书的pdf版本下载,本电子书下载方式为百度网盘方式,点击以上按钮下单完成后即会通过邮件和网页的方式发货,有问题请联系邮箱ebook666@outlook.com
图书基本信息 | |
图书名称 | 深度强化学习实践 |
作者 | Mam,Lapan |
定价 | 109元 |
出版社 | 东南大学出版社 |
ISBN | 9787564183219 |
出版日期 | 2019-05-01 |
字数 | |
页码 | 523 |
版次 | |
装帧 | 平装 |
开本 | 16开 |
商品重量 |
内容提要 | |
强化学习(RL)的新发展结合深度学习(DL),在训练代理以类似人的方式解决复杂问题方面取得了未有的进步。Google使用算法在的Atari街机游戏中获胜将该领域推至高峰,研究人员也在源源不断地产生新的想法。 《深度强化学习实践(影印版 英文版)》介绍了RL的基础知识,为你提供了编写智能学习代理所需的原理,以承担一系列艰巨的实际任务。让你了解如何在“网格世界”环境中实现Q-learning,教你的代理购买和交易股票,发现自然语言模型如何推动了聊天机器人的火爆。 |
目录 | |
Preface Chapter 1: What is Reinforcement Learning? Learning - supervised, unsupervised, and reinforcement RL formalisms and relations Reward The agent The environment Actions Observations Markov decision processes Markov process Markov reward process Markov decision process Summary Chapter 2: OpenAI Gym The anatomy of the agent Hardware and software requirements OpenAI Gym API Action space Observation space The environment Creation of the environment The CartPole session The random CartPole agent The extra Gym functionality - wrappers and monitors Wrappers Monitor Summary Chapter 3: Deep Learning with PyTorch Tensors Creation of tensors Scalar tensors Tensor operations GPU tensors Gradients Tensors and gradients NN building blocks Custom layers Final glue - loss functions and optimizers Loss functions Optimizers Monitoring with TensorBoard TensorBoard 101 Plotting stuff Example -GAN on Atari images Summary Chapter 4: The Cross-Entropy Method Taxonomy of RL methods Practical cross-entropy Cross-entropy on CartPole Cross-entropy on FrozenLake Theoretical background of the cross-entropy method Summary Chapter 5: Tabular Learning and the Bellman Equation Value, state, and optimality The Bellman equation of optimality Value of action The value iteration method Value iteration in practice Q-learning for FrozenLake Summary Chapter 6: Deep Q-Networks Chapter 7: DQN Extensions Chapter 8: Stocks Trading Using RL Chapter 9: Policy Gradients - An Alternative Chapter 10: The Actor-Critic Method Chapter 11: Asynchronous Advantaqe Actor-Critic Chapter 12: Chatbots Training with RL Chapter 13: Web Navigation Chapter 14: Continuous Action Space Chapter 15: Trust Regions - TRPO, PPO, and ACKTR Chapter 16: Black-Box Optimization in RL Chapter 17: Beyond Model-Free - Imagination Chapter 18: AlphaGo Zero Other Books You May Enjoy Index |
作者介绍 | |
Mam Lapan,is a deep learning enthusiast and independent researcher. His background and 15 years' work expertise as a software developer and a systems architect lays from low-level Linux kernel driver development to performance optimization and design of distributed applications working on thousands of servers. With vast work experiences in big data,Machine Learning, and large parallel distributed HPC and nonHPC systems, he has a talent to explain a gist of plicated things in simple words and vivid examples.His current areas of interest lie in practical applications of Deep Learning, such as Deep Natural Language Processing and Deep Reinforcement Learning. Mam lives in Moscow, Russian Federation, with his family, and he works for an Israeli start-up as a Senior NLP developer. |