# Reinforcement Learning with R

Machine learning algorithms were mainly divided into **three** main categories.

- Supervised learning algorithms
- Classification and regression algorithms
- Unsupervised learning algorithms
- Clustering algorithms
- Reinforcement learning algorithms

We have covered supervised learning and unsupervised learning algorithms couple of times in our blog articles. In this article, you are going to learn about the third category of machine learning algorithms. Which are reinforcement learning algorithms.

Before we drive further let quickly look at the table of contents.

# Table of contents:

- Reinforcement learning real-life example
- Typical reinforcement process
- Reinforcement learning process
- Divide and Rule
- Reinforcement learning implementation in R
- Preimplementation background
- MDP toolbox package
- Using Github reinforcement learning package
- How to change environment
- Complete code
- Conclusion
- Related courses
- Practical Reinforcement learning

# Reinforcement learning real-life example

The modern education system follows a standard pattern of teaching students. The teacher goes over the concepts need to be covered and reinforces them through some example questions. After explaining the topic and the process with a few solved examples, students are expected to solve similar questions from their exercise book themselves.

This mode of learning is also adopted in machine learning algorithms as a separate class known as reinforcement learning. Though it is easy to know and understand how reinforcement works, the concept is hard to implement.

# Typical reinforcement process

In a typical reinforcement process, the machine acts as the ‘student’ trying to learn the concept.

To learn, the machine interacts with a ‘teacher’ to know the classes of specific data points and learns it. This learning is guided by assigning rewards and penalties to correct and incorrect decisions respectively. Along the way, the machine makes mistakes and corrects itself so as to maximize the reward and minimize the penalty.

As it learns through trial and error and continuous interaction, a framework is built by the algorithm. Since it is so human-like, it has used in specific facets in the industry where a predefined training data is not available. Some examples include puzzle navigation and tic-tac-toe games.