FinRL: The Blueprint for Automated Trading Strategies

8 Jun 2024


(1) Xiao-Yang Liu, Hongyang Yang, Columbia University (xl2427,;

(2) Jiechao Gao, University of Virginia (;

(3) Christina Dan Wang (Corresponding Author), New York University Shanghai (

Abstract and 1 Introduction

2 Related Works and 2.1 Deep Reinforcement Learning Algorithms

2.2 Deep Reinforcement Learning Libraries and 2.3 Deep Reinforcement Learning in Finance

3 The Proposed FinRL Framework and 3.1 Overview of FinRL Framework

3.2 Application Layer

3.3 Agent Layer

3.4 Environment Layer

3.5 Training-Testing-Trading Pipeline

4 Hands-on Tutorials and Benchmark Performance and 4.1 Backtesting Module

4.2 Baseline Strategies and Trading Metrics

4.3 Hands-on Tutorials

4.4 Use Case I: Stock Trading

4.5 Use Case II: Portfolio Allocation and 4.6 Use Case III: Cryptocurrencies Trading

5 Ecosystem of FinRL and Conclusions, and References


In this section, we first present an overview of the FinRL framework and describe its layers. Then, we propose a training-testing-trading pipeline as a standard evaluation of the trading performance.

3.1 Overview of FinRL Framework

The FinRL framework has three layers, application layer, agent layer, and environment layer, as shown in Fig. 2.

• On the application layer, FinRL aims to provide hundreds of demonstrative trading tasks, serving as stepping stones for users to develop their strategies.

• On the agent layer, FinRL supports fine-tuned DRL algorithms from DRL libraries in a plug-and-play manner, following the unified workflow in Fig. 1.

• On the environment layer, FinRL aims to wrap historical data and live trading APIs of hundreds of markets into training environments, following the defacto standard Gym [5].

Upper-layer trading tasks can directly call DRL algorithms in the agent layer and market environments in the environment layer.

The FinRL framework has the following features:

• Layered architecture: The lower layer provides APIs for the upper layer, ensuring transparency. The agent layer interacts with the environment layer in an exploration-exploitation manner. Updates in each layer is independent, as long as keeping the APIs in Table 2 unchanged.

• Modularity and extensibility: Each layer has modules that define self-contained functions. A user can select certain modules to implement her trading task. We reserve interfaces for users to develop new modules, e.g., adding new DRL algorithms.

• Simplicity and applicability: FinRL provides benchmark trading tasks that are reproducible for users, and also enables users to customize trading tasks via simple configurations. In addition, hands-on tutorials are provided in a beginner-friendly fashion.

This paper is available on arxiv under CC BY 4.0 DEED license.