基于RNN预测股票价格：第一部分

2017-07-08 08:00·3264天前

AI 摘要

该教程介绍了如何使用Tensorflow构建循环神经网络（RNN）来预测股票市场价格，其中第一部分重点针对标准普尔500指数进行预测。教程提供了完整可运行的代码，托管于GitHub仓库。

原文 · 未翻译

Overview of Existing Tutorials

The Goal

Data Preparation Train / Test Split Normalization

Train / Test Split

Normalization

Model Construction Definitions Define Graph Start Training Session Use TensorBoard

Definitions

Define Graph

Start Training Session

Use TensorBoard

Results

This is a tutorial for how to build a recurrent neural network using Tensorflow to predict stock market prices. The full working code is available in github.com/lilianweng/stock-rnn. If you don’t know what is recurrent neural network or LSTM cell, feel free to check my previous post.

One thing I would like to emphasize that because my motivation for writing this post is more on demonstrating how to build and train an RNN model in Tensorflow and less on solve the stock prediction problem, I didn’t try hard on improving the prediction outcomes. You are more than welcome to take my code as a reference point and add more stock prediction related ideas to improve it. Enjoy!

Overview of Existing Tutorials#

There are many tutorials on the Internet, like:

A noob’s guide to implementing RNN-LSTM using Tensorflow

TensorFlow RNN Tutorial

LSTM by Example using Tensorflow

How to build a Recurrent Neural Network in TensorFlow

RNNs in Tensorflow, a Practical Guide and Undocumented Features

Sequence prediction using recurrent neural networks(LSTM) with TensorFlow

Anyone Can Learn To Code an LSTM-RNN in Python

How to do time series prediction using RNNs, TensorFlow and Cloud ML Engine

Despite all these existing tutorials, I still want to write a new one mainly for three reasons:

Early tutorials cannot cope with the new version any more, as Tensorflow is still under development and changes on API interfaces are being made fast.

Many tutorials use synthetic data in the examples. Well, I would like to play with the real world data.

Some tutorials assume that you have known something about Tensorflow API beforehand, which makes the reading a bit difficult.

After reading a bunch of examples, I would like to suggest taking the official example on Penn Tree Bank (PTB) dataset as your starting point. The PTB example showcases a RNN model in a pretty and modular design pattern, but it might prevent you from easily understanding the model structure. Hence, here I will build up the graph in a very straightforward manner.

The Goal#

I will explain how to build an RNN model with LSTM cells to predict the prices of S&P500 index. The dataset can be downloaded from Yahoo! Finance ^GSPC. In the following example, I used S&P 500 data from Jan 3, 1950 (the maximum date that Yahoo! Finance is able to trace back to) to Jun 23, 2017. The dataset provides several price points per day. For simplicity, we will only use the daily close prices for prediction. Meanwhile, I will demonstrate how to use TensorBoard for easily debugging and model tracking.

As a quick recap: the recurrent neural network (RNN) is a type of artificial neural network with self-loop in its hidden layer(s), which enables RNN to use the previous state of the hidden neuron(s) to learn the current state given the new input. RNN is good at processing sequential data. Long short-term memory (LSTM) cell is a specially designed working unit that helps RNN better memorize the long-term context.

For more information in depth, please read my previous post or this awesome post.

Data Preparation#

The stock prices is a time series of length $N$, defined as $p_0, p_1, \dots, p_{N-1}$ in which $p_i$ is the close price on day $i$, $0 \le i 1 else _create_one_cell()

cell = tf.contrib.rnn.MultiRNNCell( [_create_one_cell() for _ in range(config.num_layers)], state_is_tuple=True ) if config.num_layers > 1 else _create_one_cell()

(6) tf.nn.dynamic_rnn constructs a recurrent neural network specified by cell (RNNCell). It returns a pair of (model outpus, state), where the outputs val is of size (batch_size, num_steps, lstm_size) by default. The state refers to the current state of the LSTM cell, not consumed here.

tf.nn.dynamic_rnn

cell

val

batch_size

num_steps

lstm_size

val, _ = tf.nn.dynamic_rnn(cell, inputs, dtype=tf.float32)

(7) tf.transpose converts the outputs from the dimension (batch_size, num_steps, lstm_size) to (num_steps, batch_size, lstm_size). Then the last output is picked.

tf.transpose

batch_size

num_steps

lstm_size

num_steps

batch_size

lstm_size

# Before transpose, val.get_shape() = (batch_size, num_steps, lstm_size) # After transpose, val.get_shape() = (num_steps, batch_size, lstm_size) val = tf.transpose(val, [1, 0, 2]) # last.get_shape() = (batch_size, lstm_size) last = tf.gather(val, int(val.get_shape()[0]) - 1, name="last_lstm_output")

(8) Define weights and biases between the hidden and output layers.

weight = tf.Variable(tf.truncated_normal([config.lstm_size, config.input_size])) bias = tf.Variable(tf.constant(0.1, shape=[config.input_size])) prediction = tf.matmul(last, weight) + bias

(9) We use mean square error as the loss metric and the RMSPropOptimizer algorithm for gradient descent optimization.

loss = tf.reduce_mean(tf.square(prediction - targets)) optimizer = tf.train.RMSPropOptimizer(learning_rate) minimize = optimizer.minimize(loss)

Start Training Session#

(1) To start training the graph with real data, we need to start a tf.session first.

tf.session

with tf.Session(graph=lstm_graph) as sess:

(2) Initialize the variables as defined.

tf.global_variables_initializer().run()

(0) The learning rates for training epochs should have been precomputed beforehand. The index refers to the epoch index.

learning_rates_to_use = [ config.init_learning_rate * ( config.learning_rate_decay ** max(float(i + 1 - config.init_epoch), 0.0) ) for i in range(config.max_epoch)]

(3) Each loop below completes one epoch training.

for epoch_step in range(config.max_epoch): current_lr = learning_rates_to_use[epoch_step] # Check https://github.com/lilianweng/stock-rnn/blob/master/data_wrapper.py # if you are curious to know what is StockDataSet and how generate_one_epoch() # is implemented. for batch_X, batch_y in stock_dataset.generate_one_epoch(config.batch_size): train_data_feed = { inputs: batch_X, targets: batch_y, learning_rate: current_lr } train_loss, _ = sess.run([loss, minimize], train_data_feed)

(4) Don’t forget to save your trained model at the end.

saver = tf.train.Saver() saver.save(sess, "your_awesome_model_path_and_name", global_step=max_epoch_step)

The complete code is available here.

Use TensorBoard#

Building the graph without visualization is like drawing in the dark, very obscure and error-prone. Tensorboard provides easy visualization of the graph structure and the learning process. Check out this hand-on tutorial, only 20 min, but it is very practical and showcases several live demos.

Brief Summary

Use with [tf.name_scope](https://www.tensorflow.org/api_docs/python/tf/name_scope)("your_awesome_module_name"): to wrap elements working on the similar goal together.

with [tf.name_scope](https://www.tensorflow.org/api_docs/python/tf/name_scope)("your_awesome_module_name"):

Many tf.* methods accepts name= argument. Assigning a customized name can make your life much easier when reading the graph.

tf.*

name=

Methods like tf.summary.scalar and tf.summary.histogram help track the values of variables in the graph during iterations.

tf.summary.scalar

tf.summary.histogram

In the training session, define a log file using tf.summary.FileWriter.

tf.summary.FileWriter

with tf.Session(graph=lstm_graph) as sess: merged_summary = tf.summary.merge_all() writer = tf.summary.FileWriter("location_for_keeping_your_log_files", sess.graph) writer.add_graph(sess.graph)

Later, write the training progress and summary results into the file.

_summary = sess.run([merged_summary], test_data_feed) writer.add_summary(_summary, global_step=epoch_step) # epoch_step in range(config.max_epoch)

The full working code is available in github.com/lilianweng/stock-rnn.

Results#

I used the following configuration in the experiment.

num_layers=1 keep_prob=0.8 batch_size = 64 init_learning_rate = 0.001 learning_rate_decay = 0.99 init_epoch = 5 max_epoch = 100 num_steps=30

(Thanks to Yury for cathcing a bug that I had in the price normalization. Instead of using the last price of the previous time window, I ended up with using the last price in the same window. The following plots have been corrected.)

Overall predicting the stock prices is not an easy task. Especially after normalization, the price trends look very noisy.

The example code in this tutorial is available in github.com/lilianweng/stock-rnn:scripts.

教程/实践数据/训练

Lilian Weng：Lil'Log（RSS）

基于RNN预测股票价格：第一部分

2017-07-08 08:00·3264天前

AI 摘要

原文 · 保持原样，未翻译

Overview of Existing Tutorials

The Goal

Data Preparation Train / Test Split Normalization

Train / Test Split

Normalization

Model Construction Definitions Define Graph Start Training Session Use TensorBoard

Definitions

Define Graph

Start Training Session

Use TensorBoard

Results

Overview of Existing Tutorials#

There are many tutorials on the Internet, like:

A noob’s guide to implementing RNN-LSTM using Tensorflow

TensorFlow RNN Tutorial

LSTM by Example using Tensorflow

How to build a Recurrent Neural Network in TensorFlow

RNNs in Tensorflow, a Practical Guide and Undocumented Features

Sequence prediction using recurrent neural networks(LSTM) with TensorFlow