14 min readDec 25, 2020

A Handicappers Journey through a Random Forest and a Neural Network

By Leonard Wissner (Len)

By John Tenniel — Through the Looking-Glass, Public Domain, https://commons.wikimedia.org/w/index.php?curid=7592577

I can think of no greater pleasure than spending a weekday afternoon at the Delmar racetrack. Horseracing is a sport that blends the beauty of nature, the skill of jockeys, and the raw thrill of the experience of a heated racing contest. Few betters come home winners, but all seem to be satisfied with the overall experience. Picking a winning horse can be a daunting experience even to those who purchase the past performance scorecard published by the Daily Racing Form available for about $5.00 as you enter the racetrack. After all choosing a winner with no statistical guidance would be harder than picking a pig in a poke if you pardon the expression. However, I wish to note at the outset that despite the considerable statistical work which I will delve into in this article my spouse succeeds in picking winners by merely observing the horse’s rear as it is paraded in the paddock prior to the race.

We are so fortunate to be living now when computer power is cheap, ample data is available in machine-readable format and powerful machine learning algorithms are being created on a daily basis. In this article, I employed two machine learning models to predict the winner of a horserace. One model design consists of a Random Forest and another model employs a neural network built using the Keras API. Both models are compared to a benchmark which I call HUMAN Intelligence. Human Intelligence is merely the Horse that receives the lowest odds prior to a given race. This Horse represents the consensus view of bettors as to the horse most likely to win. In a sense, a horserace represents an efficient market. In an efficient market, where information is freely available, the equilibrium price of assets is set such that no excess profit is available. As you will see from the ensuing simulations, after the 15%-17% adjustment representing the parimutuel fee taken by the track from the winning pool, Human Intelligence is remarkedly efficient. Betting favorites will yield the handicapper a respectable average of 35% accuracy of winning but the odds will adjust to the point where the handicapper will net a loss of approximately 15% — remarkedly close to the fee taken by the racetrack. What is equally surprising is both of our two machine learning models, perform slightly better than Human Intelligence. A Neural Network model will win slightly less often but accrue more profit than what I call Human Intelligence. In that sense, I say it outperforms Human Intelligence which represents the ultimate benchmark when assessing the value of a machine learning model's algorithm.

Data & Features

The data was obtained using the Brisnet website(https://www.brisnet.com/product/data-files). The Brisnet data service has a Past Performance database and a Results database with 5 years of race history from all major racetracks throughout the USA. Horse racing data can be downloaded in a convenient csv file format. A sample of 16,567 horse entrants that ran in the years 2018 and 2019 was collected from five major race tracks. The data was read into a DataFrame I called race_df_features_app: I used Sweetviz, https://pypi.org/project/sweetviz/, a python EDA to perform feature analysis and visualization of my sample data. The table below was adapted from a Sweetviz exhibit. Sweetviz is an exploratory data analysis (EDA) program which I incorporated in a python notebook using the following code snippet.

!pip install sweetviz
import sweetvizreport=sweetviz.analyze(race_df_features_app)report.show_html()

Distribution of Horse Entrants 2018–2019

Thirty-one major features were isolated from the Brisnet Data and 26 numerical features were used to build a Random Forest and Neural Network Model. All the features I used are available in the Daily Racing Form which can be purchased as you enter the Racetrack. The target variable is the Finish position of an individual horse entrant. A finish value of 1 would be the winner of the race,2, would be the place and 3 is the show positions. Sweetvisz identifies the features that correlate best to the FINISH position(1–14) which I have summarized in the graph below.

What stands out the most in the above graph is that the ODDS posted just prior to the running of a race correlates strongest to the Finish position of an individual horse entrant. The lower the odds the lower the FINISH position of the horse in the race. This suggests handicappers’ judgments as to the prospect of a horse’s run are summarized best in how they wager at the betting window just prior to the race. The dynamics of market efficiency seems to come into play in a horse racing contest. As in capital markets, the collective judgment of market participants is a difficult benchmark to beat in practice. It is rare indeed to find a money manager that outperforms the Market portfolio. It is also rare to find a handicapper who ends up making more money than by betting on the favorites at the track.

Note the three features RACEODDS(1,2,3), which are the odds on the horse in its prior 3 outings. Again, we see the important behavior of the ODDS variable in the determination of a horse’s prospects. Another major feature is the number of ENTRANTS or horses that are competing in an individual race. The more horses competing the greater the chance the horse will finish in a more distant position. The BESTYEAR feature is a numerical speed rating assigned to each horse, which is determined by a horse’s speed relative to the speed of the horse’s competitors over the past year. The negative correlation to Finish suggests that the faster the horse the more likely it will Finish in an early position. JOCKEY% and TRAINER% also show up with negative correlations agreeing with intuition that jockeys and trainers who win the greater percentage of the time have performances with horses that tend to FINISH early in the pack. TRAINER1 & TRAINER2 are interesting stats, usually shown for each horse entrant at the bottom of the Daily Racing Form. These trainer angle stats describe particular features of the barn where the horses were trained. They are expressed in % terms and their negative correlation suggests the more favorable the percentage the more likely the horse will finish sooner. FINISH(1,2,3) represents the finish positions of the horse in its last 3 outings. Note if the horse finished in a distant place in its last outing the positive correlation suggests it will probably have the same poor showing in the current race.

Random Forest

Can a machine learning model assist a handicapper in picking winners at the track? Is it possible to build a model that would outperform a Human being? I define Human intelligence as the favorite horse chosen by handicappers prior to the race contest. This can be easily identified as the horse possessing the lowest odds or payoff on winning. With this criterion in mind, I used two different approaches to the problem. Given the low volume of data I had available I surmised that a Random Forest Design might be superior to a neural network approach. Random Forest Models employ bootstrapping techniques which greatly enhance the number of training examples from the data set. Neural Networks, which typically must fit a large number of parameters tend to excel when a large training data set is available which is not the case in our current example.

After the data was cleaned I imported the” sklearn. ensemble RandomForestClassifier”. The Random Forest Classifier is a decision tree classifier that chooses an ensemble of trees to fit the 14,911-training data set of horse entrants each having 26 features to the target 14 possible finish positions of each horse entrant. After a number of trials on the training data, I settled on a design that had the number of tree estimators =100(n_estimators=100 ) and a decision tree depth of 5 levels(max_depth=5). Using sklearn.inspection I imported permutation_importance to see the important features used in the creation of the decision trees of the random forest models. The following code snippet below illustrates the code I used to produce the model and bar chart of feature importances.

from sklearn.ensemble import RandomForestClassifiermodel=RandomForestClassifier(n_estimators=100,max_depth=depth)model.fit(X_train,y_train)from sklearn.inspection import permutation_importancemodel_feature_df=pd.DataFrame(data=model.feature_importances_,index=input_features,columns=[‘Feature Importance’])model_feature_df.plot(kind=’barh’,title=”Random Forest Feature Importance”,legend=False,figsize=(10,5))plt.show()

The major features turn out to be very similar to the features with the highest correlation shown in the prior graph. Note the high values of the final ODDS, number of Entrants in a race, and the odds on the horse in its prior run. The other features which show up as important are the Trainer2 barn statistic and the number of wins of the jockey in the past year. The test data consists of 1656 horse entrants. For each entrant, the probability of a horse finishing in each of 14 finish positions can be estimated from our trained model. The probability we will focus on in the simulations that follow is the probability assigned for each of the 1656 entrants finishing in the first position, the winner of the race.

y_predict_prob=model.predict_proba(X_test)Test_df[‘PROBABILITY_Forest’]=y_predict_prob[:,0]

The Random Forest Classifier will use an ensemble of 100 Trees to form a prediction. The Random Forest Classifier uses a bootstrapping technique to artificially enhance the size of the data sample. If I import from sklearn import tree I can use the tree plot method to observe the first tree estimator from the algorithm which is shown in the graph below

from sklearn import tree%matplotlib inlinefn=X_train.columnscn=list=y_train.namefig, axes = plt.subplots(nrows = 1,ncols = 1,figsize = (3,4), dpi=800)tree.plot_tree(model.estimators_[0],feature_names = fn,#class_names=cn,filled = True,max_depth=3);plt.tight_layout()fig.savefig('model_individualtree.png')plt.show()

https://stackoverflow.com/questions/40155128/plot-trees-for-a-random-forest-in-python-with-scikit-learn

I begin my journey through the first 3 levels into the first tree estimator in the Random Forest leading me to my decision of which Horse will win. A favorable path would be choosing a horse with a that won >21% in the last year, a Trainer stat >12%, and odds on its last outing <10:1. After a depth of 3 levels, this would have the greatest relative percentage of winners in position 1 designating the winner. It is also interesting to note the similarity of the distribution of values in the list of the root node of the tree compared to the distribution of finish positions in the train data where the bootlegged samples are derived from.https://scikit-learn.org/stable/auto_examples/tree/plot_unveil_tree_structure.html#sphx-glr-auto-examples-tree-plot-unveil-tree-structure-py

A Random Forest Estimator

Neural Network

I was somewhat surprised at how well the neural network model performed. After all, a total of 3294 parameters needed to be estimated from a sample of only 14,911 horse entrants each having 26 features. The training data consists of 90% of the total data of the sample taken from the five racetracks used in my study. I first preprocessed the training data using sklearn preprocessing and employed a StandardScaler to standardize the data to feed to the neural network. I then employed the Keras API to build a sequential network consisting of an input layer with 14911 entrants with 26 features, two hidden Dense Layers with 40 relu activations each, and a dense output layer which fed into a Softmax activation to compute the probabilities of each entrant falling into one of 14 finish positions. The code and summary of the model is displayed below.

from sklearn import preprocessingmm_scaler = preprocessing.StandardScaler()X_train_minmax = mm_scaler.fit_transform(X_train)X_test_minmax=mm_scaler.transform(X_test)yy_train=y_train.values.reshape(-1,1)yy_train=lb.fit_transform(list(yy_train))from tensorflow.python.keras.models import Modelnum_pixels = X_train.shape[1]num_classes=14batch=X_train.shape[0]inputs = Input(shape=(num_pixels,))# a layer instance is callable on a tensor, and returns a tensoroutput_1 = Dense(40, activation=’relu’,kernel_regularizer = regularizers.l2(.0001))(inputs)output_2 = Dense(40, activation=’relu’,kernel_regularizer = regularizers.l2(.0001))(output_1)predictions = Dense(num_classes, activation=’softmax’,kernel_regularizer = regularizers.l2(.0001))(output_2)# This creates a model that includesmodel_net = Model(inputs=[inputs], outputs=[predictions])model_net.compile(optimizer=’adam’,loss= ‘categorical_crossentropy’,metrics=[‘accuracy’])model_net.fit(X_train_minmax, yy_train,epochs=10000,verbose=0,batch_size=batch)S= model_net.predict_on_batch(np.array(X_test_minmax))print(‘S’,S[:,0])print(model_net.summary())

The model was compiled using an ‘adam’ optimizer, a ‘categorical_crossentropy’ loss function, and an ‘accuracy metric’. The training accuracy was quite low since it is measuring the exact position a horse will finish in 14 possible positions. However, as will be seen in the simulations that follow our interest is the relative probability of winning (position 0) of each horse entrant in a race with its peer group. It is quite remarkable that using this relative probability, choosing the horse with the maximum probability of winning, results in an accuracy of over 30%, remarkably close to human accuracy.

Simulations

The purpose of the simulations is to test our two machine learning models against a benchmark of how a human would perform at a racetrack. We look beyond model accuracy and test whether the models make more money than the handicappers at the racetrack on a given day in history. One problem using a profitability measure is the Pari-Mutual takeout percentage the racetrack removes from the winning pool of each race. I focus on the winning pool since the odds of winning are clearly established and known prior to each race contest. The payoff to place and show is difficult to estimate prior to the race.

Typically, the track Pari-Mutual takeout is anywhere from 15–17% of the winning pool. The remainder of the funds in the win pool is then divided amongst the winning bettors in the race. This is a steep hurdle to overcome in any betting contest.

The models are fitted on training data where the features of each horse entrant is fit to the target of the finish position of the horse in the race. In the simulations, however, it is necessary to group each entrant in the test data set with the TRACK, Date, and RACE where the horse ran. The grouping of horses in the data frame is accomplished using the code snippet below.

Test_grouped=Test_df.groupby([“TRACK”,”DATE”,”RACE”])Test_grouped_df=Test_df.set_index([“TRACK”,”DATE”,”RACE”])

In a real racing situation horses are chosen to run within their given peer group. So, a race called a $15,000 claim has horses chosen for this peer group and can be purchased for about $15,000. In another race, i.e. a $50,000 claim a more selective peer group would be chosen to run together. This makes for a more exciting race setting as each horse is competing with horses of approximately the same skill level. This grouping of horses provides a realistic situation of what occurred in history on a given race day. The probabilities that a given horse entrant will win in the contest in a given race are obtained from the model fitted with only the training data which is not seen by the test data. The payoff is derived from the data in the Brisnet Result Files which contain the finish positions and odds on each horse. In a future racing situation, all the feature data is available prior to the race(The Daily Racing Form) and the odds of a horse winning are input by the handicapper just prior to the running of the race.

Four strategies are tested in the following simulations on randomly selected Test data over the years 2018–2019.

These four strategies are tested in 3800 randomly selected races from the years 2018 -2019. Let’s look at a typical racing contest in the test data.

Belmont Racetrack May 5, 2019 Race 1

The favorite in this race was RagingFire. The human pick would be this horse since it is the favorite and has the lowest odds. Unfortunately, RagingFire lost hence there is a loss of $1 per dollar invested. The Random Forest Model picked Admiral Blue since it had the highest probability of winning, according to the Random Forest Model, and also lost $1 per dollar invested. The random choice model also lost picking Sandy Lane by choosing a random number in the range 1 -5. The neural network model picked Call Me, the winner in this contest, since it had the highest probability of winning. This resulted in a profit of $2.70 per dollar invested.

The machine learning models were tested on 20 random batches consisting of approximately 190 actual races per batch. A random test batch of races is chosen by first splitting the 16,567 horse entrants into a 90% training set and a 10% test set. The 10% test set is chosen randomly in each of 20 batch runs and the remaining 14,911 horse entrants are used to train the Random Forest and Neural Network Models.

cut=np.random.randint(0,int(.9*len(race_df_features_app)))# cuttsize=int(.1*len(race_df_features_app))# tsizeX_train=race_df_features_app.iloc[0:cut]\[input_features].append(race_df_features_app.iloc[cut+tsize:]/\[input_features])y_train=race_df_features_app.iloc[0:cut]\ [“FINISH”].append(race_df_features_app.iloc[cut+tsize:][“FINISH”])

The test data is then grouped by Race, Date, and Track, and using the models trained on the training data the probability of a horse entrant winning is computed and standardized to add to 1 in each race. The profit and accuracy are recorded for each of the 20 batches of 190 races in each batch and summarized in the table below. The machine learning models are compared to a naïve Random selection and a proxy for Human selection. The models are therefore tested on a total of 3800 randomly selected actual races that occurred in the years 2018 &2019 for the racetracks in our sample.

Strategy Performance Profit Per Dollar Invested

It can then be seen that the$ -0.15 average of Human performance is about breakeven after adjusting for the 15- 17% pari-mutuel takeout. Racetrack betting markets appear to be very efficient. The random forest and neural network models appear to perform better than human beings at the racetrack. The Neural Network Model outperformed Human intelligence in 11 of the 20 Batch runs, had a superior average return of $-0.11 average return per $ invested, and is on track to make money when adjusted for the pari-mutuel takeout.

Betting on favorites has an impressive 36% accuracy on average. The accuracy of the machine learning models over the 20 batches simulated is 35% and 31% for the Random Forest and Neural Network models. A 36% accuracy is very impressive in picking horses at the track. Unfortunately, the odds-on betting favorites are set very low especially considering the hurdle of the track pari-mutuel takeout.

Going to the track and randomly choosing the winner is a costly venture. You will probably lose about 30% of your wager and achieve an accuracy of only 10–15%. You will be lucky to win one race in your outing. Sometimes long shots will come in and pay handsomely as seen in the 14.7% maximum profit in the Random Strategy in one of the Batches simulated.

In any case, Handicappers will be in store for a great deal of excitement even if they don’t make much if any money at the racetrack over the long run. As more data is made available I place my bet on machine learning models ultimately beating human beings at the racetrack.