PART 4 – Predict the carry distance of the golf ball using the relevant features and model

Linear Regression method for prediction

Features selected regarding coefficient threshold

To predict the golf ball carry distance regarding our 3 wood golf club data, we can use a very efficient method which is the Linear Regression. It is a supervised learning algorithm that models the relationship between golf ball features and carry distance by fitting a straight line that minimizes prediction errors.

So we have split train/test our dataset, the goal of this, is to ensure that the model learns the relationship between the features and the output Golf Carry Distance and then confirm that it can makes accurate predictions of unseen data. The training process help find the coefficients of the linear regression. On the plots below, we have the prediction of the golf ball carry distance in meters regarding the testing data, after the training process.

So, we have succeeded to capture the main physical relationships and we define that excessive height, loft, or off-center shots negatively affect performance when Ball Speed remains a key point to gain distance.

Therefore, we will use an arbitrary threshold for the correlation coefficients (Pearson correlations) between each features regarding the golf ball carry distance. The goal with this limit, is to select the most relevant features before creating the machine learning model with linear regression.

Features kept (>= 0.5 correlation):
Launch angle 0.567520
Ball speed 0.734172
Features dropped (< 0.5 correlation):
Dispersion 0.034940
Apex 0.210084

Check the accuracy of the model

We need to get the accuracy of the model and we will use the common tool as the Coefficient of Determination and the RMSE (rot mean squared error).

The score results are the following one:
R²=0.85
RMSE=2.4 meters on an average distance of 180 meters

Therefore, we can say that the model demonstrates a good predictive accuracy. These results suggest that the selected features, especially ball speed and launch conditions, are strong predictors of golf ball carry distance.
The plot below also confirm the good accuracy of the model. It shows the comparison between predicted and actual carry distances. We can observe that the two line almost overlap across the entire dataset and that only small deviation between the testing and the predicted data.

Predict golf ball carry distance – use the linear regression model – play with some fictionnal golf shots

Now, we have a accurate model that we can play with it. So regarding personnal golf data, I can predict the distance I can do with my 3 wood golf club if I respect and succeed to keep a particular speed and launch angle.

So we can have this examples scenari where we play with the 2 features Ball speed and Launch angle to get of good distance while htting the golf ball:

ScenarioBall speedLaunch anglePredicted carry distance meters
Stock shot13216180
Low bullet13412178
High soft12818181
Toe miss12514170
Perfect strike13616186

The scenario table and point plot reinforce the intuition: tweaking ball speed and launch angle shifts the predicted carry exactly as expected, while dispersion/apex have milder effects once the correlation filter trims them. With a strong correlation with the relevant scores of the model, we can now play with the model and why not, improve our golf swing. To win golf speed and better launch angle, it can be relevant to be helped by a Pro golf instructor and maybe go also to the gym as many professional golfers.

You can contact me: HERE