Stake sizing, Part 1

In the last post we used Python code to take a look at a classic gambling situation, the coin flip, to make a point about the importance of choosing the highest odds available to bet at. Today, we’ll again use the coin flipping example to investigate another fundamental principal of successful gambling: stake sizing.

Now, imagine we’re one of the lucky punters from the last post who were allowed to bet on a fair coin flip at odds of 2.03. As I stated then, this is pretty much like a license to print money – but how much of your bankroll should you bet on each flip of the coin? Knowing that the coin was indeed fair and you would be getting the best of it, a natural instinct could be to bet as much as you could possibly cough up, steal and borrow in order to maximize your profit. This is a poor strategy though, as we’ll soon come to see.

The reason for this is that even if we do have come across a profitable proposition, our edge when betting at a (I’ll empasize it again: fair) coin flip at 2.03 odds is only 1.5% – meaning that for each 1 unit bet we are expected to net 0.015 units on average. This conclusion should be absolute basics for anyone interested in serious gambling, but to make sure we’re all on the same page I’ll throw some maths at you:

The Expected Value, or EV, of any bet is, simply put, the sum of all outcomes multiplied by their respective probabilities – indicating the punter’s average profit or loss on each bet. So with our coin flip, we’ll win a net of 1.03 units 50% of the time and lose 1 unit 50% of the time; our EV is therefore 1.03 * 0.5 + (-1 * 0.5) = 0.015, for a positive edge of 1.5% and an average profit of 0.015 units per bet. For these simple types of bets though, an easier way to calculate EV is to divide the given odds by the true odds and subtract 1: 2.03 / 2.0 – 1 = 0.015.

An edge of only 1.5% is nothing to scoff at though, empires has been built on less, so we’ll definitely want to bet something – but how much?

Stake sizing is much down to personal preferences about risk aversion and tolerance of the variance innately involved in gambling, but with some Python code we can at least have a look at some different strategies before we set out to chase riches and glory flipping coins. Just like in the last post I’ll just give you the code with some comments in it, which will hopefully guide you along what’s happening  before I briefly explain it.

Here we go:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

def coin_flips(n=10000,odds=1.97,bankroll=100,stake=1,bankrupt=False):
    '''
    Simulates 10000 coinflips for a single punter, betting at 1.97 odds,
    also calculates net winnings

    NEW: default bankroll and stake set at 100 and 1, respectively
    now also calculates if player went bankrupt or not
    '''

    # create a pandas dataframe for storing coin flip results
    # and calculate net winnings
    df = pd.DataFrame()
    # insert n number of coinflips, 0=loss, 1=win
    df['result'] = np.random.randint(2,size=n)
    # calculate net winnings
    df['net'] = np.where(df['result']==1,stake*odds-stake,-stake)
    # calculate cumulative net winnings
    df['cum_net'] = df['net'].cumsum()

    # calculate total bankroll
    df['bankroll'] = df['cum_net'] + bankroll

    # if bankroll goes below the default stake, punter will stop betting
    # count times bankroll < stake
    df['bankrupt'] = np.where(df['bankroll']<stake,1,0)
    # count cumulative bankruptcies, with column shifted one step down
    df['bankruptcies'] = df['bankrupt'].cumsum().shift(1)
    # in case first flip is a loss, bankruptcies will be NaN, replace with 0
    df.fillna(0,inplace=True)
    # drop all flips after first bankruptcy
    if bankrupt:
        df = df[df['bankruptcies']==0]

    return df

First off, we’ll modify our original coin_flips function to take our punter’s bankroll and stake size into consideration, setting the bankrupt threshold at the point where a default sized bet can no longer be made. By default, our punter will have an endless stream of 100 unit bankrolls, but if we set the parameter bankrupt to True, the function will cut away any coin flips after his first bankruptcy.

def many_coin_flips(punters=100,n=10000,odds=1.97,bankroll=100,stake=1,color='r',plot=False,bankrupt=False):
    '''
    Simulates 10000 coinflips for 100 different punters,
    all betting at 1.97 odds,
    also calculates and plots net winnings for each punter

    NEW: now also saves punter bankruptcies
    '''

    # create pandas dataframe for storing punter results
    punter_df = pd.DataFrame()
    # loop through all punters
    for i in np.arange(punters):
        # simulate coin flips
        df = coin_flips(n,odds,bankroll,stake,bankrupt)
        # calculate net
        net = df['net'].sum()
        # check for bankruptcy
        bankruptcy = df['bankrupt'].sum()

        # append to our punter dataframe
        punter_df = punter_df.append({'odds':odds,
                                      'net':net,
                                      'bankrupt':bankruptcy},ignore_index=True)

        if plot:
            # plot the cumulative winnings over time
            df['cum_net'].plot(color=color,alpha=0.1)

    # check if punters ended up in profit
    punter_df['winning'] = np.where(punter_df['net']>0,1,0)

    return punter_df

We also want to modify the many_coin_flips function so that it’ll also take bankroll and stake size into consideration, counting up how many of our punters went bankrupt.

We won’t use the compare_odds function here, instead we’ll write a new one to compare stake sizing – but if we ever want to use it again sometime in the future a few minor changes will be needed here as well:

def compare_odds(punters=100,n=10000,odds=[1.97,2.00,2.03]):
    '''
    Simulates and compare coin flip net winnings
    after 10000 flips for 3 groups of punters,
    betting at odds of 1.97, 2.00 and 2.03, respectively.
    Also plots every punters net winnings
    '''

    # create figure and ax objects to plot on
    fig, ax = plt.subplots()

    # set y coordinates for annotating text for each group of punters
    ys = [0.25,0.5,0.75]
    # assign colors to each group of punters
    cs = ['r','y','g']

    # loop through the groups of punters, with their respective odds,
    # chosen color and y for annotating text
    for odd, color, y in zip(odds,cs,ys):
        # run coin flip simulation with given odds, plot with chosen color
        df = many_coin_flips(punters,n,odd,color=color,plot=True)
        # calculate how many punters in the group ended up in profit
        winning_punters = df['winning'].mean()
        # set a text to annotate
        win_text = '%.2f: %.0f%%' %(odd,winning_punters * 100)
        # annotate odds and chance of profit for each group of punters
        ax.annotate(win_text,xy=(1.02,y),
                    xycoords='axes fraction', color=color,va='center')

    # set title
    ax.set_title('Chances of ending up in profit after %s coin flips' %n)
    # set x and y axis labels
    ax.set_xlabel('Number of flips')
    ax.set_ylabel('Net profit')
    # add annotation 'legend'
    ax.annotate('odds: chance',xy=(1.02,1.0),
                xycoords=('axes fraction'),fontsize=10,va='center')
    # add horizontal line at breakeven point
    plt.axhline(color='k',alpha=0.5)
    # set y axis range at some nice number
    ax.set_ylim(-450,450)

    # show plot
    plt.show()

Now, with all our previous coin flip functions taking bankroll and stake size into consideration, we can go ahead and evaluate a few stake sizing strategies with a new function:

def compare_stakes(punters=200,n=10000,odds=2.03,stakes=[100,50,25,10,5,2,1,0.5],bankroll=100):
    '''
    Similar to compare_odds, but here we instead want to compare different
    staking sizes for our coin flips betting at 2.03 odds

    Increased number of punters in each group, from 100 to 200

    Also prints out the results
    '''

    # pandas df to store results
    results_df = pd.DataFrame(columns=['stake','win','lose','bankrupt'])

    # colors to use in plot later, green=1=win, yellow=4=lost, red=2=bankrupt
    colors = [sns.color_palette()[i] for i in (1,4,2)]

    # loop through the groups of punters, with their respective odds
    for stake in stakes:
        # run coin flip simulation with given stake
        df = many_coin_flips(punters,n,odds,stake=stake,bankrupt=True)
        # calculate how many punters in the group ended up in profit
        winning_punters = df['winning'].mean()
        # ...and how many went bankrupt
        bankrupt_punters = df['bankrupt'].mean()
        # lost money but not bankrupt
        lose = 1 - winning_punters - bankrupt_punters

        # append to dataframe
        results_df = results_df.append({'stake':stake,
                                        'win':winning_punters,
                                        'lose':lose,
                                        'bankrupt':bankrupt_punters},ignore_index=True)

    # set stake as index
    results_df.set_index('stake',inplace=True)

    # plot
    fig = plt.figure()
    # create ax object
    ax = results_df.plot(kind='bar',stacked=True,color=colors,alpha=0.8)
    # fix title, axis labels etc
    ax.set_title('Simulation results: betting %s coin flips at %s odds, starting bankroll %s' %(n,odds,bankroll))
    ax.set_ylabel('%')
    # set legend outside plot
    ax.legend(bbox_to_anchor=(1.2,0.5))

    # add percentage annotation for both win and bankrupt
    for x, w, l, b in zip(np.arange(len(results_df)),results_df['win'],results_df['lose'],results_df['bankrupt']):
        # calculate y coordinates
        win_y = w/2
        lost_y = w + l/2
        bankr_y = w + l + b/2

        # annotate win, lose and bankrupt %, only if >=2%
        if w >= 0.04:
            ax.annotate('%.0f%%' %(w * 100),xy=(x,win_y),va='center',ha='center')
        if l >= 0.04:
            ax.annotate('%.0f%%' %(l * 100),xy=(x,lost_y),va='center',ha='center')
        if b >= 0.04:
            ax.annotate('%.0f%%' %(b * 100),xy=(x,bankr_y),va='center',ha='center')

    plt.show()

By default, our new compare_stakes function creates a number of punter groups, all betting on fair coin flips at 2.03 odds with a starting bankroll of a 100 units. For each group and their different staking plan, the function takes note of how many ended up in profit, how many lost and how many went bankrupt.

As we can see on the plot below, the results differ substantially:

01

Just like last time, I want to remind you that any numbers here are only rough estimates, and increasing the size of each punter group as well as the number of coin flips will get us closer to the true values.

So what can we learn from the above plot? Well, the main lesson is that even if you have a theoretically profitable bet, your edge will account for nearly nothing if you are too bold with your staking. Putting your whole bankroll at risk will see you go bankrupt around 96% of the time, and even if you bet as small as 2 units, you’ll still face a considerable risk of screwing up a lucrative proposition. The truth is that with such a small edge, keeping your bet small as well is the way to go if you want to make it in the long run.

But what if some fool offered us even higher odds, let’s say 2.20? First off, we would have to check if the person was A: mentally stable, and B: rich enough to pay us if (or rather, when) we win, before we go ahead and bet. Here our edge would be 10% (2.2 / 2.0 – 1), nearly 10 times as large as in the 2.03 situation, so we’ll likely be able to bet more – but how much? Well, the functions are written with this in mind, enabling us to play around with different situations and strategies. Specifying the odds parameter of our new function as 2.20, here’s what betting at a fair coin flip at 2.20 odds would look like:

02

As can be seen from the new plot, with a larger edge we can go ahead and raise our stake size considerably, hopefully boosting our winnings as well. So the main take-away from this small exercise is that even if you have an edge, if you want to make it in the long run you’ll have to be careful with your staking to avoid blowing up your bankroll – but also that the larger your edge, the larger you can afford to bet.

That’s it for now, but I’ll hopefully be back soon with a Part 2 about stake sizing, looking at a staking plan that actually takes your (perceived) edge into account when calculating the optimal stake size: The Kelly Criterion.

Advertisements
Stake sizing, Part 1

Allsvenskan 2016 – The Endgame

Before I continue with another Allsvenskan 2016 update – the last before the season ends – I have some news regarding the blog.

As some of you may know, I’ve been working part time for StrataBet this season, mostly writing game previews for the Norwegian Tippeligaen. As I soon take on a new, full-time job elsewhere I likely won’t have the time to write as much as I want. Also, with my new job focusing on Allsvenskan and Swedish football in general, I may be reluctant to give away too much information to the general public, so the future of this blog is very uncertain.

I’m hoping to continue writing in some form though, and what I do write will likely be closely linked to StrataBet as they’ve given me access to their great dataset.

Allsvenskan 2016 – The Endgame

Ok, so let’s get on with another update. With only 3 rounds left – the next starts tonight – we can see how much of the drama has gone out of the league table since last time. Malmö have retaken the top spot and thanks to Norrköping’s recent poor form the gap down to the title contenders is now 4 points. Sure, both Norrköping and AIK can still theoretically win the title, but I would be very surprised if Malmö let this slip out of their hands, despite the disappointing defeat to Östersund. They do have some disturbing injury problems though…

01

Göteborg have a chance to break into the top-3 and gain a European spot for next season, but this looks even more unlikely with 7 points up to AIK. At the other end of the table the bottom-3 have looked locked in for a long time. Helsingborg still have a chance to overtake Sundsvall, but again I’d be very surprised if this happens. In mid-table we see how Elfsborg, Kalmar and Hammarby have climbed a few spots at the expense of Örebro, Häcken and Östersund.

02

Counting up shots we see how Djurgården surprisingly is the best defensive side when it comes to denying the opposition chances to shoot. We also see how Gefle continue to be very bad and that Örebro still is the main outlier with A LOT of shots both taken and conceded.

03

Looking at effectiveness up front we see few changes since last time. Elfsborg have been slightly more effective with their shooting though, partly explaining their climb in the table. On the other end of the scale, Helsingborg have had a real problem scoring on their chances lately, with ZERO goals since the last update.

04

Looking at defensive effectiveness we see why Djurgården’s ability to deny the opposition chances hasn’t seen them climb into the upper half of the table: They still concede a lot of goals on the chances they do allow. Only bottom-of-the-table Falkenberg are worse. With Malmö and Norrköping’s effectiveness declining since last time, AIK now stands out as the far superior defensive side.

05

Not much have changed in terms of chance quality either – but what is interesting here is that Djurgården is the best defensive side when it come to xG as well. So if they concede very few chances, and very little xG – why are they conceding all those goals? My guess is – I don’t have time to look it up – that the few chances they do concede are of higher quality. Djurgården have also had a lot of problems with goalkeepers this season. Having used 4 keepers so far, only star signing Andreas Isaksson has looked stable enough but he has picked up an injury and will be out for the remainder of the season.

I don’t know much about evaluating goalkeepers but have been thinking about doing a blog post about it for some time now, hopefully I’ll get to it in the near future.

06

Looking at Expected Goals Difference, we see how Djurgården’s lack of defensive effectiveness has robbed them of a nice upper half finish. My model currently ranks them as 5th in the league, close to Hammarby in 4th – far above their current 11th place.

We also see how AIK have overtaken Norrköping in 2nd place, and with the reigning champions in poor form and just 3 points above AIK, this is where most of the drama left in the season lies. At the bottom of the table, Helsingborg are actually ranked far better than Sundsvall above them, but the 7 point gap will likely be too much for Henrik Larsson’s men with only 3 games remaining.

07

The model has always liked Malmö and they actually have the chance to secure the title tonight, if Norrköping lose away to Elfsborg while Malmö win away to Falkenberg – a not too unlikely outcome. In the race for 2nd place, AIK now have the upper hand much thanks to Norrköping’s recent poor results. Göteborg seems to have all but locked in the 4th place and the same goes for the bottom 3.

To continue my slight focus on Djurgården in this post, they’re interestingly projected to take about 6 points from their 3 remaining games: Helsingborg away, Häcken at home and Sundsvall away. Given their very disappointing season, and as a cynical Djurgården supporter, I doubt this.

Allsvenskan 2016 – The Endgame

Preview: AIK vs. IFK Göteborg

Monday night AIK will host IFK Göteborg for an extremely important game in the race for the Allsvenskan title. Both teams are close behind Norrköping in the lead and will surely go for the win here to challenge for the title, and I thought it would be a good idea to have a look at some team stats as a preview to this crucial game.

The plot below contains goals, shots, Expected Goals, xG per attempt, goal conversion % and shot on target % – both for and against, normalized per game where necessary. Home and away stats for each team in the league are separated with home in blue and away in red. For each subplot the lower right corner is preferable, with high offensive and low defensive numbers.

aik_gbg_03

Besides SoT%, both AIK and Göteborg appear to be among the best in the league in each stat, which partly explain why they are fighting for the title. What is really striking though, and could be seen as a indicator of team style, is that while AIK’s offensive numbers at home are really good, Göteborg’s strength when playing away is their defence.

aik_gbg_01 aik_gbg_02

This is also evident from each teams xG maps, where it is clear that AIK’s main strength is their attacking power and ability to produce high volumes of shots with high xG values each game. Göteborg on the other hand rely heavily on their defensive skills to protect their box and limit the opposition’s scoring chances. This clash of styles adds yet another interesting flavor to an already interesting game.

aik_gbg_04Looking at each teams top 5 goalscorers it is clear that AIK’s impressive attack rely heavily on Henok Goitom. His 16 goals this season are pretty much in line with his xG of about 15 while Göteborgs Søren Rieks seems to be overperforming with his 10 goals equalling almost two times his xG numbers. Both teams have sold one of their best offensive players with Bahoui and Vibe both making a move abroad this summer.

What about a prediction then? While I won’t reveal any percentages for this (or any) game, what I can say is that my model is pretty much in tune with the betting market. AIK is a slight favourite due to their home advantage, but this is really anybody’s game and it will hopefully be highly entertaining.

Preview: AIK vs. IFK Göteborg

Predicting the final Allsvenskan table

With the Swedish season soon coming to an end it’s a good time to try out how the Expected Goals model will predict the final table. With only three games left a top trio consisting of this season’s big surprise Norrköping just in front of Göteborg and AIK are competing for the title as Swedish Champion. At the opposite end of the table Åtvidaberg, Halmstad and Falkenberg look pretty stuck, with the two latter teams battling it out for the possible salvation of the 14th place relegation play-off spot.

predict_table_01

Let’s take look at the remaining schedule for the top three teams:

Norrköping have two though away games left against Elfsborg and Malmö, who are both locked in a duel for the 4th place which could potentially mean a place in the Europa League qualification. Elfsborg are probably the tougher opponent here, with reigning champions Malmö busy in the Champions League group stage. Between these two away games Norrköping will play at home against Halmstad who are fighting for survival in the bottom of the table.

Göteborg have two though away games themselves, first off at Djurgården and later a very important game against fellow title contenders AIK. This game will probably decide which of the two will challenge Norrköping for the title in the last round. Göteborg finishes the season at home to Kalmar who could possibly play for their survival in this last game.

AIK have the best remaining schedule of the three top teams, with away games at Halmstad and Örebro on either side of the crucial home game against Göteborg. As mentioned, Halmstad is fighting for their existence in Allsvenskan, while Örebro’s recent great form have seen them through to a safe spot in the table.

At this late stage of the season there are a lot of psychological factors in play, with the motivation and spirit of teams and players often being connected to their position in the table. These aspects are very hard to quantify and have not been incorporated in my model. So my prediction of the table rely solely on my Expected Goals model used in Monte Carlo simulation. I won’t reveal exactly how I simulate games but the subject will probably be touched upon in a later post so I’ll spare you any boring technical details for now.

Each of the remaining 24 individual games have been simulated 10,000 times. For each of these fictional seasons I’ve counted up the points, goals scored and goal differences for every team to come up with a final table for that season. Lastly I’ve combined all these seasons into a table with expected points and probabilities of each teams possible league positions.

predict_table_03

The model clearly ranks Norrköping as the most likely winner with Göteborg as the main contender, while AIK’s chances of winning the title is only at about 18%. The bottom three looks rather fixed in their current positions with Falkenberg having only a 2% chance of overtaking Kalmar in the last safe spot in the table. At mid-table things are still quite open, even though Djurgården’s season is pretty much over with a 89% chance of placing 6th. Malmö seem to have an advantage against Elfsborg in the race for the 4th place, but given their Champions League schedule their chances should probably be less than the model predicts.

I’ll probably be posting updated predictions on my twitter feed after each of the top teams remaining games to see how the results change the predictions.

Predicting the final Allsvenskan table