Stake sizing, Part 1

In the last post we used Python code to take a look at a classic gambling situation, the coin flip, to make a point about the importance of choosing the highest odds available to bet at. Today, we’ll again use the coin flipping example to investigate another fundamental principal of successful gambling: stake sizing.

Now, imagine we’re one of the lucky punters from the last post who were allowed to bet on a fair coin flip at odds of 2.03. As I stated then, this is pretty much like a license to print money – but how much of your bankroll should you bet on each flip of the coin? Knowing that the coin was indeed fair and you would be getting the best of it, a natural instinct could be to bet as much as you could possibly cough up, steal and borrow in order to maximize your profit. This is a poor strategy though, as we’ll soon come to see.

The reason for this is that even if we do have come across a profitable proposition, our edge when betting at a (I’ll empasize it again: fair) coin flip at 2.03 odds is only 1.5% – meaning that for each 1 unit bet we are expected to net 0.015 units on average. This conclusion should be absolute basics for anyone interested in serious gambling, but to make sure we’re all on the same page I’ll throw some maths at you:

The Expected Value, or EV, of any bet is, simply put, the sum of all outcomes multiplied by their respective probabilities – indicating the punter’s average profit or loss on each bet. So with our coin flip, we’ll win a net of 1.03 units 50% of the time and lose 1 unit 50% of the time; our EV is therefore 1.03 * 0.5 + (-1 * 0.5) = 0.015, for a positive edge of 1.5% and an average profit of 0.015 units per bet. For these simple types of bets though, an easier way to calculate EV is to divide the given odds by the true odds and subtract 1: 2.03 / 2.0 – 1 = 0.015.

An edge of only 1.5% is nothing to scoff at though, empires has been built on less, so we’ll definitely want to bet something – but how much?

Stake sizing is much down to personal preferences about risk aversion and tolerance of the variance innately involved in gambling, but with some Python code we can at least have a look at some different strategies before we set out to chase riches and glory flipping coins. Just like in the last post I’ll just give you the code with some comments in it, which will hopefully guide you along what’s happening  before I briefly explain it.

Here we go:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

def coin_flips(n=10000,odds=1.97,bankroll=100,stake=1,bankrupt=False):
    '''
    Simulates 10000 coinflips for a single punter, betting at 1.97 odds,
    also calculates net winnings

    NEW: default bankroll and stake set at 100 and 1, respectively
    now also calculates if player went bankrupt or not
    '''

    # create a pandas dataframe for storing coin flip results
    # and calculate net winnings
    df = pd.DataFrame()
    # insert n number of coinflips, 0=loss, 1=win
    df['result'] = np.random.randint(2,size=n)
    # calculate net winnings
    df['net'] = np.where(df['result']==1,stake*odds-stake,-stake)
    # calculate cumulative net winnings
    df['cum_net'] = df['net'].cumsum()

    # calculate total bankroll
    df['bankroll'] = df['cum_net'] + bankroll

    # if bankroll goes below the default stake, punter will stop betting
    # count times bankroll < stake
    df['bankrupt'] = np.where(df['bankroll']<stake,1,0)
    # count cumulative bankruptcies, with column shifted one step down
    df['bankruptcies'] = df['bankrupt'].cumsum().shift(1)
    # in case first flip is a loss, bankruptcies will be NaN, replace with 0
    df.fillna(0,inplace=True)
    # drop all flips after first bankruptcy
    if bankrupt:
        df = df[df['bankruptcies']==0]

    return df

First off, we’ll modify our original coin_flips function to take our punter’s bankroll and stake size into consideration, setting the bankrupt threshold at the point where a default sized bet can no longer be made. By default, our punter will have an endless stream of 100 unit bankrolls, but if we set the parameter bankrupt to True, the function will cut away any coin flips after his first bankruptcy.

def many_coin_flips(punters=100,n=10000,odds=1.97,bankroll=100,stake=1,color='r',plot=False,bankrupt=False):
    '''
    Simulates 10000 coinflips for 100 different punters,
    all betting at 1.97 odds,
    also calculates and plots net winnings for each punter

    NEW: now also saves punter bankruptcies
    '''

    # create pandas dataframe for storing punter results
    punter_df = pd.DataFrame()
    # loop through all punters
    for i in np.arange(punters):
        # simulate coin flips
        df = coin_flips(n,odds,bankroll,stake,bankrupt)
        # calculate net
        net = df['net'].sum()
        # check for bankruptcy
        bankruptcy = df['bankrupt'].sum()

        # append to our punter dataframe
        punter_df = punter_df.append({'odds':odds,
                                      'net':net,
                                      'bankrupt':bankruptcy},ignore_index=True)

        if plot:
            # plot the cumulative winnings over time
            df['cum_net'].plot(color=color,alpha=0.1)

    # check if punters ended up in profit
    punter_df['winning'] = np.where(punter_df['net']>0,1,0)

    return punter_df

We also want to modify the many_coin_flips function so that it’ll also take bankroll and stake size into consideration, counting up how many of our punters went bankrupt.

We won’t use the compare_odds function here, instead we’ll write a new one to compare stake sizing – but if we ever want to use it again sometime in the future a few minor changes will be needed here as well:

def compare_odds(punters=100,n=10000,odds=[1.97,2.00,2.03]):
    '''
    Simulates and compare coin flip net winnings
    after 10000 flips for 3 groups of punters,
    betting at odds of 1.97, 2.00 and 2.03, respectively.
    Also plots every punters net winnings
    '''

    # create figure and ax objects to plot on
    fig, ax = plt.subplots()

    # set y coordinates for annotating text for each group of punters
    ys = [0.25,0.5,0.75]
    # assign colors to each group of punters
    cs = ['r','y','g']

    # loop through the groups of punters, with their respective odds,
    # chosen color and y for annotating text
    for odd, color, y in zip(odds,cs,ys):
        # run coin flip simulation with given odds, plot with chosen color
        df = many_coin_flips(punters,n,odd,color=color,plot=True)
        # calculate how many punters in the group ended up in profit
        winning_punters = df['winning'].mean()
        # set a text to annotate
        win_text = '%.2f: %.0f%%' %(odd,winning_punters * 100)
        # annotate odds and chance of profit for each group of punters
        ax.annotate(win_text,xy=(1.02,y),
                    xycoords='axes fraction', color=color,va='center')

    # set title
    ax.set_title('Chances of ending up in profit after %s coin flips' %n)
    # set x and y axis labels
    ax.set_xlabel('Number of flips')
    ax.set_ylabel('Net profit')
    # add annotation 'legend'
    ax.annotate('odds: chance',xy=(1.02,1.0),
                xycoords=('axes fraction'),fontsize=10,va='center')
    # add horizontal line at breakeven point
    plt.axhline(color='k',alpha=0.5)
    # set y axis range at some nice number
    ax.set_ylim(-450,450)

    # show plot
    plt.show()

Now, with all our previous coin flip functions taking bankroll and stake size into consideration, we can go ahead and evaluate a few stake sizing strategies with a new function:

def compare_stakes(punters=200,n=10000,odds=2.03,stakes=[100,50,25,10,5,2,1,0.5],bankroll=100):
    '''
    Similar to compare_odds, but here we instead want to compare different
    staking sizes for our coin flips betting at 2.03 odds

    Increased number of punters in each group, from 100 to 200

    Also prints out the results
    '''

    # pandas df to store results
    results_df = pd.DataFrame(columns=['stake','win','lose','bankrupt'])

    # colors to use in plot later, green=1=win, yellow=4=lost, red=2=bankrupt
    colors = [sns.color_palette()[i] for i in (1,4,2)]

    # loop through the groups of punters, with their respective odds
    for stake in stakes:
        # run coin flip simulation with given stake
        df = many_coin_flips(punters,n,odds,stake=stake,bankrupt=True)
        # calculate how many punters in the group ended up in profit
        winning_punters = df['winning'].mean()
        # ...and how many went bankrupt
        bankrupt_punters = df['bankrupt'].mean()
        # lost money but not bankrupt
        lose = 1 - winning_punters - bankrupt_punters

        # append to dataframe
        results_df = results_df.append({'stake':stake,
                                        'win':winning_punters,
                                        'lose':lose,
                                        'bankrupt':bankrupt_punters},ignore_index=True)

    # set stake as index
    results_df.set_index('stake',inplace=True)

    # plot
    fig = plt.figure()
    # create ax object
    ax = results_df.plot(kind='bar',stacked=True,color=colors,alpha=0.8)
    # fix title, axis labels etc
    ax.set_title('Simulation results: betting %s coin flips at %s odds, starting bankroll %s' %(n,odds,bankroll))
    ax.set_ylabel('%')
    # set legend outside plot
    ax.legend(bbox_to_anchor=(1.2,0.5))

    # add percentage annotation for both win and bankrupt
    for x, w, l, b in zip(np.arange(len(results_df)),results_df['win'],results_df['lose'],results_df['bankrupt']):
        # calculate y coordinates
        win_y = w/2
        lost_y = w + l/2
        bankr_y = w + l + b/2

        # annotate win, lose and bankrupt %, only if >=2%
        if w >= 0.04:
            ax.annotate('%.0f%%' %(w * 100),xy=(x,win_y),va='center',ha='center')
        if l >= 0.04:
            ax.annotate('%.0f%%' %(l * 100),xy=(x,lost_y),va='center',ha='center')
        if b >= 0.04:
            ax.annotate('%.0f%%' %(b * 100),xy=(x,bankr_y),va='center',ha='center')

    plt.show()

By default, our new compare_stakes function creates a number of punter groups, all betting on fair coin flips at 2.03 odds with a starting bankroll of a 100 units. For each group and their different staking plan, the function takes note of how many ended up in profit, how many lost and how many went bankrupt.

As we can see on the plot below, the results differ substantially:

01

Just like last time, I want to remind you that any numbers here are only rough estimates, and increasing the size of each punter group as well as the number of coin flips will get us closer to the true values.

So what can we learn from the above plot? Well, the main lesson is that even if you have a theoretically profitable bet, your edge will account for nearly nothing if you are too bold with your staking. Putting your whole bankroll at risk will see you go bankrupt around 96% of the time, and even if you bet as small as 2 units, you’ll still face a considerable risk of screwing up a lucrative proposition. The truth is that with such a small edge, keeping your bet small as well is the way to go if you want to make it in the long run.

But what if some fool offered us even higher odds, let’s say 2.20? First off, we would have to check if the person was A: mentally stable, and B: rich enough to pay us if (or rather, when) we win, before we go ahead and bet. Here our edge would be 10% (2.2 / 2.0 – 1), nearly 10 times as large as in the 2.03 situation, so we’ll likely be able to bet more – but how much? Well, the functions are written with this in mind, enabling us to play around with different situations and strategies. Specifying the odds parameter of our new function as 2.20, here’s what betting at a fair coin flip at 2.20 odds would look like:

02

As can be seen from the new plot, with a larger edge we can go ahead and raise our stake size considerably, hopefully boosting our winnings as well. So the main take-away from this small exercise is that even if you have an edge, if you want to make it in the long run you’ll have to be careful with your staking to avoid blowing up your bankroll – but also that the larger your edge, the larger you can afford to bet.

That’s it for now, but I’ll hopefully be back soon with a Part 2 about stake sizing, looking at a staking plan that actually takes your (perceived) edge into account when calculating the optimal stake size: The Kelly Criterion.

Stake sizing, Part 1

Flipping coins, and the importance of betting at the highest odds

As I stated in the previous post, this blog will now focus more on gambling, using Python code to investigate whatever comes to my mind around the subject.

Today I’ll have a look at a classic gambling example – the flip of a coin – but before I go ahead and talk you through the code I want to state a few things that I know some of you will be wondering. Though R seems to be the language preferred by most in the football analytics scene, I have chosen Python simply because I feel it is so much more intuitive and easier to learn. RStudio seems to be the tool of choice for the R folks, but I don’t know of any real dominant counterpart for Python. I use Spyder, available through downloading Anaconda, mainly because it’s easy to use and comes with a lot of useful stuff pre-installed. If you’re thinking about testing it out yourself, I would suggest switching the color scheme of the editor to Zenburn for that dark and cool programming look that really make your code look super important, and run your scripts in the included IPython console.

One final, very important thing: I am not in any way an expert programmer, statistician, mathematician or anything like that. I am simply a gambler looking to use these fields to get an edge. It’s totally OK to simply copy and paste any code I publish here to use yourself and play around with it however you may wish. If you notice any mistakes or if something doesn’t add up, please comment. I’m happy to learn new stuff.

Flipping coins, and the importance of betting at the highest odds

The inspiration for this post came the other day when I noticed that a few hours prior to kick-off in this year’s Super Bowl, the bookmaker Pinnacle offered 1.97 odds on the opening coin flip. A sucker bet, I thought to myself, knowing the true odds of a fair coin to be 2.00. The coin flip is a very popular Super Bowl prop bet though and as it was pointed out to me on Twitter, a few books actually offered the fair odds of 2.00. Choosing the highest odds available is crucial if you want to make money gambling in the long run, so I decided to write up a nice little Python script to visualise my point.

The layout of these blog posts will be that I simply throw a piece of code at you, before explaining it. The comments in the code itself should also help you out, and for those of you who already know Python much will be simple basics, while those who’s completely new to coding or Python will hopefully learn a few things.

Here we go:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

def coin_flips(n=10000,odds=1.97):
    '''
    Simulates 10000 coinflips for a single punter,
    betting at 1.97 odds,
    also calculates net winnings
    '''

    # create a pandas dataframe for storing coin flip results
    # and calculate net winnings
    df = pd.DataFrame()
    # insert n number of coinflips, 0=loss, 1=win
    df['result'] = np.random.randint(2,size=n)
    # calculate net winnings
    df['net'] = np.where(df['result']==1,odds-1,-1)
    # calculate cumulative net winnings
    df['cum_net'] = df['net'].cumsum()

    return df

Allright, so after importing all the needed modules for this piece, we go ahead and define our first function, coin_flips, which will be used to simulate the coin flips and calculate the net winnings of a single punter. I’ve chosen 10,000 flips and Pinnacle’s odds of 1.97 as our default values here.

Creating a pandas dataframe, we can easily store the result of each coin flip. Now, as we assume that the coin is fair, there’s no need to even consider which side our punter would call each time, instead we can simply go ahead and use numpy to simulate a series of ones and zeros, representing either a win or a loss. Calculating the net result of each flip is also very straightforward as when he wins, our punter will pocket the net end of the offered odds, 0.97, while losing will see his pocket lightened by 1 unit. Calculating the cumulative net winnings is also very easy using pandas’ built-in cumsum function.

For coding reasons, the function is set to return the dataframe so calling it will simply make a lot of numbers pop up, but running the coin_flips()[‘cum_net’].plot() command in the IPython console will let you simulate a punter’s coin flips, and also plot his cumulative net winnings like this:

01

Every time you run the command another simulation will run with a new, different result. Doing this a couple of times, you’ll likely understand why I described this as a sucker bet. Sure, you can get lucky and win, even a couple of times in a row – but betting with the odds against you, you’ll find it very hard to make a profit long term.

But that single punter flipping coins 10,000 times actually doesn’t say that much, maybe he just got unlucky? To dig deeper we want to know just how likely you are to end up with a profit after 10,000 coin flips. So we write another function, using the previous one to simulate the results of many more punters betting on 10,000 coin flips. How many do you think will end up in profit?

def many_coin_flips(punters=100,n=10000,odds=1.97,color='r'):
    '''
    Simulates 10000 coinflips for 100 different punters,
    all betting at 1.97 odds,
    also calculates and plots net winnings for each punter
    '''

    # create pandas dataframe for storing punter results
    punter_df = pd.DataFrame()
    # loop through all punters
    for i in np.arange(punters):
        # simulate coin flips
        df = coin_flips(n,odds)
        # calculate net
        net = df['net'].sum()
        # append to our punter dataframe
        punter_df = punter_df.append({'odds':odds,
                                      'net':net},ignore_index=True)

        # plot the cumulative winnings over time
        df['cum_net'].plot(color=color,alpha=0.1)

    # check if punters ended up in profit
    punter_df['winning'] = np.where(punter_df['net']>0,1,0)

    return punter_df

The slightly more complicated many_coin_flips function uses the earlier coin_flips to loop through a group of punters, 100 by default, and save their results into a new pandas dataframe, punter_df, where we’ll assign a 1 to all punters who ended up in profit while all the losers get a 0. We also plot each punters cumulative net winnings with a nice red color to symbolise their (very) likely bankruptcy.

This function also returns a dataframe so running it will again make a lot of numbers pop up in the console, but it also plots out the financial fate of each punter, like this:

02.png

As we can see, there actually are a few of our 100 punters who got lucky enough to end up winning after 10,000 coin flips. But most of them ended up way below the break-even point, losing a lof of money. If this was a real group of punters we can only hope that even if they were stupid enough to set out betting on 10,000 coin flips at these odds, they’ll at least at some point realise their mistake and quit.

But how about if we change the offered odds? As I mentioned earlier, some books actually put up the fair odds of 2.00. How would 100 punters do after 10,000 coin flips betting at those odds? Well, we’ll have to write a new function for that. Also, just for fun (or to make a point) I’ve included an additional group of 100 punters lucky enough to be allowed to bet on the coin flips at odds of 2.03 – literally a license to print money.

def compare_odds(punters=100,n=10000,odds=[1.97,2.00,2.03]):
    '''
    Simulates and compare coin flip net winnings
    after 10000 flips for 3 groups of punters,
    betting at odds of 1.97, 2.00 and 2.03, respectively.
    Also plots every punters net winnings
    '''

    # create figure and ax objects to plot on
    fig, ax = plt.subplots()

    # set y coordinates for annotating text for each group of punters
    ys = [0.25,0.5,0.75]
    # assign colors to each group of punters
    cs = ['r','y','g']

    # loop through the groups of punters, with their respective odds,
    # chosen color and y for annotating text
    for odd, color, y in zip(odds,cs,ys):
        # run coin flip simulation with given odds, plot with chosen color
        df = many_coin_flips(punters,n,odd,color)
        # calculate how many punters in the group ended up in profit
        winning_punters = df['winning'].mean()
        # set a text to annotate
        win_text = '%.2f: %.0f%%' %(odd,winning_punters * 100)
        # annotate odds and chance of profit for each group of punters
        ax.annotate(win_text,xy=(1.02,y),
                    xycoords='axes fraction', color=color,va='center')

    # set title
    ax.set_title('Chances of ending up in profit after %s coin flips' %n)
    # set x and y axis labels
    ax.set_xlabel('Number of flips')
    ax.set_ylabel('Net profit')
    # add annotation 'legend'
    ax.annotate('odds: chance',xy=(1.02,1.0),
                xycoords=('axes fraction'),fontsize=10,va='center')
    # add horizontal line at breakeven point
    plt.axhline(color='k',alpha=0.5)
    # set y axis range at some nice number
    ax.set_ylim(-450,450)

    # show plot
    plt.show()

This last function makes use of the two previous ones to simulate the coin flips of our three groups of punters, plotting their total net winnings all on the same ax object, which we later make use of to add a title and some nice labels to the axes. We also add a horizontal line to be able to better compare the punters’ winnings with the break-even point, as well as some text annotation to explain the colors of the three groups.

Now, running the compare_odds() function in the IPython console will hopefully result in something like this:

03

Here we clearly see just how important betting at the highest odds really is. Have in mind though that the numbers to the right are only rough estimates. As you can see, the yellow group of punters who bet at the fair odds of 2.00 did not win exactly 50% of the time, but close enough. I actually had to re-run the function a few times to get this close. But it’s only natural since we only had 100 punters, a very small number in this context, in each of our groups. The more punters and coin flips we use in our simulations, the closer we’ll come to the real win percentages – but here speed is more important than super accuracy.

So as we clearly see in the above plot, betting on the coin flip at Pinnacle’s 1.97 odds really is a sucker bet, albeit an entertaining one if you were planning to watch the Super Bowl. But if you hope to make a profit from your betting, finding the highest available odds to bet on is crucial, as is shown by the green group of punters who were allowed to bet at odds of 2.03. It’s only a difference of 0.06, but it makes all the difference in the long run. The margins in betting are tiny, but they add up over time.

The lessons learned here can easily be transferred to sports betting in general and football betting in particular, were the Asian Handicaps and Over/Under markets focus on odds around even money. The coin flip example is special though as we knew the true odds of the bet beforehand, something you’ll never be able to know betting on football. But as shown in the last plot, by consistently betting at the highest available odds, you at least give yourself a much better chance of ending up in profit.

Flipping coins, and the importance of betting at the highest odds

With a new season approaching, the blog changes course: Gambling, probability and programming!

As you may have noticed, I haven’t written anything in months. There’s two reasons for this, one being of course that the Swedish football season I’ve primarily focused on ended in November, but it’s also because I’ve taken on a new job. Working full time for the first time in my life has simply left me with little time to do any writing. (Yes, I did use the word time three times in that short sentence.)

But now, having settled in at the new job I’m anxious to get back to writing again. There’s one thing though: as I now work with compiling odds on Swedish football I wouldn’t feel comfortable publishing football analytics about Allsvenskan, telling you which teams are underrated and who’ll win the league title. And knowing I set the odds, and potentially profit from your mistakes, why would you believe anything I said?

So this blog will take on a slightly new focus: gambling. I originally set up the blog intending to write about this topic as well as football analytics, using maths, statistics, probability and psychology to discuss interesting things related to gambling, but the football part soon took over completely.

As I’ve published my football work on the blog I’ve now and then gotten some questions about programming, so I’ve taken the decision to include Python code whenever applicable. Learning to code has made a huge difference for me both in my gambling and football analytics endeavours, and though the blog won’t turn into a Python tutorial per se, if any of you who are new to programming should learn a new thing or two through my writing, I’d be glad.

I’m still hoping to write the occasional football analytics piece though, and if I do it’ll likely be for StrataBet, using their data as I did when I had a look at headers in Allsvenskan and Norway’s Tippeligaen.

That’s it for now, but I already have a new post in the works, coming up very shortly!

With a new season approaching, the blog changes course: Gambling, probability and programming!

Allsvenskan 2016 summary pt. 2

A week ago I published the first part of – hopefully – three Allsvenskan 2016 summaries, then focusing on team performance. Now it’s time to have a look at individual players, much like I did back in July. Though there now exists detailed Opta data for Allsvenskan, my work on this site has mostly been based on the older, less detailed data sources focused on shots and thus this summary will only look at attacking players.

I’ve again had a look at Goal Contribution (goals+assists) and Expected Goals, dividing all players into three age groups, and also had a closer look at a few interesting players:

01

02.png

The Goal Contribution chart is unsurprisingly headed by Häcken’s John Owoeri who clinched the title as the league’s top scorer with his 4 goals against Falkenberg in the last round of the season. Interestingly, Owoeri only came alive in the second half of the season, scoring 15 of his 17 goals after the summer break.

03

 

Assist monster Magnus Wolff Eikrem sits in second, with his 0.71 assists per 90 minutes playing a big part in Malmö retaking the title. Of the other top players, Antonsson, Kjartansson and Nyman left the league during the summer transfer window but still impressed enough during the spring to remain in the top 10.

 

Djurgården’s Michael Olunga sits top among the players aged 20-23. Dubbed ‘The Engineer’ for his ongoing studies, Olunga just like Owoeri needed time to get going, scoring all of his 12 goals during the last 13 games when Mark Dempsey came in to steer Djurgården away from the relegation battle.

04

Comparing Owoeri and Olunga, it’s clear from the shot maps why Owoeri was the superior goalscorer this season. He only shoots slightly more than Olunga, but does so from far better locations closer to goal, with his average xG per shot at 0.16 while Olunga at 0.12 rely more on his finishing skill from longer range. If ‘The Engineer’ can work on his shot selection for next season I really think he can challenge for the top scorer title.

 

AIK’s Alexander Isak reign supreme among the youngest players, with his 0.62 G+A90 very impressive for a player who only turned 17 late in the season. He’s quite good at getting into good shot locations as well, with 5 of his 10 goals coming from a sweet spot just in front of goal.

05

There’s been plenty of rumours of an upcoming big transfer during the winter window and looking very much like the real deal, Isak could very well break Zlatan Ibrahimovic’s transfer record from 2001. Here’s a nice radar plot from Ted Knutson showing Isak’s skills:

 

Malmö’s Vidar Kjartansson was the king of xG this season, and the club impressingly still managed to secure the title after selling him during the summer transfer window. Kjartansson combined both quantity with quality, taking most of his shots from very good locations with an average xG per shot of 0.2.

06

 

Östersund’s Abdullahi Gero was a bit of a surprise for me, but his shot locations are good with an average xG per shot close to Kjartansson at 0.19. He could very well go on to score more next season if given the chance in Graham Potter’s Östersund side which have done so well xG-wise this season – actually finishing 4th in xG Difference per game!

07

 

As a Djurgården supporter I’m glad to see 20-year old Tino Kadewere’s development this season. Though his 793 minutes played was less than the 900 needed to be included above, he racked up an impressing 0.79 G+A90 which would see him sit 8th overall, just above Olunga, and top the players aged 20-23 if the cut-off would have been 1/4 of the league minutes played instead of 1/3. Focusing more on assists than Olunga, the two could form a dynamic partnership for Djurgården if they get the chance next season.

 

That’s it for now, but if you want to see more shot maps, just give me a shout on twitter. If I’ll find the time, I’ll also write a third summary looking at how my predictions have done over the season and how my model did against the betting markets.

Allsvenskan 2016 summary pt. 2

Allsvenskan 2016 summary pt. 1

With the season ending more than a week ago, I finally have enough time to sit down and write a summary. I’ll split it in parts, with the first two looking at team and player performance respectively, and hopefully I’ll get around to writing a third part in which I look at how my predictions have done, and the model’s performance on the betting markets.

These kind of updates likely won’t return for next season when I’ll be taking on a new job compiling odds on Swedish football. I’m hoping to continue writing in some form though.

But enough about that, let’s get to it:

00

As most expected, Malmö bounced back from last season’s 5th place to reclaim the title from Norrköping. Luckily they did so without the need of the extra win awarded to them after the abandonded game against Göteborg where the home fans threw pyrotechnics against the Malmö players – and Tobias Sana responded with a spear throw.

At the other end of the table, Gefle were finally relegated after several years of clinging to their place in the top flight. Falkenberg’s extremely poor season saw them relegated as well, while Helsingborg will have to face third placed Superettan side Halmstad in a two-leg relegation play-off.

Take a look at Djurgården’s row of results by the way, only one draw!

01

Malmö were the best side in terms of shots taken and conceded as well, while bottom duo Gefle and Falkenberg really struggled together with Sundsvall, whose good start to the season saw them able to avoid the relegation battle despite only picking up two wins after the summer break. Örebro was an outlier throughout the season, usually producing some high-shooting games.

02

Göteborg were the most efficient attacking side during the season, but their low shot volume saw them unable to compete with the real top sides. AIK and Malmö relied on pure shot volume instead, probably a result of their ability to dominate games. Falkenberg on the other hand really struggled with both volume and effectiveness, usually needing nearly 11 shots to score.

03

At the other end of the pitch we see partly why Malmö were the superior side this season, and why AIK finally overtook Norrköping in second place: they both enjoyed some very efficient defending, clearly outperforming their opponents. Falkenberg struggled here as well, conceding a goal about every 5th shot, while Örebro actually did well efficiency-wise despite conceding a lot of shots.

04

Champions Malmö were the best side in terms of both xG and xG conceded, while Falkenberg’s poor defence was a big factor in their relegation. Djurgården, Hammarby and Östersund did better defensively than their final positions in the table might suggest, unable to break into the top mostly because of their weaker attacking output.

05

Ranking the teams by Expected Goals Difference did well to explain both ends of the table, getting the top 3 and bottom 2 correct. As mentioned, Sundsvall’s ‘lucky’ results at the start of the season saw them avoid the relegation battle, while Östersund, Hammarby and Djurgården formed an underperfoming trio just below the top sides.

06

Simulating every game based on the shots taken and their xG values, we can give the teams ‘Expected Points’. This is very close to the xGD rankings above but we can see some differences, like Falkenberg ‘earning’ almost as many Expected Points as Gefle, which they were no way near in reality.

Let’s see how the teams actually did compared to their Expected Points:

07

What we see is that typically the winning sides overperform against Expected Points, while the losing sides underperform. This is to be expected as you’ll very rarely (or never) dominate a game enough for your expected points to match the three actual points awarded for a win. The same goes for losing, since you’ll pretty much always ‘earn’ more than zero Expected Points.

08

There’s always exceptions to the rule though, and this season Gefle stands out as having picked up pretty much exactly the points expected from them which I would say is rare for a losing side, while Falkenberg look to have been very unlucky to pick up so few points.

09

Looking at the Expected Points distributions for the teams, we really see just how ‘unlucky’ Falkenberg have been. As mentioned above, losing sides will very often underperform against Expected Points but Falkenberg really stands out with a 100% chance of picking up at least the 10 points they ended up with, implied by the 10,000 seasons I ran through my simulation.

Djurgården

As I’ve done in a few updates, I’ll end this one looking at Djurgården. As a lifelong supporter I’ve become used to the ups and downs but despite that and the underlying good numbers I was still a bit concerned this season.

10

Luckily though, Mark Dempsey, the right man at the right time, stepped in and turned things around much like his former mentor Per-Mathias Høgmo did in 2013. Just like I’ve seen him do in Norway, Dempsey focused on a very direct attack which worked well to improve shot numbers and level out Djurgården’s dropping xGD, while at the same time crucially also getting some real results.

Defence continued to struggle though, and no real tactics but ‘get the ball up to the big boys up front’ was clearly visible – a decent game plan to get them out of the hole they’ve dug themselves into but not something to build on for the future so the club’s decision to not give Dempsey a new contract looks reasonable.

That’s it for now, in a few days I’ll be looking closer at individual player performance.

Allsvenskan 2016 summary pt. 1

Allsvenskan 2016 – The Endgame

Before I continue with another Allsvenskan 2016 update – the last before the season ends – I have some news regarding the blog.

As some of you may know, I’ve been working part time for StrataBet this season, mostly writing game previews for the Norwegian Tippeligaen. As I soon take on a new, full-time job elsewhere I likely won’t have the time to write as much as I want. Also, with my new job focusing on Allsvenskan and Swedish football in general, I may be reluctant to give away too much information to the general public, so the future of this blog is very uncertain.

I’m hoping to continue writing in some form though, and what I do write will likely be closely linked to StrataBet as they’ve given me access to their great dataset.

Allsvenskan 2016 – The Endgame

Ok, so let’s get on with another update. With only 3 rounds left – the next starts tonight – we can see how much of the drama has gone out of the league table since last time. Malmö have retaken the top spot and thanks to Norrköping’s recent poor form the gap down to the title contenders is now 4 points. Sure, both Norrköping and AIK can still theoretically win the title, but I would be very surprised if Malmö let this slip out of their hands, despite the disappointing defeat to Östersund. They do have some disturbing injury problems though…

01

Göteborg have a chance to break into the top-3 and gain a European spot for next season, but this looks even more unlikely with 7 points up to AIK. At the other end of the table the bottom-3 have looked locked in for a long time. Helsingborg still have a chance to overtake Sundsvall, but again I’d be very surprised if this happens. In mid-table we see how Elfsborg, Kalmar and Hammarby have climbed a few spots at the expense of Örebro, Häcken and Östersund.

02

Counting up shots we see how Djurgården surprisingly is the best defensive side when it comes to denying the opposition chances to shoot. We also see how Gefle continue to be very bad and that Örebro still is the main outlier with A LOT of shots both taken and conceded.

03

Looking at effectiveness up front we see few changes since last time. Elfsborg have been slightly more effective with their shooting though, partly explaining their climb in the table. On the other end of the scale, Helsingborg have had a real problem scoring on their chances lately, with ZERO goals since the last update.

04

Looking at defensive effectiveness we see why Djurgården’s ability to deny the opposition chances hasn’t seen them climb into the upper half of the table: They still concede a lot of goals on the chances they do allow. Only bottom-of-the-table Falkenberg are worse. With Malmö and Norrköping’s effectiveness declining since last time, AIK now stands out as the far superior defensive side.

05

Not much have changed in terms of chance quality either – but what is interesting here is that Djurgården is the best defensive side when it come to xG as well. So if they concede very few chances, and very little xG – why are they conceding all those goals? My guess is – I don’t have time to look it up – that the few chances they do concede are of higher quality. Djurgården have also had a lot of problems with goalkeepers this season. Having used 4 keepers so far, only star signing Andreas Isaksson has looked stable enough but he has picked up an injury and will be out for the remainder of the season.

I don’t know much about evaluating goalkeepers but have been thinking about doing a blog post about it for some time now, hopefully I’ll get to it in the near future.

06

Looking at Expected Goals Difference, we see how Djurgården’s lack of defensive effectiveness has robbed them of a nice upper half finish. My model currently ranks them as 5th in the league, close to Hammarby in 4th – far above their current 11th place.

We also see how AIK have overtaken Norrköping in 2nd place, and with the reigning champions in poor form and just 3 points above AIK, this is where most of the drama left in the season lies. At the bottom of the table, Helsingborg are actually ranked far better than Sundsvall above them, but the 7 point gap will likely be too much for Henrik Larsson’s men with only 3 games remaining.

07

The model has always liked Malmö and they actually have the chance to secure the title tonight, if Norrköping lose away to Elfsborg while Malmö win away to Falkenberg – a not too unlikely outcome. In the race for 2nd place, AIK now have the upper hand much thanks to Norrköping’s recent poor results. Göteborg seems to have all but locked in the 4th place and the same goes for the bottom 3.

To continue my slight focus on Djurgården in this post, they’re interestingly projected to take about 6 points from their 3 remaining games: Helsingborg away, Häcken at home and Sundsvall away. Given their very disappointing season, and as a cynical Djurgården supporter, I doubt this.

Allsvenskan 2016 – The Endgame

Allsvenskan round 23 update

It’s been over six weeks since my last Allsvenskan update but now I finally have time to get to it. Six rounds have been played since last time and a lot has happened. Let’s take a look at the league table:

00

Compared to last time, we can immediately see that reigning champions Norrköping have climbed up above Malmö to claim the top spot, which is very impressing given the players who have left the club, and the mid-season managerial change.

At the other end of the table, Djurgården have (luckily for me) picked up pace under new manager Dempsey and moved up from 14th to 11th, while Helsingborg and Sundsvall have struggled – only picking up 2 points each.

Let’s have a closer look on how the teams have performed:01

Despite giving up the first place in the table to Norrköping, Malmö have distanced themselves from the rest in terms of shot dominance. Not much else has changed, Örebro are still involved in some very open games while Gefle struggle to create chances.

02

Örebro and Elfsborg have moved into the ‘constant threat’ quadrant thanks to some effective scoring, while Hammarby have done the opposite. Kalmar have improved their effectiveness, but at the same time seen a drop in shots taken per game.

03

Here we see how AIK’s and Norrköping’s improvements come mainly from their defensive work; both sides have been better at keeping shots from going in since the last update. Kalmar’s defensive effectiveness has improved as well.

04

05

Expected goals for and against look much like they did last time but AIK’s defensive improvements have seen them close in on the top 2 sides, as they’ve increased their xGD by nearly 0.20 per game.

How about a prediction then?06

Malmö’s defeat to Djurgården has really opened up the title race, but my model still fancy them. Norrköping have improved though, and we could be in for a very interesting finish to the season. AIK have improved as well, and have seemingly all but locked in a top-3 spot. In the other end of the table Falkenberg have plummeted from around 22 expected points to less than 16, with the model giving them no chance of reaching the relegation play-off spot occupied by Helsingborg.

Djurgården under Mark Dempsey

As mentioned earlier, as a Djurgården supporter I’m very happy with how the form has improved under new manager Dempsey. In the last update I showed the long-term trends leading up to Olsson’s sacking, and now that Dempsey’s been in charge for 7 games we can see how he’s managed to turn things around:

07

While shots conceded actually declined during Olsson’s last season, so did shots taken. What we see under Dempsey’s rule is clear: everything have improved! Djurgården now concede less and take more shots but more importantly, both actual goal difference and xG difference has improved, leading to more points and a climb in the league table.

Though a bit of hindsight, through my work with Norwegian football I was optimistic about Dempsey coming in as I knew he would provide the energy needed for a turnaround. Let’s hope Djurgården can continue to pick up points to climb further.

Passing spiders

Another thing I mentioned in the last update was how Opta data is now available for Allsvenskan, and I showed some passing maps heavily inspired by 11tegen11 and David Sumpter. I’ve since then played around with the script to create passing map animations, which received a lot of positive feedback on twitter and have now been dubbed ‘passing spiders’, often a quite fitting name.

I don’t know enough about tactics to determine if these animations holds some analytical value, but they are fun to look at and could possibly be used to provide an interesting narrative of individual games combined with other types of analysis. I got a lot of good advice on improvements on the animation and will implement some of it in the future.

That’s it for now!

Allsvenskan round 23 update