Category Archives: Power BI

Power Query: Week or Weak

So it all started rather innocuously creating a dynamic date dimension using a Power Query function based on posts by Chris Webb and Matt Mason. Then I got into weeks…

  1. Lets take a set of dates covering four weekends across a new year (2015-18-12 to 2016-01-11)
(StartDate, EndDateas table =>
let

StartDate #date(2015,12,18),

EndDate #date(2016,1,12),

DayCount Duration.Days(Duration.From(EndDate – StartDate)),

Source List.Dates(StartDate,DayCount,#duration(1,0,0,0)),

TableFromList Table.FromList(SourceSplitter.SplitByNothing()),

ChangedType Table.TransformColumnTypes(TableFromList,{{“Column1”type
date}}),

RenamedColumns Table.RenameColumns(ChangedType,{{“Column1”“Date”}})
in

RenamedColumns

  1. Add Day Name – Date.DayOfWeekName
  2. Add Day of Week – Date.DayOfWeek
  3. Add Week of Year – Date.WeekOfYear
  4. Add Start of Week – Date.StartOfWeek
  5. Add End of Week – Date.EndOfWeek
  6. Add Week of Month – Date.WeekOfMonth
  7. Add Is In Current Week – Date.IsInCurrentWeek with my local machine date set to January 1 2016

Giving you:

There are a few unanticipated results above:

  1. DayOfWeek shows Monday = 0. This contradicts the msdn documentation which states the firstday argument (which is optional and I didn’t use) defaults to Sunday.

    It is also different to the DAX function WEEKDAY() which returns an integer 1 to 7 (not 0 to 6) with a default of the week actually beginning on Sunday (on the same machine with same locale settings):

    WEEKDAY_DAX = WEEKDAY(‘Invoked Function (2)[Date])

    Similarly it is different from t-SQL DATEFIRST and ISO 8601 which have a base value of 1.

    This is being driven by locale settings in the pbix file, as below.

    File | Options and settings | Options | Current File | Regional Settings | Locale

    2017-01-26_16-17-32

    It looks like the Power Query function is aligned to C# enumeration but default behaviour is actually set by locale.

  2. WeekOfYear isn’t ISO 8610 compliant. This is just silly so you should vote for this Power BI idea.
  3. The following mismatch happens when a week spans January 1:
    Day of Week Week of Year Start of Week End of Week Week of Month Is In Current Week
    Day of Week

    N

    Y

    Y

    N

    Y

    Week of Year

    N

    N

    N

    Y

    N

    Start of Week

    Y

    N

    Y

    N

    Y

    End of Week

    Y

    N

    Y

    N

    Y

    Week of Month

    N

    Y

    N

    N

    N

    Is In Current Week

    Y

    N

    Y

    Y

    N

If you’re interested in dates in t-SQL and SSIS, have a look at my previous post SSIS vs t-sql datetime datatypes

To be fair to Microsoft, the concept of a week is a trivial construct that only loosely aligned to celestial movements that made more sense with it’s origins in the Babylonian Calendar than it does in the Gregorian Calendar.

Rugby World Cup with Power BI

image

Coming from a country where rugby is bigger than religion, ingrained and inescapable, I thought it would be only natural to collect some data, present some visualisations and infer some convenient conclusions.

I did something similar to this for the 2012 Olympic Games (Excel 2013 PowerView Olympics), but that was in the pre-Power BI days.

My family includes Scots (IRB rank #9), Finns (#91), English (#8), Czechs (#34) and Kiwis (#1). So whilst I don’t have to deal with this sort of trans-tasman marital problem watching any form of international sporting event can quickly become a brutal training ground in geo-political diplomacy!

Getting the data

The official RFU site does have some statistics here, but it doesn’t present them in nice tidy html tables for Power BI to suck up. Thankfully ESPN have a heap of Rugby match results data available here. It’s well worth a browse with the advanced filter options including Team, Opposition, Home or away, Host country, Ground, Match date, Trophy, Tournament/tour, Match type, Match result, Match involving players, Match involving captains, Points scored in a match, Points conceded, Points difference, Halftime score, Halftime conceded, Halftime difference, Tries scored in a match, Conversions scored, Penalty goals scored, Drop goals scored, Goals from a mark, View format, Group figures by, Result qualifications, Sort results by, Results per page.

One little gripe with the presentation is that the html tables are split across pages so have to be iterated with their respective url. As the page numbers are a simple parameter in the url, I have used Matt Masson’s Iterating over multiple pages of web data using Power Query.

I also added ?size=200 to get more data rows per page to reduce the web calls from Power BI which were a bit slow.

There are six primary data sets and three merged/unioned data sets. I have separated out the queries into Historical and 2015, so that all the static data didn’t require refreshing during the tournament, then merged them using Table.Combine. The same approach was used for Infringements but because the Red Card and Yellow Card tables were in separate html tables on the RFU site.
The only transformations used beyond column renames etc are:
Changed data type using Table.TransformColumnTypes (Text to Whole Number and Text to Date)

Table granularities:
PlayerResults: one row per player per game
MatchResults: one row per team per game (ie two rows per game).
PlayerInfringements: one row per player per game where player incurred an infringement

MatchResultsHistorical

Returns team results per match with the below fields:

image

Parameters were set as:

Web Parameter Name Web String URL Parameter Name URL string
Trophy “IRB Rugby World Cup” trophy 17
Match date to 31 Dec 2014 spanmax1 31+Dec+2014
View format “Match list” view match
Type “TEAM RECORDS” type team

which gave a paramaterised url of:

http://stats.espnscrum.com/statsguru/rugby/stats/index.html?class=1;page=1;size=200;spanmax1=31+Dec+2014;spanval1=span;template=results;trophy=17;type=team;view=match

PlayerResultsHistorical

Returns player results per match with the below fields:

image

Parameters were set as:

Web Parameter Name Web String URL Parameter Name URL string
Trophy “IRB Rugby World Cup” trophy 17
Match date to 31 Dec 2014 spanmax1 31+Dec+2014
View format “Match list” view match
Type “PLAYER RECORDS” type player

which gave a paramaterised url of:

http://stats.espnscrum.com/statsguru/rugby/stats/index.html?class=1;page=1;size=200;spanmax1=31+Dec+2014;spanval1=span;template=results;trophy=17;type=player;view=match

MatchResults2015

Parameters were set as:

Web Parameter Name Web String URL Parameter Name URL string
Trophy “IRB Rugby World Cup” trophy 17
Match date from 01 Jan 2015 spanmin1 01+Jan+2015
Match date to 31 Dec 2015 spanmax1 31+Dec+2015
View format “Match list” view match
Type “TEAM RECORDS” type team

which gave a paramaterised url of:

http://stats.espnscrum.com/statsguru/rugby/stats/index.html?class=1;page=1;size=200;spanmax1=31+Dec+2015;spanmin1=01+Jan+2015;spanval1=span;template=results;trophy=17;type=team;view=match

PlayerResults2015

Parameters were set as:

Web Parameter Name Web String URL Parameter Name URL string
Trophy “IRB Rugby World Cup” trophy 17
Match date from 01 Jan 2015 spanmin1 01+Jan+2015
Match date to 31 Dec 2015 spanmax1 31+Dec+2015
View format “Match list” view match
Type “PLAYER RECORDS” type player

which gave a paramaterised url of:

http://stats.espnscrum.com/statsguru/rugby/stats/index.html?class=1;page=1;size=200;spanmax1=31+Dec+2015;spanmin1=01+Jan+2015;spanval1=span;template=results;trophy=17;type=player;view=match

Visualising the data

I like bar charts, I don’t like pie charts. The charts are quite simple, but a few points to note:

-Results generally sorted by measure

-Comparison of Average vs SUM aggregation methods as some teams have played in all eight tournaments whilst others have only appeared once.

Tenuous Conclusions

Highest Point scoring team average: New Zealand (47)

Highest Point scoring team total: New Zealand (2248)

Highest Try scoring team average: New Zealand (6)

Highest Try scoring team total: New Zealand (306)

Biggest winning difference average: New Zealand: (33)

Biggest losing difference average: Namibia (-49)

Highest Points scoring Player: Johnny Wilkinson (272)

Highest Try scoring Player: Jonah Lomu (12)

Most Yellow Cards: Tonga

Most Red Cards: Canada

Best Position to Score Tries: Wing

Best Position to Score Points: First five-eighth or as the English call it Fly Half

Most points scored in Tournament: 2835 (2003)

Least points scored in Tournament: 1197 (1991) *four point tries

Most tries scored in Tournament: 332 (2003)

Least tries scored in Tournament: 148 (1991)

Most competitive tournament (min ∑point difference): 1991

Least competitive tournament (min ∑point difference): 2003 *England won this year…

Get a copy

Because of how the sharing currently works in Power BI (only dashboards outside of domain, not reports and datasets) you can vote here to fix it.

I have uploaded the Power BI desktop .pbix file here.

%d bloggers like this: