A brickwork fractal

On my way to work in Yaletown, I walk along some sidewalks with interesting brick patterns. The three above happen to satisfy a nice mathematical property: the bricks are arranged so that no four corners meet at the same place.

bricks.jpeg
Stretcher bond, herringbone, and pinwheel brick layouts

As my relative Oliver Linton has pointed out, paving bricks come in a variety of shapes and sizes, which allows for many more beautiful tilings. For example, the addition of 2×2 square tiles make it possible to construct rectangular tilings that fit together to tesselate the plane while preserving the four-corner rule.

boundary_wall.png

This is a very exciting observation — at least, to anybody who likes recursion! We might be able to use the same trick to construct an infinite sequence of increasingly intricate tilings that converge to a self-similar “fractal tiling”. The simplest non-trivial example I could find involves a set of 5×5, 10×10, and two 5×10 rectangular tilings.

tiles.png

Starting with any of these four layouts, we can replace each of the 1×1, 2×2, and 1×2 bricks with a corresponding 5×5, 10×10, or 5×10 rectangular tiling in the correct orientation. (This will produce a few four-corner intersections, but we can fix these by merging adjacent pairs of 1×2 bricks.)

Repeatedly performing this operation gives an infinite sequence of tilings, but can we say they converge to anything? A tiling T can be identified with its outline ∂T (i.e. the set of points on boundaries between two or more tiles). Note that if a point is in ∂Ti, then it will be in every subsequent ∂Tj unless it is one of the few bricks merged in ∂Ti+1. So we might sensibly define the limiting object of the tiling sequence T_i as the union i(∂Ti⋂∂Ti+1). This self-similar dense path-connected set satisfies the topological equivalent of the “four corners rule” — a pretty interesting list of mathematical properties!

The same strategy could be applied to other sets of (2n+1)×(2n+1), (4n+2)×(4n+2), and (2n+1)×(4n+2) tiles with similar boundaries. What’s the prettiest brickwork fractal you can find?

What’s in a name?

Ever since I first heard about it, I’ve been itching to play around with the name data released by the BC Vital Statistics Agency — a list all of the names that appear on BC’s birth certificates for each of the last hundred years! I was curious to see what I could find out about the popularity and gender distribution of given names, so I decided to dedicate a couple blog posts to exploring it. If you’re interested in playing with the data yourself, you can find them here and here.

To get a feel for how the data looks, I’ve plotted the historical frequency of a particular name, chosen completely at random and not at all to satisfy my personal curiosity. (Ed. note: it’s “Ross”.)

As you might have guessed even before you looked at it, the graph has two peaks centered around the baby boom and millenial generations. Before we draw any conclusions about the relative popularity of the particular name, we’ll have to compare it to the overall birth rate (c.f. xkcd 1138).

Screen Shot 2018-02-07 at 9.13.25 PM.png
Total births recorded in the dataset

Although they start off looking pretty similar, the top curve has a much shorter peak during the echo boom. This suggests that my name has gradually been declining in popularity since the 1960s or so, which jibes with my experience: I know exactly one other Ross in my generation, two from my parents’, and none younger than me.

A few other assorted facts about my name:

  • I was one of twenty-two Rosses born in BC in 1988, which explains why I never had to use a last initial in school.
  • Ross was the 306th most common of the 980 names listed for my birth year, tied with names like Felicia, Gabriel, Martin, and George.

Next, let’s look at how the most popular names have changed over time. I compiled lists of the most common names in each of four decades: the most recent years in the dataset (2005–2014), my own generation (1985–1994), my parents’ generation (1955–1964), and my grandparents’ generation (1925–1934).

1925–1934 1955–1964 1985–1994 2005–2014
1 Mary Susan Jessica Olivia
2 Margaret Karen Amanda Emma
3 Dorothy Sandra Sarah Emily
4 Shirley Linda Ashley Ava
5 Patricia Deborah Jennifer Sophia
1925–1934 1955–1964 1985–1994 2005–2014
1 John David Michael Ethan
2 William Robert Matthew Liam
3 Robert Michael Christopher Jacob
4 James John Ryan Lucas
5 George James Kyle Benjamin

It’s interesting to note that, while the most popular female names are completely different from generation to generation, the same is not true for male names. John, James, and Robert all appear in the first two lists, while Michael hops from number three in 1955–64 to number one in 1985–94. It’s not until the current generation that we see an entirely new batch of male names in the top five.

Screen Shot 2018-02-07 at 9.13.36 PM.png

That pattern is not the only reason why I kept the top lists separated by assigned gender. As you can see by comparing the above chart with the one below, the top female names account for a smaller share of total births than the top male names. For example, Susan was the most common female name in 1955–64, but would have been in a virtual tie for seventh in a combined list.

Screen Shot 2018-02-07 at 9.13.47 PM.png

I’m not sure whether the top-heaviness of male names is because there have been fewer of them historically for parents to choose from, or whether there are additional cultural factors at work, like the primarily-male “Junior” naming convention. Regardless of the reasons, it seems that the effect is decreasing over time, as names for both assigned genders are becoming more evenly distributed.

Screen Shot 2018-02-07 at 9.13.57 PM.png
Proportion of recorded birth certificates bearing one of the top ten male names or top ten female names (respectively blue and pink, sorry)

So far, although I’ve tried to be careful with my language, this post has largely accepted the dataset’s assumed gender framework. But I’m very curious to learn about names that challenge this framework. In particular, what are the most common gender-nonspecific names?

Before I compile a list, I have to choose how to compare names: for example, between Quinn (24♂ + 47♀ = 71 births) and Dylan (85♂ + 9♀ = 94 births), should Quinn rank higher because it is more balanced between assigned genders? Or should Dylan be higher because it’s more common overall? Let’s strike a balance between parity and popularity by listing names according to how often they were recorded with their secondary gender. (In the above example, Quinn places higher because 24 > 9.)

1925–1934 1955–1964 1985–1994 2005–2014
1 Francis Kelly Taylor Riley
2 n/a Leslie Jamie Taylor
3 n/a Terry Jordan Avery
4 n/a Kim Morgan Quinn
5 n/a Robin Devon Jordan

As it turns out, the ranking method doesn’t make much difference for the first list — only one name appears on both tables for more than one year between 1925 and 1934! Thanks to the lower birthrate during this period, it’s possible that some less common names are missing from the list, but it’s also the case that gender-nonspecific names were just rare back then.

Screen Shot 2018-02-07 at 9.14.07 PM.png
Percent of newborns given a gender-nonspecific name (at least 10 births/year in both datasets)

In my grandparents’ day, Francis was in the top half of female names despite being less common than the alternate spelling Frances; it also ranked 39th of 262 among male names.

Screen Shot 2018-02-07 at 9.14.18 PM.png
Proportion of newborns given Francis as a female name (top) or as a male name (bottom, shaded)

As BC’s population boomed, so did the percentage of babies given names spanning multiple genders. This trend was led by Leslie, a previously male-exclusive name that gained popularity as a female name during World War II.

Screen Shot 2018-02-07 at 9.14.26 PM.png
Proportion of newborns given Leslie as a female name (top) or as a male name (bottom, shaded)

In the 1950s and ’60s, the previously-unknown name Kelly rode a gender-symmetric wave of popularity.

Screen Shot 2018-02-07 at 9.14.34 PM.png
Proportion of newborns given Kelly as a female name (top) or as a male name (bottom, shaded)

Taylor really caught on in my generation: in addition to topping our charts, it was one of the most common names in the ’90s, period.

Screen Shot 2018-02-07 at 9.14.41 PM.png
Proportion of newborns given Taylor as a female name (top) or as a male name (bottom, shaded)

Over the last decade, Riley has earned the top spot among gender-nonspecific names through a slow but consistent accumulation of namesakes.

Screen Shot 2018-02-07 at 9.14.53 PM.png
Proportion of newborns given Riley as a female name (top) or as a male name (bottom, shaded)

When it comes to gender-neutral names, the above lists only tell part of the story. Vital Statistics only records people’s names and assigned genders at birth, not what they choose to go by once they’re old enough to establish their own identities. Because of this, the half-dozen people I know of as “Chris” actually show up in different (heavily gendered) rows in the dataset.

Screen Shot 2018-02-07 at 9.15.16 PM.png

It would be very neat to see an analysis of name data that ranks clusters of related names against each other!

The 140-character spike

I don’t intentionally set out to do so, but I’ve noticed my tweets gravitating towards Twitter’s character limit. Sometimes it’s the result of a too-long idea being meticulously edited down to size; sometimes it’s purely chance. Either way, it’s oddly satisfying to post a tweet with exactly 140 characters.

How often do tweets max out their character limits? What’s the average length of a tweet? To answer these questions, I turned to a set of tweets collected by Chang, Caverlee, and Lee in the fall of 2009. Filtering out retweets, I was left with over four million of them, of which over 2% used their entire 140 characters.

The shape of the character distribution is fascinating. One-word tweets are understandably very rare, but it doesn’t take long for the distribution to reach its first mode at 35 characters. The curve gradually and smoothly trails off to a local minimum around 116 characters, before positively spiking after 135. The average length is a bit more than 68 characters and the median a bit lower at 62. It looks like a lot of tweetable ideas can be expressed in five or ten words, and there’s a lot of people valiently trying to squeeze in something that’s slightly too big for the text box.

I’m curious whether the spike at 140 characters has more to do with where the character limit is set or the very fact that there is one at all. Since the initial publication of this post, Twitter has doubled the number of characters it allows in a tweet. How has this affected the distribution of tweet lengths?

Raw data from Chang, Caverlee, and Lee.

Colonialism and the Four-Colour Theorem

The Four-Colour Theorem about planar graphs implies that you only need four colours to properly draw a map, making sure that neighbouring countries are coloured differently. But this application comes with a caveat: the theorem is only guaranteed to work if the countries are all connected. This is essentially true of modern countries, so current political maps of the Earth only need four colours. But it wasn’t always this way; things were a lot more complicated back in the age of empires.

If we look back to the late 19th century, most of the planet was ruled by only a few countries, each having many satellite colonies scattered across the globe. Each colony was a new opportunity for empires to violate the hypotheses of the Four-Colour Theorem (as well as the basic principles of human decency). In sub-Saharan Africa, for example, the British, French, German, Portugese, and Belgian empires bordered each other — a configuration requiring five colours on a map.

Can you find a historical map that requires more than five colours?