Tim Black's Blog

Volcanic islands? What do you expect?

Posted October 21, 2018

This week, FiveThirtyEight's Riddler Classic is a puzzle that visits a volcanic archipelago. It came from Ricky Jacobson and Ben Holtz. Here's the puzzle:

You live on the volcanic archipelago of Riddleria. The Riddlerian Islands are a 30-minute boat ride off the shores of the nearby mainland. Your archipelago is connected via a network of bridges, forming one unified community. In an effort to conserve resources, the ancient Riddlerians who built this network opted not to build bridges between any two islands that were already connected to the community otherwise. Hence, there is exactly one path from any one island to any other island.

One day, you feel the ground start to rumble — the islands' volcanoes are stirring. You're not sure whether any volcano is going to blow, but you and the rest of the Riddlerians flee the archipelago in rowboats bound for the mainland just to be safe. But as you leave, you look back and wonder what will become of your home.

Each island contains exactly one volcano. You know that if a volcano erupts, the subterranean pressure change will be so great that the volcano will collapse in on itself, causing its island — and any connected bridges — to crumble into the ocean. Remarkably, other islands will be spared unless their own volcanoes erupt. But if enough bridges go down, your once-unified archipelagic community could split into several smaller, disjointed communities.

If there were N islands in the archipelago originally and each volcano erupts independently with probability p, how many disjointed communities can you expect to find when you return? What value of p maximizes this number?

This is a tidy little problem that can be thought of from many different angles. Here, I have four different, bite-size solutions to the first question (I've saved my favorite for last). And, we'll address the second question, too.

All four solutions rely on the amazing fact that expected value is linear, even when the events involved may be dependent.

Solution 1: Bridges then islands

Let's imagine that the destruction happens in a somewhat funny order - first the bridges crumble into the sea, and then the islands sink. It's okay to think about it this way, as long as we get to the same end result.

We start out with one community. Each bridge that collapses splits one community into two pieces, thus increasing the total number of communities by one. An island only crumbles into the sea after all the bridges out of it are gone, so when that island goes away, so do the last vestiges of that community; the number of communities goes down by one. So, the final number of communities is \[ 1 + (\text{# of collapsed bridges}) - (\text{# of sunken islands}). \]

There are \(N - 1\) bridges, each of which collapses if either terminus island collapses (or both do), which happens with probability \(2p - p^2\). There are \(N\) islands, each of which sinks with probability \(p\). So, the expected number of disjointed communities is \[ \DeclareMathOperator*{\EE}{\mathbb{E}} \DeclareMathOperator*{\Pr}{Pr} \begin{align*} \EE[\text{# communities}] &= 1 + \EE[\text{# of collapsed bridges}] - \EE[\text{# of sunken islands}] \\ &= 1 + (N - 1)(2p - p^2) - N p \\ &= 1 - p + (N - 1) p (1 - p). \end{align*} \]

Solution 2: Induction

We'll prove that the expected number of disjointed communities is \(1 - p + (N - 1) p (1 - p)\) by induction on \(N\), the number of islands.

For the base case, when \(N = 1\), there is a community if and only if the one volcano does not erupt, so the expected number of communities is \(1 - p\).

For the inductive step, suppose there are \(N\) islands, and that we have established the formula for \(N - 1\) islands. There is at least one "leaf island," that is, an island that only has one bridge coming off of it. What would happen if that island didn't exist? Then, by the induction hypothesis, the expected number of disjointed communities would be \(1 - p + (N - 2) p (1 - p)\).

When we remember that this leaf island does exist, two things could happen. For one, the addition of this island might not change the number of communities, either because the island sinks into the sea anyway, or because it survives, but is still connected to a community we already counted. Or, we could get an extra community, if the island survives, but the bridge out of it crumbles (because the next volcano over went off). This happens with probability \(p (1 - p)\).

So, the expected number of disjointed communities is, \[ \begin{align*} \EE[\text{# communities}] &= \EE[\text{# of communities without leaf island}] + \Pr[\text{leaf island is its own community}] \\ &= (1 - p + (N - 2) p (1 - p)) + p (1 - p) \\ &= 1 - p + (N - 1) p (1 - p). \end{align*} \]

Solution 3: Functional equation

For \(G\) an arrangement of \(n\) islands with bridges (a graph with \(n\) vertices), let \(f_G(n)\) be the expected number of disjointed communities at the end. If we pick some bridge in the archipelago, this bridge divides the community into two parts, graphs \(G_1\) with \(k\) vertices, and \(G_2\) with \(\ell\) vertices, \(k + \ell = n\).

The number of disjointed communities at the end is equal to the number that form out of \(G_1\) plus the number that form out of \(G_2\), except that this double-counts one community if the bridge connecting the two parts survives. That bridge survives with probability \((1 - p)^2\). So, \[ f_G(k + \ell) = f_{G_1}(k) + f_{G_2}(\ell) - (1 - p)^2. \]

With this formula, we can prove by induction that actually \(f_G(n)\) is determined solely by \(n\) without needing to know \(G\). So we can simply write \(f(n)\) instead of \(f_G(n)\). So, for all positive integers \(k, \ell\), \[ f(k + \ell) = f(k) + f(\ell) - (1 - p)^2. \] This funtional equation would be easy to solve if it weren't for the \(- (1 - p)^2\) term, so we use a trick: let \(g(n) = f(n) - (1 - p)^2\). Then, \[ g(k + \ell) = g(k) + g(\ell). \] The only solution to this is \(g(n) = cn\) for some constant \(c\). We calculate \[ c = g(1) = f(1) - (1 - p)^2 = (1 - p) - (1 - p)^2 = p (1 - p). \]

So, the expected number of communities in the end is \[ \EE[\text{# communities}] = f(N) = g(N) + (1 - p)^2 = N p (1 - p) + (1 - p)^2 = 1 - p + (N - 1) p (1 - p). \]

Solution 4: Rooting the tree

The Riddlerian islands decide to designate a capital. All the bridges are marked with an arrow pointing the way to the capital (since there is a unique path from every island to every other, there's no ambiguity on how to mark each bridge. Exactly one bridge points away from each island, except for the capital island.

When the volcanic activity settles down, the Riddlerians who can return to their islands. After they've settled back in to their own islands and assessed the damage, they all walk toward the capital, following the arrows across bridges. Many Riddlerians get stuck and don't make it all the way to the capital because a bridge is out. They declare the island where they stop to be their new capital.

In this way, each of the new communities gets its own capital. There is only one capital per community - since there was originally exactly one path from each island to the capital, they all get stuck in front of the same bridge. So, to count the number of communities, we just need to count the number of capitals.

The original capital remains a capital if it survives, which happens with probability \(1 - p\). Each other island becomes a capital if it survives (probability \(1 - p\)), but the next island in the path back to the capital (the island across the outward-pointing bridge) does not (independently, probability \(p\)). Thus, a non-capital becomes a capital with probability \(p (1 - p)\).

So, the expected number of communities is the expected number of capitals, which is \[ \EE[\text{# communities}] = 1 - p + (N - 1) p (1 - p). \]

Finding p to maximize the number of communities

The expected number of disjointed communities is \(1 - p + (N - 1) p (1 - p)\). If it weren't for the \(1 - p\) terms in front, this would be maximized at \(p = 1/2\), but that extra \(1 - p\) term gives preference to smaller values of \(p\), so we expect the actual optimal \(p\) to be slightly less then one half. We could maximize the expression over \(p\) using calculus, but this is quadratic in \(p\), so that's overkill. We could complete the square, or use the AM-GM inequality. But let's use what we know about parabolas. We compute, \[ \EE[\text{# communities}] = 1 - p + (N - 1) p (1 - p) = (1 + (N - 1) p)(1 - p) \] which is a quadratic with roots at \(p = 1, -\frac{1}{N - 1}\). Its graph is a downward-facing parabola, so it is maximized at the vertex of the parabola, whose \(p\)-coordinate is the average of the roots. That is, maximum occurs at \(p = \frac{1}{2} - \frac{1}{2 (N - 1)}\), which is just less than \(1/2\), as we anticipated. For this \(p\), the expected number of communities is \[ \frac{N}{4}\left(\frac{1}{2} + \frac{1}{2 (N - 1)}\right) = \frac{N}{4} + \frac{N}{4 (N - 1)} = \frac{N}{4} + \frac{1}{4} + \frac{1}{4 (N - 1)}. \]

last updated 10.21.2018