In [345]:

%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload

Riddler Express¶

The dollar is known to be more valuable than the rupee. Now this station says they will give you D dollars for each rupee, where D is a decimal less than 1 that goes to the hundredths place. So D can be 0.99, 0.50 or 0.37, but not values like 0.117 or 1/𝜋. And when exchanging dollars back into rupees, the station uses an exchange rate of R, where R is equal to 1/D rounded to the nearest hundredth. (Yes, that last part is very important.)

For example, suppose D is 0.53. In this case, when you trade in 100 rupees, you’ll receive $53. When trading the $53dollars back, the station uses an exchange rate of 1/0.53, or 1.88679…, which they round up to 1.89. And so returning the $53 gets you 100.17 rupees — a net profit!

What value of D will earn you the greatest profit for your 100 rupees? (Remember, D is a decimal that goes to the hundredths place and is less than 1.)

In [10]:

import numpy as np
import random

In [11]:

possible_D_values = [i*0.01 for i in range(1, 100)]

In [254]:

def arbitrage(d):
    return round(1 / d, 2) * d

In [255]:

assert arbitrage(0.53) == 1.0017, "{} != 100.17".format(arbitrage(0.53))

In [256]:

def find_argmax(f, values):
    max = 1.0
    argmaxes = [(0.53, 1.0017)]
    for v in values:
        curr = f(v)
        if curr > max:
            argmaxes = [(v, curr)]
        elif curr == max:
            argmaxes.append((v, curr))
        else:
            continue
    return argmaxes

In [257]:

find_argmax(arbitrage, possible_D_values)

Out[257]:

[(0.93, 1.0044000000000002)]

In [258]:

1/0.93

Out[258]:

1.075268817204301

This was pretty simple, but how could we analyze this instead of brute forcing it?

In [259]:

returns = [arbitrage(v) for v in possible_D_values]

In [260]:

from IPython.core.pylabtools import figsize
import matplotlib.pyplot as plt

In [261]:

figsize = (8, 4)
fig, ax = plt.subplots(figsize=(12, 4))

plt.style.use('_mpl-gallery')

ax.plot(possible_D_values, [r - 1 for r in returns], drawstyle="steps-mid")

Out[261]:

[<matplotlib.lines.Line2D at 0x311b4d3d0>]

This plot doesn't really show any obvious pattern, other than that the volatility increases as the cents go up. Let's see what elegant solutions the Riddler community come up with. Brute force is a nice way to build some intuition, although it can also breed some laziness where we got stuck on one formulation of the problem. As this is a puzzle I assume there exists a much more reasonable

Classic¶

There’s a parking lot behind Dave’s office building with 10 spaces that are available on a first-come, first-serve basis. Those 10 spaces invariably fill by 8 a.m., and the parking lot quickly empties out at 5 p.m. sharp.
Every day, three of the 10 “early birds” who snagged spots before 8 a.m. leave at random times between 10 a.m. and 3 p.m. and do not return that day. Knowing that some early birds leave during that five-hour window, nine “stragglers” drive by the lot at random times between 10 a.m. and 3 p.m. If there’s an available spot, a straggler immediately parks in the spot and doesn’t leave until 5 p.m. If there’s no open spot, a straggler immediately drives away from the lot and parks somewhere else, and doesn’t return that day.

Suppose you are a straggler arriving at a random time between 10 a.m. and 3 p.m. What is the probability that you will get a spot in the lot?

We can phrase this as having:

$X_i$ ~ Uniform(0, 5) $i \in [3]$
$Y_j$ ~ Uniform(0, 5) $j \in [9]$

Where $X$ models early birds that leave at random times and $Y$ the stragglers that pop by once. If you are one of the stragglers, what is the chance of getting a free spot? The gut feeling is that there is a $\frac{1}{3}$ chance, but intuition normally fails. For example, we need to account for the fact that the early birds all leave after all the stragglers have arrived and other permutations.

Let $A$ be the event that we arrive first after a early bird has left, thereby getting a spot.

Simplify¶

Let's start with a simpler version and build some intuition. Let's say there are only one early-bird that leaves and one straggler. Then the probability becomes:

$$P(A) = P(A | X_1 \leq Y_1) + P(A | X_1 \gt Y_1)$$

which leaves us with $P(A) = P(X_1 \leq Y_1)$ since the second term on the right hand side above is 0, i.e., if the straggler has already been there when the early-bird leaves there is 0 probability. Likewise, the first term is 1, since there are no other stragglers. This reduces to computing $P(A | X_1 \leq Y_1)$, which reduces to computing whether $P(Z > 0)$ where $Z := Y_1 - X_1$.

Digression: how do we compute the distribution of the sum of two uniform random variables?¶

I realized I'm very rusty so let's find out how to compute the distribution of $Z$. I'm sure it will come in handy.

In [63]:

from scipy.stats import uniform

In [76]:

X = uniform.rvs(1, size=10000)
Y = uniform.rvs(1, size=10000)
X_minus_Y = [x - y for x, y in zip(X, Y)]

In [79]:

import seaborn as sns
from IPython.core.pylabtools import figsize

In [81]:

figsize(12, 4)
sns.histplot(X_minus_Y)

Out[81]:

<Axes: ylabel='Count'>

What happens when we take the sum of uniforms?

In [137]:

def generate_sum_of_uniforms(k: int, n=10000):
    samples = [uniform.rvs(size=n) for i in range(k)]
    return sum(samples)

In [138]:

Ks = [2, 3, 5, 10, 25, 50, 100, 250, 500, 1000]
figsize(12, 8)
for i, k in enumerate(Ks):
    sx = plt.subplot(len(Ks) // 2, 2, i + 1)
    samples = generate_sum_of_uniforms(k)
    sns.histplot(samples, ax=sx, label="Sum of %d uniform RVs" % k)
    leg = plt.legend()
    plt.autoscale(tight=True)

This is the central limit theorem in practice.

Back to the simple case of one early bird and one straggler¶

We need to compute $P(A_1)$, which reduces to $P(X_1 \leq Y_1) = P(Z_{11} \leq 0)$. This can be solved analytically:

$$\int_0^1 \int_0^y 1 dx dy = \int_0^1 y dy = \frac{1}{2}$$

This also seems reasonable. For two i.i.d uniform random variables the one should be larger than the other 50% of the time. This also fits with the simulation.

However, does this help us to solve the general case of $m=3$ and $n=9$? Let's continue with our example and build intuiton. What if there are two stragglers or two early birds?

m=1, n=2: One early bird and two stragglers¶

We are still only interested in $P(A_1)$, i.e., the probability that we, straggler number 1, gets a spot. This is now $P(A_1 = 1) = P(X_1 \leq Y_1, Y_1 \leq Y_2) + P(X_1 \leq Y_1, X_2 \leq Y_2)$. In other words, we get the spot if we arrive after the early-bird has left and we arrive before the other straggler or the other straggler arrived before the early bird left. This is already getting kind of ugly, but let's stick with it.

However, one assumption can help us simplify some things. All the random variables are independent, so we only need to deal pairs. And we know that $P(Z_{ij} \leq 0) = P(Z_{ij} \geq 0) = \frac{1}{2}$.

In [249]:

simulate(m=4, n=12, num_scenarios=10000)

Out[249]:

defaultdict(int,
            {'Straggler 7': 3002,
             'Straggler 9': 2940,
             'Straggler 3': 3055,
             'Straggler 10': 3057,
             'Straggler 2': 3055,
             'Straggler 0': 3048,
             'Straggler 11': 2997,
             'Straggler 6': 2996,
             'Straggler 4': 2939,
             'Straggler 1': 3039,
             'Straggler 5': 3031,
             'Straggler 8': 3002})

Simulating the general case¶

In [161]:

t = uniform.rvs(size=(3, 5))

In [162]:

for x in t:
    print(x)

[0.68044007 0.02426911 0.29422743 0.83310429 0.41094505]
[0.24831401 0.4157033  0.03514218 0.24710832 0.77514717]
[0.25337364 0.11128675 0.84635987 0.44159973 0.66717735]

In [163]:

from collections import defaultdict

In [203]:

import pdb

In [224]:

def simulate(m: int, n: int, num_scenarios=1000):
    """Let's simulate one day.

    We generate m + n samples from the same uniform distribution and then
    we say that the first m are the early bird departures while the last n are
    the straggler arrivals. We can then see who gets spots. The probability that
    a particular straggler gets a spot can be counted by simulating many scenarios"""
    scenarios = uniform.rvs(size=(num_scenarios, m + n))

    results = defaultdict(int)
    for scenario in scenarios:
        early_birds = [(departure, "Early Bird {}".format(i)) for i, departure in enumerate(scenario[:m])]
        stragglers = [(arrival, "Straggler {}".format(i)) for i, arrival in enumerate(scenario[m:])]
        events = sorted(early_birds + stragglers)

        # Now we need to run through the scenarios
        scenario_results = run_scenario(events)
        for straggler in scenario_results:
            try:
                # pdb.set_trace()
                # print(straggler)
                results[straggler] += 1
            except TypeError as e:
                raise TypeError(l, e)

        # print(results)

    return results

In [228]:

def run_scenario(events):
    free_spots = 0
    who_got_spots = []
    for t, actor in events:
        if actor.startswith("Straggler"):
            if free_spots > 0:
                # print("{} got a spot".format(actor))
                who_got_spots.append(actor)
                free_spots -= 1
        else:
            free_spots += 1
            # print("Early bird departed")
    assert free_spots >= 0
    return who_got_spots

In [229]:

%pdb on

Automatic pdb calling has been turned ON

In [235]:

results = simulate(3, 9, 10000)

In [236]:

results

Out[236]:

defaultdict(int,
            {'Straggler 1': 3015,
             'Straggler 3': 3015,
             'Straggler 5': 2917,
             'Straggler 2': 2858,
             'Straggler 6': 2931,
             'Straggler 4': 2898,
             'Straggler 7': 2990,
             'Straggler 0': 2876,
             'Straggler 8': 2939})

In [238]:

for k, v in results.items():
    print(k, v / 10000)

Straggler 1 0.3015
Straggler 3 0.3015
Straggler 5 0.2917
Straggler 2 0.2858
Straggler 6 0.2931
Straggler 4 0.2898
Straggler 7 0.299
Straggler 0 0.2876
Straggler 8 0.2939

Some more observations: if I arrive at time $t$, there is $(1-t)^3$ chance that all early-birds depart after me. Other than that there are

Aha! If I arrive at time $t$ and my nearest neighbors are $t_{before}$ and $t_{after}$

Solving via permutations and the stars and bars model¶

Let's say | are early-birds and the numbers are stragglers. Then 43|5|698|127 would represent one possible unfolding of the day. The question can then be reformulated as the question of how often a bar is in front of a number. The bars are equally likely to end up anywhere, also at the end and every permutation is equally likely.

Assume we are straggler number 1. There are 9 possible positions in the sequence of stragglers, all are equally likely (this we should prove, but let's assume it for now). Then three bars are placed somewhere between the permutation. What is the probability that one is just in front of me, as straggler 1?

In [264]:

import itertools

In [273]:

import time
start_time = time.time()
for perm in itertools.permutations(range(12)):
    x = 2
print("It took {} seconds".format(time.time() - start_time))

It took 36.29958391189575 seconds

In [305]:

def count_scenarios(m: int, n: int):
    """We count the number of scenarios where we obtain a spot"""
    early_birds = ["e{}".format(i) for i in range(m)]
    stragglers = ["s{}".format(j) for j in range(n)]
    # Numbers 0..m-1 are early-birds, numbers m..n-1 are stragglers
    # We are participant m
    participants = [i for i in range(m + n)]
    count = 0
    start_time = time.time()
    for perm in itertools.permutations(participants):
        index = perm.index(m)
        free_spots = 0
        for p in perm[:index]:
            if p < m:
                free_spots += 1
            elif free_spots > 0:
                free_spots -= 1
            else:
                continue

        
        if free_spots > 0:
            count += 1

    # print("{} seconds elapsed".format(time.time() - start_time))
    return count  

In [298]:

count = count_scenarios(3, 9)

174.27846384048462 seconds elapsed

In [300]:

count / factorial(3 + 9)

Out[300]:

0.29343434343434344

This in the end is exact simulation (under the presumption that any scenario is equally likely, which we should prove or at least argue for, even if it seems obvious. If we trusted anything obvious we wouldn't need mathematics and probability wouldn't be as interesting as it is..).

This is kind of inefficient. We can also think of all the possible positions. I.e., when we are the first to arrive there won't be any spots. If we are the second, there is a 3/8 chance that we get a spot. If we are the third, then we get a spot of either the one just before us is an early bird leaving and if both were early bird leaving and so on. The law of total probability might be helpful as we go backwards in our positions.

In [318]:

count_scenarios(2, 4)

Out[318]:

In [320]:

factorial(6)

Out[320]:

In [323]:

276/12

Out[323]:

23.0

In [325]:

23/60

Out[325]:

0.38333333333333336

In [332]:

from sympy import *
x, y, z = symbols('x y z')
init_printing(use_unicode=True)

In [338]:

Rational(14, 30)

Out[338]:

$\displaystyle \frac{7}{15}$

In [328]:

x / y

Out[328]:

0.4

In [343]:

for i in range(1, 4):
    print("\n")
    for j in range(1, 9):
        print("({}, {}): {}".format(i, j, Rational(count_scenarios(i, j), factorial(i + j))))


(1, 1): 1/2
(1, 2): 1/3
(1, 3): 1/4
(1, 4): 1/5
(1, 5): 1/6
(1, 6): 1/7
(1, 7): 1/8
(1, 8): 1/9


(2, 1): 2/3
(2, 2): 7/12
(2, 3): 7/15
(2, 4): 23/60
(2, 5): 34/105
(2, 6): 47/168
(2, 7): 31/126
(2, 8): 79/360


(3, 1): 3/4
(3, 2): 7/10
(3, 3): 19/30
(3, 4): 19/35
(3, 5): 131/280
(3, 6): 103/252
(3, 7): 38/105
(3, 8): 107/330

Dynamic Programming¶

Can we compute $P(m, n)$ as a function of $P(m-1, n)$ and $P(m, n-1)$?

In [301]:

def count_positive_outcomes(m, n, position: int):
    total_participants = m + n
    if position == 1:
        return 0
    elif position == 2:
        return m/(m + n - 1)
    elif position == 3:
        pass
        # etc.
    return

  Cell In[301], line 2
    def
        ^
SyntaxError: invalid syntax

In [ ]:

Generalize¶

One can think of any pair of early bird departures and straggler arrivals as coin flips. For each pair there is a 50/50 chance that one is earlier than the other. For a given $m$ and $n$ we have $m + n$ time points. However, do we lose some information by only considering permutations?

In [ ]: