%load_ext autoreload
%autoreload 2
The autoreload extension is already loaded. To reload it, use: %reload_ext autoreload
The dollar is known to be more valuable than the rupee. Now this station says they will give you D dollars for each rupee, where D is a decimal less than 1 that goes to the hundredths place. So D can be 0.99, 0.50 or 0.37, but not values like 0.117 or 1/𝜋. And when exchanging dollars back into rupees, the station uses an exchange rate of R, where R is equal to 1/D rounded to the nearest hundredth. (Yes, that last part is very important.)
For example, suppose D is 0.53. In this case, when you trade in 100 rupees, you’ll receive $53. When trading the $53dollars back, the station uses an exchange rate of 1/0.53, or 1.88679…, which they round up to 1.89. And so returning the $53 gets you 100.17 rupees — a net profit!
What value of D will earn you the greatest profit for your 100 rupees? (Remember, D is a decimal that goes to the hundredths place and is less than 1.)
import numpy as np
import random
possible_D_values = [i*0.01 for i in range(1, 100)]
def arbitrage(d):
return round(1 / d, 2) * d
assert arbitrage(0.53) == 1.0017, "{} != 100.17".format(arbitrage(0.53))
def find_argmax(f, values):
max = 1.0
argmaxes = [(0.53, 1.0017)]
for v in values:
curr = f(v)
if curr > max:
argmaxes = [(v, curr)]
elif curr == max:
argmaxes.append((v, curr))
else:
continue
return argmaxes
find_argmax(arbitrage, possible_D_values)
[(0.93, 1.0044000000000002)]
1/0.93
1.075268817204301
This was pretty simple, but how could we analyze this instead of brute forcing it?
returns = [arbitrage(v) for v in possible_D_values]
from IPython.core.pylabtools import figsize
import matplotlib.pyplot as plt
figsize = (8, 4)
fig, ax = plt.subplots(figsize=(12, 4))
plt.style.use('_mpl-gallery')
ax.plot(possible_D_values, [r - 1 for r in returns], drawstyle="steps-mid")
[<matplotlib.lines.Line2D at 0x311b4d3d0>]
This plot doesn't really show any obvious pattern, other than that the volatility increases as the cents go up. Let's see what elegant solutions the Riddler community come up with. Brute force is a nice way to build some intuition, although it can also breed some laziness where we got stuck on one formulation of the problem. As this is a puzzle I assume there exists a much more reasonable
Every day, three of the 10 “early birds” who snagged spots before 8 a.m. leave at random times between 10 a.m. and 3 p.m. and do not return that day. Knowing that some early birds leave during that five-hour window, nine “stragglers” drive by the lot at random times between 10 a.m. and 3 p.m. If there’s an available spot, a straggler immediately parks in the spot and doesn’t leave until 5 p.m. If there’s no open spot, a straggler immediately drives away from the lot and parks somewhere else, and doesn’t return that day.
Suppose you are a straggler arriving at a random time between 10 a.m. and 3 p.m. What is the probability that you will get a spot in the lot?
We can phrase this as having:
Where $X$ models early birds that leave at random times and $Y$ the stragglers that pop by once. If you are one of the stragglers, what is the chance of getting a free spot? The gut feeling is that there is a $\frac{1}{3}$ chance, but intuition normally fails. For example, we need to account for the fact that the early birds all leave after all the stragglers have arrived and other permutations.
Let $A$ be the event that we arrive first after a early bird has left, thereby getting a spot.
Let's start with a simpler version and build some intuition. Let's say there are only one early-bird that leaves and one straggler. Then the probability becomes:
$$P(A) = P(A | X_1 \leq Y_1) + P(A | X_1 \gt Y_1)$$which leaves us with $P(A) = P(X_1 \leq Y_1)$ since the second term on the right hand side above is 0, i.e., if the straggler has already been there when the early-bird leaves there is 0 probability. Likewise, the first term is 1, since there are no other stragglers. This reduces to computing $P(A | X_1 \leq Y_1)$, which reduces to computing whether $P(Z > 0)$ where $Z := Y_1 - X_1$.
I realized I'm very rusty so let's find out how to compute the distribution of $Z$. I'm sure it will come in handy.
from scipy.stats import uniform
X = uniform.rvs(1, size=10000)
Y = uniform.rvs(1, size=10000)
X_minus_Y = [x - y for x, y in zip(X, Y)]
import seaborn as sns
from IPython.core.pylabtools import figsize
figsize(12, 4)
sns.histplot(X_minus_Y)
<Axes: ylabel='Count'>
What happens when we take the sum of uniforms?
def generate_sum_of_uniforms(k: int, n=10000):
samples = [uniform.rvs(size=n) for i in range(k)]
return sum(samples)
Ks = [2, 3, 5, 10, 25, 50, 100, 250, 500, 1000]
figsize(12, 8)
for i, k in enumerate(Ks):
sx = plt.subplot(len(Ks) // 2, 2, i + 1)
samples = generate_sum_of_uniforms(k)
sns.histplot(samples, ax=sx, label="Sum of %d uniform RVs" % k)
leg = plt.legend()
plt.autoscale(tight=True)
This is the central limit theorem in practice.
We need to compute $P(A_1)$, which reduces to $P(X_1 \leq Y_1) = P(Z_{11} \leq 0)$. This can be solved analytically:
$$\int_0^1 \int_0^y 1 dx dy = \int_0^1 y dy = \frac{1}{2}$$This also seems reasonable. For two i.i.d uniform random variables the one should be larger than the other 50% of the time. This also fits with the simulation.
However, does this help us to solve the general case of $m=3$ and $n=9$? Let's continue with our example and build intuiton. What if there are two stragglers or two early birds?
We are still only interested in $P(A_1)$, i.e., the probability that we, straggler number 1, gets a spot. This is now $P(A_1 = 1) = P(X_1 \leq Y_1, Y_1 \leq Y_2) + P(X_1 \leq Y_1, X_2 \leq Y_2)$. In other words, we get the spot if we arrive after the early-bird has left and we arrive before the other straggler or the other straggler arrived before the early bird left. This is already getting kind of ugly, but let's stick with it.
However, one assumption can help us simplify some things. All the random variables are independent, so we only need to deal pairs. And we know that $P(Z_{ij} \leq 0) = P(Z_{ij} \geq 0) = \frac{1}{2}$.
simulate(m=4, n=12, num_scenarios=10000)
defaultdict(int, {'Straggler 7': 3002, 'Straggler 9': 2940, 'Straggler 3': 3055, 'Straggler 10': 3057, 'Straggler 2': 3055, 'Straggler 0': 3048, 'Straggler 11': 2997, 'Straggler 6': 2996, 'Straggler 4': 2939, 'Straggler 1': 3039, 'Straggler 5': 3031, 'Straggler 8': 3002})
t = uniform.rvs(size=(3, 5))
for x in t:
print(x)
[0.68044007 0.02426911 0.29422743 0.83310429 0.41094505] [0.24831401 0.4157033 0.03514218 0.24710832 0.77514717] [0.25337364 0.11128675 0.84635987 0.44159973 0.66717735]
from collections import defaultdict
import pdb
def simulate(m: int, n: int, num_scenarios=1000):
"""Let's simulate one day.
We generate m + n samples from the same uniform distribution and then
we say that the first m are the early bird departures while the last n are
the straggler arrivals. We can then see who gets spots. The probability that
a particular straggler gets a spot can be counted by simulating many scenarios"""
scenarios = uniform.rvs(size=(num_scenarios, m + n))
results = defaultdict(int)
for scenario in scenarios:
early_birds = [(departure, "Early Bird {}".format(i)) for i, departure in enumerate(scenario[:m])]
stragglers = [(arrival, "Straggler {}".format(i)) for i, arrival in enumerate(scenario[m:])]
events = sorted(early_birds + stragglers)
# Now we need to run through the scenarios
scenario_results = run_scenario(events)
for straggler in scenario_results:
try:
# pdb.set_trace()
# print(straggler)
results[straggler] += 1
except TypeError as e:
raise TypeError(l, e)
# print(results)
return results
def run_scenario(events):
free_spots = 0
who_got_spots = []
for t, actor in events:
if actor.startswith("Straggler"):
if free_spots > 0:
# print("{} got a spot".format(actor))
who_got_spots.append(actor)
free_spots -= 1
else:
free_spots += 1
# print("Early bird departed")
assert free_spots >= 0
return who_got_spots
%pdb on
Automatic pdb calling has been turned ON
results = simulate(3, 9, 10000)
results
defaultdict(int, {'Straggler 1': 3015, 'Straggler 3': 3015, 'Straggler 5': 2917, 'Straggler 2': 2858, 'Straggler 6': 2931, 'Straggler 4': 2898, 'Straggler 7': 2990, 'Straggler 0': 2876, 'Straggler 8': 2939})
for k, v in results.items():
print(k, v / 10000)
Straggler 1 0.3015 Straggler 3 0.3015 Straggler 5 0.2917 Straggler 2 0.2858 Straggler 6 0.2931 Straggler 4 0.2898 Straggler 7 0.299 Straggler 0 0.2876 Straggler 8 0.2939
Some more observations: if I arrive at time $t$, there is $(1-t)^3$ chance that all early-birds depart after me. Other than that there are
Aha! If I arrive at time $t$ and my nearest neighbors are $t_{before}$ and $t_{after}$
Let's say | are early-birds and the numbers are stragglers. Then 43|5|698|127 would represent one possible unfolding of the day. The question can then be reformulated as the question of how often a bar is in front of a number. The bars are equally likely to end up anywhere, also at the end and every permutation is equally likely.
Assume we are straggler number 1. There are 9 possible positions in the sequence of stragglers, all are equally likely (this we should prove, but let's assume it for now). Then three bars are placed somewhere between the permutation. What is the probability that one is just in front of me, as straggler 1?
import itertools
import time
start_time = time.time()
for perm in itertools.permutations(range(12)):
x = 2
print("It took {} seconds".format(time.time() - start_time))
It took 36.29958391189575 seconds
def count_scenarios(m: int, n: int):
"""We count the number of scenarios where we obtain a spot"""
early_birds = ["e{}".format(i) for i in range(m)]
stragglers = ["s{}".format(j) for j in range(n)]
# Numbers 0..m-1 are early-birds, numbers m..n-1 are stragglers
# We are participant m
participants = [i for i in range(m + n)]
count = 0
start_time = time.time()
for perm in itertools.permutations(participants):
index = perm.index(m)
free_spots = 0
for p in perm[:index]:
if p < m:
free_spots += 1
elif free_spots > 0:
free_spots -= 1
else:
continue
if free_spots > 0:
count += 1
# print("{} seconds elapsed".format(time.time() - start_time))
return count
count = count_scenarios(3, 9)
174.27846384048462 seconds elapsed
count / factorial(3 + 9)
0.29343434343434344
This in the end is exact simulation (under the presumption that any scenario is equally likely, which we should prove or at least argue for, even if it seems obvious. If we trusted anything obvious we wouldn't need mathematics and probability wouldn't be as interesting as it is..).
This is kind of inefficient. We can also think of all the possible positions. I.e., when we are the first to arrive there won't be any spots. If we are the second, there is a 3/8 chance that we get a spot. If we are the third, then we get a spot of either the one just before us is an early bird leaving and if both were early bird leaving and so on. The law of total probability might be helpful as we go backwards in our positions.
count_scenarios(2, 4)
276
factorial(6)
720
276/12
23.0
23/60
0.38333333333333336
from sympy import *
x, y, z = symbols('x y z')
init_printing(use_unicode=True)
Rational(14, 30)
x / y
0.4
for i in range(1, 4):
print("\n")
for j in range(1, 9):
print("({}, {}): {}".format(i, j, Rational(count_scenarios(i, j), factorial(i + j))))
(1, 1): 1/2 (1, 2): 1/3 (1, 3): 1/4 (1, 4): 1/5 (1, 5): 1/6 (1, 6): 1/7 (1, 7): 1/8 (1, 8): 1/9 (2, 1): 2/3 (2, 2): 7/12 (2, 3): 7/15 (2, 4): 23/60 (2, 5): 34/105 (2, 6): 47/168 (2, 7): 31/126 (2, 8): 79/360 (3, 1): 3/4 (3, 2): 7/10 (3, 3): 19/30 (3, 4): 19/35 (3, 5): 131/280 (3, 6): 103/252 (3, 7): 38/105 (3, 8): 107/330
Can we compute $P(m, n)$ as a function of $P(m-1, n)$ and $P(m, n-1)$?
def count_positive_outcomes(m, n, position: int):
total_participants = m + n
if position == 1:
return 0
elif position == 2:
return m/(m + n - 1)
elif position == 3:
pass
# etc.
return
Cell In[301], line 2 def ^ SyntaxError: invalid syntax
One can think of any pair of early bird departures and straggler arrivals as coin flips. For each pair there is a 50/50 chance that one is earlier than the other. For a given $m$ and $n$ we have $m + n$ time points. However, do we lose some information by only considering permutations?