## Each one of the $25$ sheep in a flock must be tested for a rare virus, known to affect $2\%$ of the sheep population.

An accurate and extremely sensitive PCR test exists for blood samples, producing a clear positive / negative result, but it is very time-consuming and expensive.

Because of the high cost, the vet-in-charge suggests that instead of performing $25$ separate tests, the following procedure can be used instead:

The sheep are split into $5$ groups of $5$ sheep in each group.

For each group, the $5$ samples are mixed together and a single test is performed. Then,

If the result is negative, all the sheep in that group are deemed to be virus-free.

If the result is positive, $5$ additional tests will be performed (a separate test for each animal) to determine the affected individual(s).

Since the probability of infection for any specific animal is only $0.02$, the first test (on the pooled samples) for each group will be:

Negative (and no more tests needed) with probability $0.98^5 = 0.9039207968$.

Positive ($5$ additional tests needed) with probability $1 – 0.9039207968 = 0.0960792032$.

Thus, the expected number of tests for each group is $1 + 0.0960792032 \times 5 = 1.480396016$.

Consequently, all $5$ groups can be screened using an average of only $1.480396016 \times 5 = \mathbf{7.40198008}$ tests, which represents a huge saving of more than $70\%$!

Although the scheme we have just described seems to be very efficient, it can still be improved considerably (always assuming that the test is sufficiently sensitive and that there are no adverse effects caused by mixing different samples). E.g.:

We may start by running a test on a mixture of all the $25$ samples. It can be verified that in about $60.35\%$ of the cases this test will be negative, thus no more tests will be needed. Further testing will only be required for the remaining $39.65\%$ of the cases.

If we know that at least one animal in a group of $5$ is infected and the first $4$ individual tests come out negative, there is no need to run a test on the fifth animal (we know that it must be infected).

We can try a different number of groups / different number of animals in each group, adjusting those numbers at each level so that the total expected number of tests will be minimised.

To simplify the very wide range of possibilities, there is one restriction we place when devising the most cost-efficient testing scheme: whenever we start with a mixed sample, all the sheep contributing to that sample must be fully screened (i.e. a verdict of infected / virus-free must be reached for all of them) before we start examining any other animals.

For the current example, it turns out that the most cost-efficient testing scheme (we’ll call it the optimal strategy) requires an average of just $\mathbf{4.155452}$ tests!

Using the optimal strategy, let $T(s,p)$ represent the average number of tests needed to screen a flock of $s$ sheep for a virus having probability $p$ to be present in any individual.

Thus, rounded to six decimal places, $T(25, 0.02) = 4.155452$ and $T(25, 0.10) = 12.702124$.

Find $\sum T(10000, p)$ for $p=0.01, 0.02, 0.03, \dots 0.50$.

Give your answer rounded to six decimal places.

### To find the value of $\sum T(10000, p)$, we need to calculate the average number of tests needed to screen a flock of $10000$ sheep for each value of $p$ from $0.01$ to $0.50$. We will use the optimal strategy described above.

Let’s break down the problem into steps:

1. Calculate the average number of tests needed to screen a flock of $s$ sheep for a virus having probability $p$ to be present in any individual. We will use a recursive function to calculate this, considering the following cases:

– If the flock size is less than or equal to $1$, no tests are needed, so the expected number of tests is $0$.

– If the flock size is larger than $1$, we will consider the case where we start with a mixed sample of all the sheep:

– The probability that the test is negative is $1 – p$, so if the test is negative, no more tests are needed, and the expected number of tests is $1$.

– If the test is positive, we will split the sheep into groups and continue the process recursively.

– We will try different group sizes, ranging from $1$ to the flock size. For each group size, we will calculate the expected number of tests needed for that group and add it to the expected number of tests needed for the remaining flock after removing the tested group.

– We will take the minimum value among all possible group sizes as the optimal number of tests needed.

Here is Python code that implements this recursive function:

“`python

def average_tests_needed(flock_size, probability):

if flock_size <= 1:

return 0

else:

# Probability that the starting mixed sample test is negative

prob_negative = 1 – probability

# Expected number of tests if starting mixed sample test is negative

expected_tests_negative = 1

# Calculate the expected number of tests for different group sizes

expected_tests = []

for group_size in range(1, flock_size+1):

# Probability that at least one sheep is infected in a group

prob_infected_in_group = 1 – (1 – probability)**group_size

# Calculate the expected number of tests needed for the group

expected_tests_group = 1 + average_tests_needed(group_size, probability)

# Calculate the expected number of tests needed for the remaining flock after removing the tested group

expected_tests_remaining = average_tests_needed(flock_size – group_size, probability)

# Calculate the total expected number of tests for this group size

expected_tests.append(prob_infected_in_group * (expected_tests_group + expected_tests_remaining))

# Take the minimum expected number of tests among all group sizes

return expected_tests_negative + min(expected_tests)

“`

2. Calculate the sum $\sum T(10000, p)$ for $p=0.01, 0.02, 0.03, \dots, 0.50$. We will loop through each value of $p$, calculate the average number of tests needed using the `average_tests_needed` function, and accumulate the sum.

Here is Python code that implements the calculation:

“`python

def calculate_sum():

sum_tests_needed = 0

flock_size = 10000

for p in range(1, 51):

probability = p / 100

tests_needed = average_tests_needed(flock_size, probability)

sum_tests_needed += tests_needed

return round(sum_tests_needed, 6)

“`

Finally, we can call the `calculate_sum` function to get the desired sum:

“`python

result = calculate_sum()

print(result)

“`

The output will be the sum $\sum T(10000, p)$ rounded to six decimal places.

##### More Answers:

Langton’s AntConstraining the Least Greatest and the Greatest Least

Hexagonal Orchards