## Diamond Princess: A Data Science Solution To the COVID-19 Test Kit Shortage

Rarely does an opportunity present itself where statistics can so immediately and with so much impact solve a pressing real world problem. Shortly before publishing this post, we reached out and talked to the Japanese Ministry of Health and they indicated that they planned to test everybody on board. Whether they go ahead with that plan or not, the point of the article remains valid – doing a random sample (it could have been done 6 days ago) would be an efficient (both in terms of time and the limitation on test kits) way to reduce uncertainty on the total population of positive COVID-19 cases.

**Situation:** There are 3,600 passengers and crew aboard the *Diamond Princess* cruise ship docked off the coast of Yokohama, Japan. A significant percentage of them are infected with coronavirus, but it is unknown how many. Currently, it would be difficult to quickly test all 3,600 passengers and crew for the coronavirus, especially since there is limited availability of test kits, according to Japanese authorities.

**Solution:** To get a much better estimate of total cases, only a fraction of the passengers need to be tested at first. This could lead to important time savings in crucial decisions facing the Japanese Ministry of Health and the *Diamond Princess* cruise ship.

Despite the total number of 3600 passengers and crew, you only need to test a couple hundred to narrow the range meaningfully on the number of total cases; the results of those tests could have immediate and important consequences. A crucial tenet of Applied Information Economics (AIE) is that measurements are generally more useful (or have a higher ROI) when they are connected to a decision. The important decisions in this situation aren’t going to be based so much on a small difference (e.g. whether there are 200 or 220 additional cases). The important decisions will be made based on whether there are 50 or 500 additional cases. This level of uncertainty reduction could be achieved quickly with a smaller random sample of the whole population.

A crucial point is that these tests need to be selected *randomly* – thus far the samples have been taken from suspected cases (people with symptoms or contact with known cases). However, a *random *selection of 200 would allow us to apply an inverse beta distribution to the results and produce a relatively tight 90% confidence interval on total cases in the remaining 3,400 people.

**Recommendation 1:** Randomly test 70 staff and 130 passengers, and then use a beta distribution to project the total in the remaining population. The reason to break out staff and passengers is that they are likely to have different proportions of infection. Because the staff on the *Diamond Princess* continues to commingle (i.e. eat and berth together), their rate of infection is likely higher.

**Recommendation 2**: Once the new range for total infections is obtained with the random sample, prepare local hospitals for case load. Among other things, the ministry will know if there are enough isolation units available, and if not make other arrangements (i.e. dedicate one hospital to COVID-19 patients).

**Recommendation 3:** If the infection level of the staff is greater than a threshold (15-20%), then other arrangements should be made for serving passengers. Identifying the healthy staff would become a priority in this case.

At this moment, there is a wide range for the 90% confidence interval of current level of infections (https://hubbardresearch.com/cruise-ship-coronavirus-infected-passengers-in-hundreds/). People on board are worried about if they are sick or not, regardless of whether they have symptoms. If test results come back that indicate that a very small percentage of asymptomatic people tested positive, this would reassure those on board who are asymptomatic. Additionally, the percentage of asymptomatic cases that test positive *could have broad global implications* for detection and planning for we could back out percentage of cases that test positive while asymptomatic.

This is a unique opportunity to understand the disease that could help the effort to contain it worldwide!

Learn how to start measuring variables the right way – and create better outcomes – with our two-hour **Introduction to Applied Information Economics: The Need for Better Measurements**** webinar**. $100 – limited seating.