The pandemic of 2020 has been a historic and unprecedented event in many ways, not only as a health crisis but also as an economic one, affecting institutions large and small, public and private.
COVID-19 has put data driven decision making to the ultimate stress test, at the national, state and business level. The test has highlighted not only the “readiness and quality” of the data to enable decision making, but also the “readiness and knowledge” (“data literacy”) of individuals to evaluate the data used in statistics, news reports, and dashboards to determine a course of action. In particular, COVID-19 dashboards have been disseminated and used broadly to visualize data in an easy to consume way. But as we have learned in the last six months, not all COVID-19 dashboards are created equal.
The good, the bad and the ugly of COVID dashboards are also the good/bad/ugly of business dashboards used internally in your own business to make decisions.
What are the key questions decision makers should be asking when they look at a dashboards, whether provided by government, external institutions, or the IT department?
Where is the Appendix?
Dashboards should publish a detailed appendix that provides additional transparency to the underlying assumptions, numbers, sources and formulas. The appendix will also provide information on how to interpret the graphs. As an example, The Johns Hopkins University COVID-19 dashboard, used broadly by businesses and government alike, provides detailed FAQ and instructions online for all to see. In your business, If the dashboard is a “black box” with no such transparency, ask for one. If none can be provided, then question the dashboard’s credibility.
What are the sources of data?
Where did all the data come from that is being displayed in the dashboard and used in the derived statistics. Can these sources be trusted? Is there any bias in the collection or aggregation of the data? A trusted source not only comes from credible institutions and systems but has also been scrubbed for data quality issues such as duplicates, invalid values and inconsistencies. Achieving 100% data quality accuracy is not usually possible or practical. Therefore, understanding the data quality scrubbing that has been performed and the data issues that still may persist, would determine the level of confidence and accuracy to the data driven decisions and conclusions drawn from the dashboard.
What is the timeliness and age of the data?
The timing of when the dashboard data was created, and the age of the data is particularly important when comparing multiple numbers. Comparisons, as in COVID-19 dashboards, between two states’ infection number/percentages or between two countries’ infection number/percentages will be problematic if the data from the two states or countries was collected at significantly different intervals. The Appendix should provide the timing frequency.
Are Terms & Definitions standardized?
Early in COVID-19 tracking, before the CDC provided guidance to states, the definition for COVID-19 deaths was not standardized. As a result, some states included suspected deaths as well as confirmed deaths in their numbers. Some states counted only cases of individuals who died in hospital care. Comparing two states with different definitions for the same term leads to inaccurate conclusions. Those states that used the broader definition would appear to be in a worse shape than those states using the narrower definition. Even when the CDC provided a definition, many states did not comply until much later. If definitions for common terms are not consistent, then comparison and conclusions between these items can’t be made reliably. In your business, common definitions are also required for accurate comparisons. For example, the definition for “active customer” is oftentimes different across company functions. Understanding these inconsistent definitions before comparisons are made is crucial.
In statistical formulas, what is the denominator?
Dashboards display statistics such as averages and percentages. The dashboard appendix should include the mathematical formula for how these statistics were calculated. These statistics are basic division formulas that will have a numerator (top number) divided by a denominator (bottom number). Especially important is the bottom number (denominator) as it can significantly influence the result. For example, tracking “ % (percentages) of COVID-19 deaths” requires dividing “number of deaths” by “population number.” Using this measure, concluding the safety of a particular state could yield significantly different conclusions based on the population count (denominator) used.
Comparing two states with different denominators used in the calculation would also be problematic. If, when calculating the “percentage of deaths,” the denominator is the “total state population,” an accurate comparison state to state is possible. But if the one state uses as the denominator “population tested in the state” then concluding that one state is more infected than another is potentially inaccurate. The level of testing is a smaller population number and could lead to a much larger percentage result. Later tracking began using % death per 100K population, a more credible comparison statistic. Averages also can be problematic if there is a broad range and distribution between the highest and lowest numbers. A better statistic would be the mean or the median score.
Why am I still confused?
Dashboards can be excellent digital communication tools to capture many metrics and offer “at a glance” insights in an easy-to-consume way. But it requires mastering dashboard design principles and techniques. Designing an effective dashboard requires skill in using colors, types of graphs (bar, pie, line tables, gauges) and visual cues in the right way. If not designed correctly, the audience is left confused and this ultimately leads to misinterpretations. When reviewing your business dashboard, if you find yourself staring too long at the graphs and pictures to make sense of the data, the dashboard design is not intuitive. Ask more questions.
What data is missing that would tell the full story?
Analytic Dashboards, like the COVID-19 ones, are used for analysis and decision making. As a decision maker, understand the decisions you want to make, what data will help make the decisions and then evaluate the dashboard for including all the necessary data points. Oftentimes, it’s the data that is not included that would help make the decision more accurate. For example, in the COVID-19 case, valuable information, not just about the zip code infection rate but about the actual streets and neighborhoods involved, would have been helpful to understand clusters and what areas to avoid. Also, breaking down pre-existing condition infection/death distribution would alert those with the highest risk conditions to take more cautionary actions to protect themselves. The list of pre-existing conditions was very broad and included a significant percentage of the population, making it less meaningful to those who were most at risk.
How credible is the institution/person/function publishing the dashboard?
Before COVID-19, “fake news” was already in our vocabulary and affecting the information sources that we personally were using to get our public news. COVID-19 has further demonstrated the effect that credibility has on data providers and the extent it will change public opinion and actions. The numbers displayed, the sources used, and the formulas will not matter if the public believes there are personal or political agendas at stake. Agendas and bias also exist in the business world. As a decision maker, evaluate the trustworthiness and credibility of the person/function providing the dashboard. Could bias or agenda be possible? Alternatively, if you are the data provider, establish yourself as a credible, independent data broker for the firm.
Data driven businesses use data to inform and enhance processes and decision making. Reviewing dashboards for COVID-19 can help us see the areas where we can do better analysis in our own businesses.
Common Courtesy Is Not Always Common
The Unintended Consequences Of New Technologies