Data Visualisation using COVID-19 Statistics
September 2020
Preface
With an increasing volume of data being generated, we now usher in an era
where data-driven research is playing an increasingly important role on top of
traditional hypothesis-driven research. Thus, it is important to be able to
tease out trends / patterns from large volumes of data, often through the use
of carefully crafted visualizations. In this guide, we provide an introduction
into the process of extracting, processing and eventually visualizing data to
identify meaningful trends. In particular, we will be working on publicly
available COVID-19 statistics (e.g. the number of cases worldwide) using the R
programming language to perform data wrangling (with the dplyr
package) and
data visualization (with the ggplot2
package).
Apart from the programming aspects, the guide will also discuss other aspects of data visualisation, with emphasis on aesthetics and proper interpretation of plots. Overall, we hope that the guide will not only help readers be able to generate plots and work better with data but also appreciate the subtleties in interpreting results from graphical plots.
If you find any errors or wish to offer any feedback, feel free to contact me at john.f.ouyang@gmail.com.
And let’s dive in!