Module: Basic Exploratory Data Analysis
Module Chapter 7 covers data at the most basic level. Once data are obtained, the next step is to conduct basic exploratory data analysis, which consists of summarizing the data graphically and numerical. This allows one to explore and analyze a data set to discover patterns, trends, distributions, anomalies, etc. The intent is to explore and learn from the data, as opposed to assessing statistical hypotheses.
There does not exist an algorithm or method that automatically tells one how the data should be summarized. Therefore, it is important to have a basic understanding of the properties for different methods of summarization. With practice, one can develop the ability to summarize data effectively to brings insight regarding your research question(s).
It is worth noting that there three common plotting systems in R to graphically summarize data:
base R
lattice
ggplot2
The Lattice system is preferred here module because it mostly follows the R formula expression, providing a suite of functions for summarizing data that is intuitive and easy to use. That is, using this system provides a suite of functions for summarizing data that follows the general form task( y ~ x , data )
.
Module outline
- The chapters in this module have three focuses regarding data:
- Exploring categorical data
- Exploring numerical data
- Exploring categorical data
- Outline of topics:
- Graphical summaries
- Numerical summaries