Two univariate distributions

plot of chunk unnamed-chunk-2

Two univariate distributions

plot of chunk unnamed-chunk-3

Bivariate data: scatterplot

plot of chunk unnamed-chunk-4

Marginal and Joint

plot of chunk unnamed-chunk-5

Contour plot

plot of chunk unnamed-chunk-6

Image plot / heat map

plot of chunk unnamed-chunk-7

Perspective plot

plot of chunk unnamed-chunk-8

Fisher's Irises

Fisher's Irises: Is there a relationship between sepal length and sepal width of irises? I.e., if you have measurements on one, can you predict what the other will be?

Fisher's Irises

plot of chunk unnamed-chunk-9

Conditioning

Question: what is the relationship between species and sepal width when comparing irises of the same length?

Two step process:

  1. Collect (subset) the flowers that are of the same length.

  2. Plot the relationship of width and species.

Conditioning

  1. Let's consider the flowers around 5.5 cm in length.
nrow(iris)
## [1] 150
iris_slice <- iris[iris$Sepal.Length > 5.0 & iris$Sepal.Length < 6.0, ]
nrow(iris_slice)
## [1] 51

Conditioning

  1. Plot (density plots) the relationship between width and species in this slice.

plot of chunk unnamed-chunk-11

Conditioning

  1. Plot (boxplots) the relationship between width and species in this slice.

plot of chunk unnamed-chunk-12

3D scatterplot

3D density plot

Data in yet higher dimensions

http://www.gapminder.org/world

  1. How many variables/columns/dimensions are displayed?

  2. Is there any structure that is persistent over an entire variable?

  3. How would you characterize the relationship between fertility and life expectancy during the 1950s and 1960s when comparing the industrialized western nations to the rest of the world?

High dimension, high art