Plotting Techniques, STAT 603

This page summarizes the types of plots we have encountered in the pre-term STATS 603 class. Here is a list of plot types: histograms, boxplots, normal quantile plots, bar charts, spine plots, scatterplots, comparison boxplots, mosaic plots, time series plots, control charts. We will bring some order to this collection by showing when a particular plot is useful, what can be learned from it, and how it is generated in JMP.

[In what follows we will use the terms "variable" and "column" interchangeably.]

PLOT TYPES are largely determined by two factors:

How JMP thinks: It doesn't want to hear "scatterplot". It wants you to ask for a bivariate plot ("Fit Y by X"), and if you choose variables that are both continuous, it will indeed give you a scatterplot. If, however, both variables are nominal or ordinal, it will give you a mosaic plot. Similarly, you don't ask for a histogram. You ask for a univariate plot ("Distribution"), and if the variable you choose is continuous, you get a histogram, otherwise a barchart. In other words, JMP decides for you what bivariate or univariate plot makes sense.

[Problem: Sometimes you get the wrong type of plot, such as when the modeling type of an integer variable is continuous, but you really want to know how often each integer occurs. You can't tell JMP to hand you a barchart instead of a histogram, or a mosaic plot instead of a scatterplot. Instead, you have to change the modeling type from continuous to ordinal: R-click on column name > Column Info > Modeling Type > ordinal. Btw, variables with character data cannot be made continuous!]



UNIVARIATE PLOTS

With univariate plots you examine the distribution of one variable at a time, not the associations and dependences between two or more variables.

JMP: Analyze > Distribution; you can now specify as many variables as you please by selecting them as Y-variables. You will get one plot for each variable separately.

Rule: Make univariate plots of ALL variables first thing when you start looking at a new dataset.

Here are the univariate plot types as a function of the modeling types: