Skip to main content

Command Palette

Search for a command to run...

Getting Started with R and RStudio: A Beginner’s Guide to Data Visualization

Published
4 min read

If you are starting your journey in data science, you’ll often hear about R and RStudio. R is a programming language designed for statistical computing and data visualization, while RStudio is a user-friendly IDE (Integrated Development Environment) that makes working with R much easier.

In this guide, we’ll go through:

  • R basics

  • Popular visualization packages

  • Using the inbuilt plot() function

  • Plotting with ggplot2

  • Extending ggplot2 with GGally

And yes—we’ll look at code snippets you can try on your own! 🚀


Setting up R and RStudio

  1. Download R → Install from CRAN (Comprehensive R Archive Network).

  2. Download RStudio → Install from RStudio IDE.

  3. Open RStudio → You’ll see Console, Script editor, Environment, and Plots panel.

Now you’re ready to code in R! 🎉


R Basics

Here are some quick basics in R:

# Arithmetic
2 + 3      # 5
10 / 2     # 5

# Variables
x <- 5
y <- 10
x + y      # 15

# Vectors
numbers <- c(1, 2, 3, 4, 5)
mean(numbers)   # Average = 3

R is designed with statistics and visualization in mind, so plotting data is one of its strongest features.


R has a rich ecosystem of visualization libraries. Let’s look at the most popular ones:

  • Based on the Grammar of Graphics.

  • Highly customizable.

  • Example:

library(ggplot2)

data(mpg)  # Built-in dataset
ggplot(mpg, aes(x = displ, y = hwy, color = class)) +
  geom_point() +
  theme_minimal() +
  labs(title = "Engine Size vs Highway Mileage")

2. plotly (Interactive plots)

  • Creates interactive charts.

  • Great for dashboards.

library(plotly)

fig <- plot_ly(data = iris, x = ~Sepal.Length, y = ~Petal.Length,
               type = "scatter", mode = "markers", color = ~Species)
fig

3. lattice (Multi-panel plots)

  • Useful for conditioning plots (splitting data by categories).
library(lattice)

xyplot(mpg ~ wt | cyl, data = mtcars,
       main = "MPG vs Weight by Cylinders",
       xlab = "Car Weight", ylab = "Miles per Gallon")

4. leaflet (Maps in R 🌍)

  • Best for interactive maps.
library(leaflet)

leaflet() %>%
  addTiles() %>%
  addMarkers(lng = 77.5946, lat = 12.9716, popup = "Bangalore")

5. Others worth exploring

  • cowplot → Enhances ggplot layout.

  • highcharter → Interactive charts.

  • corrplot → Correlation matrix visualization.


Using the Inbuilt plot() Function

R comes with a very handy base function:

plot(mtcars$wt, mtcars$mpg,
     main = "Car Weight vs MPG",
     xlab = "Weight", ylab = "Miles per Gallon",
     col = "blue", pch = 19)

👉 Additional Features:

  • pch → point shape

  • col → colors

  • type → "l" for line, "p" for points, "b" for both

  • abline() → add regression lines

# Adding a regression line
model <- lm(mpg ~ wt, data = mtcars)
abline(model, col = "red", lwd = 2)

Plotting with ggplot2

ggplot2 is the powerhouse of visualization in R.

Example: Scatter Plot with Smooth Curve

ggplot(mpg, aes(x = displ, y = hwy, color = class)) +
  geom_point(size = 3) +
  geom_smooth(method = "lm", se = FALSE, color = "red") +
  theme_classic() +
  labs(title = "Displacement vs Highway Mileage",
       x = "Engine Displacement (L)", y = "Highway Mileage")

👉 Extras:

  • geom_histogram() for histograms

  • geom_boxplot() for boxplots

  • facet_wrap(~class) for multi-panel plots


Going Further: GGally Extension

GGally extends ggplot2 with additional functionality.

Example: Pairwise Scatter Plots (Great for EDA 🔍)

library(GGally)

# Pair plot of iris dataset
ggpairs(iris, aes(color = Species))

This generates a matrix of plots showing relationships between variables—super helpful in exploring datasets quickly.


Wrapping Up

  • R + RStudio = Perfect combo for beginners and professionals in data science.

  • Use base R plots (plot()) for quick visual checks.

  • Switch to ggplot2 for polished, publication-quality visuals.

  • Explore plotly, lattice, leaflet for interactivity, multi-panels, and maps.

  • Extend with GGally for advanced exploratory analysis.

📌 Visualization is a core skill in data science, and R provides some of the best tools available.

👉 Start with small datasets (like iris, mtcars, or mpg) and keep experimenting.