Preface
This work is a translation of the a Python tutorial from the following repository: https://github.com/jisukimmmm/NCCR_MWQTA_2024
It was then transformed in an interactive tutorial.
Introduction to R language - exercises & answers
Basic Syntax and Operations:
1. Calculate the area of a triangle:
Write a program to calculate the area of a triangle given its base and height.
The area of a triangle is calculated using the formula: (base * height) / 2
<- 10
base <- 3
height
<- base * height / 2
triangle
paste("This is the area:", triangle)
2. Speed Conversion
Create a program that converts kilometers per hour to meters per second.
To convert km/h to m/s: 1. Multiply by 1000 to convert km to m 2. Divide by 3600 to convert hours to seconds
<- 100
kmph <- kmph * 1000 / 3600
ms paste("The answer is", ms)
3. String Reversal
Write an R script that takes a string as input and prints its reverse.
Use strsplit()
to split the string into characters, rev()
to reverse them, and paste()
with collapse
to join them back.
<- "This is a text"
my_text <- paste(rev(strsplit(my_text, NULL)[[1]]), collapse = "")
rev_text rev_text
Conditional Statements and Loops:
1. Leap Year Check
Create a program that checks whether a given year is a leap year or not.
A year is a leap year if: - It’s divisible by 4 AND not divisible by 100 - OR it’s divisible by 400 Use the modulo operator %%
to check divisibility
<- 3000
year if ((year %% 4 == 0 && year %% 100 != 0) | (year %% 400 == 0)) {
"This is a leap year"
else {
} "This is not a leap year"
}
2. Sum of Multiples
Write an R script to find the sum of all numbers between 1 and 1000 that are divisible by both 3 and 5.
- Create a sequence from 1 to 1000
- Use vector operations with modulo to find numbers divisible by both 3 and 5
- Use
sum()
to add them up
<- 1:1000
numbers <- numbers[numbers %% 3 == 0 & numbers %% 5 == 0]
bag sum(bag)
3. Geometric Progression
Implement a program to print the first 10 terms of the geometric progression series: 2, 6, 18, 54, …
- Create a numeric vector to store the series
- First term is given
- Each subsequent term is previous term multiplied by common ratio
<- 3
common_ratio <- 2
gp_series
for (i in 2:10) {
<- gp_series[i-1] * common_ratio
gp_series[i]
} gp_series
Lists and List Operations:
1. Largest and Smallest Elements
Create a program to find the largest and smallest elements in a list.
Use R’s built-in functions: - min()
for smallest element - max()
for largest element
<- c(2, 5, 1, 67, 4, 7)
number_list <- min(number_list)
mini <- max(number_list)
maxi paste("Min:", mini, "Max:", maxi)
2. List Intersection
Write an R script to find the intersection of two lists.
Use the intersect()
function to find common elements between two vectors
<- c(1, 2, 3, 4, 5)
list1 <- c(4, 5, 6, 7, 8)
list2 <- intersect(list1, list2)
intersection intersection
3. Program to shuffle a deck of cards (x)
Implement a program to shuffle a deck of cards represented as a list.
Use the sample()
function to randomly shuffle elements in a vector. First create a vector with all cards.
<- c("A", "2", "3", "4", "5", "6", "7", "8", "9", "10", "J", "Q", "K")
deck <- sample(deck)
shuffled_deck
print(shuffled_deck)
Strings and String Operations:
1. Capitalize the first letter of each word (x)
Write an R script to capitalize the first letter of each word in a sentence.
The tools::toTitleCase()
function can capitalize the first letter of each word in a string.
<- "this is a sentence"
sentence <- tools::toTitleCase(sentence)
capitalized_sentence
print(capitalized_sentence)
2. Most Frequent Character (x)
Create a program to find the most frequent character in a given string.
- Split the string into characters using
strsplit()
- Create a frequency table with
table()
- Sort in descending order and get the first element
<- "this is a string"
string <- names(sort(table(strsplit(string, NULL)[[1]]),
most_frequent_char decreasing = TRUE))[1]
most_frequent_char
3. Check if a string contains only digits
Implement a program to check if a given string contains only digits.
- Use
grepl()
function with a regular expression pattern - The pattern
^[0-9]+$
means:^
start of string[0-9]
any digit+
one or more occurrences$
end of string
<- "123456"
string
<- grepl("^[0-9]+$", string)
is_digits
print(paste("Does the string contain only digits?", is_digits))
Functions:
1. Perfect Square Check
Create a function to check whether a given number is a perfect square or not.
- Take the square root of the number
- Check if the square root is equal to its floor value
- Return TRUE/FALSE accordingly
<- function(x) {
is_perfect_square <- sqrt(x)
sqrt_x return(sqrt_x == floor(sqrt_x))
}
# Test the function
is_perfect_square(16) # Should return TRUE
is_perfect_square(15) # Should return FALSE
2. Reverse the elements of a vector
Implement a function to reverse the elements of a list in place.
You can use R’s built-in rev()
function to reverse a list or vector. Alternatively, you could write a loop that swaps elements from the beginning and end moving towards the middle.
<- function(lst) {
reverse_list return(rev(lst))
}
# Test the function
<- c(1, 2, 3, 4, 5)
my_list <- reverse_list(my_list)
reversed print(reversed)
3. Calculate the mean of a list of numbers
Create a function to calculate the mean (average) of a list of numbers.
The mean is calculated by summing all numbers and dividing by the count of numbers. In R, you can use the built-in mean()
function or implement it using sum()
and length()
.
<- function(numbers) {
calculate_mean return(mean(numbers))
}
# Test the function
<- c(1, 2, 3, 4, 5)
numbers <- calculate_mean(numbers)
avg print(paste("The mean is:", avg))
File Handling:
1. CSV Data Analysis (x)
Create a program to read a CSV file containing student scores and calculate their average.
- Use
readr::read_csv()
to read the CSV file - Access the score column using
$
- Calculate mean using
mean()
# Method 1
<- read.csv("data/student_scores.csv")
student_scores mean(student_scores$score)
# Method 2
library(readr)
<- read_csv("data/student_scores.csv")
student_scores mean(student_scores$score)
2. Find lines containing a specific word in a text file (x)
Write a Python script to find and print all lines containing a specific word in a text file.
Use readLines()
to read the file content and grep()
to search for matching lines. The grep()
function with value=TRUE
returns the actual matching lines.
<- function(file_path, word) {
find_lines_with_word <- readLines(file_path)
lines <- grep(word, lines, value = TRUE)
matching_lines return(matching_lines)
}
# Example usage
# find_lines_with_word("example.txt", "specific_word")
3. Count words in a text file
Implement a program to count the number of words in a text file.
Break this down into steps: 1. Read the file using readLines()
2. Split the text into words using strsplit()
with whitespace as delimiter 3. Count the total words using length()
<- function(file_path) {
count_words_in_file <- readLines(file_path)
lines <- unlist(strsplit(lines, "\\s+"))
words return(length(words))
}
# Example usage
# count_words_in_file("example.txt")
Plotting:
1. Histogram
Histogram of Student Scores: Create a histogram showing the distribution of student scores.
- Load ggplot2
- Use
geom_histogram()
- Set appropriate binwidth
- Add proper labels
library(ggplot2)
ggplot(student_scores, aes(x = score)) +
geom_histogram(binwidth = 5) +
labs(title = "Histogram of Student Scores",
x = "Score",
y = "Frequency")
2. Create a Boxplot of Student Scores
Boxplot of Student Scores: Generate a boxplot to visualize the spread and central tendency of student scores.
Use ggplot()
with geom_boxplot()
. The data should be mapped to the y-axis since we want a vertical boxplot. Don’t forget to add appropriate labels.
ggplot(student_scores, aes(y = score)) +
geom_boxplot() +
labs(title = "Boxplot of Student Scores", y = "Score")
3. Create a Scatter Plot of Student Scores
Scatter Plot of Student Scores: Create a scatter plot to explore the relationship between student scores and student IDs.
Use ggplot()
with geom_point()
. Map student_id to x-axis and score to y-axis. Remember to include appropriate axis labels.
ggplot(student_scores, aes(x = student_id, y = score)) +
geom_point() +
labs(title = "Scatter Plot of Student Scores", x = "Student ID", y = "Score")