Analysis of Newspaper headlines

Author

Andrés A.

Published

May 2, 2024

Why this graph ?

To analyze the tweets from www.20min.ch, we devised a controversial score. This score was formulated under the premise that a higher number of replies compared to likes indicates a more controversial tweet, while a higher number of likes relative to replies suggests greater agreement on the topic. Hence, we utilized the following formula to provide a clearer representation of the data.

\[ \text{Controversial Score} = \frac{\text{replies\_count}}{\text{likes\_count} + \text{retweets\_count} + 1}\] Therefore:

As we move towards the right, the tweet have more likes.
As we move towards the top, the tweet becomes increasingly controversial.

Limitation:

Tweets with a low number of likes were excluded due to challenges in calculating scores accurately.
The use of log visualization could potentially lead to misinterpretation.

Used tools/packages

ggplotly: Extends ggplot2 with interactivity, enabling the creation of interactive plots from ggplot2 objects.
DT: Facilitates the creation of interactive data tables in R, offering features like sorting and filtering.
crosstalk: Enables communication between interactive HTML widgets, allowing for linked brushing and filtering.
Quarto: A dynamic document creation tool in R that integrates code, text, and output for reproducible reports and presentations.

Visual representation

Code

df_shared <- SharedData$new(df, key = ~likes_count)


p <- df_shared %>% 
  ggplot(aes(likes_count, score, color = score, 
             text = paste("score:",round(score,2),"\n",
                          "Likes:",round(likes_count,2),"\n",
                          "replies:",round(replies_count,2),"\n",
                          tweet_content )
             )
  ) + 
  geom_point()+
  scale_color_gradient(low = 'green',high = "red",transform = "log10") +
  scale_x_log10() + 
  scale_y_log10() + 
  labs(x = "Number of likes", 
       y = "Controversial Score")

ggplotly(p, tooltip ="text")

Figure 1: Visual representation of tweets

Tabular representation

Code

t <- df_shared %>% 
  DT::datatable(colnames = c("date","score","likes_count","replies_count","tweet","URL"),
                options = list(paging = T, 
                               pageLength = 20,
                               dom = "t",
                               scrollX = TRUE),
                rownames = FALSE,
                filter = 'top',
                escape = F)


t

Table