1 Exploratory Data Analysis

1.1 Adjusted closing prices

Let us plot the adjusted closing prices…

Code

fig_asml <- plot_ly(data = asml, x = ~Date, y = ~ASML.Adjusted, 
  type = 'scatter', mode = 'lines', name = 'ASML Adjusted',
  line = list(color = 'darkblue', width = 1.5))

fig_asml <- layout(
  fig_asml,
  title = "ASML Stock Price Evolution",
  xaxis = list(
    title = "Date",
    rangeslider = list(visible = TRUE),
    rangeselector = list(
      buttons = list(
        list(count=1, label="1m", step="month", stepmode="backward"),
        list(count=6, label="6m", step="month", stepmode="backward"),
        list(count=1, label="YTD", step="year", stepmode="todate"),
        list(count=1, label="1y", step="year", stepmode="backward"),
        list(step="all")
      )
    )
  )
)

fig_asml

ASML Daily Adjusted Closing Prices

The graph clearly shows that the price series is non-stationary: the mean is not constant (it exhibits trends), and the variance appears to fluctuate depending on the price level. Consequently, standard statistical inference cannot be directly applied to prices.

1.2 Log returns

To obtain a stationary process, we transform prices into log-returns. Let \(P_t\) be the price at time \(t\). The simple net return is defined as:

\[R_t = \frac{P_t - P_{t-1}}{P_{t-1}}\]

However, in financial econometrics, log-returns (\(r_t\)) are preferred due to their time-additivity property. The log-return is defined as the natural logarithm of the gross return:

\[ r_t = \ln\left(\frac{P_t}{P_{t-1}}\right) = \ln(P_t) - \ln(P_{t-1}) \]

We compute the log-returns for ASML in R:

Code

log_ret_vec <- diff(log(asml$ASML.Adjusted))
log_ret_vec <- log_ret_vec[is.finite(log_ret_vec)]

log_returns <- data.frame(
  Date = asml$Date[-1],
  LogReturns = as.numeric(log_ret_vec)
)

Let’s visualize the log-returns of ASML:

Code

fig_log <- plot_ly(data = log_returns, x = ~Date, y = ~LogReturns, type = 'scatter',
  mode = 'lines', name = 'Log Returns', line = list(color = 'darkred', width = 1))

fig_log <- layout(
  fig_log,
  title = "ASML Log-Returns",
  xaxis = list(title = "Date"),
  yaxis = list(title = "Log Return"),
  shapes = list(
    list(
      type = "line",
      x0 = min(log_returns$Date),
      x1 = max(log_returns$Date),
      y0 = 0, 
      y1 = 0,
      line = list(color = "black", width = 1)
    )
  )
)

fig_log

ASML Daily Log-Returns

Unlike prices, the log-returns oscillate around a constant mean (close to zero). This behavior suggests that the return series is stationary, satisfying the conditions for further econometric analysis.

1.3 Distribution analysis

We analyze the distribution of log-returns to check for deviations from normality (e.g., fat tails or asymmetry).

1.3.1 Graphical representation

1.3.1.1 Histogram

The graphical representation using a histogram of the frequency distribution of returns observed over a given sample period provides an initial indication of the characteristics of the probability distribution that generated them.

Code

mu_ret <- mean(log_returns$LogReturns)
sd_ret <- sd(log_returns$LogReturns)

t_fit_fGarch <- stdFit(log_returns$LogReturns * 100)
t_params <- t_fit_fGarch$par
t_label <- sprintf("Student-t (df=%.2f)", t_params["nu"])

fig_dist <- plot_ly(data = log_returns, x = ~LogReturns,
  type = "histogram", name = "Log Returns", histnorm = "probability density",
  marker = list(color = "lightgray", line = list(color = "gray", width = 1)),
  opacity = 0.7)

x_seq <- seq(min(log_returns$LogReturns), max(log_returns$LogReturns), length.out = 500)

y_norm <- dnorm(x_seq, mean = mu_ret, sd = sd_ret)
y_t <- dstd(
  x_seq,
  mean = t_params["mean"]/100,
  sd = t_params["sd"]/100,
  nu = t_params["nu"]
)

fig_dist <- fig_dist %>%
  add_lines(
    x = x_seq,
    y = y_norm,
    name = "Normal Distribution",
    line = list(color = "#E74C3C", width = 2, dash = "dash"),
    inherit = FALSE
  ) %>% 
  
  add_lines(
    x = x_seq,
    y = y_t,
    name = t_label,
    line = list(color = "#2E86C1", width = 2),
    inherit = FALSE
  ) %>% 
  
  layout(
    title = "Distribution of ASML Log-Returns",
    xaxis = list(title = "Log Return"),
    yaxis = list(title = "Density"),
    legend = list(x = 0.8, y = 0.9),
    hovermode = "x unified"
  )

fig_dist

Histogram of ASML Log-Returns

The histogram indicates that the distribution of log-returns is closer to a Student-t distribution than to a Normal distribution.

1.3.1.2 Q-Q plot

Code

vec_ret <- log_returns$LogReturns

qq_vals <- qqnorm(vec_ret, plot.it = FALSE)
qq_data <- data.frame(
  Theoretical = qq_vals$x,
  Sample = qq_vals$y
)

y <- quantile(vec_ret, c(0.25, 0.75), names = FALSE)
x <- qnorm(c(0.25, 0.75))
slope <- diff(y)/diff(x)
int <- y[1L] - slope * x[1L]

fig_qq <- plot_ly(data = qq_data, x = ~Theoretical, y = ~Sample, type = 'scatter',
  mode = 'markers', marker = list(size = 3, color = '#2E86C1', opacity = 0.6),
  name = "Returns")

fig_qq <- fig_qq %>%
  add_lines(
    x = ~Theoretical,
    y = ~Theoretical * slope + int, 
    line = list(color = "#E74C3C", width = 2, dash = "dash"), 
    name = "Normal Reference",
    inherit = FALSE
    ) %>%
    
    layout(
      title = "Q-Q Plot: Normal vs Empirical",
      xaxis = list(title = "Theoretical Quantiles (Normal)"),
      yaxis = list(title = "Sample Quantiles (ASML)")
    )

fig_qq

Q-Q Plot of ASML Log-Returns

The Q-Q plot confirms the findings from the histogram. While the central observations (the body of the distribution) align well with the theoretical normal line (in red), the extreme values at both ends deviate significantly, forming an ‘S-shape’. This provides clear visual evidence of fat tails (leptokurtosis), reinforcing the stylized fact that extreme market movements occur more frequently than predicted by a Gaussian model.

1.3.2 Synthetic indicators

Code

desc_stats <- data.frame(
  Metric = c("Mean", "Std. Dev.", "Skewness", "Kurtosis"),
  Value = c(
    mean(log_returns$LogReturns),
    sd(log_returns$LogReturns),
    skewness(log_returns$LogReturns),
    kurtosis(log_returns$LogReturns)
  )
)

datatable(
  desc_stats, 
  options = list(dom = 't', paging = FALSE), 
  rownames = FALSE
  ) %>% formatRound('Value', digits = 5)

As we can see, the distribution has a mean close to zero, slight negative skewness, and significant leptokurtosis (fat tails). These characteristics align with the well-known stylized facts of financial log-returns, indicating a departure from normality.

1.3.3 Test of normality

To formally test the normality hypothesis, we employ the Jarque-Bera test, which is based on skewness and kurtosis matching a normal distribution.

\[ H_0: \text{The data is normally distributed (Skewness=0, Kurtosis=3)} \] \[ H_1: \text{The data is not normally distributed} \]

Code

jb_test <- jarque.bera.test(log_returns$LogReturns)

jb_res <- data.frame(
  Test = "Jarque-Bera",
  Statistic = round(jb_test$statistic, 2),
  P_Value = ifelse(jb_test$p.value < 0.001, "< 0.001", round(jb_test$p.value, 4)),
  Result = ifelse(jb_test$p.value < 0.05, "Reject H0", "Fail to Reject H0")
)
datatable(jb_res, options = list(dom = 't'), rownames = FALSE)

Since the p-value is virtually zero (<0.001), we strongly reject the null hypothesis of normality. This confirms that ASML returns are not normally distributed.