Time Series Analysis with Pandas

Time series data is ubiquitous in many fields, such as finance, economics, environmental science, and engineering. Pandas provides powerful tools to handle, analyze, and visualize time series data effectively. This tutorial will cover:

  • Creating and manipulating time series data
  • Resampling and frequency conversion
  • Time-based indexing and slicing
  • Rolling and expanding windows
  • Time series visualization

Creating Time Series Data

We’ll start by creating a sample time series DataFrame to demonstrate these concepts.

Creating a Date Range

  import pandas as pd
import numpy as np

# Creating a date range
date_range = pd.date_range(start='2023-01-01', periods=100, freq='D')
print("Date Range:")
print(date_range)
  

Creating a Time Series DataFrame

  # Creating a DataFrame with the date range as the index
data = np.random.randn(100)
df = pd.DataFrame(data, index=date_range, columns=['Value'])
print("\nTime Series DataFrame:")
print(df.head())
  

Resampling and Frequency Conversion

Resampling involves changing the frequency of your time series observations. This can be useful for down-sampling (reducing the frequency) or up-sampling (increasing the frequency).

Down-sampling

  # Down-sampling to monthly frequency
monthly_df = df.resample('M').mean()
print("\nDown-sampled to Monthly Frequency:")
print(monthly_df)
  

Up-sampling

  # Up-sampling to hourly frequency and filling missing values with the previous value
hourly_df = df.resample('H').ffill()
print("\nUp-sampled to Hourly Frequency:")
print(hourly_df.head())
  

Time-based Indexing and Slicing

Time-based indexing and slicing allow you to access specific subsets of your time series data based on time criteria.

Indexing

  # Accessing data for a specific date
specific_date = df.loc['2023-02-01']
print("\nData for 2023-02-01:")
print(specific_date)
  

Slicing

  # Slicing data for a specific date range
date_range_slice = df.loc['2023-02-01':'2023-02-10']
print("\nData from 2023-02-01 to 2023-02-10:")
print(date_range_slice)
  

Rolling and Expanding Windows

Rolling and expanding windows provide methods to calculate statistics over a sliding window or an expanding window.

Rolling Window

  # Calculating the rolling mean with a window of 7 days
rolling_mean = df.rolling(window=7).mean()
print("\nRolling Mean with a Window of 7 Days:")
print(rolling_mean.head(10))
  

Expanding Window

  # Calculating the expanding mean
expanding_mean = df.expanding().mean()
print("\nExpanding Mean:")
print(expanding_mean.head(10))
  

Time Series Visualization

Visualizing time series data is crucial for understanding trends, patterns, and anomalies.

Plotting Time Series Data

  import matplotlib.pyplot as plt

# Plotting the original time series data
df.plot(title='Original Time Series Data')
plt.show()

# Plotting the rolling mean
rolling_mean.plot(title='Rolling Mean with a Window of 7 Days')
plt.show()

# Plotting the expanding mean
expanding_mean.plot(title='Expanding Mean')
plt.show()

# Plotting the down-sampled data
monthly_df.plot(title='Monthly Mean')
plt.show()
  

This tutorial provides a comprehensive introduction to time series analysis using Pandas, demonstrating how to create, manipulate, and visualize time series data effectively. By mastering these techniques, you can leverage the power of Pandas to perform advanced time series analysis and gain valuable insights from your data.

Learn How To Build AI Projects

Now, if you are interested in upskilling in 2024 with AI development, check out this 6 AI advanced projects with Golang where you will learn about building with AI and getting the best knowledge there is currently. Here’s the link.

Last updated 17 Aug 2024, 12:31 +0200 . history