Resample Function in Pandas

Getting your Trinity Audio player ready...

When you work with time-based data like sales reports, attendance logs, website traffic, or MIS dashboards, one common question always comes up:

“How do I summarize this data by day, week, month, quarter, or year?”

This is exactly where df.resample() in Pandas becomes extremely powerful.

Resample Function in Pandas

The resample() function allows you to group data by time and then apply calculations like sum, mean, count, max, min, etc. It works very similar to Pivot Tables in Excel or GROUP BY Date in SQL — but in a much more flexible way.

In this article, we’ll understand:

  • What df.resample() is
  • How it works step by step
  • Weekly, monthly, quarterly, and yearly resampling
  • The correct usage of MS, QS, QE (since M and Q are being deprecated)
  • Real-life sales examples
  • Common mistakes and best practices

All explanations are in easy, practical language, perfect for MIS, Data Analyst, and Python beginners.


What is df.resample() in Pandas?

df.resample() is used to change the frequency of time-series data.

In simple words:

  • It groups data based on time
  • Then applies an aggregation function like sum(), mean(), count(), etc.

👉 It works only when:

  • Your DataFrame has a DateTime index, OR
  • You explicitly pass a date column using on='Date'

Basic Syntax of resample()

df.resample(rule).aggregation()

Or when the date is a column:

df.resample(rule, on='Date').aggregation()

Understanding Resample Rules (Very Important)

RuleMeaning
DDaily
WWeekly
MSMonth Start
QSQuarter Start
QEQuarter End
YSYear Start
YEYear End

⚠️ Important Note

  • M (Month End) and Q (Quarter End) are being deprecated
  • Always use MS, QS, QE for future-proof code

Sample Dataset (Sales Example)

Assume you have a sales dataset like this:

DateProductQuantitySales
2023-01-01Laptop11200
2023-01-03Mouse250
2023-01-08Keyboard185
2023-01-15Monitor1450
2023-02-02Laptop11200
2023-02-10Mouse375

Read More: Basic Functions in Pandas


Convert Date Column to DateTime (Must Do)

df['Date'] = pd.to_datetime(df['Date'])

Without this step, resample() will not work properly.


Example 1: Weekly Sales Quantity (Most Common Use Case)

Requirement

Show total quantity sold per week.

Code

df.resample("W", on='Date')['Quantity'].sum()

Explanation

  • "W" → Weekly grouping
  • on='Date' → Date column used for grouping
  • ['Quantity'] → Column to aggregate
  • sum() → Total quantity per week

Output (Conceptual)

Week EndingQuantity
2023-01-084
2023-01-152
2023-02-124

👉 This is very useful in weekly MIS reports.


Example 2: Monthly Sales using MS (Month Start)

Why use MS instead of M?

  • M (Month End) is being deprecated
  • MS = Month Start (recommended)

Code

df.resample("MS", on='Date')['Sales'].sum()

Output

MonthTotal Sales
2023-01-011785
2023-02-011275

Example 3: Monthly Average Sales

df.resample("MS", on='Date')['Sales'].mean()

✔ Useful for:

  • Average revenue tracking
  • Performance analysis

Example 4: Quarterly Sales using QE (Quarter End)

Since Q is deprecated, always use QE.

df.resample("QE", on='Date')['Sales'].sum()

When to use QE?

  • Financial reports
  • Quarterly business reviews
  • Management dashboards

Example 5: Yearly Sales Summary

df.resample("YS", on='Date')['Sales'].sum()

✔ Common in:

  • Annual MIS reports
  • Year-on-year analysis

Resample with Multiple Aggregations

You can calculate multiple metrics at once:

df.resample("MS", on='Date').agg({
    'Sales': 'sum',
    'Quantity': 'sum'
})

Resample vs Groupby – What’s the Difference?

Featureresample()groupby()
Time-based grouping✅ Best⚠️ Manual
Handles missing dates✅ Yes❌ No
Time-series friendly
Non-date grouping

👉 Use resample when time is involved.


Common Mistakes to Avoid

❌ Date column not in datetime format
❌ Using deprecated rules like M, Q
❌ Forgetting on='Date'
❌ Using resample for non-time data


Best Practices for Using resample()

✅ Always convert date columns using pd.to_datetime()
✅ Use MS, QS, QE instead of deprecated rules
✅ Combine with .agg() for MIS-style summaries
✅ Keep date column clean (no text, no nulls)


Real-Life MIS Use Cases

  • Weekly ticket count reports
  • Monthly sales performance dashboards
  • Quarterly finance summaries
  • Yearly growth analysis
  • Attendance tracking

Challenges with resample()

  • Requires clean date data
  • Not suitable for categorical grouping
  • Beginners often confuse it with groupby

👉 Once understood, it becomes one of the most powerful Pandas tools.


Conclusion

The df.resample() function is a must-know skill for anyone working with time-based data in Pandas. Whether you are building MIS reports, sales dashboards, or analytical summaries, resampling helps you turn raw daily data into meaningful insights.

By using future-proof rules like MS and QE, you ensure your code remains stable even as Pandas evolves.

If you are serious about Data Analysis, MIS, or Python reporting, mastering resample() will give you a strong edge.

FAQs – Resample Function in Pandas

For time-series data, yes. It is optimized for date-based grouping.

To make time rules more explicit. MS, QS, and QE clearly define start and end periods.

No. Resample is meant for numeric aggregation.

Yes. If your Date column is already set as the DataFrame index, you can use resample() directly without mentioning on='Date'.

Step 1: Convert Date to datetime

df['Date'] = pd.to_datetime(df['Date'])

Step 2: Set Date as index

df = df.set_index('Date')

Step 3: Resample without on='Date'

df.resample("W")['Quantity'].sum()

What’s Next?

In the next post, we’ll learn about the Merge Function in Pandas

Spread the love

Leave a Comment

Your email address will not be published. Required fields are marked *

Translate »
Scroll to Top