Basic Functions in Pandas

Getting your Trinity Audio player ready...

Pandas is the foundation of data analysis in Python. Whether you are analyzing MIS reports, cleaning datasets, preparing dashboards, or building machine learning models, you will use Pandas in every step.

But here’s the problem most beginners face:

There are hundreds of functions in Pandas. Which ones are truly important?

Basic Functions in Pandas

In this guide, we’ll explore the basic yet most powerful Pandas functions that you will use daily. Everything is explained in a simple, conversational tone—perfect for beginners and intermediate learners.


Basic Functions in Pandas

Below is a quick index of the functions we will cover:

  • Data Creation Functions
  • Basic Exploration Functions
  • Selection & Filtering Functions
  • Statistical Functions
  • String/Text Functions
  • Handling Missing Values
  • Sorting
  • Grouping
  • Merging & Joining
  • Apply & Lambda
  • Exporting Data

Each section includes examples, tables, diagrams, and best practices.


1. Creating Data in Pandas

Before learning any function, you must know how to create data.

1.1 Creating Series

import pandas as pd

s = pd.Series([10, 20, 30])

1.2 Creating DataFrame

data = {
    "Name": ["Amit", "Priya", "John"],
    "Marks": [85, 92, 78],
    "City": ["Delhi", "Mumbai", "Kolkata"]
}

df = pd.DataFrame(data)

Also Read: Series vs DataFrame in Pandas


2. Basic Data Exploration Functions

These functions help you understand your dataset quickly.

2.1 head() – Show top rows

df.head()

Useful when you want a quick preview.

2.2 tail() – Show bottom rows

df.tail()

2.3 shape – Row & column count

df.shape

Output Example:

(3, 3)

2.4 info() – Data summary

df.info()

Gives:

  • Column names
  • Data types
  • Null counts

2.5 describe() – Statistics summary

df.describe()

Real-life use:
Check the distribution of marks, salaries, sales numbers, etc.


3. Selecting & Filtering Data

3.1 Selecting a Column

df["Name"]

3.2 Selecting Multiple Columns

df[["Name", "Marks"]]

3.3 Filtering Rows Using Condition

df[df["Marks"] > 80]

3.4 Using Multiple Conditions

df[(df["Marks"] > 80) & (df["City"] == "Delhi")]

3.5 loc – Select by labels

df.loc[0:1, ["Name", "Marks"]]

3.6 iloc – Select by positions

df.iloc[0:2, 0:2]

4. Statistical Functions

These are used in almost all analytical tasks.

4.1 SUM

df["Marks"].sum()

4.2 COUNT

df["City"].count()

4.3 AVERAGE/MEAN

df["Marks"].mean()

4.4 MIN & MAX

df["Marks"].min()
df["Marks"].max()

4.5 MEDIAN

df["Marks"].median()

5. String / Text Functions

Useful for cleaning messy datasets.

5.1 Lowercase, Uppercase, Title Case

df["Name"].str.lower()
df["Name"].str.upper()
df["Name"].str.title()

5.2 Trim Extra Spaces

df["Name"].str.strip()

5.3 Replace Text

df["City"].str.replace("Delhi", "New Delhi")

5.4 Length

df["Name"].str.len()

5.5 Extract Substring

df["Name"].str[:3]

6. Handling Missing Values

Real datasets always contain missing values.

6.1 Check Missing Values

df.isnull().sum()

6.2 Fill Missing Values

df.fillna(0)

6.3 Drop Missing Rows

df.dropna()

7. Sorting Data

7.1 Sort by Column

df.sort_values("Marks")

7.2 Sort in Descending

df.sort_values("Marks", ascending=False)

8. Grouping Data (groupby)

This is extremely powerful in business reporting.

Example Dataset

NameCitySales
AmitDelhi50000
PriyaDelhi65000
JohnMumbai45000

8.1 Group by City

df.groupby("City")["Sales"].sum()

Real-Life Use Cases

  • Total sales per city
  • Tickets resolved per agent
  • Students marks per class

9. Merging & Joining DataFrames

9.1 Merge Using Common Column

pd.merge(df1, df2, on="ID")

9.2 Left Merge

pd.merge(df1, df2, on="ID", how="left")

Real-Life Use Case

  • Join employee table with salary table
  • Combine order data with customer data

10. Using apply() and lambda

Very useful for column-wise custom calculations.

Example: Add 10 bonus marks

df["Bonus"] = df["Marks"].apply(lambda x: x + 10)

11. Exporting Data

To CSV

df.to_csv("output.csv", index=False)

To Excel

df.to_excel("output.xlsx", index=False)

Real-Life Examples: How Pandas Basic Functions Help You

Example 1: MIS Ticket Analysis

  • Use count() to count total tickets
  • Use groupby() to find tickets by status
  • Use merge() to join ticket & user tables
  • Use sort_values() to sort by priority

Example 2: Sales Report

  • Use sum() for total sales
  • Use mean() for average monthly sales
  • Use fillna() to fix missing data
  • Use apply() to calculate commission

Example 3: HR Attendance Sheet

  • Use loc to extract employees from specific department
  • Use value_counts() to count Present/Absent
  • Use replace() to clean text values

Use Cases: Where These Functions Are Used?

1. Business Analytics

  • Monthly performance analysis
  • Revenue forecasting
  • KPI dashboards

2. Data Cleaning

  • Remove duplicates
  • Fix missing values
  • Standardize text

3. Machine Learning

  • Feature engineering
  • Dataset preparation
  • Target/feature separation

Challenges Beginners Face

❌ Getting confused between loc and iloc
❌ Not understanding data types
❌ Using wrong merge type
❌ Forgetting to remove missing values
❌ Not using groupby efficiently


Best Practices

✔ Always check data with head() and info()
✔ Clean data before analysis
✔ Use astype() to fix data types
✔ Use groupby() instead of loops
✔ Use meaningful column names
✔ Export results for reporting


Conclusion

Pandas is not difficult.
Once you master the basic functions—selecting data, cleaning data, grouping, merging, and exporting—you can confidently work on real-world datasets.

These basic functions are the building blocks for advanced data analysis and machine learning. Learn them well, practice often, and you will quickly become comfortable with Pandas.

What’s Next?

In the next post, we’ll learn about the Analyze Data in Pandas

Spread the love

Leave a Comment

Your email address will not be published. Required fields are marked *

Translate »
Scroll to Top