Basic Functions in Pandas

Getting your Trinity Audio player ready...

Pandas is the foundation of data analysis in Python. Whether you are analyzing MIS reports, cleaning datasets, preparing dashboards, or building machine learning models, you will use Pandas in every step.

But here’s the problem most beginners face:

There are hundreds of functions in Pandas. Which ones are truly important?

In this guide, we’ll explore the basic yet most powerful Pandas functions that you will use daily. Everything is explained in a simple, conversational tone—perfect for beginners and intermediate learners.

Below is a quick index of the functions we will cover:

Data Creation Functions
Basic Exploration Functions
Selection & Filtering Functions
Statistical Functions
String/Text Functions
Handling Missing Values
Sorting
Grouping
Merging & Joining
Apply & Lambda
Exporting Data

Each section includes examples, tables, diagrams, and best practices.

1. Creating Data in Pandas

Before learning any function, you must know how to create data.

1.1 Creating Series

import pandas as pd

s = pd.Series([10, 20, 30])

import pandas as pd

s = pd.Series([10, 20, 30])

1.2 Creating DataFrame

data = {
    "Name": ["Amit", "Priya", "John"],
    "Marks": [85, 92, 78],
    "City": ["Delhi", "Mumbai", "Kolkata"]
}

df = pd.DataFrame(data)

data = {
    "Name": ["Amit", "Priya", "John"],
    "Marks": [85, 92, 78],
    "City": ["Delhi", "Mumbai", "Kolkata"]
}

df = pd.DataFrame(data)

Also Read: Series vs DataFrame in Pandas

2. Basic Data Exploration Functions

These functions help you understand your dataset quickly.

2.1 `head()` – Show top rows

df.head()

df.head()

Useful when you want a quick preview.

2.2 `tail()` – Show bottom rows

df.tail()

df.tail()

2.3 `shape` – Row & column count

df.shape

df.shape

Output Example:

(3, 3)

2.4 `info()` – Data summary

df.info()

df.info()

Gives:

Column names
Data types
Null counts

2.5 `describe()` – Statistics summary

df.describe()

df.describe()

Real-life use:
Check the distribution of marks, salaries, sales numbers, etc.

3. Selecting & Filtering Data

3.1 Selecting a Column

df["Name"]

df["Name"]

3.2 Selecting Multiple Columns

df[["Name", "Marks"]]

df[["Name", "Marks"]]

3.3 Filtering Rows Using Condition

df[df["Marks"] > 80]

df[df["Marks"] > 80]

3.4 Using Multiple Conditions

df[(df["Marks"] > 80) & (df["City"] == "Delhi")]

df[(df["Marks"] > 80) & (df["City"] == "Delhi")]

3.5 `loc` – Select by labels

df.loc[0:1, ["Name", "Marks"]]

df.loc[0:1, ["Name", "Marks"]]

3.6 `iloc` – Select by positions

df.iloc[0:2, 0:2]

df.iloc[0:2, 0:2]

4. Statistical Functions

These are used in almost all analytical tasks.

4.1 SUM

df["Marks"].sum()

df["Marks"].sum()

4.2 COUNT

df["City"].count()

df["City"].count()

4.3 AVERAGE/MEAN

df["Marks"].mean()

df["Marks"].mean()

4.4 MIN & MAX

df["Marks"].min()
df["Marks"].max()

df["Marks"].min()
df["Marks"].max()

4.5 MEDIAN

df["Marks"].median()

df["Marks"].median()

5. String / Text Functions

Useful for cleaning messy datasets.

5.1 Lowercase, Uppercase, Title Case

df["Name"].str.lower()
df["Name"].str.upper()
df["Name"].str.title()

df["Name"].str.lower()
df["Name"].str.upper()
df["Name"].str.title()

5.2 Trim Extra Spaces

df["Name"].str.strip()

df["Name"].str.strip()

5.3 Replace Text

df["City"].str.replace("Delhi", "New Delhi")

df["City"].str.replace("Delhi", "New Delhi")

5.4 Length

df["Name"].str.len()

df["Name"].str.len()

5.5 Extract Substring

df["Name"].str[:3]

df["Name"].str[:3]

6. Handling Missing Values

Real datasets always contain missing values.

6.1 Check Missing Values

df.isnull().sum()

df.isnull().sum()

6.2 Fill Missing Values

df.fillna(0)

df.fillna(0)

6.3 Drop Missing Rows

df.dropna()

df.dropna()

7. Sorting Data

7.1 Sort by Column

df.sort_values("Marks")

df.sort_values("Marks")

7.2 Sort in Descending

df.sort_values("Marks", ascending=False)

df.sort_values("Marks", ascending=False)

8. Grouping Data (`groupby`)

This is extremely powerful in business reporting.

Example Dataset

Name	City	Sales
Amit	Delhi	50000
Priya	Delhi	65000
John	Mumbai	45000

8.1 Group by City

df.groupby("City")["Sales"].sum()

df.groupby("City")["Sales"].sum()

Real-Life Use Cases

Total sales per city
Tickets resolved per agent
Students marks per class

9. Merging & Joining DataFrames

9.1 Merge Using Common Column

pd.merge(df1, df2, on="ID")

pd.merge(df1, df2, on="ID")

9.2 Left Merge

pd.merge(df1, df2, on="ID", how="left")

pd.merge(df1, df2, on="ID", how="left")

Real-Life Use Case

Join employee table with salary table
Combine order data with customer data

10. Using apply() and lambda

Very useful for column-wise custom calculations.

Example: Add 10 bonus marks

df["Bonus"] = df["Marks"].apply(lambda x: x + 10)

df["Bonus"] = df["Marks"].apply(lambda x: x + 10)

11. Exporting Data

To CSV

df.to_csv("output.csv", index=False)

df.to_csv("output.csv", index=False)

To Excel

df.to_excel("output.xlsx", index=False)

df.to_excel("output.xlsx", index=False)

Real-Life Examples: How Pandas Basic Functions Help You

Example 1: MIS Ticket Analysis

Use count() to count total tickets
Use groupby() to find tickets by status
Use merge() to join ticket & user tables
Use sort_values() to sort by priority

Example 2: Sales Report

Use sum() for total sales
Use mean() for average monthly sales
Use fillna() to fix missing data
Use apply() to calculate commission

Example 3: HR Attendance Sheet

Use loc to extract employees from specific department
Use value_counts() to count Present/Absent
Use replace() to clean text values

Use Cases: Where These Functions Are Used?

1. Business Analytics

Monthly performance analysis
Revenue forecasting
KPI dashboards

2. Data Cleaning

Remove duplicates
Fix missing values
Standardize text

3. Machine Learning

Feature engineering
Dataset preparation
Target/feature separation

Challenges Beginners Face

❌ Getting confused between loc and iloc
❌ Not understanding data types
❌ Using wrong merge type
❌ Forgetting to remove missing values
❌ Not using groupby efficiently

Best Practices

✔ Always check data with head() and info()
✔ Clean data before analysis
✔ Use astype() to fix data types
✔ Use groupby() instead of loops
✔ Use meaningful column names
✔ Export results for reporting

Conclusion

Pandas is not difficult.
Once you master the basic functions—selecting data, cleaning data, grouping, merging, and exporting—you can confidently work on real-world datasets.

These basic functions are the building blocks for advanced data analysis and machine learning. Learn them well, practice often, and you will quickly become comfortable with Pandas.

What’s Next?

In the next post, we’ll learn about the Analyze Data in Pandas

Spread the love