|
Getting your Trinity Audio player ready...
|
Pandas is the foundation of data analysis in Python. Whether you are analyzing MIS reports, cleaning datasets, preparing dashboards, or building machine learning models, you will use Pandas in every step.
But here’s the problem most beginners face:
There are hundreds of functions in Pandas. Which ones are truly important?

In this guide, we’ll explore the basic yet most powerful Pandas functions that you will use daily. Everything is explained in a simple, conversational tone—perfect for beginners and intermediate learners.
Basic Functions in Pandas
Below is a quick index of the functions we will cover:
- Data Creation Functions
- Basic Exploration Functions
- Selection & Filtering Functions
- Statistical Functions
- String/Text Functions
- Handling Missing Values
- Sorting
- Grouping
- Merging & Joining
- Apply & Lambda
- Exporting Data
Each section includes examples, tables, diagrams, and best practices.
1. Creating Data in Pandas
Before learning any function, you must know how to create data.
1.1 Creating Series
import pandas as pd
s = pd.Series([10, 20, 30])
1.2 Creating DataFrame
data = {
"Name": ["Amit", "Priya", "John"],
"Marks": [85, 92, 78],
"City": ["Delhi", "Mumbai", "Kolkata"]
}
df = pd.DataFrame(data)
Also Read: Series vs DataFrame in Pandas
2. Basic Data Exploration Functions
These functions help you understand your dataset quickly.
2.1 head() – Show top rows
df.head()
Useful when you want a quick preview.
2.2 tail() – Show bottom rows
df.tail()
2.3 shape – Row & column count
df.shape
Output Example:
(3, 3)
2.4 info() – Data summary
df.info()
Gives:
- Column names
- Data types
- Null counts
2.5 describe() – Statistics summary
df.describe()
Real-life use:
Check the distribution of marks, salaries, sales numbers, etc.
3. Selecting & Filtering Data
3.1 Selecting a Column
df["Name"]
3.2 Selecting Multiple Columns
df[["Name", "Marks"]]
3.3 Filtering Rows Using Condition
df[df["Marks"] > 80]
3.4 Using Multiple Conditions
df[(df["Marks"] > 80) & (df["City"] == "Delhi")]
3.5 loc – Select by labels
df.loc[0:1, ["Name", "Marks"]]
3.6 iloc – Select by positions
df.iloc[0:2, 0:2]
4. Statistical Functions
These are used in almost all analytical tasks.
4.1 SUM
df["Marks"].sum()
4.2 COUNT
df["City"].count()
4.3 AVERAGE/MEAN
df["Marks"].mean()
4.4 MIN & MAX
df["Marks"].min()
df["Marks"].max()
4.5 MEDIAN
df["Marks"].median()
5. String / Text Functions
Useful for cleaning messy datasets.
5.1 Lowercase, Uppercase, Title Case
df["Name"].str.lower()
df["Name"].str.upper()
df["Name"].str.title()
5.2 Trim Extra Spaces
df["Name"].str.strip()
5.3 Replace Text
df["City"].str.replace("Delhi", "New Delhi")
5.4 Length
df["Name"].str.len()
5.5 Extract Substring
df["Name"].str[:3]
6. Handling Missing Values
Real datasets always contain missing values.
6.1 Check Missing Values
df.isnull().sum()
6.2 Fill Missing Values
df.fillna(0)
6.3 Drop Missing Rows
df.dropna()
7. Sorting Data
7.1 Sort by Column
df.sort_values("Marks")
7.2 Sort in Descending
df.sort_values("Marks", ascending=False)
8. Grouping Data (groupby)
This is extremely powerful in business reporting.
Example Dataset
| Name | City | Sales |
|---|---|---|
| Amit | Delhi | 50000 |
| Priya | Delhi | 65000 |
| John | Mumbai | 45000 |
8.1 Group by City
df.groupby("City")["Sales"].sum()
Real-Life Use Cases
- Total sales per city
- Tickets resolved per agent
- Students marks per class
9. Merging & Joining DataFrames
9.1 Merge Using Common Column
pd.merge(df1, df2, on="ID")
9.2 Left Merge
pd.merge(df1, df2, on="ID", how="left")
Real-Life Use Case
- Join employee table with salary table
- Combine order data with customer data
10. Using apply() and lambda
Very useful for column-wise custom calculations.
Example: Add 10 bonus marks
df["Bonus"] = df["Marks"].apply(lambda x: x + 10)
11. Exporting Data
To CSV
df.to_csv("output.csv", index=False)
To Excel
df.to_excel("output.xlsx", index=False)
Real-Life Examples: How Pandas Basic Functions Help You
Example 1: MIS Ticket Analysis
- Use
count()to count total tickets - Use
groupby()to find tickets by status - Use
merge()to join ticket & user tables - Use
sort_values()to sort by priority
Example 2: Sales Report
- Use
sum()for total sales - Use
mean()for average monthly sales - Use
fillna()to fix missing data - Use
apply()to calculate commission
Example 3: HR Attendance Sheet
- Use
locto extract employees from specific department - Use
value_counts()to count Present/Absent - Use
replace()to clean text values
Use Cases: Where These Functions Are Used?
1. Business Analytics
- Monthly performance analysis
- Revenue forecasting
- KPI dashboards
2. Data Cleaning
- Remove duplicates
- Fix missing values
- Standardize text
3. Machine Learning
- Feature engineering
- Dataset preparation
- Target/feature separation
Challenges Beginners Face
❌ Getting confused between loc and iloc
❌ Not understanding data types
❌ Using wrong merge type
❌ Forgetting to remove missing values
❌ Not using groupby efficiently
Best Practices
✔ Always check data with head() and info()
✔ Clean data before analysis
✔ Use astype() to fix data types
✔ Use groupby() instead of loops
✔ Use meaningful column names
✔ Export results for reporting
Conclusion
Pandas is not difficult.
Once you master the basic functions—selecting data, cleaning data, grouping, merging, and exporting—you can confidently work on real-world datasets.
These basic functions are the building blocks for advanced data analysis and machine learning. Learn them well, practice often, and you will quickly become comfortable with Pandas.
What’s Next?
In the next post, we’ll learn about the Analyze Data in Pandas