SOSCONPH 25 logo
SOSCONPH 25 menu icon
soscon_public_learnsoscon_public_sharesoscon_public_network

Speakers

Workshop 3
14:00 - 15:30

Analyzing Data with Pandas

Pandas
Pandas Logo

Learning Outcomes

Understand the fundamentals of Pandas

Explain what pandas is, its purpose, core concepts and basic usage

Read different data types

Import data from a variety of sources, including CSV, Excel and SQL databases to a DataFrame

Clean and manipulate datasets

Handle missing data, filter and selecting data, sorting and aggregating data and combining datasets

Analyze and explore data

Generate summary statistics and visualizing the dataset

Gain hands-on experience

Successfully complete practical exercises to reinforce theoretical knowledge and apply it to real-world scenarios

Modules

1

Introducing Pandas

  • What is Pandas?
  • Using pandas through Jupyter notebook
  • Core Concepts and Essential Functionality
2

Data Loading and Storage

  • Functions for reading data (read_csv)
  • Data loading function arguments (delimiter, header)
  • Methods for writing data (to_csv)
3

Data Cleaning and Preparation

  • Handling missing data (drop, fill)
  • Data transformation (drop duplicates, map, apply)
  • String manipulation
4

Data Analysis

  • Calculating summary statistics
  • Aggregating data
  • Combining and merging datasets
  • Pivoting
5

Data Visualization

  • Creating visualizations with pandas
  • Matplotlib and seaborn
6

Hands-on Exercise

    Requirements

    Basic knowledge in Python

    Basic knowledge with Jupyter Notebooks is preferred

    Abdul-Rashid Sampaco III

    Lead Engineer,

    Samsung R&D Institute Philippines

    Rashid is a data engineer, academician, and science researcher whose expertise span a multitude of industries. He finished his bachelor’s degree in Biochemistry and Master of Science in Chemistry at the University of the Philippines Diliman. Before his current role, he worked as a data scientist, developing ML models for monitoring subsea internet cables. Throughout his career, Rashid has extensively utilized Pandas for data analytics. What makes his perspective unique is that his journey with data began in the academe. As a science researcher in computational biochemistry, he first honed his data wrangling skills by using pandas to navigate the complexity of cheminformatics and molecular simulations. This diverse journey from pure science to applied engineering makes him the perfect guide to show us the practical power of Pandas today. Currently, he is a data engineer at Samsung R&D Institute Philippines (SRPH), where he develops and maintains critical data solutions and backend software for Machine Learning Operations (MLOps) projects, enabling teams to leverage data more effectively.

    speaker