Exploring pandas melt() function [Practical Examples]

Exploring pandas melt() function [Practical Examples]

Introduction to Pandas melt() function

Pandas melt() function is used to unpivot a DataFrame from wide to long format, optionally leaving identifiers set. A pivot table aggregates the values in a data set. In this tutorial, we’ll learn how to do the opposite: break an aggregated collection of data into an unaggregated one.

This function is useful to massage a DataFrame into a format where one or more columns are identifier variables (id_vars), while all other columns, considered measured variables (value_vars), are “unpivoted” to the row axis, leaving just two non-identifier columns, ‘variable’ and ‘value’.

Syntax:

pandas.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None, ignore_index=True)

Parameters

  • id_vars - tuple, list, or array, optional. Column(s) to use as identifier variables.
  • value_vars - tuple, list, or array, optional. Column(s) to unpivot. If not specified, uses all columns that are not set asid_vars.
  • var_name - scalar. Name to use for the ‘variable’ column. If None it usesframe.columns.nameor ‘variable’.
  • value_name - scalar, default ‘value’. Name to use for the ‘value’ column.
  • col_level - int or str, optional. If columns are a MultiIndex then use this level to melt.
  • ignore_index - bool, default True. If True, original index is ignored. If False, the original index is retained. Index labels will be repeated as necessary.

What we will cover?

We will discuss the following:

  • Set one column as identifier variable and one column as value variable
  • Set one column as identifier variable and two columns as value variable

Create pandas DataFrame with example data

DataFrame is a data structure used to store the data in two dimensional format. It is similar to table that stores the data in rows and columns. Rows represents the records/tuples and columns refers to the attributes.

We can create the DataFrame by using**pandas.DataFrame()**method.

Syntax:

pandas.DataFrame(input_data,columns,index)

Parameters:

It will take mainly three parameters

  1. input_data is represents a list of data
  2. columnsrepresent the columns names for the data
  3. indexrepresent the row numbers/values

We can also create a DataFrame using dictionary by skipping columns and indices.

Example:Python Program to create a dataframe for college data from a dictionary.

# import pandas 
import pandas 

# create dataframe with college data
dataframe = pandas.DataFrame({'Name': {0: 'sravan', 1: 'bobby', 2: 'deepak',3:'prasad'},
                'Subjects': {0: 'PHP', 1: 'PHP', 2: 'dbms',3:'java'},
                'Marks': {0: 89, 1: 90, 2: 93,3:57}})

# display dataframe
print(dataframe)

Output:

     Name Subjects  Marks
0  sravan      PHP     89
1   bobby      PHP     90
2  deepak     dbms     93
3  prasad     java     57

Scenario-1 : Set one column as identifier variable and one column as value variable

Here we are defining one variable as identifier and also one variable is value variable.

Example 1:Python program to set Name as identifier variable and Subjects as value variable

# import pandas 
import pandas 

# create dataframe with college data
dataframe = pandas.DataFrame({'Name': {0: 'sravan', 1: 'bobby', 2: 'deepak',3:'prasad'},
                'Subjects': {0: 'PHP', 1: 'PHP', 2: 'dbms',3:'java'},
                'Marks': {0: 89, 1: 90, 2: 93,3:57}})

# here Name is id_vars and Subjects is value_vars
print(pandas.melt(dataframe, id_vars =['Name'], value_vars =['Subjects']))

Output:

     Name  variable value
0  sravan  Subjects   PHP
1   bobby  Subjects   PHP
2  deepak  Subjects  dbms
3  prasad  Subjects  java

Example 2: Python program to set Name as identifier variable and Marks as value variable

# import pandas 
import pandas 

# create dataframe with college data
dataframe = pandas.DataFrame({'Name': {0: 'sravan', 1: 'bobby', 2: 'deepak',3:'prasad'},
                'Subjects': {0: 'PHP', 1: 'PHP', 2: 'dbms',3:'java'},
                'Marks': {0: 89, 1: 90, 2: 93,3:57}})

# here Name is id_vars and Marks is value_vars
print(pandas.melt(dataframe, id_vars =['Name'], value_vars =['Marks']))

Output:

     Name variable  value
0  sravan    Marks     89
1   bobby    Marks     90
2  deepak    Marks     93
3  prasad    Marks     57

Scenario-2 : Set one column as identifier variable and two columns as value variable

Here we are defining one variable as identifier and two variables as value variable.

Example :Python program to set Name as identifier variable and Subjects, Marks as value variable

# import pandas 
import pandas 

# create dataframe with college data
dataframe = pandas.DataFrame({'Name': {0: 'sravan', 1: 'bobby', 2: 'deepak',3:'prasad'},
                'Subjects': {0: 'PHP', 1: 'PHP', 2: 'dbms',3:'java'},
                'Marks': {0: 89, 1: 90, 2: 93,3:57}})

# here Name is id_vars and subjects , Marks is value_vars
print(pandas.melt(dataframe, id_vars =['Name'], value_vars =['Subjects','Marks']))

Output:

     Name  variable value
0  sravan  Subjects   PHP
1   bobby  Subjects   PHP
2  deepak  Subjects  dbms
3  prasad  Subjects  java
4  sravan     Marks    89
5   bobby     Marks    90
6  deepak     Marks    93
7  prasad     Marks    57

Summary

In this article we learned aboutPandas and how it melts a DataFrame with the melt method. (Melting is the process of converting a wide data set to a narrow one.) The method accepts two primary parameters:

  • Theid_vars parameter sets the identifier column, the column for which the wide data set aggregates data.
  • Thevalue_varsparameter accepts the column(s) whose values pandas will melt and store in a new column.

References

Pandas - melt()

Deepak Prasad

Deepak Prasad

R&D Engineer

Founder of GoLinuxCloud with over a decade of expertise in Linux, Python, Go, Laravel, DevOps, Kubernetes, Git, Shell scripting, OpenShift, AWS, Networking, and Security. With extensive experience, he excels across development, DevOps, networking, and security, delivering robust and efficient solutions for diverse projects.