6 ways to add column to existing DataFrame in pandas

6 ways to add column to existing DataFrame in pandas

Different methods to add column to existing DataFrame in pandas

In this tutorial we will discuss how to add column to existing pandas DataFrame using the following methods:

  • Using [] with None value
  • Using [] with Constant value
  • Using [] with values
  • Using insert() method
  • Using assign() method
  • Using [] with NaN value

Create pandas DataFrame with example data

DataFrame is a data structure used to store the data in two dimensional format. It is similar to table that stores the data in rows and columns. Rows represents the records/ tuples and columns refers to the attributes.

We can create the DataFrame by using**pandas.DataFrame()**method.

Syntax:

pandas.DataFrame(input_data,columns,index)

Parameters:

It will take mainly three parameters

  1. input_data is represents a list of data
  2. columnsrepresent the columns names for the data
  3. indexrepresent the row numbers/values

We can also create a DataFrame using dictionary by skipping columns and indices.

Example: Python Program to create a dataframe for market data from a dictionary of food items by specifying the column names.

# import the module
import pandas

# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

# pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

# display the dataframe
print(dataframe)

Output:

            id            name    cost  quantity
item-1  foo-23  ground-nut oil  567.00         1
item-2  foo-13         almonds  562.56         2
item-3  foo-02           flour   67.00         3
item-4  foo-31         cereals   76.09         2

Method 1 : Using [] with None value

In this method we are going to add a column by filling None values in that column using [] .

Syntax:

dataframe['new_column']=None

where,

  1. dataframe is the input dataframe
  2. new_column is the new column name
  3. None is the value to be assigned to this new column for None values

Example: In this example we are going to add a column named stock and pass None values

# import the module
import pandas

# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

# pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

# add column - empty column
dataframe['stock']=None

# display dataframe
print(dataframe)

Output:

            id            name    cost  quantity stock
item-1  foo-23  ground-nut oil  567.00         1  None
item-2  foo-13         almonds  562.56         2  None
item-3  foo-02           flour   67.00         3  None
item-4  foo-31         cereals   76.09         2  None

Method 2 : Using [] with Constant value

In this method we are going to add a column by filling constant value in that column using [] .

Syntax:

dataframe['new_column']=value

where,

  1. dataframe is the input dataframe
  2. new_column is the new column name
  3. value is the constant value which is same in the new column

Example:In this example we are going to add a column named stock and pass value - 45

# import the module
import pandas

# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

# pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

# add column - 45 value
dataframe['stock']=45

# display dataframe
print(dataframe)

Output:

            id            name    cost  quantity stock
item-1  foo-23  ground-nut oil  567.00         1    45
item-2  foo-13         almonds  562.56         2    45
item-3  foo-02           flour   67.00         3    45
item-4  foo-31         cereals   76.09         2    45

Method 3 : Using [] with values

In this method we are going to add a column by filling values from a list in that column using [] .

Syntax:

dataframe['new_column']=[value,............,value]

where,

  1. dataframe is the input dataframe
  2. new_column is the new column name
  3. value is the value from the list of values assigned to each row in the column

Example:In this example we are going to add a column named stock and pass the list of values

# import the module
import pandas

# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

# pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

# add column - stock
dataframe['stock']=['yes','no','no','yes']

# display dataframe
print(dataframe)

Output:

            id            name    cost  quantity stock
item-1  foo-23  ground-nut oil  567.00         1   yes
item-2  foo-13         almonds  562.56         2    no
item-3  foo-02           flour   67.00         3    no
item-4  foo-31         cereals   76.09         2   yes

Method 4 : Using insert() method

Here, we are using insert() function to insert a new column at particular location.

Syntax:

dataframe.insert(location,"new_column",[value,.,value])

where,

  1. 1. dataframe is the input dataframe
  2. location parameter will take integer value to locate the position of the new column
  3. new_column is the name of the new column
  4. last parameter is the list of values to be assigned to the column created.

Example: In this example, we are going to add stock column and add values in last position

# import the module
import pandas

# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

# pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

# add column - stock at last position
dataframe.insert(4,"stock", ['yes','no','no','yes'])

# display dataframe
print(dataframe)

Output:

            id            name    cost  quantity stock
item-1  foo-23  ground-nut oil  567.00         1   yes
item-2  foo-13         almonds  562.56         2    no
item-3  foo-02           flour   67.00         3    no
item-4  foo-31         cereals   76.09         2   yes

Method 5 : Using assign() method

assign() is used to add a new column by taking the column name and values

Syntax:

dataframe.assign(new_column= [value,.....,value])

where,

  1. dataframe is the input dataframe
  2. new_column is the new column name that takes list of values

Example:In this example, we are going to add stock column and add values.

# import the module
import pandas

# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

# pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

# add column - stock 
dataframe= dataframe.assign(stock= ['yes','no','no','yes'])

# display dataframe
print(dataframe)

Output:

            id            name    cost  quantity stock
item-1  foo-23  ground-nut oil  567.00         1   yes
item-2  foo-13         almonds  562.56         2    no
item-3  foo-02           flour   67.00         3    no
item-4  foo-31         cereals   76.09         2   yes

Method 6 : Using [] with NaN value

In this method we are going to add a column by filling NaN values in that column using [] .NaN stands for Not a Number. It is available in numpy package, so we have to import numpy module

Syntax:

dataframe['new_column']=numpy.NaN

where,

  1. dataframe is the input dataframe
  2. new_column is the new column name
  3. numpy.NaN is the value to be assigned to this new column for NaN values

Example: In this example we are going to add a column named stock and pass NaN values.

# import the module
import pandas
import numpy 

# consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

# pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

# add column - stock
dataframe['stock']=numpy.nan

# display dataframe
print(dataframe)

Output:

            id            name    cost  quantity stock
item-1  foo-23  ground-nut oil  567.00         1   NaN
item-2  foo-13         almonds  562.56         2   NaN
item-3  foo-02           flour   67.00         3   NaN
item-4  foo-31         cereals   76.09         2   NaN

Summary

In this article, we discussed how to add a new column in the existing dataframe using [],insert(),assign() and with constant/NaN/None values. We have seen that , it is possible to add the column at any position by using insert() function.


References

Deepak Prasad

Deepak Prasad

R&D Engineer

Founder of GoLinuxCloud with over a decade of expertise in Linux, Python, Go, Laravel, DevOps, Kubernetes, Git, Shell scripting, OpenShift, AWS, Networking, and Security. With extensive experience, he excels across development, DevOps, networking, and security, delivering robust and efficient solutions for diverse projects.