Create an empty Python Pandas DataFrame and add rows

David Y.

The Problem

I would like to create an empty Python Pandas DataFrame and add rows to it one by one. How can I achieve this?

The Solution

While it is possible to add rows to a DataFrame after it has been created, this approach has several downsides compared to the standard practice of creating a new DataFrame from a list or dictionary. It is slower and more memory intensive, the datatypes for elements in new rows will not be automatically inferred, and numeric labels may not behave as desired. Per the Pandas documentation:

It is not recommended to build DataFrames by adding single rows in a for loop.

If we want to build up data for a DataFrame iteratively, we should work with either dictionaries or lists until we’re ready to create the final DataFrame. For example:

import pandas # Make a 5x5 list of lists data = [] for x in range(5): data.append([]) for y in range(5): data[x].append(y) # Create a DataFrame from the list of lists df = pandas.DataFrame(data) print(df) # will print # 0 1 2 3 4 # 0 0 1 2 3 4 # 1 0 1 2 3 4 # 2 0 1 2 3 4 # 3 0 1 2 3 4 # 4 0 1 2 3 4

In some instances, this may be insufficient for our needs. Older versions of Pandas provided a DataFrame.append method, but this has been deprecated in favor of pandas.concat. We can use this latter method to add rows to an existing DataFrame:

import pandas # Make a 5x5 list of lists data = [] for x in range(5): data.append([]) for y in range(5): data[x].append(y) # Create a DataFrame from the list of lists df = pandas.DataFrame(data) # Create a new row and add it to the DataFrame new_row = pandas.DataFrame([[0, 1, 2, 3, 4]]) df = pandas.concat([df, new_row]) print(df) # will print # 0 1 2 3 4 # 0 0 1 2 3 4 # 1 0 1 2 3 4 # 2 0 1 2 3 4 # 3 0 1 2 3 4 # 4 0 1 2 3 4 # 0 0 1 2 3 4

As we can see, the new row’s label is 0 rather than 5. We can fix this by renaming the row:

df.index = df.index[:-1].tolist() + [5] # remove the final row label and add a new label print(df) # will print # 0 1 2 3 4 # 0 0 1 2 3 4 # 1 0 1 2 3 4 # 2 0 1 2 3 4 # 3 0 1 2 3 4 # 4 0 1 2 3 4 # 5 0 1 2 3 4

Get Started With Sentry

Get actionable, code-level insights to resolve Python performance bottlenecks and errors.

  1. Create a free Sentry account

  2. Create a Python project and note your DSN

  3. Grab the Sentry Python SDK

pip install --upgrade sentry-sdk
  1. Configure your DSN
import sentry_sdk sentry_sdk.init( "https://<key>@sentry.io/<project>", # Set traces_sample_rate to 1.0 to capture 100% # of transactions for performance monitoring. # We recommend adjusting this value in production. traces_sample_rate=1.0, )

Loved by over 4 million developers and more than 90,000 organizations worldwide, Sentry provides code-level observability to many of the world’s best-known companies like Disney, Peloton, Cloudflare, Eventbrite, Slack, Supercell, and Rockstar Games. Each month we process billions of exceptions from the most popular products on the internet.

Share on Twitter
Bookmark this page
Ask a questionJoin the discussion

Related Answers

A better experience for your users. An easier life for your developers.

    TwitterGitHubDribbbleLinkedinDiscord
© 2024 • Sentry is a registered Trademark
of Functional Software, Inc.