Instructions for creating DataFrames in Python
A Data Frame is a table of information with just two columns and rows. It’s a method of organising information wherein tabular records are used to store information. Several datasets may be stored in the same data frame, which is organised like a table with rows and columns. Many arithmetic operations are available, including the addition of a selected column or row to the corresponding column or row in the data frame.
DataFrames may be imported from a variety of external sources, including SQL databases, comma-separated value (CSV) files, and Microsoft Excel spreadsheets. Another option is to utilise dictionaries, lists, etc.
Many methods for constructing the data frame will be covered in this guide. Let’s go into these various approaches.
It is necessary to add the pandas library to Python first.
environment.
An empty dataframe
A simple, empty Dataframe may be made. In order to create a DataFrame, it is necessary to use the dataframe function Object() { [native code] }. Now, let’s analyse the following scenario. Namely –
Resulting
-
#
import
pandas as pd
-
import
pandas as pd
-
-
# Calling DataFrame constructor
-
df = pd.DataFrame()
-
-
print(df)
:
Empty DataFrame Columns: [] Index: []
Method – 2: Create a dataframe using List
A single list or a collection of lists may be used to construct a dataframe. Now, let’s analyse the following scenario. Namely –
Its Output is
-
# importing pandas library
-
import
pandas as pd
-
-
# string values in the list
-
lst = [
‘Java’
,
‘Python’
,
‘C’
,
‘C++’
,
-
‘JavaScript’
,
‘Swift’
,
‘Go’
]
-
-
# Calling DataFrame constructor on list
-
dframe = pd.DataFrame(lst)
-
print(dframe)
0 Java 1 Python 2 C 3 C++ 4 JavaScript 5 Swift 6 Go
Method – 3: Create Dataframe from dict of ndarray/lists
In order to make a dataframe from a dict of ndarray/lists, all the ndarray must have the same length. By default, the index will be of type range(n), where n is the array’s size. Now, let’s analyse the following scenario. Namely –
Resulting
-
import
pandas as pd
-
-
# assign data of lists.
-
data = {
‘Name’
: [
‘Tom’
,
‘Joseph’
,
‘Krish’
,
‘John’
],
‘Age’
: [
20
,
21
,
19
,
18
]}
-
-
# Create DataFrame
-
df = pd.DataFrame(data)
-
-
# Print the output.
-
print(df)
:
Name Age 0 Tom 20 1 Joseph 21 2 Krish 19 3 John 18
Method – 4: Create a indexes Dataframe using arrays
Now, let’s figure out how to build an array-based index dataframe by analysing the following code snippet. Namely –
Resulting
-
# DataFrame using arrays.
-
import
pandas as pd
-
-
# assign data of lists.
-
data = {
‘Name’
:[
‘Renault’
,
‘Duster’
,
‘Maruti’
,
‘Honda City’
],
‘Ratings’
:[
9.0
,
8.0
,
5.0
,
3.0
]}
-
-
# Creates pandas DataFrame.
-
df = pd.DataFrame(data, index =[
‘position1’
,
‘position2’
,
‘position3’
,
‘position4’
])
-
-
# print the data
-
print(df)
Justification:
Name Ratings position1 Renault 9.0 position2 Duster 8.0 position3 Maruti 5.0 position4 Honda City 3.0
Column names including automobile names and star ratings have been defined in the preceding code. We built on the array’s foundation to
indexes.
Method – 5: Create Dataframe from list of dicts
The dictionaries’ lists may be used as input for developing a Pandas dataframe. By default, the names of the columns themselves are used as the keys. Now, let’s analyse the following scenario. Namely –
Resulting
-
# the example is to create
-
# Pandas DataFrame by lists of dicts.
-
import
pandas as pd
-
-
# assign values to lists.
-
data = [{
‘A’
:
10
,
‘B’
:
20
,
‘C’
:
30
}, {
‘x’
:
100
,
‘y’
:
200
,
‘z’
:
300
}]
-
-
# Creates DataFrame.
-
df = pd.DataFrame(data)
-
-
# Print the data
-
print(df)
:
A B C x y z 0 10.0 20.0 30.0 NaN NaN NaN 1 NaN NaN NaN 100.0 200.0 300.0
Let’s have a look at another another example of how to build a pandas dataframe from a list of dictionaries containing both row and column indices. As a second illustration:
Resulting
-
import
pandas as pd
-
-
# assigns values to lists.
-
data = [{
‘x’
:
1
,
‘y’
:
2
}, {
‘A’
:
15
,
‘B’
:
17
,
‘C’
:
19
}]
-
-
# With two column indices, values same
-
# as dictionary keys
-
dframe1 = pd.DataFrame(data, index =[
‘first’
,
‘second’
], columns =[
‘x’
,
‘y’
])
-
-
# With two column indices with
-
# one index with other name
-
dframe2 = pd.DataFrame(data, index =[
‘first’
,
‘second’
], columns =[
‘x’
,
‘y1’
])
-
-
# print the first data frame
-
print (dframe1,
“\n”
)
-
# Print the second DataFrame.
-
print (dframe2)
x y first 1.0 2.0 second NaN NaN x y1 first 1.0 NaN second NaN NaN
Therefore, let’s learn how to construct a dataframe by giving in a dictionary and a list of rows. Instance 3
As a result of
-
# The example is to create
-
# Pandas DataFrame by passing lists of
-
# Dictionaries and row indices.
-
import
pandas as pd
-
-
# assign values to lists
-
data = [{
‘x’
:
2
,
‘z’
:
3
}, {
‘x’
:
10
,
‘y’
:
20
,
‘z’
:
30
}]
-
-
# Creates padas DataFrame by passing
-
# Lists of dictionaries and row index.
-
dframe = pd.DataFrame(data, index =[
‘first’
,
‘second’
])
-
-
# Print the dataframe
-
print(dframe)
Programming,
x y z first 2 NaN 3 second 10 20.0 30
Three methods for constructing the dataframe from the aforementioned lists have been presented.
dictionary.
Method – 6: Create Dataframe using the zip() function
To combine the two sets, we utilise the zip() method. Now, let’s analyse the following scenario. Namely –
Resulting
-
# The example is to create
-
# pandas dataframe from lists using zip.
-
-
import
pandas as pd
-
-
# List1
-
Name = [
‘tom’
,
‘krish’
,
‘arun’
,
‘juli’
]
-
-
# List2
-
Marks = [
95
,
63
,
54
,
47
]
-
-
# two lists.
-
# and merge them by using zip().
-
list_tuples = list(zip(Name, Marks))
-
-
# Assign data to tuples.
-
print(list_tuples)
-
-
# Converting lists of tuples into
-
# pandas Dataframe.
-
dframe = pd.DataFrame(list_tuples, columns=[
‘Name’
,
‘Marks’
])
-
-
# Print data.
-
print(dframe)
:
[('john', 95), ('krish', 63), ('arun', 54), ('juli', 47)] Name Marks 0 john 95 1 krish 63 2 arun 54 3 juli 47
Method – 7: Create Dataframe from Dicts of series
A dataframe may be made with the help of the dictionary. Dicts of series may be used, with the next index being the concatenation of all the series indexed before. Now, let’s analyse the following scenario. Namely –
The Results of the
-
# Pandas Dataframe from Dicts of series.
-
-
import
pandas as pd
-
-
# Initialize data to Dicts of series.
-
d = {
‘Electronics’
: pd.Series([
97
,
56
,
87
,
45
], index =[
‘John’
,
‘Abhinay’
,
‘Peter’
,
‘Andrew’
]),
-
‘Civil’
: pd.Series([
97
,
88
,
44
,
96
], index =[
‘John’
,
‘Abhinay’
,
‘Peter’
,
‘Andrew’
])}
-
-
# creates Dataframe.
-
dframe = pd.DataFrame(d)
-
-
# print the data.
-
print(dframe)
Computer:
Electronics Civil John 97 97 Abhinay 56 88 Peter 87 44 Andrew 45 96
Many methods for generating DataFrames have been covered in this lesson.