Everything You Must Know About Python Pandas Data Structure

This blog series is a general introduction about python pandas data structure, to know more about python we recommend visiting our Free course on Learning Python by killing zombies, also to know more about Python request library see here, to learn how to download a Youtube video using Python programming, see here.

General Introduction about Python Pandas Data Structure

Everything You Must Know About Python Pandas Data Structure

  1. Pandas is a Python Open Source library that provides tools for Data Analysis, data structures with  high performance used in field of Finance, Economics, Statistics, Biotechnology etc.
  2. In Pandas you can explore three kinds of Data structures:
  3. Series (Data Mutable but its Size Immutable, one dimensional database)
  4. DataFrames (Data and its Size Mutable, two dimensional database)
  5. Panel (Data and its Size Mutable, three dimensional database)
  6. A Series is a one dimensional data like a data of Even Numbers i.e, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20. It has a fixed size i.e, you cannot change the size of data which is 10 in the given example above but however you can change the data i.e, instead of 18 you can change it to 24.
  7. DataFrame is a two dimensional heterogeneous data whose both size and data itself can be changed. Ex: Data about Students in classroom, which can contain details like their Name, Age, Weight, Height, Parents name, Address etc.
  8. Panel is a three dimensional Heterogeneous data which is hard to represent in Graphical form but can be illustrated as contained of DataFrame. Panel Data and its size, both are Mutable.

Python Pandas Data Structure: Series

In Pandas series which is a one dimensional data structure can be created using following constructor:

Python Pandas Series
pandas.Series( data, index, dtype, copy)

The parameters of Pandas Series with description are explained as follows:

  1. data: can take various forms like ndarray, list or constants.
  2. index: it’s values should be unique and with same length as that of data.
  3. dtype: As in data type.
  4. copy: for copying data, by default it is False.

Pandas Series Example

Creating pandas series and later print it.

Like what you're reading? Subscribe to our Weekly Tech Newsletter!

* indicates required
Python Pandas Series
import pandas as pd
s = pd.Series([123,456,789,147,258,369,159,753,147,258,369],index = ['a','b','c','d','e','f','g','h','i','j','k'])
print s
Output
a 123
b 456
c 789
d 147
e 258
f 369
g 159
h 753
i 147
j 258
k 369
dtype: int64

Python Pandas Data Structure: DataFrame

A Pandas DataFrame is a two dimensional heterogeneous data whose both size and data itself can be changed. Its column are of different data types and is mutable. One can perform mathematical operations as well in this two-dimensional data structure.

In Pandas DataFrame which is a two dimensional data structure can be created using following constructor:

Python Pandas DataFrame
pandas.DataFrame( data, index, columns, dtype, copy)

The parameters of Pandas DataFrame with description are explained as follows:

  1. data: can take various forms like ndarray, list or constants.
  2. index: used for row labels.
  3. columns: for column labels.
  4. dtype: As in data type of each column.
  5. copy: for copying data, by default it is False.

Pandas DataFrame Example

Generating a random data with 4×4 and 4×1 column and rows.

Python Pandas DataFrame
import pandas as pd 
d = pd.DataFrame([['Rishabh',95], ['Tanya',98],['Santa',59],['Maria',89],['Jake',58]],columns=['Name','Marks']) 
print d
Output
  Name Marks 
0 Rishabh 95 
1 Tanya 98 
2 Santa 59 
3 Maria 89 
4 Jake 58

Python Pandas Data Structure: Panel

Panel is a three dimensional Heterogeneous data which is hard to represent in Graphical form but can be illustrated as contained of DataFrame. Being a 3D data, it has 3 panels, which are as follows:

  1. items: axis 0
  2. major_axis: axis 1
  3. minor_axis: axis 2

In Pandas Panel which is a three dimensional data structure can be created using following constructor:

Python Pandas DataFrame
pandas.Panel(data, items, major_axis, minor_axis, dtype, copy)

The parameters of Pandas DataFrame with description are explained as follows:

  1. data: can take various forms like ndarray, list or constants.
  2. items: axis 0
  3. major_axis: axis 1
  4. minor_axis: axis 2
  5. dtype: As in data type of each column.
  6. copy: for copying data, by default it is False.

Pandas Panel Example

Creating a data on Name and Marks obtained by 5 students in recent exam.

Python Pandas Panel
import pandas as pd
import numpy as np
data = {'Data1' : pd.DataFrame(np.random.randn(4, 4)),
        'Data2' : pd.DataFrame(np.random.randn(4, 1))}
p = pd.Panel(data)
print p['Data1']
print p['Data2']
print p
Output
 0 1 2 3
0 -1.028867 1.165953 -1.337242 0.289351
1 -0.029509 -0.286931 -0.404673 1.055836
2 -1.091060 0.790604 -0.195462 -0.028688
3 0.150005 -1.097062 -1.396424 -0.521907
 0 1 2 3
0 0.876145 NaN NaN NaN
1 0.789839 NaN NaN NaN
2 0.706054 NaN NaN NaN
3 0.887317 NaN NaN NaN
<class 'pandas.core.panel.Panel'>
Dimensions: 2 (items) x 4 (major_axis) x 4 (minor_axis)
Items axis: Data1 to Data2
Major_axis axis: 0 to 3
Minor_axis axis: 0 to 3

Remember that in Python Pandas Data Structure, Panel is deprecated and will be removed in its future versions. The recommended way of representing a 3D data now is using MultiIndex on a DataFrame.