Python code: Multi-index pandas DataFrame

1 minute read

Published:

Although there several takes on using multi-index DataFrames around the internet, it still takes some time to get to a simple and elegant solution. Therefore, with this post I would like to show what at

Lets start with the definition on multi-index DataFrame. I will address only column-multi-index, since I find them more useful compared to row-multi-index DataFrames.

import pandas as pd

# set multi-index levels
levels = ['level_0', 'level_1']

# prepare multi-index 
multi_index_column = pd.MultiIndex( levels=[[] for level in levels],
                                     codes=[[] for level in levels],
                                     names=levels)
                                     
# set columns to multi-index
df = pd.DataFrame(index=[], columns=multi_index_column)

The following test script:

import pandas as pd
import numpy as np

# set multi-index levels
levels = ['level_0', 'level_1']

# prepare multi-index 
multi_index_column = pd.MultiIndex( levels=[[] for level in levels],
                                     codes=[[] for level in levels],
                                     names=levels)

# set columns to multi-index
df = pd.DataFrame(index=[], columns=multi_index_column)


data_len = 3
# fill in the columns
df[('A', 0)] = pd.Series( np.random.normal(0, 1, data_len) )
df[('B', 0)] = pd.Series( np.random.normal(0, 1, data_len) )
df[('B', 1)] = pd.Series( np.random.normal(0, 1, data_len) )
df[('A', 1)] = pd.Series( np.random.normal(0, 1, data_len) )
df[('B', 2)] = pd.Series( np.random.normal(0, 1, data_len) )

# set axis=0 to DatetimeIndex
start = pd.Timestamp('2021-01-01')
end = pd.Timestamp('2022-01-01')
time_vector = np.linspace(start.value, end.value, data_len)
datetime = pd.to_datetime(time_vector)
df.index = datetime

# show
print(df)

# sort axis=1 index
df = df.sort_index(axis=1)

# show
print(df)

illustrates the re-ordering of the columns:

level_0                     A         B                   A         B
level_1                     0         0         1         1         2
2021-01-01 00:00:00 -0.505519  0.021671 -0.741799  0.399135 -0.086691
2021-07-02 12:00:00 -1.011332  0.946652 -0.237040 -0.485605  0.098137
2022-01-01 00:00:00  1.450697  0.121733 -0.280538 -0.513861  2.013553
level_0                     A                   B
level_1                     0         1         0         1         2
2021-01-01 00:00:00 -0.505519  0.399135  0.021671 -0.741799 -0.086691
2021-07-02 12:00:00 -1.011332 -0.485605  0.946652 -0.237040  0.098137
2022-01-01 00:00:00  1.450697 -0.513861  0.121733 -0.280538  2.013553

The DataFrame is illustrated in the image below. column-multi-index DataFrame