How to barplot a grouped dataframe with three axes?

84 Views Asked by At

I have a grouped dataframe with three columns: timestamp, category, and values. I have grouped the dataframe by timestamp and used it as an index. I would like to plot values of categories stacked over each other.

  TIMESTAMP        CATEGORY_1  count
0   2023-03-31        correct     30
1   2023-03-31    not correct     11
2   2023-03-31        no info      2
3   2023-04-30        correct     15
4   2023-04-30    not correct      8

  TIMESTAMP             CATEGORY_2  count
0   2023-03-31                okay     29
1   2023-03-31             no info     17
2   2023-03-31             too high     4


  TIMESTAMP       CATEGORY_3  count
0   2023-03-31                 okay      4
1   2023-03-31              no info      2
2   2023-03-31             positive      3
3   2023-03-31             negative      2

When I use

df.pivot_table(index='timestamp', columns='category1',
values='count', aggfunc='mean').plot(kind='bar', stacked=True) 

it works fine. This is one category (one dataframe)

I have three different categories with three different values, which all share the same timestamp. I thought of plotting them on one axis so that I do not have to scroll back and forth to see three plots.

fig, (ax1, ax2, ax3) = plt.subplots(1, 3)

ax1.bar(x=df.index, y='category1', data=df1, stacked=True) 
ax2.bar(x=df.index, y='category2', data=df2, stacked=True)
ax3.bar(x=df.index, y='category3', data=df3, stacked=True)`

It does not work. I sadly cannot use the seaborne library.

Any pieces of advice? I saw some solutions with for-loops like this how to create a stacked bar with three dataframe with three columns & three rows but it was not exactly clear which columns two variables correspond to.

2

There are 2 best solutions below

2
Paddy Harrison On

Here is an example of plotting stacked bars on a single graph for the data you provided. I am not sure if this is exactly the plot you have in mind, but the code below demonstrates the flow for creating a stacked plot:

import pandas as pd
from matplotlib import pyplot as plt


df1 = pd.read_csv('cat1.csv')
df2 = pd.read_csv('cat2.csv')
df3 = pd.read_csv('cat3.csv')

fig, ax = plt.subplots()

# keep track of the previous colors and legend for figure
colors = {}
labels = {}

for i in range(1, 4):
    # load data
    df = pd.read_csv(f"cat{i}.csv").sort_values("count", ascending=False)
    # filter date for example
    df = df[df["TIMESTAMP"] == "2023-03-31"]
    # previous cumulative bar height for dataset
    prev = 0
    for _, row in df.iterrows():
        val = row['count']
        label = row[f'CATEGORY_{i}']
        color = colors.get(label, None)
        bars = ax.bar(i, val, bottom=prev, width=0.5, color=color)
        if color is None:
            bar = bars[0]
            colors[label] = bar.get_facecolor()
            labels[label] = bar
        prev += val

ax.set_xticks(range(1, 4))
ax.set_xlabel("Category")
ax.set_ylabel("Count")
ax.legend(labels.values(), labels.keys())

stacked bar plot

2
gulniza On

In the end, I solved it like this.

def plot_bar(ax, data, x_label, y_label):
bottom = None    
for col in data.columns:
    ax.bar(data.index, data[col], label= col, bottom=bottom, width=15)
    if bottom is None:
        bottom = data[col].values        
    else:
         bottom += data[col].values    
  

ax.legend()
ax.set_xlabel(x_label)
ax.set_ylabel(y_label)
ax.set_xticks(ticks=data.index, labels = data.index, rotation=45)
# ax.set_yticks()



fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(25,10), layout='tight')

Finally, I called the function and passed on the arguments.