Note

Some nifty ninjastics you can do with Group By and MatPlotLib.

import pandas as pd
from matplotlib.pylab import plt
import sys

Here is the csv data if you want to follow along:

Date,Symbol,Volume
1/1/2013,A,0
1/2/2013,A,200
1/3/2013,A,1200
1/4/2013,A,1001
1/5/2013,A,1300
1/6/2013,A,1350
3/8/2013,B,500
3/9/2013,B,1150
3/10/2013,B,1180
3/11/2013,B,2000
1/5/2013,C,56600
1/6/2013,C,45000
1/7/2013,C,200
5/20/2013,E,1300
5/21/2013,E,1700
5/22/2013,E,900
5/23/2013,E,2100
5/24/2013,E,8000
5/25/2013,E,12000
5/26/2013,E,1900
5/27/2013,E,1000
5/28/2013,E,1900

print('Python version ' + sys.version)
print('Pandas version ' + pd.__version__)
Python version 3.11.7 | packaged by Anaconda, Inc. | (main, Dec 15 2023, 18:05:47) [MSC v.1916 64 bit (AMD64)]
Pandas version 2.2.1
# let's see what kind of data we are working with
raw = pd.read_csv('Test_9_17_Python.csv')
raw.head()
DateSymbolVolume
01/1/2013A0
11/2/2013A200
21/3/2013A1200
31/4/2013A1001
41/5/2013A1300
df2 = raw.copy()

You are going to have to change the data type of the Date column

df2.dtypes
Date      object
Symbol    object
Volume     int64
dtype: object
df2['Date'] = pd.to_datetime(df2['Date'])
df2.dtypes
Date      datetime64[ns]
Symbol            object
Volume             int64
dtype: object
# generate some fake data
pool = ['boy','girl']
pool = pool*(int(len(df2)/2))
df2['Gender'] = pool
df = df2.copy()
df.head()
DateSymbolVolumeGender
02013-01-01A0boy
12013-01-02A200girl
22013-01-03A1200boy
32013-01-04A1001girl
42013-01-05A1300boy

Group one column and plot

group = df.groupby('Symbol')
for x in group:
    print(type(x))
    print('//////')
<class 'tuple'>
//////
<class 'tuple'>
//////
<class 'tuple'>
//////
<class 'tuple'>
//////
fig, axes = plt.subplots(2,1, figsize=(15,5))
plt.subplots_adjust(hspace=0.5)
 
group.get_group('A').plot(ax=axes[0])
group.get_group('B').plot(ax=axes[1])
 
axes[0].set_title('title')
axes[0].set_xlabel('sdf')
 
axes[1].set_title('title bottom');


def plot(group):
    mask = group['Volume'].apply(lambda x: x>1000)
    mask2 = group['Symbol'] == 'A'
    mask3 = group['Symbol'] == 'B'
    
    return group[mask & (mask2 | mask3)]['Volume'].sum()
 
a = group[['Symbol','Volume']].apply(plot)
a = pd.DataFrame(a)
a = a.rename(columns={0:'Volume'})
a.plot();

for i, g in group:
    g.plot(title=i)

Group two columns and plot

group = df.groupby(['Symbol', 'Gender'])
for i, g in group:
    print(i)
('A', 'boy')
('A', 'girl')
('B', 'boy')
('B', 'girl')
('C', 'boy')
('C', 'girl')
('E', 'boy')
('E', 'girl')
group.get_group(('A', 'boy')).plot()
group.get_group(('A', 'boy')).plot();

fig, axes = plt.subplots(len(group.groups),1, figsize=(5,20))
fig.subplots_adjust(hspace=1.0) ## Create space between plots
ix = 0
 
for i, g in group:
    p = g.plot(ax=axes[ix], title=str(i))
    if ix < len(axes)-1:
        ix = ix + 1
    else: 
        ix = 0