Plotting The Same Column From Various Dataframes In A Panel
I've got data from a simulation which gives me some values stored in a DataFrame (100 rows x 6 columns). For varying starting values I saved my data in a Panel (2 DataFrames x 100
Solution 1:
Consider the following example:
In [77]: import pandas_datareader.data as web
In [78]: p = web.DataReader(['AAPL','GOOGL'], 'yahoo', '2017-01-01')
In [79]: p.axes
Out[79]:
[Index(['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close'], dtype='object'),
DatetimeIndex(['2017-01-03', '2017-01-04', '2017-01-05', '2017-01-06', '2017-01-09', '2017-01-10', '2017-01-11', '2017-01-12',
'2017-01-13', '2017-01-17', '2017-01-18', '2017-01-19', '2017-01-20', '2017-01-23', '2017-01-24', '2017-01-25',
'2017-01-26', '2017-01-27', '2017-01-30', '2017-01-31', '2017-02-01', '2017-02-02', '2017-02-03', '2017-02-06',
'2017-02-07', '2017-02-08', '2017-02-09', '2017-02-10', '2017-02-13', '2017-02-14', '2017-02-15', '2017-02-16',
'2017-02-17', '2017-02-21', '2017-02-22', '2017-02-23', '2017-02-24', '2017-02-27', '2017-02-28', '2017-03-01',
'2017-03-02', '2017-03-03', '2017-03-06', '2017-03-07', '2017-03-08', '2017-03-09', '2017-03-10', '2017-03-13',
'2017-03-14', '2017-03-15', '2017-03-16', '2017-03-17', '2017-03-20', '2017-03-21', '2017-03-22', '2017-03-23',
'2017-03-24', '2017-03-27', '2017-03-28', '2017-03-29', '2017-03-30', '2017-03-31', '2017-04-03', '2017-04-04',
'2017-04-05', '2017-04-06', '2017-04-07', '2017-04-10', '2017-04-11', '2017-04-12', '2017-04-13', '2017-04-17',
'2017-04-18', '2017-04-19', '2017-04-20', '2017-04-21'],
dtype='datetime64[ns]', name='Date', freq=None),
Index(['AAPL', 'GOOGL'], dtype='object')]
In [80]: p.loc['Adj Close']
Out[80]:
AAPL GOOGL
Date
2017-01-03115.648597808.0100102017-01-04115.519154807.7700202017-01-05116.106611813.0200202017-01-06117.401002825.2100222017-01-09118.476334827.1799932017-01-10118.595819826.0100102017-01-11119.233055829.8599852017-01-12118.735214829.5300292017-01-13118.526121830.9400022017-01-17119.481976827.460022
... ... ...
2017-04-07143.339996842.0999762017-04-10143.169998841.7000122017-04-11141.630005839.8800052017-04-12141.800003841.4600222017-04-13141.050003840.1799932017-04-17141.830002855.1300052017-04-18141.199997853.9899902017-04-19140.679993856.5100102017-04-20142.440002860.0800172017-04-21142.270004858.950012
[76 rows x 2 columns]
plot it
In [81]: p.loc['Adj Close'].plot()
Out[81]: <matplotlib.axes._subplots.AxesSubplot at 0xdabfda0>
Examples of different slicing/indexing/selecting for the sample Panel:
In [118]: p
Out[118]:
<class'pandas.core.panel.Panel'>Dimensions:6 (items) x 76 (major_axis) x 2 (minor_axis)
Items axis: Open to Adj Close
Major_axis axis: 2017-01-0300:00:00to2017-04-2100:00:00
Minor_axis axis: AAPL to GOOGL
By items axis (index):
In [119]:p.loc['AdjClose']Out[119]:AAPLGOOGLDate2017-01-03 115.648597808.0100102017-01-04 115.519154807.7700202017-01-05 116.106611813.0200202017-01-06 117.401002825.2100222017-01-09 118.476334827.1799932017-01-10 118.595819826.0100102017-01-11 119.233055829.8599852017-01-12 118.735214829.5300292017-01-13 118.526121830.9400022017-01-17 119.481976827.460022.........2017-04-07 143.339996842.0999762017-04-10 143.169998841.7000122017-04-11 141.630005839.8800052017-04-12 141.800003841.4600222017-04-13 141.050003840.1799932017-04-17 141.830002855.1300052017-04-18 141.199997853.9899902017-04-19 140.679993856.5100102017-04-20 142.440002860.0800172017-04-21 142.270004858.950012
[76rowsx2columns]
By major axis:
In[120]: p.loc[:, '2017-01-03']Out[120]:
OpenHighLowCloseVolumeAdjCloseAAPL115.800003116.330002114.760002116.15000228781900.0115.648597GOOGL800.619995811.440002796.890015808.0100101959000.0808.010010
By minor axis:
In [121]:p.loc[:,:,'GOOGL']Out[121]:OpenHighLowCloseVolumeAdjCloseDate2017-01-03 800.619995811.440002796.890015808.0100101959000.0808.0100102017-01-04 809.890015813.429993804.109985807.7700201515300.0807.7700202017-01-05 807.500000813.739990805.919983813.0200201340500.0813.0200202017-01-06 814.989990828.960022811.500000825.2100222017100.0825.2100222017-01-09 826.369995830.429993821.619995827.1799931406800.0827.1799932017-01-10 827.070007829.409973823.140015826.0100101194500.0826.0100102017-01-11 826.619995829.900024821.469971829.8599851320200.0829.8599852017-01-12 828.380005830.380005821.010010829.5300291349500.0829.5300292017-01-13 831.000000834.650024829.520020830.9400021288000.0830.9400022017-01-17 830.000000830.179993823.200012827.4600221439700.0827.460022.....................2017-04-07 845.000000845.880005837.299988842.0999761110000.0842.0999762017-04-10 841.539978846.739990840.789978841.7000121021200.0841.7000122017-04-11 841.700012844.630005834.599976839.880005971900.0839.8800052017-04-12 838.460022843.719971837.590027841.4600221126100.0841.4600222017-04-13 841.039978843.729980837.849976840.1799931067200.0840.1799932017-04-17 841.380005855.640015841.030029855.1300051044800.0855.1300052017-04-18 852.539978857.390015851.250000853.989990935200.0853.9899902017-04-19 857.390015860.200012853.530029856.5100101077500.0856.5100102017-04-20 859.739990863.929993857.500000860.0800171186900.0860.0800172017-04-21 860.619995862.440002857.729980858.9500121168200.0858.950012
[76rowsx6columns]
In your case (depending on your axes) you may want to slice your Panel differently:
Panel.loc[:, :, 'A'].plot()
Solution 2:
Here's one approach, using Panel.apply()
.
The output of apply(plt.plot)
is a minor_axis
-by-items
data frame of Line2D objects. plot()
tries to plot an additional dimension that doesn't really make sense for our purposes, but we can use lines.pop()
to remove the offending dimension. Hope this helps.
# generate sample data
x = np.arange(20)
y1 = np.random.randint(100, size=20)
y2 = np.random.randint(100, size=20)
data = {'A1': pd.DataFrame({'y':y1,'x':x}),
'A2': pd.DataFrame({'y':y2,'x':x})}
p = pd.Panel(data)
# plot panels
df = p.apply(plt.plot)
df.ix[0,0].axes.lines.pop(2)
df.ix[0,0].axes.lines.pop(0)
df.ix[0,0].axes.legend(loc="lower right")
Post a Comment for "Plotting The Same Column From Various Dataframes In A Panel"