# Advanced Plotting

### Getting Ready

```
dat2=pd.read_csv("C:\\Users\duan\Desktop\PythonDataProcessingVisualization\meanByClass.txt", sep='\s+')
```

```
dat2
```

<figure><img src="/files/JGBBoQ0DD9maKfz8WhZ3" alt=""><figcaption></figcaption></figure>

Explore a fake gene expression data modified from iris.csv

```
rpkm=pd.read_csv("C:\\Users\duan\Desktop\PythonDataProcessingVisualization\\fakeExpressionDat.csv")
```

```
rpkm
```

<figure><img src="/files/qxALaE5pHZgOMXonU0jR" alt=""><figcaption></figcaption></figure>

### Advanced Line Plot

```
plt.figure(); dat2.plot(); plt.legend(loc='best')
```

<figure><img src="/files/hwCKVRkFDpFI0PXYO1kS" alt=""><figcaption></figcaption></figure>

Get rid of the legend

```
dat2.plot(legend=False)
```

<figure><img src="/files/MUrE2UEKcUy0LKfGQODC" alt=""><figcaption></figcaption></figure>

Separate the features

```
dat2.plot(subplots=True, figsize=(6, 6)); plt.legend(loc='best')
```

<figure><img src="/files/SByBdHHjAgg8dlqe5kyF" alt=""><figcaption></figcaption></figure>

Plotting on a Secondary Y-axis

```
plt.figure()
dat2.WtTypeA.plot(color="b")
dat2.WtTypeB.plot(color="turquoise")
dat2.KOTypeA.plot(color="r")
dat2.KOTypeB.plot(color="pink")
dat2.replicate.plot(secondary_y=True, style='g')
```

<figure><img src="/files/IvHEhwk9INIuiaiesU6C" alt=""><figcaption></figcaption></figure>

Plot a subset of columns

```
plt.figure()
dat2.WtTypeA.plot(color="b")
dat2.WtTypeB.plot(color="turquoise")
dat2.KOTypeA.plot(color="r")
dat2.KOTypeB.plot(color="pink")
```

<figure><img src="/files/qdmSLwG9PDwOmIpkAlCb" alt=""><figcaption></figcaption></figure>

Selective Plotting on Secondary Y-axis

```
plt.figure()
dat3=dat2.drop(['replicate'], axis = 1)
ax = dat3.plot(secondary_y=['wtTypeA', 'KOTypeA'])
ax.set_ylabel('TypeB scale')
ax.right_ax.set_ylabel('TypeA scale')
```

<figure><img src="/files/An6XVUdpYKWcoIF3aWt6" alt=""><figcaption></figcaption></figure>

Targeting different subplots by passing an ax argument

```
fig, axes = plt.subplots(nrows=2, ncols=2)
dat2['WtTypeA'].plot(ax=axes[0,0]); axes[0,0].set_title('WtTypeA')
dat2['KOTypeA'].plot(ax=axes[0,1]); axes[0,1].set_title('KOTypeA')
dat2['WtTypeB'].plot(ax=axes[1,0]); axes[1,0].set_title('WtTypeB')
dat2['KOTypeB'].plot(ax=axes[1,1]); axes[1,1].set_title('KOTypeB')
```

<figure><img src="/files/AtP5Y5yRCUgEN7EFhgER" alt=""><figcaption></figcaption></figure>

Adjusting spacing between subplots

```
fig, axes = plt.subplots(nrows=2, ncols=2)
dat2['WtTypeA'].plot(ax=axes[0,0]); axes[0,0].set_title('WtTypeA')
dat2['KOTypeA'].plot(ax=axes[0,1]); axes[0,1].set_title('KOTypeA')
dat2['WtTypeB'].plot(ax=axes[1,0]); axes[1,0].set_title('WtTypeB')
dat2['KOTypeB'].plot(ax=axes[1,1]); axes[1,1].set_title('KOTypeB')
plt.subplots_adjust(left=0.1,
                    bottom=0.1,
                    right=0.9,
                    top=0.9,
                    wspace=0.4,
                    hspace=0.4)
```

<figure><img src="/files/X8Dpk7IgmK2CZaptt1Fv" alt=""><figcaption></figcaption></figure>

### Advanced Bar Plots

Looking at one replicate a time

```
plt.figure();
dat2.iloc[1].plot(kind='bar'); plt.axhline(0, color='k')
```

<figure><img src="/files/Dm4SlxIx6ZV2pQU63Mqk" alt=""><figcaption></figcaption></figure>

Looking at all replicates at the same time

```
plt.figure();
dat2.plot(kind='bar'); plt.axhline(0, color='k')
```

<figure><img src="/files/fFJ1e5fORbjWuRkWfYo3" alt=""><figcaption></figcaption></figure>

```
plt.figure();
dat2.plot(kind='bar', colormap='Greens')
```

<figure><img src="/files/JmoPDVWl8KviQotbLRw2" alt=""><figcaption></figcaption></figure>

stacked boxes

```
dat3.plot(kind='bar', stacked=True);
```

<figure><img src="/files/9VMkGoGHfcB5PKOvjCLF" alt=""><figcaption></figcaption></figure>

### Advanced Histogram

```
plt.figure()
dat.hist(by="genotype", figsize=(6, 4),bins=20)
```

<figure><img src="/files/jDSkPjaOmcBiNxmSWyLS" alt=""><figcaption></figcaption></figure>

### Scatter Plot

```
from pandas.plotting import scatter_matrix
rpkm=pd.read_csv("C:\\Users\duan\Desktop\IntroductionToMatplotlib\\fakeExpressionDat.csv")
rpkm
```

<figure><img src="/files/qYNO8FB5iJveH9zbagc6" alt=""><figcaption></figcaption></figure>

```
scatter_matrix(rpkm, alpha=0.9, figsize=(6, 6), diagonal='kde')
```

<figure><img src="/files/dUbygXLBpIBtgix8FaOX" alt=""><figcaption></figcaption></figure>

### Parallel Coordinates

Parallel coordinates is a plotting technique for plotting multivariate data. It allows one to see clusters in data and to estimate other statistics visually. Using parallel coordinates points are represented as connected line segments. Each vertical line represents one attribute. One set of connected line segments represents one data point. Points that tend to cluster will appear closer together

```
from pandas.plotting import parallel_coordinates
plt.figure()
parallel_coordinates(rpkm, 'pathway')
```

<figure><img src="/files/GlzMr9YxmmbUpzlxcYHP" alt=""><figcaption></figcaption></figure>

```
from pandas.plotting import parallel_coordinates
plt.figure()
parallel_coordinates(rpkm, 'pathway',colormap='gist_rainbow')
```

<figure><img src="/files/clnlE3mGxXOQNfzGFR02" alt=""><figcaption></figcaption></figure>

```
from pandas.plotting import parallel_coordinates
plt.figure()
parallel_coordinates(rpkm, 'pathway',colormap='spring')
```

<figure><img src="/files/eBF3hbBIEQN6P7aqIg6i" alt=""><figcaption></figcaption></figure>

```
from pandas.plotting import parallel_coordinates
plt.figure()
parallel_coordinates(rpkm, 'pathway',colormap='autumn')
```

<figure><img src="/files/Boivz1iLBju4qxdtuXbj" alt=""><figcaption></figcaption></figure>

### Andrews Curves

Andrews Curves are smoothed versions of Parallel Coordinates

```
from pandas.plotting import andrews_curves
```

```
plt.figure()
andrews_curves(rpkm, 'pathway')
plt.show()
```

<figure><img src="/files/FjywLDvjPuqDiCFQXVTZ" alt=""><figcaption></figcaption></figure>

A potential issue when plotting a large number of columns is that it can be difficult to distinguish some series due to repetition in the default colors. To remedy this, we can either loop through different colors using rainbow() function. Or DataFrame plotting supports the use of the colormap= argument, which accepts either a Matplotlib colormap or a string that is a name of a colormap registered with Matplotlib

```
plt.figure()
andrews_curves(rpkm, 'pathway',color = [cm.rainbow(i) for i in np.linspace(0, 1, 3)])
plt.show()
```

<figure><img src="/files/dIYhNaCShS1rgTWx8nMr" alt=""><figcaption></figcaption></figure>

```
plt.figure()
andrews_curves(rpkm, 'pathway',colormap='jet')
plt.show()
```

<figure><img src="/files/4s78AsaSw7mXbWM655c2" alt=""><figcaption></figcaption></figure>

```
plt.figure()
andrews_curves(rpkm, 'pathway',colormap="winter")
plt.show()
```

<figure><img src="/files/8HwojQVhWrumoM6IACxa" alt=""><figcaption></figcaption></figure>

### RadViz

```
from pandas.plotting import radviz
plt.figure()
radviz(rpkm, 'pathway')
plt.show()
```

<figure><img src="/files/1CFiY4mumkdZTHZZgRXn" alt=""><figcaption></figcaption></figure>

```
from pandas.plotting import radviz
plt.figure()
radviz(rpkm, 'pathway',colormap="Set1")
plt.show()
```

<figure><img src="/files/ya1GIqHa5r9UUC3K8d6s" alt=""><figcaption></figcaption></figure>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://igb.mit.edu/mini-courses/python/data-processing-with-python/matplotlib/advanced-plotting.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
