# Making DataFrames

### Provide a list or numpy array

The column and row labels will simply use the numerical index

```
import pandas as pd
import numpy as np
```

```
z = np.array([[1,2,3,4,5],[6,7,8,9,10]])
z
```

<figure><img src="https://498238201-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWuHhstIreJ3jFvE4gQ3y%2Fuploads%2FaBydrLZ1OMNg22am4uOr%2Fimage.png?alt=media&#x26;token=12b67e97-9372-4916-90fc-be2aaaeab152" alt=""><figcaption></figcaption></figure>

```
pd.DataFrame(z) #note the difference to a numpy array z above
```

<figure><img src="https://498238201-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWuHhstIreJ3jFvE4gQ3y%2Fuploads%2FbVHBBppM3Hzfsfkgfnjb%2Fimage.png?alt=media&#x26;token=66e00d31-f8be-4096-b320-6a6d89abea5a" alt=""><figcaption></figcaption></figure>

```
my_list = [['a', 'b', 'c'], [10,5,2.5], [3,2,1]]
print(my_list)
df = pd.DataFrame(my_list)
df #note the output here
```

\[\['a', 'b', 'c'], \[10, 5, 2.5], \[3, 2, 1]]

<figure><img src="https://498238201-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWuHhstIreJ3jFvE4gQ3y%2Fuploads%2FokxOmDCS62RtYtuQzKlL%2Fimage.png?alt=media&#x26;token=f5b09a0a-db14-4710-a2a2-9d90df7b2cf7" alt=""><figcaption></figcaption></figure>

```
df.shape
```

(3, 3)

### Provide a dictionary

```
dictionary = {'a':[10,3], 'b':[5,2], 'c':[2.5,1]}
df = pd.DataFrame(dictionary)
df #note the difference with the prior dataframe you made
```

<figure><img src="https://498238201-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWuHhstIreJ3jFvE4gQ3y%2Fuploads%2Fg6DJlRLDeW3UdR3X14Tl%2Fimage.png?alt=media&#x26;token=2a5e3f5e-75a2-4efd-8846-33cef14b4863" alt=""><figcaption></figcaption></figure>

```
df.shape #note the new shape
```

(2, 3)

```
df.columns #this is how to get a list of the column headers
```

Index(\['a', 'b', 'c'], dtype='object')

```
dictionary = {'a':{'row1':3, 'row2':2}, 'b':{'row1':5,'row2':2}, 'c':{'row1':2.5,'row2':1}}
df = pd.DataFrame(dictionary)
df #note the new index!
```

<figure><img src="https://498238201-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWuHhstIreJ3jFvE4gQ3y%2Fuploads%2FrgOd6F0VczT1UMQytaiu%2Fimage.png?alt=media&#x26;token=3407d2ce-5acc-4d25-9d9e-7918a6a9d88f" alt=""><figcaption></figcaption></figure>

```
df.index #this is how you get a list of the row labels
```

Index(\['row1', 'row2'], dtype='object')

```
dictionary2 = {'a':{'row1':3, 'row2':2}, 'b':{'row3':5,'row4':2}, 'c':{'row5':2.5,'row6':1}}
df2 = pd.DataFrame(dictionary2)
df2
```

<figure><img src="https://498238201-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWuHhstIreJ3jFvE4gQ3y%2Fuploads%2FqhPNZ39FmYViy4ZS1tjb%2Fimage.png?alt=media&#x26;token=443ad32c-ebaf-43ab-99a7-a51dccade0b0" alt=""><figcaption></figcaption></figure>

#### Read a CSV file

```
#pandas has some great methods to read .csv files
pd.read_csv?
```

```
ms=pd.read_csv("C:\\Users\duan\Desktop\PythonDataProcessingVisualization\mass_spec.csv")
ms
```

<figure><img src="https://498238201-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWuHhstIreJ3jFvE4gQ3y%2Fuploads%2FCPbSDwRnlEvFJ8v1QxTx%2Fimage.png?alt=media&#x26;token=dc223f1f-585f-41b5-8865-c459bbc87feb" alt=""><figcaption></figcaption></figure>

#### Read an Excel file

```
#pandas has read_excel method to read excel files
pd.read_excel?
```

```
excelf=pd.read_excel("C:\\Users\duan\Desktop\PythonDataProcessingVisualization\excelfile.csv")
excelf
```

<figure><img src="https://498238201-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWuHhstIreJ3jFvE4gQ3y%2Fuploads%2FzCnYuFErql7GRjJTXjmN%2Fimage.png?alt=media&#x26;token=2e3bcf24-29fd-4b0d-8029-dd64c00dfc0d" alt=""><figcaption></figcaption></figure>
