Editing DataFrames

You can trivially add new columns or change the values in existing column.

Add a new column

Be sure that the column you are adding has the same indices as the old dataframe.
This is most easily accomplished by manipulating an old column and saving the value
Example: dataframe['new_column_name'] = ms['old_column']*value
Example: dataframe['new_column_name'] = ms['old_column1']*ms['old_column2']

Alter a column

You select a set of cells using the tools from above change their values
Note that dataframes are MUTABLE!
dataframe.loc[selection,selection] = 7

Dealing with missing data

The dataframe.dropna() and .fillna() funtions are super helpful in removing/replacing missing values
Example: only_complete_rows = dataframe.dropna(how='any')
Example: replace_with_0 = dataframe.fillna(value=0.0)

# what is this line doing?
ms['light +1 charge mass']=ms['light Precursor Mz']*ms['Precursor Charge'] - ((ms['Precursor Charge']-1)*1.0078)

ms[['Peptide Modified Sequence', 'light Precursor Mz', 'Precursor Charge', 'light +1 charge mass']]

#think through what this line is doing
ms.loc[ms['light +1 charge mass']>2000,['Peptide Modified Sequence', 'light Precursor Mz', 'Precursor Charge', 'light +1 charge mass']]

ms.loc[ms['light +1 charge mass']>2000,'light +1 charge mass'] = 'way too big!'

#you can quickly save your work at a .csv using the command .to_csv(path_to_file)
ms.to_csv("C:\\Users\duan\Desktop\PythonDataProcessingVisualization\mass_spec_new.csv")
#look in your directory for a new .csv file!

PreviousSelecting from DataFrames NextMatplotlib

Last updated 1 year ago

Was this helpful?