LogoLogo
LogoLogo
  • The Barbara K. Ostrom (1978) Bioinformatics and Computing Facility
  • Computing Resources
    • Active Data Storage
    • Archive Data Storage
    • Luria Cluster
      • FAQs
    • Other Resources
  • Bioinformatics Topics
    • Tools - A Basic Bioinformatics Toolkit
      • Getting more out of Microsoft Excel
      • Bioinformatics Applications of Unix
        • Unix commands applied to bioinformatics
        • Manipulate NGS files using UNIX commands
        • Manipulate alignment files using UNIX commands
      • Alignments and Mappers
      • Relational databases
        • Running Joins on Galaxy
      • Spotfire
    • Tasks - Bioinformatics Methods
      • UCSC Genome Bioinformatics
        • Interacting with the UCSC Genome Browser
        • Obtaining DNA sequence from the UCSC Database
        • Obtaining genomic data from the UCSC database using table browser queries
        • Filtering table browser queries
        • Performing a BLAT search
        • Creating Custom Tracks
        • UCSC Intersection Queries
        • Viewing cross-species alignments
        • Galaxy
          • Intro to Galaxy
          • Galaxy NGS Illumina QC
          • Galaxy NGS Illumina SE Mapping
          • Galaxy SNP Interval Data
        • Editing and annotation gene structures with Argo
      • GeneGO MetaCore
        • GeneGo Introduction
        • Loading Data Into GeneGO
        • Data Management in GeneGO
        • Setting Thresholds and Background Sets
        • Search And Browse Content Tab
        • Workflows and Reports Tab
        • One-click Analysis Tab
        • Building Network for Your Experimental Data
      • Functional Annotation of Gene Lists
      • Multiple Sequence Alignment
        • Clustalw2
      • Phylogenetic analysis
        • Neighbor Joining method in Phylip
      • Microarray data processing with R/Bioconductor
    • Running Jupyter notebooks on luria cluster nodes
  • Data Management
    • Globus
  • Mini Courses
    • Schedule
      • Previous Teaching
    • Introduction to Unix and KI Computational Resources
      • Basic Unix
        • Why Unix?
        • The Unix Tree
        • The Unix Terminal and Shell
        • Anatomy of a Unix Command
        • Basic Unix Commands
        • Output Redirection and Piping
        • Manual Pages
        • Access Rights
        • Unix Text Editors
          • nano
          • vi / vim
          • emacs
        • Shell Scripts
      • Software Installation
        • Module
        • Conda Environment
      • Slurm
    • Introduction to Unix
      • Why Unix?
      • The Unix Filesystem
        • The Unix Tree
        • Network Filesystems
      • The Unix Shell
        • About the Unix Shell
        • Unix Shell Manual Pages
        • Using the Unix Shell
          • Viewing the Unix Tree
          • Traversing the Unix Tree
          • Editing the Unix Tree
          • Searching the Unix Tree
      • Files
        • Viewing File Contents
        • Creating and Editing Files
        • Manipulating Files
        • Symbolic Links
        • File Ownership
          • How Unix File Ownership Works
          • Change File Ownership and Permissions
        • File Transfer (in-progress)
        • File Storage and Compression
      • Getting System Information
      • Writing Scripts
      • Schedule Scripts Using Crontab
    • Advanced Utilization of IGB Computational Resources
      • High Performance Computing Clusters
      • Slurm
        • Checking the Status of Computing Nodes
        • Submitting Jobs / Slurm Scripts
        • Interactive Sessions
      • Package Management
        • The System Package Manager
        • Environment Modules
        • Conda Environments
      • SSH Port Forwarding
        • SSH Port Forwarding Jupyter Notebooks
      • Containerization
        • Docker
          • Docker Installation
          • Running Docker Images
          • Building Docker Images
        • Singularity
          • Differences from Docker
          • Running Images in Singularity
      • Running Nextflow / nf-core Pipelines
    • Python
      • Introduction to Python for Biologists
        • Interactive Python
        • Types
          • Strings
          • Lists
          • Tuples
          • Dictionaries
        • Control Flow
        • Loops
          • For Loops
          • While Loops
        • Control Flows and Loops
        • Storing Programs for Re-use
        • Reading and Writing Files
        • Functions
      • Biopython
        • About Biopython
        • Quick Start
          • Basic Sequence Analyses
          • SeqRecord
          • Sequence IO
          • Exploration of Entrez Databases
        • Example Projects
          • Coronavirus Exploration
          • Translating a eukaryotic FASTA file of CDS entries
        • Further Resources
      • Machine Learning with Python
        • About Machine Learning
        • Hands-On
          • Project Introduction
          • Supervised Approaches
            • The Logistic Regression Model
            • K-Nearest Neighbors
          • Unsupervised Approaches
            • K-Means Clustering
          • Further Resources
      • Data Processing with Python
        • Pandas
          • About Pandas
          • Making DataFrames
          • Inspecting DataFrames
          • Slicing DataFrames
          • Selecting from DataFrames
          • Editing DataFrames
        • Matplotlib
          • About Matplotlib
          • Basic Plotting
          • Advanced Plotting
        • Seaborn
          • About Seaborn
          • Basic Plotting
          • Visualizing Statistics
          • Visualizing Proteomics Data
          • Visualizing RNAseq Data
    • R
      • Intro to R
        • Before We Start
        • Getting to Know R
        • Variables in R
        • Functions in R
        • Data Manipulation
        • Simple Statistics in R
        • Basic Plotting in R
        • Advanced Plotting in R
        • Writing Figures to a File
        • Further Resources
    • Version Control with Git
      • About Version Control
      • Setting up Git
      • Creating a Repository
      • Tracking Changes
        • Exercises
      • Exploring History
        • Exercises
      • Ignoring Things
      • Remotes in Github
      • Collaborating
      • Conflicts
      • Open Science
      • Licensing
      • Citation
      • Hosting
      • Supplemental
Powered by GitBook

MIT Resources

  • https://accessibility.mit.edu

Massachusetts Institute of Technology

On this page
  • Functions are like machines
  • Examples

Was this helpful?

Export as PDF
  1. Mini Courses
  2. Python
  3. Introduction to Python for Biologists

Functions

PreviousReading and Writing FilesNextBiopython

Last updated 1 year ago

Was this helpful?

Functions are like machines

  • Learning to write your own functions will greatly increase the complexity of the programs that you can write

  • A function is a black box-it takes some input,does something with it, and spits out some output

  • Functions hide details away, allowing you to solve problems at a higher level without getting bogged down

Examples

  • A typical function looks like this:

def function_name(function_arguments)
   """optional string decribing the function"""
   statements ...
   return result

  • A function example: sum

In [1]: numbers=[1,2,3,4,5]

In [2]: sum(numbers)
Out[2]: 15

  • Anatomy of sum function

1.initialize the sum to zero
2.loop over each number while adding the number to the sum variable
3.return the value of sum

def sum(xs):
   """Given a sequence of numbers, return the sum."""
   s=0
   for x in xs:
      s=s+x
   return s

  • Writing your own function

Open a text editor, type the following and save it as MyMathFunctions.py:
def mysum(numbers):
        """Given a sequence of numbers, return the sum"""
        s=0
        for x in range(numbers):
                s=s+x
        return s

def myproduct(numbers):
        """Given a sequence of numbers, return the product"""
        p=1
        for x in range(numbers):
                p=p*x
        return p

  • Importing a function from a file (module)

    • Once we import MyMathFunctions module we just wrote, we can use the mysum function and myproduct function just like the built-in function

    • The way to call a function is to give the function name followed by parenthesis with values for the number of arguments expected

In [1]: import MyMathFunctions

In [2]: numbers=[1,2,3,4]

In [3]: MyMathFunctions.mysum(numbers)
Out[3]: 10

In [4]: MyMathFunctions.myproduct(numbers)
Out[4]: 24
Too many typing strokes? Try the following:

In [5]: import MyMathFunctions as f

In [6]: f.mysum(numbers)
Out[6]: 10
Still too many typing strokes? Try the following:
In [7]: import MyMathFunctions

In [8]: a=MyMathFunctions.mysum

In [9]: a(numbers)
Out[9]: 10
  • Function arguments

    • We can define functions with more than one arguments

Example: restrcition.py
def finder(DNA,enzyme):
        db={}
        name=['ECOR1','BAMH1','HINDIII']
        site=['GAATTC','GGATCC','AAGCTT']
        db=dict(zip(name,site))
        DNA=DNA.upper()
        enzyme=enzyme.upper()
        index=DNA.find(db[enzyme])
        if (index>-1):
                print ("The restriction site starts at base pair %d\n" %index)
        else:
                print ("No such restriction site\n")

In [1]: import restriction
In [2]: restriction.finder('ATGGAATTCCGT','EcoR1')
The restriction site starts at base pair 3

In [3]: restriction.finder('ATGGAATTCCGT','BamH1')
No such restriction site
  • Arguments with default values do not need to be supplied when calling a function. But if provided, will overwrite the default values

Example: restrcition_BamH1_default.py
def finder(DNA,enzyme='BamH1'):
        db={}
        name=['ECOR1','BAMH1','HINDIII']
        site=['GAATTC','GGATCC','AAGCTT']
        db=dict(zip(name,site))
        DNA=DNA.upper()
        enzyme=enzyme.upper()
        index=DNA.find(db[enzyme])
        if (index>-1):
                print ("The restriction site starts at base pair %d" %index)
        else:
                print ("No such restriction site\n")

In [1]: import restriction_BamH1_default

In [2]: restriction_BamH1_default.finder('ATGGAATTCCGT')
No such restriction site


In [3]: restriction_BamH1_default.finder('ATGGAATTCCGT','Ecor1')
The restriction site starts at base pair 3
  • Some existing Python modules

    • The os module provides a platform independent way to work with the operating system, make or remove files and directories

    • The csv module provides readers and writers for comma separated value data

    • The sys module contains many objects and functions for dealing with how python was complied or called when executed

    • The glob module proves the glob function to perform file globbing similar to what the unix shell provides

    • The math module provides common algebra and trigonometric function along with several math constants

    • The re module provides access to powerful regular expression

    • The datetime module provides time and datetime objects. allowing easy comparison of times and dates

    • The time module provides simple estimates for how long a command takes

    • The pickle module provides a way to save python objects to a file that you can unpickle later in a different program

    • The pypi module helps package installation

    • The numpy module is the de facto standard for numerical computing

    • The pandas module is useful for tabular data managing

    • The matplotlib module is the most frequently used plotting package in Python

    • The seaborn module is a module based on matplotlib. It provides a high-level interface for drawing attractive graphics.