LogoLogo
LogoLogo
  • The Barbara K. Ostrom (1978) Bioinformatics and Computing Facility
  • Computing Resources
    • Active Data Storage
    • Archive Data Storage
    • Luria Cluster
      • FAQs
    • Other Resources
  • Bioinformatics Topics
    • Tools - A Basic Bioinformatics Toolkit
      • Getting more out of Microsoft Excel
      • Bioinformatics Applications of Unix
        • Unix commands applied to bioinformatics
        • Manipulate NGS files using UNIX commands
        • Manipulate alignment files using UNIX commands
      • Alignments and Mappers
      • Relational databases
        • Running Joins on Galaxy
      • Spotfire
    • Tasks - Bioinformatics Methods
      • UCSC Genome Bioinformatics
        • Interacting with the UCSC Genome Browser
        • Obtaining DNA sequence from the UCSC Database
        • Obtaining genomic data from the UCSC database using table browser queries
        • Filtering table browser queries
        • Performing a BLAT search
        • Creating Custom Tracks
        • UCSC Intersection Queries
        • Viewing cross-species alignments
        • Galaxy
          • Intro to Galaxy
          • Galaxy NGS Illumina QC
          • Galaxy NGS Illumina SE Mapping
          • Galaxy SNP Interval Data
        • Editing and annotation gene structures with Argo
      • GeneGO MetaCore
        • GeneGo Introduction
        • Loading Data Into GeneGO
        • Data Management in GeneGO
        • Setting Thresholds and Background Sets
        • Search And Browse Content Tab
        • Workflows and Reports Tab
        • One-click Analysis Tab
        • Building Network for Your Experimental Data
      • Functional Annotation of Gene Lists
      • Multiple Sequence Alignment
        • Clustalw2
      • Phylogenetic analysis
        • Neighbor Joining method in Phylip
      • Microarray data processing with R/Bioconductor
    • Running Jupyter notebooks on luria cluster nodes
  • Data Management
    • Globus
  • Mini Courses
    • Schedule
      • Previous Teaching
    • Introduction to Unix and KI Computational Resources
      • Basic Unix
        • Why Unix?
        • The Unix Tree
        • The Unix Terminal and Shell
        • Anatomy of a Unix Command
        • Basic Unix Commands
        • Output Redirection and Piping
        • Manual Pages
        • Access Rights
        • Unix Text Editors
          • nano
          • vi / vim
          • emacs
        • Shell Scripts
      • Software Installation
        • Module
        • Conda Environment
      • Slurm
    • Introduction to Unix
      • Why Unix?
      • The Unix Filesystem
        • The Unix Tree
        • Network Filesystems
      • The Unix Shell
        • About the Unix Shell
        • Unix Shell Manual Pages
        • Using the Unix Shell
          • Viewing the Unix Tree
          • Traversing the Unix Tree
          • Editing the Unix Tree
          • Searching the Unix Tree
      • Files
        • Viewing File Contents
        • Creating and Editing Files
        • Manipulating Files
        • Symbolic Links
        • File Ownership
          • How Unix File Ownership Works
          • Change File Ownership and Permissions
        • File Transfer (in-progress)
        • File Storage and Compression
      • Getting System Information
      • Writing Scripts
      • Schedule Scripts Using Crontab
    • Advanced Utilization of IGB Computational Resources
      • High Performance Computing Clusters
      • Slurm
        • Checking the Status of Computing Nodes
        • Submitting Jobs / Slurm Scripts
        • Interactive Sessions
      • Package Management
        • The System Package Manager
        • Environment Modules
        • Conda Environments
      • SSH Port Forwarding
        • SSH Port Forwarding Jupyter Notebooks
      • Containerization
        • Docker
          • Docker Installation
          • Running Docker Images
          • Building Docker Images
        • Singularity
          • Differences from Docker
          • Running Images in Singularity
      • Running Nextflow / nf-core Pipelines
    • Python
      • Introduction to Python for Biologists
        • Interactive Python
        • Types
          • Strings
          • Lists
          • Tuples
          • Dictionaries
        • Control Flow
        • Loops
          • For Loops
          • While Loops
        • Control Flows and Loops
        • Storing Programs for Re-use
        • Reading and Writing Files
        • Functions
      • Biopython
        • About Biopython
        • Quick Start
          • Basic Sequence Analyses
          • SeqRecord
          • Sequence IO
          • Exploration of Entrez Databases
        • Example Projects
          • Coronavirus Exploration
          • Translating a eukaryotic FASTA file of CDS entries
        • Further Resources
      • Machine Learning with Python
        • About Machine Learning
        • Hands-On
          • Project Introduction
          • Supervised Approaches
            • The Logistic Regression Model
            • K-Nearest Neighbors
          • Unsupervised Approaches
            • K-Means Clustering
          • Further Resources
      • Data Processing with Python
        • Pandas
          • About Pandas
          • Making DataFrames
          • Inspecting DataFrames
          • Slicing DataFrames
          • Selecting from DataFrames
          • Editing DataFrames
        • Matplotlib
          • About Matplotlib
          • Basic Plotting
          • Advanced Plotting
        • Seaborn
          • About Seaborn
          • Basic Plotting
          • Visualizing Statistics
          • Visualizing Proteomics Data
          • Visualizing RNAseq Data
    • R
      • Intro to R
        • Before We Start
        • Getting to Know R
        • Variables in R
        • Functions in R
        • Data Manipulation
        • Simple Statistics in R
        • Basic Plotting in R
        • Advanced Plotting in R
        • Writing Figures to a File
        • Further Resources
    • Version Control with Git
      • About Version Control
      • Setting up Git
      • Creating a Repository
      • Tracking Changes
        • Exercises
      • Exploring History
        • Exercises
      • Ignoring Things
      • Remotes in Github
      • Collaborating
      • Conflicts
      • Open Science
      • Licensing
      • Citation
      • Hosting
      • Supplemental
Powered by GitBook

MIT Resources

  • https://accessibility.mit.edu

Massachusetts Institute of Technology

On this page

Was this helpful?

Export as PDF
  1. Bioinformatics Topics
  2. Tasks - Bioinformatics Methods
  3. UCSC Genome Bioinformatics

Obtaining genomic data from the UCSC database using table browser queries

PreviousObtaining DNA sequence from the UCSC DatabaseNextFiltering table browser queries

Last updated 1 year ago

Was this helpful?

  • The UCSC table browser allows execution of complex database queries without knowledge of database query languages.

  • Open the and activate the "Dec. 2013 (GRCh38/hg38) human assembly.

  • Enter the position chr4:133149103-133195252 and click submit.

  • Click the "Tools-->Table Browser" link in the upper blue bar to open a Table Browser page.

The Table Browser

  • Note the organism and assembly controls (red box)

  • Note the data type controls (blue box) that are associated with different kinds of data (browser tracks).

  • Note the region selection option

  • Set group to "Genes and Gene Prediction Tracks", track to "GENCODEv24", and table to "knownGene".

  • Restrict region to chr4:133,149,846-133,195,995

  • locate the output format menu and select "BED - browser extensible data" and click the "summary/statistics" button at the bottom of the page.

    • This provides details about your query results and can be very useful for data summaries and as a guide to ensure your query is behaving as desired.

  • Return to the Table Browser and click the "get output" button to view the bed format version of the data.

  • Switch the output format to "GTF" and compare the results

  • Note that all different kinds of data can be accessed and downloaded in ways similar to this although the available output formats may be different.

  • Zoom to the 5' end of PCDH10 (chr4:133,148,848-133,150,064) and experiment with Comparative Genomics --> Cons 20 Mammals track. Note the base level conservation scores from output format "data points".

  • Request output in "custom track" format and view the data with "Full" setting in the browser.

  • Note the high scoring region just upstream of the 5'end that also overlaps with a DNase I hypersensitivity peak and an area conserved in fish and lamprey.

  • Turn on the CpG islands track in the Regulation group and observe the overlap with conserves region and DHS. Open the Encode Regulation track and turn on all the other available tracks by setting them to "pack". In addition to those other features, there is also a valley in the histone marks. Together these data are consistent with this region having a role in the regulation of expression of PCDH10.

The count of genes returned should be 3. Note that only 2 are visible in the browser. This is do to the default "Basic" display of the new known genes set ENCODEv24. For more information see .

Switch the output to "sequence" and select mRNA on the next page, save the resulting output file to your desktop as kg.fa, this file will be used to demonstrate searching. In case of problems, the file is available

here
BLAT
here
UCSC browser