Obtaining genomic data from the UCSC database using table browser queries
Last updated
Last updated
MIT Resources
https://accessibility.mit.eduMassachusetts Institute of Technology
The UCSC table browser allows execution of complex database queries without knowledge of database query languages.
Open the UCSC browser and activate the "Dec. 2013 (GRCh38/hg38) human assembly.
Enter the position chr4:133149103-133195252 and click submit.
Click the "Tools-->Table Browser" link in the upper blue bar to open a Table Browser page.
Note the organism and assembly controls (red box)
Note the data type controls (blue box) that are associated with different kinds of data (browser tracks).
Note the region selection option
Set group to "Genes and Gene Prediction Tracks", track to "GENCODEv24", and table to "knownGene".
Restrict region to chr4:133,149,846-133,195,995
locate the output format menu and select "BED - browser extensible data" and click the "summary/statistics" button at the bottom of the page.
This provides details about your query results and can be very useful for data summaries and as a guide to ensure your query is behaving as desired.
The count of genes returned should be 3. Note that only 2 are visible in the browser. This is do to the default "Basic" display of the new known genes set ENCODEv24. For more information see here.
Return to the Table Browser and click the "get output" button to view the bed format version of the data.
Switch the output format to "GTF" and compare the results
Note that all different kinds of data can be accessed and downloaded in ways similar to this although the available output formats may be different.
Zoom to the 5' end of PCDH10 (chr4:133,148,848-133,150,064) and experiment with Comparative Genomics --> Cons 20 Mammals track. Note the base level conservation scores from output format "data points".
Request output in "custom track" format and view the data with "Full" setting in the browser.
Note the high scoring region just upstream of the 5'end that also overlaps with a DNase I hypersensitivity peak and an area conserved in fish and lamprey.
Turn on the CpG islands track in the Regulation group and observe the overlap with conserves region and DHS. Open the Encode Regulation track and turn on all the other available tracks by setting them to "pack". In addition to those other features, there is also a valley in the histone marks. Together these data are consistent with this region having a role in the regulation of expression of PCDH10.