# Galaxy NGS Illumina SE Mapping

This tutorial shows how to perform basic QC on Illumina data, such as basic quality statistics, quality score boxplots, trimming and masking.

### Single-End Mapping of Illumina Data

**1. Load fastq file and annotate the uploaded data**

* On the Tool Panel, click on Get Data → Upload File.
* This tools allows you to upload a data from a file, url or a textbox.
  * Select *Galaxy\_GM12878\_trimmed.fastq* as input file.
  * Select "fastqsanger" as file format.
  * Select "hg18" as genome.
* Click the "Execute" button.

<figure><img src="https://498238201-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWuHhstIreJ3jFvE4gQ3y%2Fuploads%2FOasnHVl6dz5SjuyG4mkA%2Fimage.png?alt=media&#x26;token=5b3f895a-a4b4-4d6c-8d7c-c82c32814e9c" alt=""><figcaption></figcaption></figure>

**2. Map to reference genome (hg18) using Bowtie**

* On the Tool Panel, click on NGS Toolbox Beta → NGS Mapping → Map with Bowtie for Illmina.
* This tools allows you to run the aligner Bowtie. The output is a SAM files with all the read alignments.
  * Select *hg canonical* as reference genome.
  * Leave other settings as default.
* Click the "Execute" button.

<figure><img src="https://498238201-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWuHhstIreJ3jFvE4gQ3y%2Fuploads%2FTmnsdFqaWwZSHo4gDKUl%2Fimage.png?alt=media&#x26;token=f6727060-6739-4cc5-b293-078808db0743" alt=""><figcaption></figcaption></figure>

**3. Filter SAM file on bitwise flag values**

* On the Tool Panel, click on NGS Toolbox Beta → NGS SAMtools → Filter SAM on bitwise flag values.
* This tools
  * Select data 2 as input dataset.
  * Add new flag with type set to "the read is unmapped" and the value set to "No".
  * Add new flag with type set to "read strand" and the value set to "Yes".
* Click the "Execute" button.
* With these parameters, the resulting output consists of those reads that are properly mapped and are on the reverse strand.

<figure><img src="https://498238201-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWuHhstIreJ3jFvE4gQ3y%2Fuploads%2FSYcO8cNkROtwcc7nzyoT%2Fimage.png?alt=media&#x26;token=6adcca39-2edf-4eb0-b164-e0a79a4dfaaa" alt=""><figcaption></figcaption></figure>

**4. Find how many reads map to each chromosome**

* On the Tool Panel, click on Join, Subtract and Group → Group →
* This tools
  * Select data 2 (bowtie output) as input dataset.
  * Group by column 3 (reference name, i.e. chromosome name)
  * Add new operation: count on column 1 .
* Click the "Execute" button.
* With these parameters, the resulting output consists of those reads that are properly mapped and are on the reverse strand.
* Edit the attributes of this module by clicking on the eye icon: rename as "read distribution by chromosome".

<figure><img src="https://498238201-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWuHhstIreJ3jFvE4gQ3y%2Fuploads%2F80P9ZG0xNjRqcBSk6E9X%2Fimage.png?alt=media&#x26;token=d1a06dc3-52b7-49e8-8602-ade23b308bd3" alt=""><figcaption></figcaption></figure>

**5. Find the most represented chromosome**

* On the Tool Panel, click on Filter and Sort → Sort data →
* This tools
  * Select the column representing the key to sort
  * Select "Numerical sort" in "descending order" as options.
* Click the "Execute" button.
* With these parameters, the results show that chr19 is the most represented chromosome.

<figure><img src="https://498238201-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWuHhstIreJ3jFvE4gQ3y%2Fuploads%2FkIzN7JD9z0lDTUAhQ0bI%2Fimage.png?alt=media&#x26;token=1eb6b263-fcee-416c-882b-c65f85c6c82d" alt=""><figcaption></figcaption></figure>

**6. Convert SAM to BAM**

* On the Tool Panel, click on NGS Toolbox Beta→ NGS Samtools \&rarr SAM to BAM converter.
* This tools converts SAM-formatted files into BAM-formatted files.
  * Select the SAM file to convert.
  * Click the "Execute" button.

<figure><img src="https://498238201-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWuHhstIreJ3jFvE4gQ3y%2Fuploads%2F3ccmCIvIWBqk9HuLLkyY%2Fimage.png?alt=media&#x26;token=38f75d09-d5b3-4ce7-b573-ae75ce087432" alt=""><figcaption></figcaption></figure>

**7. Compute general statistics via Flagstat operation**

* On the Tool Panel, click on NGS Toolbox Beta→ NGS Samtools \&rarr flagstat.
* This tools provides a simple summary based on BAM-format.
  * Select data 6 (BAM file) as input.

<figure><img src="https://498238201-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWuHhstIreJ3jFvE4gQ3y%2Fuploads%2F87wbtEe6NyoiVS0EuYVO%2Fimage.png?alt=media&#x26;token=c0e2c548-2393-41f8-b0db-db9f4f89086f" alt=""><figcaption></figcaption></figure>
