# Galaxy NGS Illumina SE Mapping

This tutorial shows how to perform basic QC on Illumina data, such as basic quality statistics, quality score boxplots, trimming and masking.

### Single-End Mapping of Illumina Data

**1. Load fastq file and annotate the uploaded data**

* On the Tool Panel, click on Get Data → Upload File.
* This tools allows you to upload a data from a file, url or a textbox.
  * Select *Galaxy\_GM12878\_trimmed.fastq* as input file.
  * Select "fastqsanger" as file format.
  * Select "hg18" as genome.
* Click the "Execute" button.

<figure><img src="/files/mqDgdVNuNVQdLDkRtPq4" alt=""><figcaption></figcaption></figure>

**2. Map to reference genome (hg18) using Bowtie**

* On the Tool Panel, click on NGS Toolbox Beta → NGS Mapping → Map with Bowtie for Illmina.
* This tools allows you to run the aligner Bowtie. The output is a SAM files with all the read alignments.
  * Select *hg canonical* as reference genome.
  * Leave other settings as default.
* Click the "Execute" button.

<figure><img src="/files/zrYRKYXSdAlOkKGT1xpZ" alt=""><figcaption></figcaption></figure>

**3. Filter SAM file on bitwise flag values**

* On the Tool Panel, click on NGS Toolbox Beta → NGS SAMtools → Filter SAM on bitwise flag values.
* This tools
  * Select data 2 as input dataset.
  * Add new flag with type set to "the read is unmapped" and the value set to "No".
  * Add new flag with type set to "read strand" and the value set to "Yes".
* Click the "Execute" button.
* With these parameters, the resulting output consists of those reads that are properly mapped and are on the reverse strand.

<figure><img src="/files/bcANNWjg2UyIXaDFKahN" alt=""><figcaption></figcaption></figure>

**4. Find how many reads map to each chromosome**

* On the Tool Panel, click on Join, Subtract and Group → Group →
* This tools
  * Select data 2 (bowtie output) as input dataset.
  * Group by column 3 (reference name, i.e. chromosome name)
  * Add new operation: count on column 1 .
* Click the "Execute" button.
* With these parameters, the resulting output consists of those reads that are properly mapped and are on the reverse strand.
* Edit the attributes of this module by clicking on the eye icon: rename as "read distribution by chromosome".

<figure><img src="/files/M5lKlfB77r7L43fSXzW7" alt=""><figcaption></figcaption></figure>

**5. Find the most represented chromosome**

* On the Tool Panel, click on Filter and Sort → Sort data →
* This tools
  * Select the column representing the key to sort
  * Select "Numerical sort" in "descending order" as options.
* Click the "Execute" button.
* With these parameters, the results show that chr19 is the most represented chromosome.

<figure><img src="/files/d2pbWHoec9ZR3FKQswcU" alt=""><figcaption></figcaption></figure>

**6. Convert SAM to BAM**

* On the Tool Panel, click on NGS Toolbox Beta→ NGS Samtools \&rarr SAM to BAM converter.
* This tools converts SAM-formatted files into BAM-formatted files.
  * Select the SAM file to convert.
  * Click the "Execute" button.

<figure><img src="/files/Fm88d4ctoLPcYwDXqu7A" alt=""><figcaption></figcaption></figure>

**7. Compute general statistics via Flagstat operation**

* On the Tool Panel, click on NGS Toolbox Beta→ NGS Samtools \&rarr flagstat.
* This tools provides a simple summary based on BAM-format.
  * Select data 6 (BAM file) as input.

<figure><img src="/files/3772vbPpLPWCEwIrHdzi" alt=""><figcaption></figcaption></figure>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://igb.mit.edu/bioinformatics-topics/tasks-bioinformatics-methods/ucsc-genome-bioinformatics/galaxy/galaxy-ngs-illumina-se-mapping.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
