Filename extensions | .msf, .pup, .pileup |
---|---|
Developed by | Tony Cox and Zemin Ning |
Type of format | Bioinformatics |
Extended from | Tab separated values |
Website | www |
Pileup format is a text-based format for summarizing the base calls of aligned reads to a reference sequence. This format facilitates visual display of SNP/indel calling and alignment. It was first used by Tony Cox and Zemin Ning at the Wellcome Trust Sanger Institute, and became widely known through its implementation within the SAMtools software suite. [1]
Sequence | Position | Reference Base | Read Count | Read Results | Quality |
---|---|---|---|---|---|
seq1 | 272 | T | 24 | ,.$.....,,.,.,...,,,.,..^+. | <<<+;<<<<<<<<<<<=<;<;7<&
|
seq1 | 273 | T | 23 | ,.....,,.,.,...,,,.,..A | <<<;<<<<<<<<<3<=<<<;<<+
|
seq1 | 274 | T | 23 | ,.$....,,.,.,...,,,.,... | 7<7;<;<<<<<<<<<=<;<;<<6
|
seq1 | 275 | A | 23 | ,$....,,.,.,...,,,.,...^l. | <+;9*<<<<<<<<<=<<:;<<<<
|
seq1 | 276 | G | 22 | ...T,,.,.,...,,,.,.... | 33;+<<7=7<<7<&<<1;<<6<
|
seq1 | 277 | T | 22 | ....,,.,.,.C.,,,.,..G. | +7<;<<<<<<<&<=<<:;<<&<
|
seq1 | 278 | G | 23 | ....,,.,.,...,,,.,....^k. | %38*<<;<7<<7<=<<<;<<<<<
|
seq1 | 279 | C | 23 | A..T,,.,.,...,,,.,..... | 75&<<<<<<<<<=<<<9<<:<<<
|
Each line consists of 5 (or optionally 6) tab-separated columns:
\+[0-9]+[ACGTNacgtn]+
denotes an insertion of one or more bases starting from the next position. For example, +2AG means insertion of AG in the forward strand\-[0-9]+[ACGTNacgtn]+
denotes a deletion of one or more bases starting from the next position. For example, -2ct means deletion of CT in the reverse strand-[0-9]+[ACGTNacgtn]+
notationThis is an optional column. If present, the ASCII value of the character minus 33 gives the mapping Phred quality of each of the bases in the previous column 5. This is similar to quality encoding in the FASTQ format.
There is no standard file extension for a Pileup file, but .msf (multiple sequence file), .pup[2] and .pileup[3][4] are used.
Original source: https://en.wikipedia.org/wiki/Pileup format.
Read more |