cgatools::reference::CompactDnaSequence Class Reference

Class to describe the DNA sequence of a chromosome, in a compact manner. More...

#include <CompactDnaSequence.hpp>

List of all members.

Public Member Functions

 CompactDnaSequence (const std::string &name, bool circular, const void *packedData, const util::Md5Digest &md5, size_t length, const std::vector< AmbiguousRegion > amb)
std::string getSequence (int64_t pos, int64_t length) const
 Return the sequence of IUPAC codes for the chromosome as if by repeatedly calling CompactDnaSequence::getBase().
std::string getUnambiguousSequence (int64_t pos, int64_t length) const
 Return an unambiguous sequence of base calls as if by repeatedly calling CompactDnaSequence::getUnambiguousBase().
void appendSequence (std::string &seq, int64_t pos, int64_t length) const
 Append the sequence of IUPAC codes as if by repeatedly calling CompactDnaSequence::getBase().
void appendUnambiguousSequence (std::string &seq, int64_t pos, int64_t length) const
 Append an unambiguous sequence of base calls as if by repeatedly calling CompactDnaSequence::getUnambiguousBase().
char getBase (int64_t pos) const
 Get the IUPAC code for this chromosome at position pos.
char getUnambiguousBase (int64_t pos) const
 Return an unambiguous base call for this chromosome at position pos (as by util::BaseUtil::disambiguate(char)) that is consistent with the IUPAC code for the chromosome at this position.
size_t extendLeftBy3Mers (size_t pos, size_t count) const
 Return pos, extended to the left until it has passed by count distinct 3-mers of unambiguous reference sequence.
size_t extendRightBy3Mers (size_t pos, size_t count) const
 Return pos, extended to the right until it has passed by count distinct 3-mers of unambiguous reference sequence.
void validate () const
 Verify that the md5s recorded in the crr file metadata are the same as the md5s produced by re-computing them on the data.
const std::string & getName () const
 Return the name of this chromosome.
bool isCircular () const
 Return whether this chromosome is circular.
const util::Md5DigestgetMd5Digest () const
 Return the md5 digest of the chromosome's sequence.
size_t length () const
 Return the length in bases of the chromosome.
const std::vector
< AmbiguousRegion > & 
getAmbiguousRegions () const
 Return the list of AmbiguousRegion for this chromosome, in order by position.

Detailed Description

Class to describe the DNA sequence of a chromosome, in a compact manner.

Used internally by CrrFile class.


Member Function Documentation

void cgatools::reference::CompactDnaSequence::appendSequence ( std::string &  seq,
int64_t  pos,
int64_t  length 
) const

Append the sequence of IUPAC codes as if by repeatedly calling CompactDnaSequence::getBase().

void cgatools::reference::CompactDnaSequence::appendUnambiguousSequence ( std::string &  seq,
int64_t  pos,
int64_t  length 
) const

Append an unambiguous sequence of base calls as if by repeatedly calling CompactDnaSequence::getUnambiguousBase().

size_t cgatools::reference::CompactDnaSequence::extendLeftBy3Mers ( size_t  pos,
size_t  count 
) const

Return pos, extended to the left until it has passed by count distinct 3-mers of unambiguous reference sequence.

This function stops at the chromosome end, even for circular chromosomes.

size_t cgatools::reference::CompactDnaSequence::extendRightBy3Mers ( size_t  pos,
size_t  count 
) const

Return pos, extended to the right until it has passed by count distinct 3-mers of unambiguous reference sequence.

This function stops at the chromosome end, even for circular chromosomes.

const std::vector<AmbiguousRegion>& cgatools::reference::CompactDnaSequence::getAmbiguousRegions (  )  const [inline]

Return the list of AmbiguousRegion for this chromosome, in order by position.

char cgatools::reference::CompactDnaSequence::getBase ( int64_t  pos  )  const

Get the IUPAC code for this chromosome at position pos.

For circular chromosomes, pos is allowed to range from -length to 2*length-1.

const util::Md5Digest& cgatools::reference::CompactDnaSequence::getMd5Digest (  )  const [inline]

Return the md5 digest of the chromosome's sequence.

In particular, this is the md5 of the IUPAC codes of the chromosome, converted to upper case.

std::string cgatools::reference::CompactDnaSequence::getSequence ( int64_t  pos,
int64_t  length 
) const

Return the sequence of IUPAC codes for the chromosome as if by repeatedly calling CompactDnaSequence::getBase().

char cgatools::reference::CompactDnaSequence::getUnambiguousBase ( int64_t  pos  )  const

Return an unambiguous base call for this chromosome at position pos (as by util::BaseUtil::disambiguate(char)) that is consistent with the IUPAC code for the chromosome at this position.

For circular chromosomes, pos is allowed to range from -length to 2*length-1.

std::string cgatools::reference::CompactDnaSequence::getUnambiguousSequence ( int64_t  pos,
int64_t  length 
) const

Return an unambiguous sequence of base calls as if by repeatedly calling CompactDnaSequence::getUnambiguousBase().

void cgatools::reference::CompactDnaSequence::validate (  )  const

Verify that the md5s recorded in the crr file metadata are the same as the md5s produced by re-computing them on the data.


The documentation for this class was generated from the following file:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Defines

Generated by  doxygen 1.6.2