cgatools::util::baseutil Namespace Reference

DNA nucleotide (base) utility functions. More...

Functions

bool isValidBase (char base)
bool isCalledSequence (const std::string &sequence)
 Returns true if all the bases in the sequence are valid.
bool isCalledSequence (const std::string &sequence, size_t start, size_t end)
 Returns true if all the bases in the subsequence are valid.
bool isValidIupacCode (char iupacCode)
 Returns true if iupacCode is a valid IUPAC (base ambiguity) code.
uint32_t pack (char base)
 Returns an integer corresponding to the given base, or throws an exception if the base is invalid.
char unpack (uint32_t packedBase)
 Returns the unpacked base call.
char disambiguate (char iupacCode)
 Returns an unambiguous base call that is consistent with the given iupacCode.
bool isConsistent (char lhs, char rhs)
 Returns true if the given IUPAC codes are consistent.
bool isConsistent (const std::string &lhs, const std::string &rhs)
 Returns true if lhs and rhs are consistent.
bool isConsistent (const std::string &lhs, size_t lhsStart, size_t lhsEnd, const std::string &rhs, size_t rhsStart, size_t rhsEnd)
 Returns true if lhs and rhs are consistent for the given range of posistions.
char complement (char iupacCode)
 Returns the complement of the given IUPAC code.
std::string reverseComplement (const std::string &sequence)
 Returns the reverse complement of the given sequence of IUPAC codes.

Detailed Description

DNA nucleotide (base) utility functions.


Function Documentation

char cgatools::util::baseutil::complement ( char  iupacCode  ) 

Returns the complement of the given IUPAC code.

For bases, the complements are as follows:

  • A -> T
  • C -> G
  • G -> C
  • T -> A An ambiguous IUPAC code's complement is compatible with the complements of all the bases the original IUPAC code is compatible with.
char cgatools::util::baseutil::disambiguate ( char  iupacCode  ) 

Returns an unambiguous base call that is consistent with the given iupacCode.

bool cgatools::util::baseutil::isConsistent ( const std::string &  lhs,
size_t  lhsStart,
size_t  lhsEnd,
const std::string &  rhs,
size_t  rhsStart,
size_t  rhsEnd 
)

Returns true if lhs and rhs are consistent for the given range of posistions.

Here, lhs and rhs are sequences of IUPAC codes, and in addition they may have zero or more '?' characters to indicate an unknown sequence of zero or more bases.

bool cgatools::util::baseutil::isConsistent ( const std::string &  lhs,
const std::string &  rhs 
)

Returns true if lhs and rhs are consistent.

Here, lhs and rhs are sequences of IUPAC codes, and in addition they may have zero or more '?' characters to indicate an unknown sequence of zero or more bases.

bool cgatools::util::baseutil::isConsistent ( char  lhs,
char  rhs 
) [inline]

Returns true if the given IUPAC codes are consistent.

Two IUPAC codes are considered consistent if there is some base A, C, G, or T such that both codes are consistent with that base.

bool cgatools::util::baseutil::isValidBase ( char  base  )  [inline]

Returns true for A, a, C, c, G, g, T, or t.

bool cgatools::util::baseutil::isValidIupacCode ( char  iupacCode  ) 

Returns true if iupacCode is a valid IUPAC (base ambiguity) code.

uint32_t cgatools::util::baseutil::pack ( char  base  ) 

Returns an integer corresponding to the given base, or throws an exception if the base is invalid.

Valid base calls:

  • A or a -> 0
  • C or c -> 1
  • G or g -> 2
  • T or t -> 3
std::string cgatools::util::baseutil::reverseComplement ( const std::string &  sequence  ) 

Returns the reverse complement of the given sequence of IUPAC codes.

char cgatools::util::baseutil::unpack ( uint32_t  packedBase  ) 

Returns the unpacked base call.

This can be used with BaseUtil::pack(char), such that BaseUtil::unpack(BaseUtil::pack(base)) is equivalent to toupper(base).

 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Defines

Generated by  doxygen 1.6.2