Public Member Functions | |
JunctionVcfWriter (boost::shared_ptr< reference::CrrFile > ref, boost::shared_ptr< junctions::JunctionFiles > junctionFiles) | |
void | writeJunctionVcfHeaders (std::ostream &out) const |
Writes VCF headers. | |
std::string | createJunctionVcfId (const JunctionRef &jref, size_t side) const |
Returns VCF id of the junction side. | |
std::string | formatPositionForVcf (reference::Location pos, const std::string &sep) const |
Return genomic position in the VCF format: chromosome name without "chr" and 1-based offset. | |
void | writeJunctionPositionToVcf (const junctions::JunctionSideSection &jss, size_t side, std::ostream &out) const |
void | writeJunctionAltFieldToVcf (const JunctionRef &jref, size_t side, std::ostream &out, bool suppressChrom=false) const |
Writes the contents of the ALT column for a VCF "adjacency" line that corresponds to the given junction side. | |
std::string | convertMobileElementToVcf (const std::string &med) const |
Parse mobile element deletion information in our format, and return a string that contains the same information in more VCF-ish format: comma-separated, no "chr" in chromosome name, 1-based closed-interval coordinates. | |
void | writeJunctionInfoFieldToVcf (const JunctionRef &jref, size_t side, std::ostream &out) const |
Write semicolon-separated list of all INFO subfields. | |
void | addFilterFlag (std::ostream &out, const std::string &flag, bool &filtered) const |
Write semicolon-separated list of all INFO subfields. | |
void | writeJunctionFilterFieldToVcf (const JunctionRef &jref, std::ostream &out) const |
Write semicolon-separated list of all filters that this junction failed or PASS if it passes all filters. | |
void | writeJunctionComparisonField (const JunctionRef &jref, size_t side, std::ostream &out) const |
most of the junction information squeezed into a single field | |
void | writeJunctionToVcf (const JunctionRef &jref, size_t side, const JunctionCompatMapPerFile &compat, std::ostream &out) const |
Writes one side of a VCF "adjacency" to the given stream. | |
Public Attributes | |
std::string | fileFieldSeparator_ |
size_t | filterScoreThreshold_ |
size_t | filterSideLength_ |
Protected Member Functions | |
void | init () |
Initializes internal data such as sample IDs. | |
Protected Attributes | |
boost::shared_ptr < reference::CrrFile > | reference_ |
boost::shared_ptr < junctions::JunctionFiles > | junctionFiles_ |
std::vector< std::string > | sampleIds_ |
void cgatools::junctions::JunctionVcfWriter::addFilterFlag | ( | std::ostream & | out, | |
const std::string & | flag, | |||
bool & | filtered | |||
) | const |
Write semicolon-separated list of all INFO subfields.
Currently we write the type (always BND for "breakend"), ID of the other side of the junction, frequency in the baseline, and xref and deleted mobile element fields if present. Filter field helper.
std::string cgatools::junctions::JunctionVcfWriter::convertMobileElementToVcf | ( | const std::string & | med | ) | const |
Parse mobile element deletion information in our format, and return a string that contains the same information in more VCF-ish format: comma-separated, no "chr" in chromosome name, 1-based closed-interval coordinates.
std::string cgatools::junctions::JunctionVcfWriter::createJunctionVcfId | ( | const JunctionRef & | jref, | |
size_t | side | |||
) | const |
Returns VCF id of the junction side.
VCF has a separate line for each side of the "adjacency", and we generate the ID out of the junction file ID by appending "L" or "R" depending on the side.
std::string cgatools::junctions::JunctionVcfWriter::formatPositionForVcf | ( | reference::Location | pos, | |
const std::string & | sep | |||
) | const |
Return genomic position in the VCF format: chromosome name without "chr" and 1-based offset.
Separator between the chromosome subfield and the position can be specified; if the separator is empty, the chromosome subfield is not printed at all.
void cgatools::junctions::JunctionVcfWriter::writeJunctionAltFieldToVcf | ( | const JunctionRef & | jref, | |
size_t | side, | |||
std::ostream & | out, | |||
bool | suppressChrom = false | |||
) | const |
Writes the contents of the ALT column for a VCF "adjacency" line that corresponds to the given junction side.
This column describes the orientation of the current side, the transition sequence, and the orientation of the other side. It consist of the sequence field SF, transition field TF and the oriented adjacent position field (OAP). The SF field is the closest base to the junction from the current side. The TF sequence is in strand orientation required to convert the current side of the junction to the reference strand. The OAP is the 1-based position of the other-side's base that's closest to the junction, surrounded by either [ or ] characters, depending on whether the adjacent sequence extends right- or leftward from the specified position. The field order is SF TF OAP if the current side points right, and OAP TF SF otherwise.
void cgatools::junctions::JunctionVcfWriter::writeJunctionFilterFieldToVcf | ( | const JunctionRef & | jref, | |
std::ostream & | out | |||
) | const |
Write semicolon-separated list of all filters that this junction failed or PASS if it passes all filters.
The set of filters is the same as the high-confidence file filters, with the exception of not removing baseline cross-chr junctions.
void cgatools::junctions::JunctionVcfWriter::writeJunctionInfoFieldToVcf | ( | const JunctionRef & | jref, | |
size_t | side, | |||
std::ostream & | out | |||
) | const |
Write semicolon-separated list of all INFO subfields.
Currently we write the type (always BND for "breakend"), ID of the other side of the junction, frequency in the baseline, and xref and deleted mobile element fields if present.
void cgatools::junctions::JunctionVcfWriter::writeJunctionToVcf | ( | const JunctionRef & | jref, | |
size_t | side, | |||
const JunctionCompatMapPerFile & | compat, | |||
std::ostream & | out | |||
) | const |
Writes one side of a VCF "adjacency" to the given stream.
Note that, regardless of our junction side strand, VCF record is always written relative to the primary strand.