fusor.harvester

Harvester methods for output from different fusion callers

class fusor.harvester.ArribaHarvester(fusor, assembly)[source]

Class for harvesting Arriba data

column_rename: ClassVar[dict[str, str]] = {'#gene1': 'gene1', 'reading_frame': 'rf', 'strand1(gene/fusion)': 'strand1', 'strand2(gene/fusion)': 'strand2', 'type': 'event_type'}[source]
coordinate_type: CoordinateType = 'residue'[source]
delimiter: str = '\t'[source]
fusion_caller[source]

alias of Arriba

translator_class[source]

alias of ArribaTranslator

class fusor.harvester.CIVICHarvester(fusor, update_cache=False, update_from_remote=True, local_cache_path=civic.LOCAL_CACHE_PATH, include_status=None)[source]

Class for harvesting CIViC Fusion objects

__init__(fusor, update_cache=False, update_from_remote=True, local_cache_path=civic.LOCAL_CACHE_PATH, include_status=None)[source]

Initialize CivicHarvester class.

Parameters:
  • fusor (FUSOR) – A FUSOR class instance

  • update_cache (bool) – True if civicpy cache should be updated. Note this will take several minutes. False if to use local cache.

  • update_from_remote (bool) – If set to True, civicpy.update_cache will first download the remote cache designated by REMOTE_CACHE_URL, store it to LOCAL_CACHE_PATH, and then load the downloaded cache into memory. This parameter defaults to True.

  • local_cache_path (str) – A filepath destination for the retrieved remote cache. This parameter defaults to LOCAL_CACHE_PATH from civicpy.

  • include_status (Optional[list[Literal['accepted', 'submitted', 'rejected']]]) – Whether to include accepted, submitted, and/or rejected fusion variants from civicpy cache. By default, this is set to accepted if include_status is set to None.

async load_records()[source]

Convert CIViC fusions to CategoricalFusion objects

:return A list of CategoricalFusion objects

Return type:

list[CategoricalFusion]

translator_class[source]

alias of CIVICTranslator

class fusor.harvester.CiceroHarvester(fusor, assembly)[source]

Class for harvesting Cicero data

column_rename: ClassVar[dict[str, str]] = {'chrA': 'chr_5prime', 'chrB': 'chr_3prime', 'coverageA': 'coverage_5prime', 'coverageB': 'coverage_3prime', 'geneA': 'gene_5prime', 'geneB': 'gene_3prime', 'posA': 'pos_5prime', 'posB': 'pos_3prime', 'readsA': 'reads_5prime', 'readsB': 'reads_3prime', 'type': 'event_type'}[source]
coordinate_type: CoordinateType = 'residue'[source]
delimiter: str = '\t'[source]
fusion_caller[source]

alias of Cicero

translator_class[source]

alias of CiceroTranslator

class fusor.harvester.EnFusionHarvester(fusor, assembly)[source]

Class for harvesting EnFusion data

column_rename: ClassVar[dict[str, str]] = {'Break1': 'break_5prime', 'Break2': 'break_3prime', 'Chr1': 'chr_5prime', 'Chr2': 'chr_3prime', 'FusionJunctionSequence': 'fusion_junction_sequence', 'Gene1': 'gene_5prime', 'Gene2': 'gene_3prime'}[source]
coordinate_type: CoordinateType = 'residue'[source]
delimiter: str = '\t'[source]
fusion_caller[source]

alias of EnFusion

translator_class[source]

alias of EnFusionTranslator

class fusor.harvester.FusionCallerHarvester(fusor, assembly)[source]

ABC for fusion caller harvesters

__init__(fusor, assembly)[source]

Initialize FusionCallerHarvester

Parameters:
  • fusor (FUSOR) – A FUSOR object

  • assembly (Assembly) – The assembly that the coordinates are described on

column_rename: ClassVar[dict[str, str]][source]
coordinate_type: CoordinateType[source]
delimiter: str[source]
fusion_caller: type[FusionCaller][source]
async harvest_records(fusion_path)[source]

Convert rows of fusion caller output to FusionCallerRecord objects.

Each entry in the returned list includes the raw fusion caller output row, as well as the annotated fusion model. Any errors encountered when annotating fusions are captured and returned per fusion.

Parameters:

fusion_path (Path) – The path to the fusions file

Raises:

ValueError – if the file does not exist at the specified path

Return type:

AsyncGenerator[tuple[dict, FusionCaller], None]

Returns:

A generator of raw fusion rows, and fusion call outputs.

async load_record_table(fusion_path)[source]

Convert rows of fusion caller output to FusionCallerRecord objects.

Each entry in the returned list includes the raw fusion caller output row, as well as the annotated fusion model. Any errors encountered when annotating fusions are captured and returned per fusion.

Parameters:

fusion_path (Path) – The path to the fusions file

Raises:

ValueError – if the file does not exist at the specified path

Return type:

list[FusionCallerRecord]

Returns:

A list of fusion caller records

async load_records(fusion_path)[source]

Convert rows of fusion caller output to AssayedFusion and InternalTandemDuplication objects

Parameters:

fusion_path (Path) – The path to the fusions file

Raises:

ValueError – if the file does not exist at the specified path

Return type:

list[AssayedFusion | InternalTandemDuplication]

Returns:

A list of translated fusions, represented as AssayedFusion or InternalTandemDuplcation objects

async translate(fusion)[source]

Call the translator for this fusion.

Parameters:

fusion (FusionCaller) – The fusion call entry.

Return type:

AbstractTranscriptStructuralVariant

Returns:

The translated fusion call. Usually an AssayedFusion.

translator_class: type[TypeVar(T, bound= Translator)][source]
class fusor.harvester.FusionCallerRecord(*args, **kwargs)[source]

Records a single entry in a fusion caller data table.

Parameters:
  • source – The untranslated fusion caller row

  • annotated – The annotated fusion data

  • annotation_errors – Captures any errors that occurred during fusion translation

__init__(*args, **kwargs)[source]
annotated: Optional[AbstractTranscriptStructuralVariant][source]
annotation_error: Optional[str][source]
source: dict[str, Any][source]
class fusor.harvester.FusionCatcherHarvester(fusor, assembly)[source]

Class for harvesting FusionCatcher data

column_rename: ClassVar[dict[str, str]] = {'Fusion_point_for_gene_1(5end_fusion_partner)': 'five_prime_fusion_point', 'Fusion_point_for_gene_2(3end_fusion_partner)': 'three_prime_fusion_point', 'Fusion_sequence': 'fusion_sequence', 'Gene_1_symbol(5end_fusion_partner)': 'five_prime_partner', 'Gene_2_symbol(3end_fusion_partner)': 'three_prime_partner', 'Predicted_effect': 'predicted_effect', 'Spanning_pairs': 'spanning_reads', 'Spanning_unique_reads': 'spanning_unique_reads'}[source]
coordinate_type: CoordinateType = 'residue'[source]
delimiter: str = '\t'[source]
fusion_caller[source]

alias of FusionCatcher

translator_class[source]

alias of FusionCatcherTranslator

class fusor.harvester.GenieHarvester(fusor, assembly)[source]

Class for harvesting Genie data

column_rename: ClassVar[dict[str, str]] = {'Annotation': 'annot', 'Site1_Chromosome': 'site1_chrom', 'Site1_Hugo_Symbol': 'site1_hugo', 'Site1_Position': 'site1_pos', 'Site2_Chromosome': 'site2_chrom', 'Site2_Effect_On_Frame': 'reading_frame', 'Site2_Hugo_Symbol': 'site2_hugo', 'Site2_Position': 'site2_pos'}[source]
coordinate_type: CoordinateType = 'residue'[source]
delimiter: str = '\t'[source]
fusion_caller[source]

alias of Genie

translator_class[source]

alias of GenieTranslator

class fusor.harvester.JAFFAHarvester(fusor, assembly)[source]

Class for harvesting JAFFA data

column_rename: ClassVar[dict[str, str]] = {'fusion genes': 'fusion_genes', 'spanning pairs': 'spanning_pairs', 'spanning reads': 'spanning_reads'}[source]
coordinate_type: CoordinateType = 'residue'[source]
delimiter: str = ','[source]
fusion_caller[source]

alias of JAFFA

translator_class[source]

alias of JAFFATranslator

class fusor.harvester.MOAHarvester(fusor, cache_dir=None, force_refresh=False, use_local=False)[source]

Class for harvesting Molecular Oncology Almanac (MOA) fusion data

__init__(fusor, cache_dir=None, force_refresh=False, use_local=False)[source]

Initialize MOAHarvester class

Parameters:
  • fusor (FUSOR) – A FUSOR object

  • cache_dir (Optional[Path]) – The path to the store the cached MOA assertions. This by defualt is set to None, and the MOA assertions are stored in the FUSOR_DATA_DIR directory.

  • use_local (bool) – A boolean indicating if the latest local available file should be use. By default, this is set to False.

Paran force_refresh:

A boolean indicating if the MOA assertions file should be regenerated. By default, this is set to False.

load_records()[source]

Convert MOA records to CategoricalFusion objects

:return A list of CategoricalFusion objects

Return type:

list[CategoricalFusion]

translator_class: MOATranslator[source]
class fusor.harvester.StarFusionHarvester(fusor, assembly)[source]

Class for harvesting STAR-Fusion data

column_rename: ClassVar[dict[str, str]] = {'JunctionReadCount': 'junction_read_count', 'LeftBreakpoint': 'left_breakpoint', 'LeftGene': 'left_gene', 'RightBreakpoint': 'right_breakpoint', 'RightGene': 'right_gene', 'SpanningFragCount': 'spanning_frag_count'}[source]
coordinate_type: CoordinateType = 'residue'[source]
delimiter: str = '\t'[source]
fusion_caller[source]

alias of STARFusion

translator_class[source]

alias of STARFusionTranslator