fusor.models

Model for FUSOR classes

class fusor.models.AbstractFusion(**data)[source]

Define AbstractFusion class

classmethod enforce_abc(values)[source]

Ensure only subclasses can be instantiated.

classmethod enforce_element_quantities(values)[source]

Ensure minimum # of elements, and require > 1 unique genes.

To validate the unique genes rule, we extract gene IDs from the elements that designate genes, and take the number of total elements. If there is only one unique gene ID, and there are no non-gene-defining elements (such as an unknown partner), then we raise an error.

classmethod structure_ends(values)[source]

Ensure start/end elements are of legal types and have fields required by their position.

type: FusionType[source]
viccNomenclature: Optional[Annotated[str]][source]
class fusor.models.AbstractTranscriptStructuralVariant(**data)[source]

Define AbstractTranscriptStructuralVariant class

classmethod enforce_abc(values)[source]

Ensure only subclasses can be instantiated.

fivePrimeJunction: Optional[Annotated[str]][source]
readingFramePreserved: Optional[Annotated[bool]][source]
regulatoryElement: Optional[RegulatoryElement][source]
structure: list[BaseStructuralElement][source]
threePrimeJunction: Optional[Annotated[str]][source]
class fusor.models.AdditionalFields(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Define possible fields that can be added to Fusion object.

LOCATION_ID = 'location_id'[source]
SEQUENCE_ID = 'sequence_id'[source]
class fusor.models.AnchoredReads(**data)[source]

Define AnchoredReads class

This class can be used to report the number of reads that span the fusion junction. This is used at the TranscriptSegment level, as it indicates the transcript where the longer segment of the read is found

reads: int[source]
type: Literal[<FUSORTypes.ANCHORED_READS: 'AnchoredReads'>][source]
class fusor.models.Assay(**data)[source]

Information pertaining to the assay used in identifying the fusion.

assayId: Optional[Annotated[str]][source]
assayName: Optional[Annotated[str]][source]
fusionDetection: Optional[Evidence][source]
methodUri: Optional[Annotated[str]][source]
type: Literal['Assay'][source]
class fusor.models.AssayedFusion(**data)[source]

Assayed gene fusions from biological specimens are directly detected using RNA-based gene fusion assays, or alternatively may be inferred from genomic rearrangements detected by whole genome sequencing or by coarser-scale cytogenomic assays. Example: an EWSR1 fusion inferred from a breakapart FISH assay.

assay: Optional[Assay][source]
causativeEvent: Optional[CausativeEvent][source]
contig: Optional[ContigSequence][source]
readData: Optional[ReadData][source]
structure: list[Annotated[TranscriptSegmentElement | GeneElement | TemplatedSequenceElement | LinkerElement | UnknownGeneElement | ContigSequence | ReadData]][source]
type: Literal[<FUSORTypes.ASSAYED_FUSION: 'AssayedFusion'>][source]
class fusor.models.BaseModelForbidExtra(**data)[source]

Base Pydantic model class with extra values forbidden.

class fusor.models.BaseStructuralElement(**data)[source]

Define BaseStructuralElement class.

type: StructuralElementType[source]
class fusor.models.BreakpointCoverage(**data)[source]

Define BreakpointCoverage class.

This class models breakpoint coverage, or the number of fragments that are retained near the breakpoint for a fusion partner

fragmentCoverage: int[source]
type: Literal[<FUSORTypes.BREAKPOINT_COVERAGE: 'BreakpointCoverage'>][source]
class fusor.models.CategoricalFusion(**data)[source]

Categorical gene fusions are generalized concepts representing a class of fusions by their shared attributes, such as retained or lost regulatory elements and/or functional domains, and are typically curated from the biomedical literature for use in genomic knowledgebases.

criticalFunctionalDomains: Optional[list[FunctionalDomain]][source]
extensions: Optional[list[Extension]][source]
structure: list[Annotated[TranscriptSegmentElement | GeneElement | TemplatedSequenceElement | LinkerElement | MultiplePossibleGenesElement]][source]
type: Literal[<FUSORTypes.CATEGORICAL_FUSION: 'CategoricalFusion'>][source]
class fusor.models.CausativeEvent(**data)[source]

Define causative event information for a fusion.

The evaluation of a fusion may be influenced by the underlying mechanism that generated the fusion. Often this will be a DNA rearrangement, but it could also be a read-through or trans-splicing event.

eventDescription: Optional[Annotated[str]][source]
eventType: EventType[source]
type: Literal[<FUSORTypes.CAUSATIVE_EVENT: 'CausativeEvent'>][source]
class fusor.models.ContigSequence(**data)[source]

Define ContigSequence class.

This class models the assembled contig sequence that supports the reported fusion event

contig: Annotated[str][source]
type: Literal[<FUSORTypes.CONTIG_SEQUENCE: 'ContigSequence'>][source]
class fusor.models.DomainStatus(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Define possible statuses of functional domains.

LOST = 'lost'[source]
PRESERVED = 'preserved'[source]
class fusor.models.DuplicationType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Define possible Duplication types

INTERNAL_TANDEM_DUPLICATION = 'InternalTandemDuplication'[source]
classmethod values()[source]

Provide all possible enum values.

Return type:

set[str]

class fusor.models.EventType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Permissible values for describing the underlying causative event driving an assayed fusion.

READ_THROUGH = 'read-through'[source]
REARRANGEMENT = 'rearrangement'[source]
TRANS_SPLICING = 'trans-splicing'[source]
class fusor.models.Evidence(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Form of evidence supporting identification of the fusion.

INFERRED = 'inferred'[source]
OBSERVED = 'observed'[source]
class fusor.models.FUSORTypes(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Define FUSOR object type values.

ANCHORED_READS = 'AnchoredReads'[source]
ASSAYED_FUSION = 'AssayedFusion'[source]
BREAKPOINT_COVERAGE = 'BreakpointCoverage'[source]
CATEGORICAL_FUSION = 'CategoricalFusion'[source]
CAUSATIVE_EVENT = 'CausativeEvent'[source]
CONTIG_SEQUENCE = 'ContigSequence'[source]
FUNCTIONAL_DOMAIN = 'FunctionalDomain'[source]
GENE_ELEMENT = 'GeneElement'[source]
INTERNAL_TANDEM_DUPLICATION = 'InternalTandemDuplication'[source]
LINKER_SEQUENCE_ELEMENT = 'LinkerSequenceElement'[source]
MULTIPLE_POSSIBLE_GENES_ELEMENT = 'MultiplePossibleGenesElement'[source]
READ_DATA = 'ReadData'[source]
REGULATORY_ELEMENT = 'RegulatoryElement'[source]
SPANNING_READS = 'SpanningReads'[source]
SPLIT_READS = 'SplitReads'[source]
TEMPLATED_SEQUENCE_ELEMENT = 'TemplatedSequenceElement'[source]
TRANSCRIPT_SEGMENT_ELEMENT = 'TranscriptSegmentElement'[source]
UNKNOWN_GENE_ELEMENT = 'UnknownGeneElement'[source]
class fusor.models.FunctionalDomain(**data)[source]

Define FunctionalDomain class

associatedGene: MappableConcept[source]
id: Optional[Annotated[str]][source]
label: Optional[Annotated[str]][source]
sequenceLocation: Optional[SequenceLocation][source]
status: DomainStatus[source]
type: Literal[<FUSORTypes.FUNCTIONAL_DOMAIN: 'FunctionalDomain'>][source]
class fusor.models.FusionType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Specify possible Fusion types.

ASSAYED_FUSION = 'AssayedFusion'[source]
CATEGORICAL_FUSION = 'CategoricalFusion'[source]
classmethod values()[source]

Provide all possible enum values.

Return type:

set

class fusor.models.GeneElement(**data)[source]

Define Gene Element class.

gene: MappableConcept[source]
type: Literal[<FUSORTypes.GENE_ELEMENT: 'GeneElement'>][source]
class fusor.models.InternalTandemDuplication(**data)[source]

Internal tandem duplications are repeated transcribed elements within a gene as a result of focal duplications. These can be described in both an assayed and categorical context. These events differ from fusions in that the same gene symbol must be used for both event partners, indicating a duplication.

assay: Optional[Assay][source]
causativeEvent: Optional[CausativeEvent][source]
contig: Optional[ContigSequence][source]
criticalFunctionalDomains: Optional[list[FunctionalDomain]][source]
classmethod enforce_itd_element_quantities(values)[source]

Ensure minimum # of elements for InternalTandemDuplications (ITDs)

To validate the unique genes rule, we extract gene IDs from the elements that designate genes, and take the number of total elements. If there is only one unique gene ID, and there are no non-gene-defining elements (such as an unknown partner), then we raise an error.

readData: Optional[ReadData][source]
structure: list[Annotated[TranscriptSegmentElement | GeneElement | TemplatedSequenceElement | LinkerElement | UnknownGeneElement | MultiplePossibleGenesElement]][source]
type: Literal[<FUSORTypes.INTERNAL_TANDEM_DUPLICATION: 'InternalTandemDuplication'>][source]
class fusor.models.LinkerElement(**data)[source]

Define LinkerElement class (linker sequence)

linkerSequence: LiteralSequenceExpression[source]
type: Literal[<FUSORTypes.LINKER_SEQUENCE_ELEMENT: 'LinkerSequenceElement'>][source]
class fusor.models.MultiplePossibleGenesElement(**data)[source]

Define MultiplePossibleGenesElement class.

This is primarily intended to represent a partner in a categorical fusion, typifying generalizable characteristics of a class of fusions such as retained or lost regulatory elements and/or functional domains, often curated from biomedical literature for use in genomic knowledgebases. For example, EWSR1 rearrangements are often found in Ewing and Ewing-like small round cell sarcomas, regardless of the partner gene. We would associate this assertion with the fusion of EWSR1 with a MultiplePossibleGenesElement.

type: Literal[<FUSORTypes.MULTIPLE_POSSIBLE_GENES_ELEMENT: 'MultiplePossibleGenesElement'>][source]
class fusor.models.ReadData(**data)[source]

Define ReadData class.

This class is used at the AssayedFusion level when a fusion caller reports metadata describing sequencing reads for the fusion event

spanning: Optional[SpanningReads][source]
split: Optional[SplitReads][source]
type: Literal[<FUSORTypes.READ_DATA: 'ReadData'>][source]
class fusor.models.RegulatoryClass(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Define possible classes of Regulatory Elements.

Options are the possible values for /regulatory_class value property in the INSDC controlled vocabulary.

ATTENUATOR = 'attenuator'[source]
CAAT_SIGNAL = 'caat_signal'[source]
ENHANCER = 'enhancer'[source]
ENHANCER_BLOCKING_ELEMENT = 'enhancer_blocking_element'[source]
GC_SIGNAL = 'gc_signal'[source]
IMPRINTING_CONTROL_REGION = 'imprinting_control_region'[source]
INSULATOR = 'insulator'[source]
LOCUS_CONTROL_REGION = 'locus_control_region'[source]
MINUS_10_SIGNAL = 'minus_10_signal'[source]
MINUS_35_SIGNAL = 'minus_35_signal'[source]
OTHER = 'other'[source]
POLYA_SIGNAL_SEQUENCE = 'polya_signal_sequence'[source]
PROMOTER = 'promoter'[source]
RESPONSE_ELEMENT = 'response_element'[source]
RIBOSOME_BINDING_SITE = 'ribosome_binding_site'[source]
RIBOSWITCH = 'riboswitch'[source]
SILENCER = 'silencer'[source]
TATA_BOX = 'tata_box'[source]
TERMINATOR = 'terminator'[source]
class fusor.models.RegulatoryElement(**data)[source]

Define RegulatoryElement class.

featureId would ideally be constrained as a CURIE, but Encode, our preferred feature ID source, doesn’t currently have a registered CURIE structure for EH_ identifiers. Consequently, we permit any kind of free text.

associatedGene: Optional[MappableConcept][source]
classmethod ensure_min_values(values)[source]

Ensure that one of {featureId, featureLocation}, and/or associatedGene is set.

featureId: Optional[str][source]
featureLocation: Optional[SequenceLocation][source]
regulatoryClass: RegulatoryClass[source]
type: Literal[<FUSORTypes.REGULATORY_ELEMENT: 'RegulatoryElement'>][source]
class fusor.models.SpanningReads(**data)[source]

Define SpanningReads class.

This class models the number of pairs of reads that support the reported fusion event

spanningReads: int[source]
type: Literal[<FUSORTypes.SPANNING_READS: 'SpanningReads'>][source]
class fusor.models.SplitReads(**data)[source]

Define SplitReads class.

This class models the number of reads that cover the junction bewteen the detected partners in the fusion

splitReads: int[source]
type: Literal[<FUSORTypes.SPLIT_READS: 'SplitReads'>][source]
class fusor.models.StructuralElementType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Define possible structural element type values.

GENE_ELEMENT = 'GeneElement'[source]
LINKER_SEQUENCE_ELEMENT = 'LinkerSequenceElement'[source]
MULTIPLE_POSSIBLE_GENES_ELEMENT = 'MultiplePossibleGenesElement'[source]
TEMPLATED_SEQUENCE_ELEMENT = 'TemplatedSequenceElement'[source]
TRANSCRIPT_SEGMENT_ELEMENT = 'TranscriptSegmentElement'[source]
UNKNOWN_GENE_ELEMENT = 'UnknownGeneElement'[source]
class fusor.models.TemplatedSequenceElement(**data)[source]

Define TemplatedSequenceElement class.

A templated sequence is a contiguous genomic sequence found in the gene product.

region: SequenceLocation[source]
strand: Strand[source]
type: Literal[<FUSORTypes.TEMPLATED_SEQUENCE_ELEMENT: 'TemplatedSequenceElement'>][source]
class fusor.models.TranscriptSegmentElement(**data)[source]

Define TranscriptSegmentElement class

anchoredReads: Optional[AnchoredReads][source]
classmethod check_exons(values)[source]

Check that at least one of {exonStart, exonEnd} is set. If set, check that the corresponding elementGenomic field is set. If not set, set corresponding offset to None

coverage: Optional[BreakpointCoverage][source]
elementGenomicEnd: Optional[SequenceLocation][source]
elementGenomicStart: Optional[SequenceLocation][source]
exonEnd: Optional[Annotated[int]][source]
exonEndOffset: Optional[Annotated[int]][source]
exonStart: Optional[Annotated[int]][source]
exonStartOffset: Optional[Annotated[int]][source]
gene: MappableConcept[source]
strand: Strand[source]
transcript: Annotated[str][source]
type: Literal[<FUSORTypes.TRANSCRIPT_SEGMENT_ELEMENT: 'TranscriptSegmentElement'>][source]
class fusor.models.UnknownGeneElement(**data)[source]

Define UnknownGene class.

This is primarily intended to represent a partner in the result of a fusion partner-agnostic assay, which identifies the absence of an expected gene. For example, a FISH break-apart probe may indicate rearrangement of an MLL gene, but by design, the test cannot provide the identity of the new partner. In this case, we would associate any clinical observations from this patient with the fusion of MLL with an UnknownGene element.

type: Literal[<FUSORTypes.UNKNOWN_GENE_ELEMENT: 'UnknownGeneElement'>][source]
fusor.models.save_fusions_cache(variants_list, cache_name, cache_dir=None)[source]

Save a list of translated fusions as a cache

Parameters:
  • variants_list (list[AssayedFusion | CategoricalFusion | InternalTandemDuplication]) – A list of FUSOR-translated fusions or ITDs

  • cache_name (str) – The name for the resultant cached file

  • cache_dir (Optional[Path]) – The location to store the cached file. If this parameter is not supplied, it will default to storing data in the FUSOR_DATA_DIR directory

Return type:

None