The civic module

CIViCpy is primarily designed to enable exploration of the content of CIViC through Python CivicRecord objects. While these record objects can be initialized independently, the civic module also provides several routines for getting records directly from CIViC. Use of these routines is recommended.

The civic module may be imported from civicpy at the top level:

>>>from civicpy import civic

CIViC records

class civic.CivicRecord(partial=False, **kwargs)[source]

As a base class, CivicRecord is used to define the characteristic of all records in CIViC. This class is not intended to be invoked directly by the end user, but provided for documentation of shared methods and variables in child classes.

__init__(partial=False, **kwargs)[source]

The record object may be initialized by the user, though the practice is discouraged. To do so, values for each of the object attributes (except type) must be specified as keyword arguments, or the partial parameter must be set to True. If partial is set to True, the id keyword argument is still required.

Users are encouraged to use the functions for getting records in lieu of directly initializing record objects.

Parameters:

partial (bool) – Indicates whether the the set of object attributes passed is incomplete. If set to True the id keyword is required.

type

The type of record. This field is set automatically by child classes and should not be changed.

id

The record ID. This is set on initialization using the id keyword argument, and reflects the primary ID for the record as stored in CIViC.

A URL string to the appropriate landing page for the CivicRecord on the CIViC web application.

property site_link

Returns a URL to the record on the CIViC web application.

update(allow_partial=True, force=False, **kwargs)[source]

Updates the record object from the cache or the server. Keyword arguments may be passed to kwargs, which will update the corresponding attributes of the CivicRecord instance.

Parameters:
  • allow_partial (bool) – Flag to indicate whether the record will be updated according to the contents of CACHE, without requiring all attributes to be assigned.

  • force (bool) – Flag to indicate whether to force an update from the server, even if a full record ecists in the cache.

Returns:

True if record is complete after update, else False.

CIViC record types

The primary CIViC records are found on the CIViC advanced search page, and are fully-formed

class civic.Gene(**kwargs)[source]

Bases: CivicRecord

aliases

A list of alternate gene symbols by which this gene is referenced.

description

A curated summary of the clinical significance of this gene.

entrez_id

The Entrez ID associated with this gene.

name

The HGNC Gene Symbol associated with this gene.

sources

A list of CivicAttribute source objects associated with the gene description.

variants

A list of Variant records associated with this gene.

class civic.Variant(**kwargs)[source]

Bases: CivicRecord

allele_registry_id

The ClinGen Allele Registry ID associated with this variant.

clinvar_entries

A list of clinvar ids associated with this variant.

coordinates

A CivicAttribute object describing CIViC coordinates.

entrez_id

The Entrez ID of the gene this variant belongs to.

entrez_name

The HGNC Gene Symbol of the gene this variant belongs to.

gene

The Gene this variant belongs to.

gene_id

The CivicRecord.id of the gene this variant belongs to.

hgvs_expressions

Curated HGVS expressions describing this variant.

name

The curated name given to this variant.

moleulcar_profiles

A list of MolecularProfile objects of all the molecular profiles involving this variant.

variant_aliases
aliases

A curated list of aliases by which this variant is referenced.

variant_groups
groups

A list of variant groups to which this variant belongs.

variant_types
types

A list of CivicAttribute objects describing variant types from the Sequence Ontology.

class civic.MolecularProfile(**kwargs)[source]

Bases: CivicRecord

definition

A curated summary of the clinical significance of this molecular profile.

molecular_profile_score

The CIViC `molcular profile score`_ associated with this molecular profile.

name

The human readable name of this molecular profile, including gene and variant names.

class civic.Evidence(**kwargs)[source]

Bases: CivicRecord

assertions

CIViC Assertion records containing this evidence.

description
statement

The Evidence Statement (returned as description by the CIViC API) is a brief summary of the clinical implications of the variant in the context of the specific disease, evidence_type, and significance as curated from the cited literature source.

disease

The cancer or cancer subtype context for the evidence record.

evidence_direction

One of ‘Supports’, ‘Does Not Support’, indicating whether the evidence statement supports or refutes the significance of an event.

evidence_level

The evidence level describes the robustness of the study supporting the evidence item. Five different evidence levels are supported: “A - Validated association”, “B - Clinical evidence”, “C - Case study”, “D - Preclinical evidence”, and “E - Inferential association”. For more information, please see Understanding Levels.

evidence_type

Category of clinical action/relevance implicated by event. Refer to the additional documentation on evidence types for details on how to enter evidence of each of the six types: Predictive, Prognostic, Predisposing, Diagnostic, Functional, and Oncogenic.

molecular_profile

The MolecularProfile object this evidence item belongs to.

molecular_profile_id

The CivicRecord.id of the molecular profile this evidence item belongs to.

name

A system-generated unique identifier for the evidence record, e.g. EID7.

phenotypes

Zero or more phenotype CivicAttribute, linked to corresponding Human Phenotype Ontology (HPO) terms when applicable.

rating

The Evidence Rating is an integer from 1 to 5, indicating the curator’s confidence in the quality of the summarized evidence as a number of stars. For more information about this metric, please see Understanding Evidence Ratings.

significance

A string indicating the type of significance statement being made, values are defined based on the corresponding evidence_type. Please see Understanding Significance for more details on the expected values for this field.

source

A CivicAttribute source object from which this evidence was derived.

status

One of ‘accepted’, ‘rejected’, or ‘submitted’, describing the state of this evidence in the CIViC curation cycle. An evidence item needs to be reviewed by a CIViC editor before being accepted or rejected. Therefore “submitted” evidence might not be accurate or complete.

  • submitted: This evidence has been submitted by a CIViC curator or editor

  • accepted: This evidence has been reviewed and approved by a CIViC editor

  • rejected: This evidence has been reviewed and rejected by a CIViC editor

therapies

Zero or more therapy CivicAttribute, linked to corresponding NCIT terms when applicable. Only used with therapeutic response predictive evidence_type.

therapy_interaction_type

One of ‘Combination’, ‘Sequential’, or ‘Substitutes’, this field describes how multiple indicated therapies within a therapeutic response predictive evidence_type are related.

class civic.Assertion(**kwargs)[source]

Bases: CivicRecord

acmg_codes

Evidence codes used in the assessment of variants under the ACMG/AMP classification guidelines.

amp_level

The clinical interpretation classification by AMP/ASCO/CAP or ACMG/AMP guidelines.

assertion_direction

One of ‘Supports’ or ‘Does Not Support’, indicating whether the evidence statement supports or refutes the significance of an event.

assertion_type

Category of clinical action/relevance implicated by event. Refer to the additional `documentation on assertion types`_ for details on how to enter assertions of each of the five types: Predictive, Prognostic, Predisposing, Diagnostic, and Oncogenic.

description

The Assertion Description gives detail including practice guidelines and approved tests for the molecular profile. See curating assertions for more details.

disease

A disease CivicAttribute, linked to a corresponding Disease Ontology term when applicable.

fda_companion_test

A boolean indicating whether or not the assertion has an associated FDA companion test.

fda_regulatory_approval

A boolean indicating whether or not the therapies indicated in the assertion have regulatory approval for use in the treatment of the assertion disease.

molecular_profile

The MolecularProfile object this assertion belongs to.

molecular_profile_id

The CivicRecord.id of the molecular profile this assertion belongs to.

name

A system-generated unique identifier for the assertion, e.g. AID7.

nccn_guideline

A string linking the assertion to the corresponding NCCN Guidelines for treatment of cancer by disease site, if applicable.

nccn_guideline_version

The version associated with the indicated nccn_guideline document.

phenotypes

Zero or more phenotype CivicAttribute, linked to corresponding Human Phenotype Ontology (HPO) terms when applicable.

significance

A string indicating the type of significance statement being made, values are defined based on the corresponding evidence_type. Please see Understanding Significance for more details on the expected values for this field.

status

One of ‘accepted’, ‘rejected’, or ‘submitted’, describing the state of this assertion in the CIViC curation cycle. An Assertion needs to be reviewed by a CIViC editor before being accepted or rejected. Therefore “submitted” Assertions might not be accurate or complete.

  • submitted: This assertion has been submitted by a CIViC curator or editor

  • accepted: This assertion has been reviewed and approved by a CIViC editor

  • rejected: This assertion has been reviewed and rejected by a CIViC editor

summary

The Assertion Summary restates the Significance as a brief single sentence statement. It is intended for potential use in clinical reports. The Assertion Summary is designed for rapid communication of the Significance, especially when displayed in a longer list with other molecular profiles.

therapies

Zero or more therapy CivicAttribute, linked to corresponding NCIT terms when applicable. Only used with therapeutic response predictive evidence_type.

therapy_interaction_type

One of ‘Combination’, ‘Sequential’, or ‘Substitutes’, this field describes how multiple indicated therapies within a therapeutic response predictive evidence_type assertion are related.

variant_origin

The origin of the variants in this molecular profile, one of ‘Somatic’, ‘Rare Germline’, ‘Common Germline’, ‘Unknown’, ‘N/A’, ‘Germline or Somatic’, or ‘Mixed’

CIViC attributes

The CivicAttribute class is a special type of CivicRecord that is not indexed, and is used as a base class for additional complex records beyond those mentioned above (e.g. diseases, therapies). CivicAttributes are not cached except as attached objects to non-CivicAttribute CivicRecord objects, and cannot be retrieved independently.

class civic.CivicAttribute(**kwargs)[source]

Getting records

By ID

Records can be obtained by ID through a collection of functions provided in the civic module. Gene objects can be queried by the following methods:

civic.get_gene_by_id(gene_id)[source]
Parameters:

gene_id (int) – A single CIViC gene ID.

Returns:

A Gene object.

civic.get_genes_by_ids(gene_id_list)[source]
Parameters:

gene_id_list (list) – A list of CIViC gene IDs to query against to cache and (as needed) CIViC.

Returns:

A list of Gene objects.

civic.get_all_genes(include_status=['accepted', 'submitted', 'rejected'], allow_cached=True)[source]

Queries CIViC for all genes.

Parameters:
  • include_status (list) – A list of statuses. Only genes and their associated entities matching the given statuses will be returned.

  • allow_cached (bool) – Indicates whether or not object retrieval from CACHE is allowed. If False it will query the CIViC database directly.

Returns:

A list of Gene objects.

Analogous methods exist for Variant, MolecularProfile, Assertion, and Evidence:

civic.get_variant_by_id(variant_id)[source]
civic.get_variants_by_ids(variant_id_list)[source]
civic.get_all_variants(include_status=['accepted', 'submitted', 'rejected'], allow_cached=True)[source]
civic.get_molecular_profile_by_id(mp_id)[source]
civic.get_molecular_profiles_by_ids(mp_id_list)[source]
civic.get_all_molecular_profiles(include_status=['accepted', 'submitted', 'rejected'], allow_cached=True)[source]
civic.get_assertion_by_id(assertion_id)[source]
civic.get_assertions_by_ids(assertion_id_list=[], get_all=False)[source]
civic.get_all_assertions(include_status=['accepted', 'submitted', 'rejected'], allow_cached=True)[source]
civic.get_evidence_by_id(evidence_id)[source]
civic.get_evidence_by_ids(evidence_id_list)[source]
civic.get_all_evidence(include_status=['accepted', 'submitted', 'rejected'], allow_cached=True)[source]

By Coordinate

Variant records can be searched by GRCh37 coordinates. To query specific genomic coordinates, you will need to construct a CoordinateQuery object, and pass this query to the search_variants_by_coordinates() function. If you wish to query multiple genomic coordinates (e.g. a set of variants observed in a patient tumor), construct a sorted list of CoordinateQuery objects (sorted by chr, start, stop, alt), and pass the list to the bulk_search_variants_by_coordinates() function.

class civic.CoordinateQuery(chr, start, stop, alt=None, ref=None, build='GRCh37', key=None)[source]

A namedtuple with preset fields describing a genomic coordinate, for use with coordinate-based queries of CIViC Variants.

Parameters:
  • chr (str) – A chromosome of value 1-23, X, Y

  • start (int) – The chromosomal start position in base coordinates (1-based)

  • stop (int) – The chromosomal stop position in base coordinates (1-based)

  • alt (str optional) – The alternate nucleotide(s) at the designated coordinates

  • ref (str optional) – The reference nucleotide(s) at the designated coordinates

  • build (NCBI36,GRCh37,GRCh38) – The reference build version of the coordinates

  • key (Any optional) – A user-defined object linked to the coordinate

civic.search_variants_by_coordinates(coordinate_query, search_mode='any')[source]

Search the cache for variants matching provided coordinates using the corresponding search mode.

Parameters:
  • coordinate_query (CoordinateQuery) – Coordinates to query

  • search_mode (any,query_encompassing,variant_encompassing,exact) –

    any : any overlap between a query and a variant is a match

    query_encompassing : CIViC variant records must fit within the coordinates of the query

    record_encompassing : CIViC variant records must encompass the coordinates of the query

    exact : variants must match coordinates precisely, as well as reference allele(s) and alternate allele(s). Use '*' in the coordinate_query as a wildcard for reference and/or alternate alleles. Using None in the coordinate_query for reference or alternate alleles will only match variants that have no reference or alternate alleles, respectively (e.g. indels)

    search_mode is any by default

Returns:

Returns a list of variant hashes matching the coordinates and search_mode

civic.bulk_search_variants_by_coordinates(sorted_queries, search_mode='any')[source]

An interator to search the cache for variants matching the set of sorted coordinates and yield matches corresponding to the search mode.

Parameters:
  • sorted_queries (list[CoordinateQuery]) – Sorted list of coordinates to query

  • search_mode (any,query_encompassing,variant_encompassing,exact) –

    any : any overlap between a query and a variant is a match

    query_encompassing : CIViC variant records must fit within the coordinates of the query

    record_encompassing : CIViC variant records must encompass the coordinates of the query

    exact : variants must match coordinates precisely, as well as reference allele(s) and alternate allele(s). Use '*' in the coordinate_query as a wildcard for reference and/or alternate alleles. Using None in the coordinate_query for reference or alternate alleles will only match variants that have no reference or alternate alleles, respectively (e.g. indels)

    search_mode is any by default

Returns:

returns a dictionary of Match lists, keyed by query

Coordinates can also be used to query Assertion records:

civic.search_assertions_by_coordinates(coordinates, search_mode='any')[source]

By Other Attribute

civic.search_variants_by_allele_registry_id(caid)[source]

Search the cache for variants matching the queried Allele Registry ID (CAID)

Parameters:

caid (String) – Allele Registry ID to query

Returns:

Returns a list of variant hashes matching the Allele Registry ID

civic.search_variants_by_hgvs(hgvs)[source]

Search the cache for variants matching the queried HGVS expression

Parameters:

name (String) – HGVS expression to query

Returns:

Returns a list of variant hashes matching the HGVS expression

civic.search_variants_by_name(name)[source]

Search the cache for variants matching the queried name

Parameters:

name (String) – Variant name to query

Returns:

Returns a list of variant hashes matching the name