.. _install: Installation ============ Prerequisites ------------- * Python 3.11+ * A UNIX-like environment (e.g. MacOS, WSL, Ubuntu) * A recent version of PostgreSQL (ideally at least 11+) * A modern Java runtime (if using DynamoDB for the Gene Normalizer database) Library installation -------------------- Install ``FUSOR`` from `PyPI `_: .. code-block:: shell pip install fusor Data setup ---------- Universal Transcript Archive (UTA) ++++++++++++++++++++++++++++++++++ The `UTA `_ is a dataset of genome-transcript aligned data supplied as a PostgreSQL database. Access in FUSOR is supplied by way of ``Cool-Seq-Tool``; see the `Cool-Seq-Tool UTA docs `_ for some opinionated setup instructions. At runtime, UTA connection information can be relayed to FUSOR (by way of Cool-Seq-Tool) either as an initialization argument or via the environment variable ``UTA_DB_URL``. By default, it is set to ``postgresql://uta_admin:uta@localhost:5432/uta/uta_20210129b``. See the `Cool-Seq-Tool configuration docs `_ for more info. SeqRepo +++++++ `SeqRepo `_ is a controlled dataset of biological sequences. As with UTA, access in FUSOR is given via `Cool-Seq-Tool`, which provides `documentation `_ on getting it set up. At runtime, the file location of the SeqRepo instance directory can be defined (by way of Cool-Seq-Tool) either as an initialization argument or via the environment variable ``SEQREPO_ROOT_DIR``. By default, it's expected to be ``/usr/local/share/seqrepo/latest``. See the `Cool-Seq-Tool configuration docs `_ for more info. Gene Normalizer +++++++++++++++ Finally, ``FUSOR`` uses the `Gene Normalizer `_ to ground gene terms. See the `Gene Normalizer documentation `_ for setup instructions. Connection information for the normalizer database can be set using the environment variable ``GENE_NORM_DB_URL``. See the `Gene Normalizer docs `_ for more information on connection configuration. As a default, this connects to port 8000: ``http://localhost:8000``. Check data availability +++++++++++++++++++++++ Use the :py:meth:`fusor.tools.check_data_resources` method to verify that all data dependencies are available: .. code-block:: pycon >>> from fusor.tools import check_data_resources >>> status = await check_data_resources() >>> assert all(status) # passes if all resources can be acquired successfully