Accounts and Species¶
Accounts¶
You need to specify the host name (the domain name where there’s a running MySQL server hosting the Ensembl databases), username and password. Use the HostAccount
class for this
>>> from ensembldb3 import HostAccount
>>> account = HostAccount("mysqlhostname.anu.edu.au", "username", "password")
You can also specify the port number, e.g. HostAccount(..., port=5306)
if it differs from the default 3306.
I find it convenient to specify the MySQL server account details as an environment variable called ENSEMBL_ACCOUNT
by adding the following to my .bashrc
:
export ENSEMBL_ACCOUNT="mysqlhostname.anu.edu.au username password"
In my scripts I then create the HostAccount
instance as
>>> import os
>>> from ensembldb3 import HostAccount
>>> account = HostAccount(*os.environ['ENSEMBL_ACCOUNT'].split())
Note
ensembldb3
defaults to using the Ensembl UK MySQL servers if you don’t specify an account.
Species¶
The Species
class is a top level import that is used to translate between latin names and Ensembl’s database naming scheme. It also serves to allow the user to use just a species common name to reference it’s genome databases. The queries are case-insensitive.
>>> from ensembldb3 import Species
>>> print(Species)
=========================================================================================================
Common name Species name Ensembl Db Prefix Synonymns
---------------------------------------------------------------------------------------------------------
Alpaca Vicugna pacos vicugna_pacos
Amazon molly Poecilia formosa poecilia_formosa
Anole Lizard Anolis carolinensis anolis_carolinensis ...
You can directly extend the list of species, or modify an existing entry, using Species.amend_species
. If you wish to edit the species list on a larger scale or just do it once so all your scripts can rely on that change, you can directly modify the reference species data used by ensembldb3
.
See ensembldb3 exportrc to obtain the species data distributed with
ensembldb3
plus other configuration files and edit thespecies.tsv
fileAdd an environment variable
ENSEMBLDBRC
to your.bashrc
as follows:export ENSEMBLDBRC="~/path/to/ensembldbrc/"
Note
The species.tsv
file has at least 2, and up to 3, fields per line: species name, common name, species name synonym.
Look up a species common name¶
This can be done using the species name. Note that some synonyms are supported.
>>> Species.get_common_name("Felis catus")
'Cat'
>>> Species.get_common_name("Canis familiaris")
'Dog'
>>> Species.get_common_name("Canis lupus familiaris")
'Dog'