Integration with an existing Galaxy
IRIDA can be setup to use an existing Galaxy installation assuming a few conditions are met:
- Galaxy version >= 16.01 is required as IRIDA makes use of conda with Galaxy. A method to get newer conda-based tools to work with older Galaxy versions is described in our FAQ, however this option will not be supported and is not recommended.
- The filesystem is shared between the machines serving IRIDA and Galaxy under the same paths (e.g.,
/path/to/irida-data
on IRIDA is available as/path/to/irida-data
on the Galaxy server). - Galaxy is setup to use PostgreSQL or MySQL/MariaDB as it’s database. The default installation uses SQLite, but this is insufficient for the complex workflows used by IRIDA.
- Some modifications to the configuration settings (see below) are made to enable IRIDA to communicate with Galaxy.
The following describes the procedures needed to get IRIDA setup with an existing Galaxy installation.
- Dependencies
- Configuration settings
- Setup of a Galaxy user and API key
- Setup IRIDA tools in Galaxy
- Link up Galaxy with IRIDA
Dependencies
Some tools will need to be built from source, and so require the standard Linux build environment to be installed on the Galaxy server. For CentOS this will include the following packages:
yum groupinstall "Development tools"
yum install mercurial zlib-devel ncurses-devel tcsh git db4-devel expat-devel java
Additionally, some tools assume certain dependencies are installed on the machines where the tools are to be run and do not install these dependencies automatically. To handle these cases, you can create a specific conda environment which is loaded up each time a tool is run (via the env.sh
file). Assuming conda is installed these dependencies can be installed into a conda environment, galaxy, with:
conda create --name galaxy samtools perl-xml-simple perl-time-piece perl-bioperl openjdk gnuplot libjpeg-turbo
To load up this conda environment before each tool is run, please add the following to the galaxy/env.sh
file:
source activate galaxy
Some Galaxy tool installation instructions for IRIDA may require the installation of additional dependencies which can be added to this conda galaxy environment.
Note: the location of galaxy/env.sh
is defined by the property environment_setup_file
in config/galaxy.ini
.
Configuration settings
Settings in config/galaxy.ini
The following is a list of other necessary configuration settings within the file config/galaxy.ini
for IRIDA to function with Galaxy.
- Change
allow_library_path_paste
to allow direct linking of files in Galaxy to the IRIDA file locations (as opposed to making copies). E.g.,- Change
#allow_library_path_paste = False
toallow_library_path_paste = True
.
- Change
- Set the Galaxy
id_secret
for encoding database ids. E.g.,- Change
#id_secret = USING THE DEFAULT IS NOT SECURE!
toid_secret = some secure password
- The command
pwgen --secure -N 1 56
may be useful for picking a hard-to-guess key. - Note: Once this key is set, please do not change it. This key is used to translate database ids in Galaxy to API ids used by IRIDA to access datasets, histories, and workflows. IRIDA does store some of these API ids internally for debugging and tracking purposes and changing this value will render any of the API ids stored in IRIDA useless.
- The command
- Change
- Setup Conda for installing tool dependencies via Galaxy’s automated dependency management system. E.g.,
- Set
conda_prefix = /home/galaxy-irida/miniconda3
, or wherever conda is installed for Galaxy. - Set
conda_ensure_channels = iuc,bioconda,r,defaults,conda-forge
.
- Set
- Add Galaxy/IRIDA user to
admin_users
list (see below).
Setup of a Galaxy user and API key
As IRIDA communicates with Galaxy through the API it is necessary to setup a Galaxy API key which will be used by IRIDA. It is recommended (though not required) to use a separate user for this purpose.
This user must also have Galaxy admin privileges in order to allow IRIDA to link directly to the sequence files when sharing with Galaxy (avoids creating copies of fastq files each time a workflow is run). Admin privileges can be assigned by adding the user’s email to admin_users
in the configuration file config/galaxy.ini
.
Setup IRIDA tools in Galaxy
Automated installation of tools
The tool Ephemeris can be used to automate installing of tools in Galaxy. A list of tools to install is provided with the irida-[version].zip
download on the IRIDA releases page. Instructions can be accessed on the Automated tools install page.
The short version is to:
-
Install Ephemeris
conda install -c bioconda ephemeris
-
Install tools
shed-tools install --toolsfile tools-list.yml --galaxy [http://url-to-galaxy] --api_key [api key]
Please replace url-to-galaxy and api key with appropriate values for your Galaxy instance.
You may want to monitor the Galaxy log files (e.g., galaxy/*.log
) as the installation is proceeding. This may take a while to download, build, and install all tools.
Note: Please take a look through the Manual installation of tools instructions to see if there are any additional setup instructions needed (such as environment variables that need to be set).
Manual installation of tools
Alternatively, the necessary tools can be installed manually through the following instructions specific to each pipeline in IRIDA:
- SNVPhyl Whole Genome Phylogeny
- Assembly and Annotation
- Assembly and Annotation Collection
- SISTR Salmonella Typing
- refseq_masher
- MentaLiST MLST
- Bio_Hansel
Link up Galaxy with IRIDA
In order to connect IRIDA to this Galaxy instance you will need to modify the parameters galaxy.execution.url, galaxy.execution.email, and galaxy.execution.apikey in the file /etc/irida/irida.conf
and restart IRIDA. Additional details for configuring IRIDA can be found in the instructions to install and configure the IRIDA web interface.
Once you have configured IRIDA to connect to Galaxy you can attempt to execute a workflow by adding some data to your cart, selecting Pipelines from the main menu, then selecting a particular pipeline. You will have to have some data uploaded into IRIDA before testing. An example set of data can be found at irida-sample-data.zip. Currently all workflows assume you are using paired-end sequence reads.