User Guide
The user guide for IRIDA describes the major concepts of IRIDA, and demonstrates how to manage data in IRIDA, including launching new analytical workflows.
System Introduction
IRIDA provides a management interface for several different types of data:
- Users,
- User groups,
- Projects,
- Samples,
- Sequence files,
- Sequencing runs,
- Analyses.
If you already understand the different data types (or are impatient!), you can proceed directly to the dashboard section section.
Users
A user account is required to use IRIDA. User accounts have the following properties:
- A username,
- An e-mail address,
- A password,
- A full name
- A phone number
The username and e-mail address must be unique in IRIDA. The minimum length for a username is 3 characters. The e-mail address must be a properly formatted e-mail address. If you intend to export your samples to Galaxy, we recommend that you use the same e-mail address in IRIDA as you use in Galaxy (the data library owner field is auto-filled with the e-mail address you’ve entered for IRIDA).
First and last names are required to be at least 2 characters long, but are otherwise not validated.
The password that you choose for IRIDA must meet the following requirements:
- Have at least 8 characters (Admin passwords should be at least 11 characters)
- Include at least 1 upper-case character
- Include at least 1 lower-case character
- Include at least 1 number
- Include at least 1 special character (ex:
!@#$%^&*()+?/<>=.\\{}
) - Passwords should not form words
- Passwords should not include any personal information
- Passwords may not be reused
Phone numbers must be at least 4 characters long, but are otherwise not validated. For example, the following phone numbers are valid:
- +1-204-789-7029
- (204) 789-7029
- 789-7029
- 7029
- 204-789-7029 ext: 3000
System Roles
User accounts in IRIDA can be assigned a system role that gives them certain privileges. IRIDA currently has 4 types of user account roles:
- User
- Technician
- Manager
- Administrator
- Sequencer
User
User accounts with the user
role are permitted to:
- Create new projects,
- Manage the projects created by that user account,
- View data on the projects that they participate in.
Technician
User accounts with the technician
role have all of the same permissions as a user account with the user
role, and also have the permission to view all sequencing runs on the system.
Manager
User accounts with the manager
role have all of the same permissions as a user account with the user
role, and also have the permission to create new user accounts (restricted to creating user accounts with the user
role).
Administrator
User accounts with the administrator
role have all of the same permissions as a user account with the manager
role, and may also:
- View data on all projects in IRIDA,
- Manage data in all projects in IRIDA,
- Create new user accounts with any system role.
Sequencer
User accounts with the sequencer
role are intended to be used by lab technicians for uploading data to IRIDA from a sequencer, using the uploader tool. User accounts with only the sequencer
role are not permitted to log into the web interface and can only interact with IRIDA using external tools. User accounts with the sequencer
role are permitted to:
- Create new samples in any project,
- Upload files to any sample in any project.
User Groups
User groups are collections of user accounts used to assign permissions to Projects. You can assign a user group to several projects and any membership changes to the user group will be reflected in the projects that the group is a member of.
A user group has the following properties:
- A name,
- A description,
- A collection of user accounts with permissions to view or modify group details.
The user group name must be at least 3 characters log and is not otherwise validated.
The user group description can be any length and is not otherwise validated.
Each user account in the user group is assigned a role of either group member or group owner. A group member is used only for project membership, group members cannot change any properties of a group. A group owner has permissions to change the group name, description, and add or remove members from the group.
Projects
A project is a container for organizing a collection of related samples. A project has the following properties:
- A unique identifier (generated by IRIDA),
- A required name,
- An optional project organism,
- An optional description,
- An optional link to a wiki or external web site for the project,
- A collection of user accounts and user groups with permissions to view or modify project details,
- A collection of samples.
The unique identifier is generated by IRIDA and is required by external tools to interact with a project.
The project name must be at least 5 characters long, and must not contain any of the following characters: ? ( ) [ ] / \ = + < > : ; " , * ^ | &
The project description is not validated and may have an arbitrary length.
The link to an external project wiki or web site must be a valid URL.
Each project also has an associated collection of user accounts that are permitted to view or modify the project and its samples. Each user account in the collection is assigned a project role of owner
or collaborator
. A project role of owner
permits the user account to view and modify all properties and samples of the project. Users with a project role of collaborator
can only view properties and samples of the project.
Samples
A sample is a container for organizing a collection of related files, usually generated from a single isolate. Samples are generally not created by a user account. Instead, samples are automatically created by an external tool when sequencing data is bring transferred to IRIDA.
Samples have the following, user-modifiable properties:
- A unique identifier (generated by IRIDA),
- A required name,
- An optional description,
- Optional organism details (including organism name, isolate, and strain),
- Optional collection information (including collected by, date of collection, isolation source, and geographic location of collection)
- A collection of sequence files.
The unique identifier is generated by IRIDA and is required by external tools to interact with the sample (along with the unique project identifier).
The sample name must be at least 3 characters long, and must not contain white space characters (tab or space) or any of the following characters: ? ( ) [ ] / \ = + < > : ; " , * ^ | & ' .
The sample name automatically assigned by the uploader tool corresponds to the Sample_Name
column of the sample sheet used by the Illumina MiSeq sequencer. The Sample_Name
column is filled-in by sequencing technicians when preparing the sequencing run. Sample name restrictions are the same as those enforced by CASAVA.
The sample description is not validated and can have arbitrary length.
The organism and sample collection information is not validated.
Sequence Files
A sequence file contains the primary source of information for processes in IRIDA. The sequence files currently stored in IRIDA are uncompressed FASTQ files. Sequence files are generally not created by a user. Instead, a sequence file is created by a tool external to IRIDA when data is transferred to be managed by IRIDA.
Sequence files have the following properties:
- A unique identifier (generated by IRIDA),
- A collection of sequencer-specific additional properties,
- Quality control data generated by FastQC.
The unique identifier is generated by IRIDA and is required by external tools to interact with the file (along with the unique sample and project identifiers).
FastQC is executed by IRIDA automatically when data is uploaded. IRIDA stores the following statistics computed by FastQC:
- The file type (usually “Conventional base calls”),
- The total number of reads in the file,
- The GC content,
- The length of the shortest read in the file,
- The length of the longest read in the file,
- The number of reads filtered by FastQC,
- A chart showing per-sequence quality scores,
- A chart showing per-base quality scores,
- A chart showing duplication levels,
- A collection of over-represented sequences found in reads.
Sequence files can be grouped together as part of a pair, if paired-end sequencing was used to generate the file.
Sequencing Runs
A sequencing run contains a collection of sequence files that were produced on the same execution of a sequencer. Only users with the administrator
role can see a sequencing run.
A sequencing run has the following properties:
- A unique identifier (generated by IRIDA),
- The date the run was created in IRIDA,
- A collection of files associated with the run,
- The layout type (one of single-end or paired-end).
Sequencing runs cannot be created by a user. Instead, a sequencing run is created by the uploader tool prior to uploading any data from a sequencing run.
Analyses
An analysis contains the results of running a pipeline on a collection of samples in IRIDA. An analysis cannot be created manually by a user. Instead, an analysis is created automatically by IRIDA on pipeline submission and completion.
IRIDA currently has multiple types of analysis that can be created:
- SNVPhyl, and
- Assembly and annotation.
- Salmonella in silico typing (SISTR).
Each specific type of analysis may have analysis-specific properties, but an analysis will have at least the following properties:
- A unique identifier (generated by IRIDA),
- The date that the analysis was created in IRIDA (when it finished running),
- A description,
- A collection of analysis-specific properties,
- A collection of named files produced by the pipeline execution,
- A reference to the submission request that was used to launch pipeline execution.
The unique identifier is generated by IRIDA and is required by external tools to interact with the analysis.
The collection of named files produced by the pipeline execution also includes workflow execution provenance that describe how the files were created. The workflow execution provenance includes individual tool names, individual tool versions, and the set of all parameters that were supplied to the tool at execution time.