Interacting with ENA Database¶

Data¶

Data in ENA are organzed into 11 domains (or type):

Domain	Description
Assembly	Information describing the construction of reads and sequence contigs into higher order scaffolds and chromosomes
Sequence	Assembled and, optionally, annotated assembled reads
Coding	A virtual domain comprising sequence regions reported by data providers as being protein-coding regions
Non-coding	A virtual domain comprising sequence regions reported by data providers as representing non-protein-coding (RNA) genes
Marker	A virtual domain comprising information relating to phylogenetic, identification and molecular ecology marker data
Analysis	Derived data forms, such as recalibrated aligned reads and metabarcoding identifications
Read	Raw sequencing reads from next generation platforms
Trace	Raw sequencing data from capillary platforms
Taxon	Information relating to the organism that was the source of the sequenced biological sample
Sample	Information relating to the biological sample studied in the sequencing experiment
Study	Information relating to the scope of the sequencing effort; also known as ‘Project’, the primary use of study is to unite content otherwise dispersed across the ENA domains

Each domains are further subdivided in some cases into data classes. It is the results that can be accessed:

Domain	Result	Description
Assembly	assembly	Genome assemblies
Sequence	sequence_release	Nucleotide sequences (Release)
	sequence_update	Nucleotide sequences (Update)
	wgs_set	Genome assembly contig sets (WGS)
	tsa_set	Transcriptome assembly contig sets (TSA)
Coding	coding_release	Protein coding sequences (Release)
Coding	coding_update	Protein coding sequences (Update)
Non-coding	noncoding_release	Non-coding sequences (Release)
Non-coding	noncoding_update	Non-coding sequences (Update)
Analysis	analysis_study	Studies used for nucleotide sequence analyses from reads
Analysis	analysis	Nucleotide sequence analyses from reads
Read	read_experiment	Experiments used for raw reads
	read_run	Raw reads
	read_study	Studies used for raw reads
Sample	sample	Samples
Taxon	taxon	Taxonomic classfication
Environmental	environmental	Environmental samples
Study	Study	Studies

This list can be accessed with get_results.

Each “result” can be searched, the outputs can be formatted and sorted given different fields. These fields are accessible via the commands:

get_filter_fields to obtain the fields to build a query or filter (more information about the type of these filters with get_filter_types)
get_returnable_fields to obtain the fields extractable for a result
get_sortable_fields to obtain the fields usable to sort the outputs

The data on ENA can be accessed programmatically, in ENASearch: