Welcome to AnnoTEP - local version. This version provides a graphical interface that streamlines your work and research in annotating transposable elements in plants. The performance of the annotation process depends on your machine's specifications, so ensure it remains powered on during execution. For any questions or further information, please use the 'Help' option available in the menu.

Annotation of all Class I and Class II elements
Data visualisation in graph format and phylogenetic trees.
An e-mail informing you of the start and end of the annotation is sent.

Data input

Email Adress

Genome data *

Additional features

Species specification to identify TIR candidates

Steps to be executed

Overwrite

Deactivated

Sensitivity

Enable

Annotation

Deactivated

Evaluate

Deactivated

Force

Deactivated

TIR filter

Deactivated

Annot. type

Deactivated

Threads

* For highly repetitive genomes, users are encouraged to adjust the "Maximum divergence" parameter

Additional input files

(Field not required)

Coding DNA Sequence

Curate library

Exclusion of masked regions

RepeatModeler library

RepeatMasker library

Latest Activities

List of the 10 most recent genomic data annotations

AnnoTEP documentation - Annotation of Transposable Elements in Plants

Welcome to the AnnoTEP Documentation

AnnoTEP is a transposable element annotation framework designed based on the EDTA pipeline, offering additional functionalities that enhance the analysis of genomic elements in plants. In addition to maintaining the core features of EDTA, AnnoTEP expands its applicability by including the analysis of non-autonomous LTR (Long Terminal Repeat) elements, such as TRIM, LARD, TR-GAG, and BARE2, along with the capability to detect soloLTRs. Furthermore, we also analyse TIR (Terminal Inverted Repeat) elements, as well as perform an in-depth analysis of autonomous and non-autonomous Helitron elements.

AnnoTEP also features a structured graphical interface that facilitates the annotation process of transposable elements in plant genomes. This interface enables the scientific community to conduct analyses more efficiently and with greater visual accessibility, providing essential support for research focused on the understanding and classification of transposable elements in plants.

How to use the platform

The AnnoTEP interface is organised into three main sections: Data Input, Additional Features, and Additional Input Files. Below, the use of each of these areas is detailed to configure the analysis of transposable elements in plant genomes.

Data Input

Email Address: To start the annotation process, enter a valid email address. This is a optional field and is used to send notifications about the status of the analysis. While the email facilitates communication, the annotation occurs locally, so keep the system running during the process.

Genome Data: Upload the input file containing the complete genomic sequence in FASTA format. Use the "Browse" button to select the file from the local system.

Additional Features

This section provides advanced configuration options for the analysis.

Species Specification to Identify TIR Candidates: Specify the species to optimise the identification of TIR elements. The options include:

  • Others: For species other than rice or maize (default).

  • Rice: For rice genomes.

  • Maize: For maize genomes.

Steps to Be Executed:Select which parts of the pipeline will be executed:

  • All: Runs the entire pipeline (default).

  • Filter: Starts from raw TEs to the end.

  • Final: Begins with filtered TEs to completion.

  • Anno: Conducts genome-wide annotation after building the TE library.

Overwrite: Decide whether existing output data should be overwritten.

Sensitivity: Control the execution of RepeatModeler to identify additional elements.

Annotation: Specify whether the genome-wide annotation of TEs should proceed after building the TE library.

Evaluate: Check if the classification of annotated TEs is consistent. The "Annotation" field must be enabled to use this feature.

Force: If no reliable TE candidates are identified, enabling this option allows the script to continue using a backup rice TE library.

TIR filter: Filter TIRs without annotated domains. Enabling this filter can substantially reduce false positives, but may also result in the loss of some true positives (false negatives).

Annot. type: Specify whether to annotate the genome using a RepeatMasker-based librar. Enabling this option may negatively affect the filtering step and compromise benchmark results.

Neutral Mutation Rate: Set the neutral mutation rate for calculating the age of intact LTR elements (default: 1.3e-8 bp per year, based on rice). For more information, visit the website and check out the table in the genome section.

Maximum Divergence: Define the maximum acceptable divergence for TE fragments. For highly repetitive genomes, users are encouraged to adjust the parameter (default: 40).

Threads: Determine the number of threads to be used in running the pipeline (default: 4).

Additional Input File

This section allows for the addition of optional files to customise the analysis:

Coding DNA Sequence: Select a FASTA file containing the coding sequence (without introns, UTRs, or TEs) of the genome or a close relative. This helps in excluding non-transposable elements.

Curate library: Upload a curated library to maintain consistent TE naming and classification. Only manually validated TEs should be provided. This file is optional.

Exclusion of masked regions: Define regions to be ignored during TE masking. The "Annotation" field must be enabled to use this option.

RepeatModeler library: Upload a classified RepeatModeler library to enhance analysis sensitivity, particularly for LINEs. If not provided, one will be generated automatically.

RepeatMasker library: Provide your own homology-based TE annotation in RepeatMasker .out format. This file will be merged with the structure-based annotation. The "Annotation" field must be enabled.

Submit and results

After completing all necessary fields, click "Submit" to start the analysis. Ensure that all configurations are correct for efficient and accurate processing.

If you have provided a valid email address, you will receive a notification once the annotation process has started. Upon completion, a second email will be sent, indicating whether the process was successful or if any issues occurred. This final email will include a detailed log file outlining the outcome.

You can also monitor the progress of the annotation by accessing the "Results" tab. This section displays key information such as:

  • The name of the generated output file;

  • Start and end timestamps (when available);

  • The current status of the annotation (e.g., in progress, completed, failed);

  • The last 20 lines of the annotation log.

Note 1: all results and output files will be stored in the Docker volume you specified as the output directory. Ensure this path is correctly mounted to access the generated data.

Note 2: Errors may occasionally occur during the annotation process, so it is important to pay attention to two key stages:

  1. Start of annotation: After uploading a file, ensure that the annotation process has actually begun. Sometimes, the file may not be properly recognised by the system, and in such cases, you will need to re-upload the file and restart the process.

  2. Completion of annotation: Even if the process runs, the annotation may not be successfully finalised in the Results section. Therefore, it is important to check the following:

    • Look out for any prolonged error messages.
    • Confirm whether the message "The generation of charts and reports has been completed" appears — this message indicates that the annotation has been successfully completed.