AnnoTEP is a platform for annotating Transposable Elements in Plants, based on the famous and widely adopted EDTA program (Ou et al., 2019; PMID: 31843001). By using an input file in FASTA format containing the plant genome on a chromosomal or contig/scaffold scale, it is possible to obtain:

Annotation of all Class I and Class II elements.

Data visualization in graphic format and phylogenetic trees.

Executable via CLI or GUI with GitHub/Docker/Singularity

Pre-computed Annotation Examples: Click here

Obtaining AnnoTEP

AnnoTEP can be obtained in different ways: by downloading the compressed installation package available on this site, by accessing the official repository on GitHub, or by using the containerised (Docker/Singularity) versions available on this website. For detailed installation and configuration instructions, see "Help > Downloading and Configuring AnnoTEP".

GitHub

Acess link

Download

Command Line - Docker

docker pull annotep/annotep-cli:v1

Graphic Interface - Docker

docker pull annotep/annotep-gui:v1

Mutation rate table

The table provides suggested values for use during the tool's annotation process, considering different contexts and scenarios. These values were calculated based on statistical analyses and literature reviews, aiming to offer a reliable reference for estimating mutation rates in various organisms or systems.

Use this table as an initial guide, but always validate the values with empirical data and adjust them as needed for your project. For more information, access "Help > General Recommendations for Using Mutation Rates to Calculate LTR Ages" or "General Mutation Rates by Ecological Category" via the side menu.

Family	Species	Mutation rate (mutations per site per year)

* Bromelia - Estimated by analogy with pineapple (9.0 x 10⁻⁹)

* Mimosa pudica - Estimated similar to other Fabaceae (1.0x 10⁻⁸)

* Banisteriopsis caapi - Estimated by analogy with similar species (1.1 x 10⁻⁸)

*Theobroma grandiflorum - Estimated similar to cacao (1.6 x 10⁻⁸)

*Passiflora edulis - Estimated (1.3 x 10⁻⁸)

* Saccharum officinarum - Estimated (2.5 x 10⁻⁸)

* Psychotria viridis - Estimated, similar to other tropical plants (1.2 x 10⁻⁸)

* General Cacti - Estimated (1.2 x 10⁻⁸), based on comparisons with other drought-resistant species

General Mutation Rates by Ecological Category

Ecological Category	Estimated rate (mutations per site per year)
Tropical Plants	1.2 x 10⁻⁸	1.2e-8
Aquatic Plants	8.0 x 10⁻⁹	8.0e-9
Desert Plants	1.0 x 10⁻⁸	1.0e-8
Arctic and Alpine Plants	7.0 x 10⁻⁹	7.0e-9
Temperate Forest Plants	9.0 x 10⁻⁹	9.0e-9

Preprocessed Genomes

AnnoTEP documentation - Annotation of Transposable Elements in Plants

Introduction

Welcome to the AnnoTEP documentation, a specialised tool for the annotation of transposable elements (TEs) in plant genomes. Developed based on the famous and widely adopted EDTA pipeline (Ou et al., 2019; PMID: 31843001), AnnoTEP extends its functionalities by offering additional features that enhance the analysis and interpretation of genomic elements.

AnnoTEP has been designed to meet the demands of researchers working with plant genomes, providing greater accuracy and detail in the identification and classification of TEs. Among its main capabilities, the following stand out:

Detection of non-autonomous LTR-RTs, such as TRIM, LARD, TR_GAG, and BARE-2.
Advanced classification of the Copia and Gypsy superfamilies at the lineage level, following the criteria established by Orozco et al. (2019; PMID: 31390781).
Greater accuracy in the classification of autonomous TIRs.
Annotation of Helitrons, distinguishing between autonomous and non-autonomous elements.
Application of appropriate soft masking for subsequent analyses.
Generation of detailed classification reports.
Creation of graphical outputs, including phylogenetic trees and age analyses.

How to Use the Tool

AnnoTEP is a tool capable of adapting to the needs and skills of the user, catering to both researchers with little technical experience and users specialised in annotations. It offers two distinct interfaces – a Graphical User Interface (GUI) and a Command-Line Interface (CLI).

AnnoTEP GUI (Graphic User Interface)

The AnnoTEP GUI has been developed for researchers who prefer a visual and interactive approach. With customised fields and menus, the GUI simplifies the input and manipulation of genomic data.

GUI Features:

Installation: Available via GitHub, as a Docker image on Docker Hub or through conversion of the Docker image to Singularity;
Execution: The tool runs locally via localhost after installation;
Notification System: Receive email alerts about the status of the annotation process.

AnnoTEP CLI (Command Line Interface)

For experienced users, the CLI offers a more technical approach. Predefined commands and specific parameters allow greater control over the annotation process, following the EDTA pipeline terminology.

CLI Features:

Installation: Available via GitHub, as a Docker image on Docker Hub or through conversion of the Docker image to Singularity;
Execution: Like traditional pipelines, the CLI runs directly in the terminal and displays the entire annotation process in real-time, without a notification system.

Recommendations

The time required for genome analysis of any size depends solely on the user's hardware. The greater the resources used, the faster the analysis will be.

Return to menu

System requirements

Software

Hardware

Minimum requirements for both versions for Genomes up to 1GB

More resources are recommended for larger genomes.

Return to menu

General Recommendations for Using Mutation Rates to Calculate LTR Ages

Understanding LTR Age Calculation:

The age of an LTR retrotransposon can be estimated by comparing the divergence between the 5' and 3' LTR sequences of the same retrotransposon. The assumption here is that these sequences were identical at the time of insertion and have diverged due to mutations over time.
The formula commonly used is: Age = Divergence / (2 x Mutation Rate), where Divergence is the genetic distance between the two LTR sequences.

Accurate Divergence Estimation:

Use reliable bioinformatics tools to accurately measure the sequence divergence between the LTRs. Tools like LTR_retriever provide mechanisms to identify LTRs and calculate divergence.
Ensure that the alignment and comparison of LTR sequences are accurately performed to avoid underestimation or overestimation of divergence.

Appropriate Mutation Rate:

Use species-specific mutation rates when available. The mutation rates you have for each species are critical as they can significantly affect age estimations.
If species-specific mutation rates are not available, use rates from closely related species or general rates for the plant family as a proxy, acknowledging the potential for error this introduces.

Literature Review for Validation:

Review recent literature to validate the mutation rates and the methodologies used for similar studies in the same or related species. This can help confirm that your approach is aligned with current scientific standards.
Especially look for studies that have used LTR_retriever or similar tools in the same species for comparisons.

Consideration of Evolutionary and Environmental Factors:

Remember that mutation rates can be influenced by various factors including environmental stress, life history traits, and population dynamics. These factors might cause the actual mutation rate in certain environments or periods to deviate from the average.

The mutation rate list provided below can be a valuable resource for calculating the ages of LTR retrotransposons. However, this list should be used with caution due to several important considerations:

Species-Specific Variability:

Mutation rates can vary significantly even within a single species due to environmental factors, genetic background, and historical population dynamics. The rates provided are averages and may not capture this intra-species variability.

Generalization Risks:

Using mutation rates from closely related species or generalized rates for an entire plant family can introduce errors. Such rates might not accurately reflect the specific evolutionary pressures and genetic history of the species of interest.

Methodological Differences:

The methods used to estimate these mutation rates might differ, affecting their accuracy. Some rates might be derived from lab observations under controlled conditions, which may not perfectly mimic natural environments.

Evolutionary and Environmental Influences:

Mutation rates are influenced by numerous factors including climate, soil conditions, and exposure to mutagens, which can fluctuate over time and across geographies. This context-dependent nature of mutation rates can lead to underestimations or overestimations of LTR ages.

Technological and Analytical Limitations:

The precision of mutation rate calculations and the subsequent age estimations of LTR retrotransposons rely heavily on the technology and algorithms used in their determination. Advances in sequencing technology or bioinformatics tools may refine these rates, potentially altering previous calculations.

Literature Support:

It is crucial to consult the latest peer-reviewed studies for the most recent and robust mutation rates and to understand the context in which they were measured. Research publications often provide more nuanced insights into the conditions and accuracy of reported mutation rates.

Recommendations

When using this list to calculate LTR ages, clearly state any assumptions made about mutation rates and the potential sources of error in your methods and results. Consider validating your findings with multiple approaches and seek peer feedback or additional data where possible. Always stay updated with the latest research and methodological advances that may impact the interpretation of these rates.

Return to menu

General Mutation Rates by Ecological Category

Tropical Plants:

Tropical plants often have higher rates of growth and reproduction, which could lead to higher mutation rates. However, the rich biodiversity and complex interactions in tropical ecosystems might also promote genetic stability to some extent.

Aquatic Plants:

Aquatic environments provide a relatively stable thermal environment but can expose plants to varying levels of UV radiation and other mutagenic factors depending on water clarity and depth. This rate assumes a moderate mutation rate reflecting these mixed conditions.

Estimated Rate:

Plants in arid or desert environments are exposed to extreme conditions that can increase oxidative stress and potential DNA damage, possibly leading to slightly higher mutation rates.

Arctic and Alpine Plants:

The harsh, cold environments can slow metabolic processes and potentially reduce mutation rates. These plants also have longer life spans and slower growth rates, which might contribute to a lower rate of mutation accumulation.

Temperate Forest Plants:

This rate is based on the assumption that temperate plants experience seasonal variations that might impact their metabolic rates and, consequently, their mutation rates. This is a mid-range estimate considering the moderate environmental stresses.

Notes on General Mutation Rates by Ecological Category Estimations

These estimates are highly speculative and should be used with caution in scientific contexts. They are based on ecological reasoning rather than direct experimental evidence, which is the ideal method to determine such rates.
Mutation rates can vary widely even within a single ecological category due to species-specific factors, including life cycle length, reproductive strategy, and exposure to environmental mutagens.

Suggested Use

These general rates can be useful for preliminary models or simulations in ecological genetics and evolutionary studies. They provide a starting point for discussions about how different environments might influence genetic variability in plants. However, for rigorous scientific research, specific studies and data are always recommended.

Return to menu

Preprocessed Genomes

Here, you will find a list of genomes that have been analysed using AnnoTEP. These genomes have been carefully tested and processed, and the results are available for consultation. To access these results, simply click on the plant image, and you will be redirected to a new page.

What will you find in the preprocessed genomes?

Each preprocessed genome contains the following results and analyses:

Annotated data:

This section contains annotated data, including FASTA (.fa) and GFF3 files with detailed classification of transposable elements (TEs), as well as additional files with the graphical outputs and the data used to generate them

TE classification table and genomic distribution:

A table that categorises elements hierarchically by order, superfamily, and autonomy. It also calculates metrics such as size and the percentage of each identified element in the analysed genome.

RepeatLandscape graphic:

A graph providing an overview of the distribution of TEs in the analysed genome.

LTR Age Graph:

Graphs representing the age of the Gypsy and Copia superfamilies.

Phylogenetic Tree and Density Graph:

Graphs representing the phylogeny of LTR lineages.

Return to menu

Downloading and Configuring AnnoTEP

GitHub

In terminal, Download the repository

git clone https://github.com/Marcos-Fernando/AnnoTEP.git $HOME/AnnoTEP

Enter into the folder

cd $HOME/AnnoTEP

Installing with library and conda

Installing Miniconda

Download Miniconda
After downloading Miniconda from the link above, run the following command in your terminal:

bash Miniconda3-latest-Linux-x86_64.sh

Once Miniconda is installed, make sure you are inside the AnnoTEP directory, then set up the environment as follows:

cd AnnoTEP
conda env create -f environment.yml
conda activate EDTA-new

Still within the AnnoTEP directory, copy the break_fasta.pl script to /usr/local/bin to make it accessible system-wide:

sudo cp Scripts/break_fasta.pl /usr/local/bin

RepeatMasker Fixes for Long Names

During execution, you may encounter the following error: FastaDB::_cleanIndexAndCompact(): Fasta file contains a sequence identifier which is too long ( max id length = 50 ). To fix this issue, follow the steps below:

To fix this issue, follow the steps below:

Edit the RepeatMasker File

Access the RepeatMasker file installed in the Conda environment:

/home/"user"/miniconda3/envs/EDTA-new/bin/RepeatMasker

Locate all occurrences of FastaDB where the following snippet appears:

= FastaDB->new(
maxIDLength => 50
);

Change the value of maxIDLength from 50 to a higher value, for example:

= FastaDB->new(
maxIDLength => 80
);

Edit the ProcessRepeats File

Acess the ProcessRepeats File

/home/"user"/miniconda3/envs/EDTA-new/share/RepeatMasker/ProcessRepeats

Repeat the same procedure to change the value of maxIDLength to 80.

Testing

Download the genome

Arabidopsis thaliana

Download the TAIR10_chr_all.fas.gz file from the TAIR website and extract its contents.

gzip -d TAIR10_chr_all.fas.gz
cat TAIR10_chr_all.fas | cut -f 1 -d" " > At.fasta
rm TAIR10_chr_all.fas

Inside the AnnoTEP directory, run EDTA on the downloaded genome

cd AnnoTEP
mkdir Athaliana
cd Athaliana
nohup ../EDTA/EDTA.pl --genome ../At.fasta --species others --step all --sensitive 1 --anno 1 --threads 20 -u 7.0e-9 > EDTA.log 2>&1 &

Monitor the progress

tail -f EDTA.log

Adjust the number of threads - Set the number of threads --threads according to the capacity of your machine or server. For optimal performance, use the maximum available. In the example above, it is set to 20.

Improving TE detection - Enable --sensitive 1. for more accurate TE detection and annotation. This option runs RepeatModeler to identify additional TEs and repeat sequences, and it also provides Superfamily and Lineage-level classifications.

Enhancing genome analysis with mutation rate - For a more refined analysis of TE insertion age, we recommend setting the mutation rate using the -u parameter. Suggested values and detailed explanations can be found in the Genome section

Generating Graphs

Run the processing script: With the Conda environment still activated, navigate to the folder where the annotated genome was stored (e.g., Athaliana) and run the script below to generate summary data and graphs from the input genome (e.g., At.fasta):

cd Athaliana
bash -u ../Scripts/generate_PLOTs-for-TE-pipe.sh At.fasta

Make sure to replace At.fasta with the name of the input genome file you wish to process, if it is different.

At the end of the analysis, a directory named REPORT will be created. It contains all the outputs, including bubble and bar plots, phylogenetic trees, and summary reports.

Return to menu

Using AnnoTEP with Graphical User Interface

Recommendations

Before proceeding, make sure the Conda environment is properly set up and activated.

Navigate to the graphic-interface folder within the AnnoTEP directory.

cd AnnoTEP/graphic-interface

Configure flaskenv: With the Conda environment still active, you will need to create and configure a .flaskenv file. This file defines essential Flask settings and, optionally, enables email functionality.

You can create the .flaskenv file using the following content:

FLASK_APP = "main.py"
FLASK_DEBUG = True
FLASK_ENV = development

If you plan to use the built-in email system (for the notification system), you must also include the following email configuration:

MAIL_SERVER=server-email
MAIL_PORT=number
MAIL_USE_TLS=True or False
MAIL_USE_SSL=True or False
MAIL_USERNAME=your@email.com
MAIL_PASSWORD=app*password*

Email Server Settings:

App Password for Gmail:

To use Gmail securely, create an app-specific password:

Security Recommendations:

Run the Application: Within the graphic-interface folder, and with the Conda environment activated, start the application by running the following command:

flask run

Access the Platform: Click on the link http://127.0.0.1:5000/, or copy and paste it into your browser to access the platform and start testing it.
Using GUI: The AnnoTEP interface is organised into three main sections: Data Input, Additional Features, and Additional Input Files. Below, the use of each of these areas is detailed to configure the analysis of transposable elements in plant genomes.

Data Input

Email Address: To start the annotation process, enter a valid email address. This is a optional field and is used to send notifications about the status of the analysis. While the email facilitates communication, the annotation occurs locally, so keep the system running during the process.
Genome Data: Upload the input file containing the complete genomic sequence in FASTA format. Use the "Browse" button to select the file from the local system.

Additional Features

This section provides advanced configuration options for the analysis.

Species Specification to Identify TIR Candidates: Specify the species to optimise the identification of TIR elements. The options include:

Steps to Be Executed: Select which parts of the pipeline will be executed:

Overwrite: Decide whether existing output data should be overwritten.
Sensitivity: Control the execution of RepeatModeler to identify additional elements.
Annotation: Specify whether the genome-wide annotation of TEs should proceed after building the TE library.
Evaluate: Check if the classification of annotated TEs is consistent. The "Annotation" field must be enabled to use this feature.
Force: If no reliable TE candidates are identified, enabling this option allows the script to continue using a backup rice TE library.
TIR filter: Filter TIRs without annotated domains. Enabling this filter can substantially reduce false positives, but may also result in the loss of some true positives (false negatives).
Annot. type: Specify whether to annotate the genome using a RepeatMasker-based librar. Enabling this option may negatively affect the filtering step and compromise benchmark results.
Neutral Mutation Rate: Set the neutral mutation rate for calculating the age of intact LTR elements (default: 1.3e-8 bp per year, based on rice). For more information, check out the table in the genome section.
Maximum Divergence: Define the maximum acceptable divergence for TE fragments. For highly repetitive genomes, users are encouraged to adjust the parameter (default: 40).
Threads: Determine the number of threads to be used in running the pipeline (default: 4).

Additional Input File

This section allows for the addition of optional files to customise the analysis:

Coding DNA Sequence: Select a FASTA file containing the coding sequence (without introns, UTRs, or TEs) of the genome or a close relative. This helps in excluding non-transposable elements.
Curate library: Upload a curated library to maintain consistent TE naming and classification. Only manually validated TEs should be provided. This file is optional.
Exclusion of masked regions: Define regions to be ignored during TE masking. The "Annotation" field must be enabled to use this option.
RepeatModeler library: Upload a classified RepeatModeler library to enhance analysis sensitivity, particularly for LINEs. If not provided, one will be generated automatically.
RepeatMasker library: Provide your own homology-based TE annotation in RepeatMasker .out format. This file will be merged with the structure-based annotation. The "Annotation" field must be enabled.

Submit and results

After completing all necessary fields, click "Submit" to start the analysis. Ensure that all configurations are correct for efficient and accurate processing.

If you have provided a valid email address, you will receive a notification once the annotation process has started. Upon completion, a second email will be sent, indicating whether the process was successful or if any issues occurred. This final email will include a detailed log file outlining the outcome.

You can also monitor the progress of the annotation by accessing the "Results" tab. This section displays key information such as:

The name of the generated output file;
Start and end timestamps (when available);
The current status of the annotation (e.g., in progress, completed, failed);
The last 20 lines of the annotation log.

Note 1: all results and output files will be stored in the Docker volume you specified as the output directory. Ensure this path is correctly mounted to access the generated data.

Note 2: Errors may occasionally occur during the annotation process, so it is important to pay attention to two key stages:

Start of annotation: After uploading a file, ensure that the annotation process has actually begun. Sometimes, the file may not be properly recognised by the system, and in such cases, you will need to re-upload the file and restart the process.
Completion of annotation: Even if the process runs, the annotation may not be successfully finalised in the Results section. Therefore, it is important to check the following:

Results: You can monitor the annotation process directly through the GUI by accessing the Results section in the side menu. This section displays the last 10 executed jobs, automatically reading from the directory where your results are stored. For each annotation, the following information is shown: the name of the directory where the data is or will be saved, the execution time, the status (completed successfully or with an error), and an excerpt from the execution log. This log is extracted directly from the directory and updated in real time, showing the latest lines being processed.

This is a visual option — it does not redirect you to the actual directory.
You may close the browser and return later; the GUI will still display the live updates of the ongoing annotation.

User guide: The GUI also includes a brief usage guide to assist you in case you forget how to use any feature of the tool.

Return to menu

Docker

Docker GUI

Recommendations

If you intend to use the email notification system, please note that your machine must have access to the internet for this feature to function properly.

Open the terminal and run the following commands:

Download the AnnoTEP image: Open your terminal and run the following command to download the AnnoTEP Docker image:

docker pull annotep/annotep-gui:v1

Run the container: Next, run the container using the command below. Specify a folder on your machine to store the annotation results:

docker run -it -v <path-to-results-folder>:/usr/local/AnnoTEP/graphic-interface/results -dp 0.0.0.0:5000:5000 annotep/annotep-gui:v1

Parameter descriptions:

-v <path-to-results-folder>:/usr/local/AnnoTEP/graphic-interface/results: Creates a volume between your machine and the container to store results. Replace -v <path-to-results-folder> with the path to a folder on your machine. If the folder doesn't exist, Docker will create it. The path /usr/local/AnnoTEP/graphic-interface/results is the directory inside the container and should not be changed.
-dp 0.0.0.0:5000:5000: Maps port 5000 of the container to port 5000 of the host.
annotep/annotep-gui:v1: Specifies the Docker image to use.

Access the AnnoTEP Interface: After running the container, open your browser and go to the following address to use the graphical interface: http://127.0.0.1:5000
The GUI guide can be found in the subsection: using GUI .

Recommendations

Avoid shutting down your machine during the process, as this may interrupt the analysis. Even when using the web interface, processing occurs locally on your machine.

Annotation speed depends on your machine's performance. Ensure your system meets the recommended requirements for optimal results.

Return to menu

Docker CLI

Follow the steps below to download and configure the AnnoTEP CLI. This version is ideal for advanced users who prefer greater control and customization via commands.

Download the AnnoTEP CLI image: To get started, download the AnnoTEP CLI Docker image by running the following command:

docker pull annotep/annotep-cli:v1

Display the User Guide: Use the -h parameter to display a detailed guide on how to use the script:

docker run annotep/annotep-cli:v1 python run_annotep.py -h

This will display a detailed guide with usage options:

Command Flag	Description	Required?

Run the Container: To simplify this step, we recommend creating a folder to store your genomic data in FASTA format. Once created, run the container using the command below as a guide. Ensure you provide the full path to the folder where you want to save the results, as well as the full path to the genomes folder:

docker run -it -v <path-to-results-folder>:/usr/local/AnnoTEP/bash-interface/results -v <absolute-path-to-folder-genomes>:<absolute-path-to-folder-genomes> annotep/annotep-cli:v1 python run_annotep.py --genome <absolute-path-to-folder-genomes>/genome.fa --threads <number>

Parameter descriptions:

-v <path-to-results-folder>:/usr/local/AnnoTEP/bash-interface/results: Creates a volume between your machine and the container to store results. Replace -v <path-to-results-folder> with the path to a folder on your machine. If the folder doesn't exist, Docker will create it. The path /usr/local/AnnoTEP/bash-interface/results is the directory inside the container and should not be changed.
-v <absolute-path-to-folder-genomes>:<absolute-path-to-folder-genomes>: Creates a temporary copy of the genomic files inside Docker. Replace <absolute-path-to-folder-genomes> with the full path of the folder containing the genomes.
--genome <absolute-path-to-folder-genomes>/genome.fa: Specify the full path to the genome file you want to annotate.
--threads <number>: Define the number of threads to be used.

Monitor the Annotation Process: Wait for the genome annotation to complete. You can monitor the progress directly through the terminal.

Now, just wait for the annotation to complete. You can monitor the progress directly in the terminal, where logs will be displayed in real-time.

Resolving Memory Issues in Docker Containers

If Docker containers experience memory issues or unexpected terminations due to intensive resource usage, you can adjust the process limits (--pids-limit) and swap memory (--memory-swap). Example usage:

docker run -it -v <path-to-results-folder>:/usr/local/AnnoTEP/graphic-interface/results -dp 0.0.0.0:5000:5000 --pids-limit <threads x 10000> --memory-swap -1 annotep/annotep-gui:v1

Explanation

--pids-limit <threads x 10000>: Sets the maximum number of processes the container can create. For example, if you use 12 threads, set this value to 120,000. This ensures each thread can create subprocesses without hitting the process limit, maintaining performance under high load.
--memory-swap -1: Disables the swap memory limit, allowing the container to use unlimited virtual memory. This helps avoid errors when physical RAM is insufficient.

Return to menu

Singularity

You can use AnnoTEP with Singularity by converting the official Docker images. Below are the available methods to obtain and run .sif images.

Obtaining the Singularity Image: There are two ways to obtain the image:

Method 1 – Direct Conversion from Docker Hub: Download and convert the image directly from Docker Hub using:

singularity build <name-image>.sif docker://annotep/annotep-cli:v1
#or
singularity build <name-image>.sif docker://annotep/annotep-gui:v1

Description:

<name-image>: you can name the image anything you like; the extension must be .sif.
docker://: specifies that the image will be pulled from a remote repository (e.g. Docker Hub).

Method 2 – Conversion from a Local Docker Image: This method involves saving the Docker image locally and then converting it:

Save the Docker image to a .tar file:

docker save annotep/annotep-cli:v1 -o annotep_cli1.tar
#or
docker save annotep/annotep-gui:v1 -o annotep_gui1.tar

Convert the .tar file to a Singularity image:

singularity build <name-image>.sif docker-archive://annotep_cli1.tar
#or
singularity build <name-image>.sif docker-archive://annotep_gui1.tar

Description:

-o: specifies the name of the .tar file.
<name-image>: you can name the image anything you like; the extension must be .sif.
docker-archive://: indicates the image will be built from a local .tar archive.

Running the Image: How you run the container depends on the interface you choose:

Singularity GUI

To launch the graphical interface, use:

singularity exec --bind <path-to-results-folder>:/usr/local/AnnoTEP/graphic-interface/results <name-image>.sif bash -c "cd /usr/local/AnnoTEP/graphic-interface && source /usr/local/miniconda3/etc/profile.d/conda.sh && conda activate EDTA-new && python main.py"

After running the container, access the AnnoTEP interface by typing the following address into your web browser:127.0.0.1:5000

Description:

--bind <path-to-results-folder>:/usr/local/AnnoTEP/graphic-interface/results: maps a directory from your local machine to a directory inside the container
bash -c "...": executes a sequence of commands within the container.

The GUI guide can be found in the subsection: using GUI .

Singularity CLI

To run via the command line, use:

singularity exec -B <path-to-results-folder>:/usr/local/AnnoTEP/bash-interface/results -B <absolute-path-to-folder-genomes>:/genomas <name-image>.sif python /usr/local/AnnoTEP/bash-interface/run_annotep.py --genome /genomas/genome.fasta --threads <threads>

Description:

-B: equivalent to --bind, links local directories to container paths.
<path-to-results-folder>:/usr/local/AnnoTEP/bash-interface/results: folder where analysis results will be saved.
<absolute-path-to-folder-genomes>:/genomas: folder containing the input genome files.
python /usr/local/AnnoTEP/bash-interface/run_annotep.py: the main command that starts the analysis.
--genome /genomas/genome.fasta: path to the genome file to be annotated.

Return to menu