Accepted raw microarray files formats < Guide < Annotare

Annotare submission guidelines for sequencing-based spatial transcriptomics

ArrayExpress accepts sequencing-based spatial transcriptomics datasets. These guidelines describe how to prepare and submit such data using Annotare.

Please note that this page highlights requirements specific for spatial data and is not comprehensive. Please refer to the general submission guidelines for complete guidance on the Annotare/ArrayExpress submission process.

Imaging-based spatial transcriptomics datasets are not accepted in ArrayExpress and should instead be submitted to the BioImage Archive. For more information on how to submit imaging-based spatial transcriptomics datasets and more broadly on spatial transcriptomics data see the dedicated portal.

Scope and Data Routing

Raw data files are brokered to the European Nucleotide Archive (ENA), while ArrayExpress archives metadata, images, and processed data.

Human Data and Consent

Data derived from human samples that could potentially enable donor identification may only be submitted if appropriate consent for public data release has been obtained.

If raw data from human material experiments cannot be released publicly, we can accept submissions with processed data only. In these cases, sufficient metadata and protocol descriptions must still be provided to enable interpretation of the dataset. A curator will need to approve the submission manually to override the missing raw data file error—please contact the Annotare team for assistance.

General Principles

The aim of an ArrayExpress submission is to ensure that datasets can be understood and reproduced independently of any associated publication.

All relevant metadata, protocols, and processed outputs should therefore be included.

Raw sequencing data should be provided as compressed FASTQ or BAM files, named according to prescribed naming conventions.
Raw high-resolution tissue images should be included but assigned as processed data files.
Include all additional processed outputs described below.

Annotare Template Selection

For spatial transcriptomics submissions, please select the “Single cell sequencing” template in Annotare.

This template allows submission of more than two raw data files per sequencing library and captures the required experimental metadata.

Leave empty or select “Other” for any library construction categories that are not relevant for your method.

Recommended Metadata for Spatial Transcriptomics

Annotare will automatically require core metadata based on the selected experimental model. For spatial transcriptomics datasets, we strongly recommend including the additional sample attributes listed below.

Sample-Level Metadata (Required)

Sample ID: Unique identifier for the biological specimen.
Specimen slide ID: Identifier for the slide or cartridge.
Sampling site: Precise anatomical location (preferably using ontology terms).
Storage state: Preservation method (e.g. fresh frozen, FFPE).
Tissue section thickness: Reported in micrometers (µm).
Permeabilization time: Optional (if relevant).
Position on slide: Location of capture area.

Library Information

In the Single-cell library information tab:

Specify the library construction method (e.g. 10x Visium v2).
If “Other” is selected, provide full details in the nucleic acid library construction protocol.

Protocol Requirements

Please provide all required protocols.

Spatial transcriptomics submissions should also include a Dissection protocol describing:

Sample Preparation

Fixation
Sectioning / slicing
Permeabilization

Sample Imaging

Microscopy instrument
Magnification
Microscopy technique
Imaging setup

Data Processing

In the normalization/data transformation protocol, specify the version of downstream analysis software.

File Requirements

Raw Data

(Required unless restricted human data)

Compressed FASTQ files
Compressed BAM files

Please refer to detailed instructions for raw sequencing file preparation.

A raw high-resolution tissue image must be provided as well but has to be assigned as a processed data file.