S2-PepAnalyst presents a comprehensive dataset curated to facilitate in-depth exploration and analysis of proteomic data, i.e., small signalling peptides, across multiple plant species. Drawing from a rich array of proteomic resources, including reference databases from Arabidopsis, Avocado Hass/Gween, Mango, and Tomato, this dataset offers researchers and practitioners a robust foundation for investigating protein composition and functionality within these diverse organisms.

The inclusion of proteomic data from Arabidopsis thaliana, renowned as a model organism in plant biology, underscores the dataset's scientific rigour and relevance. Additionally, the incorporation of proteomic information from avocado, mango, and tomato caters to the interests of researchers exploring plant-based nutrition, agriculture, and biotechnology.

Furthermore, S2-PepAnalyst extends its scope beyond the primary species by offering an 'Others' proteome feature. This feature empowers users to deepen into proteomic datasets from alternative organisms, with pre-calculated cut sites probabilities in SignalP enhancing the utility and versatility of the platform.

In essence, the dataset underpinning S2-PepAnalyst encapsulates a wealth of proteomic insights, fostering interdisciplinary research endeavours and driving advancements in plant science, nutrition, and biotechnology.

The dataset used in our application are:

Arabidopsis Proteome (Arabidopsis thaliana)

The Arabidopsis proteome serves as a foundational component within the dataset structure of S2-PepAnalyst. Derived from the comprehensive reference TAIR-Araport11, it encompasses a diverse array of small proteins relevant to Arabidopsis thaliana, a widely studied model organism in plant biology research.

Avocado Hass/Gween Proteome (Persea americana)

The Avocado Hass/Gween proteome enriches the dataset with proteomic information specific to avocado varieties (https://www.avocado.uma.es/). Avocado, known for its nutritional value and culinary versatility, is represented comprehensively through this dataset, offering insights into its unique protein composition.

Mango Proteome (Mangifera indica)

Within the dataset structure of S2-PepAnalyst, the mango proteome provides a nuanced understanding of proteins inherent to the mango fruit (https://mangobase.org/). Mango, celebrated for its tropical sweetness and nutritional benefits, is spotlighted through this dataset, facilitating exploration into its proteomic profile.

Tomato Proteome (Solanum lycopersicum)

The tomato proteome constitutes an integral component of the dataset, offering insights into the protein landscape of Solanum lycopersicum (https://tea.solgenomics.net/). Tomatoes, widely cultivated and consumed globally, are delineated through this dataset, enabling comprehensive analyses pertinent to both agricultural and nutritional contexts.

Other Proteomes

In addition to the aforementioned species, S2-PepAnalyst also accommodates users interested in exploring proteomic data from other organisms (Maize, Strawberry, Pineapple, etc.). This feature encompasses pre-calculated cut sites in SignalP6, catering to diverse research interests beyond the primary species of focus. Users can leverage this functionality to investigate proteomic profiles of various organisms beyond the core dataset.

Moreover, a comprehensive data mining analysis was undertaken to charting the primary map, to the best of our knowledge, of small signalling peptides categorised by family, specifically sourced from Arabidopsis thaliana, as documented in extant literature. Interested parties are invited to make contact via email for access to this catalogue.