Service Policy

Summary

Describes the architecture and integrated services of the D4Science infrastructure, including Virtual Research Environments, data and research artefact services, computational platforms, APIs, support, and service evolution.

Policy Version
1.1
Effective Date

 

1. Purpose and Scope

The D4Science infrastructure provides an integrated ecosystem of services designed to support the full lifecycle of scientific research, from data collection and analysis to collaboration and dissemination of results.

This Service Policy describes how these services are organized, delivered, and interconnected, providing a coherent and user-oriented environment for research communities. Unlike isolated platforms, D4Science operates as a unified infrastructure in which services are designed to work together seamlessly, enabling users to access data, tools, and applications through a consistent and integrated experience.

The purpose of this policy is to clarify the value and role of the services offered, explain how they are accessed and used, and describe how the infrastructure evolves over time to meet the needs of scientific communities.

2. Architecture of the D4Science Infrastructure

D4Science is built as a distributed and modular infrastructure that supports multiple research communities through a shared set of core services and customizable environments.

At its core, the infrastructure combines global services, which provide identity, resource management, and service discovery, with Virtual Research Environments (VREs), which represent the operational environments where users interact with data, tools, and applications.

This architecture allows D4Science to offer both standardization and flexibility. Services are centrally managed and integrated, while each community can operate within its own tailored environment. From the user’s perspective, the infrastructure appears as a cohesive system, even though services may be deployed across different physical or cloud environments.

3. Global Infrastructure Services

Global services provide the backbone of the D4Science infrastructure, ensuring that all components operate coherently and securely.

The Identity and Access Management system enables secure and interoperable authentication, supporting institutional identities through eduGAIN as well as local accounts where required. This allows users to access services across communities using a consistent identity model.

The Resource Registry acts as the infrastructure’s information system, maintaining a structured representation of services, resources, and environments. This enables services to be discovered, monitored, and orchestrated dynamically.

The Resource Manager provides the capability to create and configure Virtual Research Environments, allowing new communities to be onboarded and existing ones to evolve. Through this mechanism, D4Science can adapt to emerging scientific needs without disrupting existing users.

Together, these services ensure that the infrastructure remains coherent, scalable, and responsive to change.

4. Virtual Research Environments

Virtual Research Environments are the central element of D4Science. They provide a digital workspace where research communities can collaborate, access shared resources, and develop scientific workflows.

Each VRE integrates multiple services into a unified environment, allowing users to interact with data, applications, and computational tools without needing to manage the underlying infrastructure. From the user’s perspective, a VRE is a single, coherent environment, even though its components may be distributed across different systems.

The flexibility of VREs allows them to support a wide range of use cases, from open collaborative platforms to restricted project environments, enabling D4Science to serve diverse scientific domains effectively.

5. Data and Research Artefact Management Services

D4Science provides a comprehensive set of services for managing research artefacts, which represent the core assets of scientific work. These artefacts include not only datasets, but also models, workflows, scripts, notebooks, and applications.

A key feature of the infrastructure is that these artefacts are not managed in isolation. They are fully integrated into the computational and collaborative environment, allowing users to move seamlessly between data management, analysis, and sharing.

The Workspace service plays a central role in this ecosystem. It provides users with a structured environment where data and artefacts can be stored, organized, and shared. Workspace is not a standalone storage system: it is directly accessible from computational environments such as JupyterLab, RStudio, and CCP, enabling users to work on data without manual transfers. This tight integration significantly improves usability and efficiency.

StorageHub complements Workspace by providing scalable and persistent storage capabilities, ensuring that large datasets and artefacts can be reliably managed and accessed by other services.

Catalogue services enable the publication and discovery of research artefacts. By associating metadata, licensing information, and references to artefacts, the Catalogue supports reuse and dissemination of scientific outputs across communities.

In addition, specialized services such as the Spatial Data Infrastructure support domain-specific data management needs, demonstrating the extensibility of the platform.

6. Collaboration Services

Collaboration is a fundamental aspect of the D4Science infrastructure. The platform provides integrated tools that enable users to communicate, share knowledge, and coordinate activities within their Virtual Research Environments.

These services include social interaction features, messaging systems, and collaborative documentation tools. Rather than being isolated communication tools, they are embedded within the VRE context, allowing collaboration to occur directly alongside data analysis and application usage.

This integration supports more effective scientific collaboration, reducing fragmentation and enabling communities to work in a unified digital space.

7. Computational and Analytical Services

Computational services in D4Science enable users to transform data into knowledge by executing analytical workflows and deploying applications. These services are designed to support both structured workflows and exploratory analysis.

The Cloud Computing Platform (CCP) provides a framework for executing containerised scientific methods and workflows. It allows researchers to package their methods in a reproducible way and execute them within the infrastructure, without needing to manage the underlying execution environment.

Alongside CCP, D4Science provides interactive environments such as JupyterLab and RStudio, which are widely used for data analysis and scientific programming. These environments are fully integrated with the infrastructure, allowing direct access to Workspace and StorageHub resources. This means that users can seamlessly read, process, and write data without switching contexts.

The infrastructure also supports community-developed applications, which represent a key element of the D4Science value proposition. Communities can develop their own tools, integrate them into the infrastructure, and make them available to other users within a VRE. This enables a model of co-creation, where the infrastructure evolves together with the communities that use it.

Interactive applications such as RShiny apps are also supported and are commonly used to provide user-friendly interfaces to complex analytical workflows. These applications can be deployed within the infrastructure and accessed by users without requiring advanced technical expertise.

Users may deploy their own artefacts within the infrastructure, and while the platform provides the execution environment, responsibility for the correctness and compliance of these artefacts remains with their authors.

The legacy DataMiner platform, previously used for distributed analytics, is being progressively replaced by the Cloud Computing Platform. To ensure continuity, DataMiner remains available for existing workflows until June 2026, with full decommissioning planned for October 2026. This transition reflects the adoption of more flexible and modern execution models.

8. Programmatic Access and APIs

APIs are a fundamental component of the D4Science infrastructure, enabling integration with external systems and the automation of scientific workflows.

Through APIs, users and developers can interact with services, access data, and build applications that extend the capabilities of the infrastructure. This is essential for reproducible science and for integrating D4Science into broader digital ecosystems.

API access is secured using OpenID Connect and token-based authorization, ensuring that all operations are performed within the appropriate Virtual Research Environment context.

The infrastructure may apply usage limits to ensure fair access and system stability, and detailed technical documentation is provided through the developer portal.

9. Infrastructure Hosting

D4Science operates on a hybrid infrastructure model that combines resources from multiple providers.

Core services are hosted on infrastructure operated by CNR and GARR, while additional scalability is provided through cloud resources such as Google Cloud Platform. Despite this distribution, services are designed to appear unified to the user, ensuring a consistent experience regardless of where they are deployed.

This approach provides both reliability and flexibility, allowing the infrastructure to adapt to changing demands and to the needs of different scientific communities.

10. Service Support

D4Science provides dedicated support to ensure that users and communities can effectively use the infrastructure.

Support services include assistance with access, troubleshooting, service requests, and incident handling. All support interactions are managed through a centralized portal, ensuring traceability and consistency.

The support portal is available at https://support.d4science.org.

11. Service Availability

The infrastructure is designed to provide reliable services to the research community. While availability targets typically exceed 99.5%, services are provided on a best-effort basis, reflecting the shared nature of the infrastructure.

Temporary interruptions may occur due to maintenance or unforeseen events, but efforts are made to minimize their impact and ensure continuity wherever possible.

12. Evolution of Services

D4Science is a continuously evolving infrastructure. Services are updated, improved, or replaced to reflect technological advances and the needs of the research community.

Changes are introduced carefully, with the goal of minimizing disruption and supporting communities during transitions.

13. Policy Framework

This document forms part of the D4Science Policy Framework.

For support or questions related to this policy, please contact https://support.d4science.org.