Gateway Terms of Use

Last revised on December 10th, 2019

Disclaimer

This Gateway is operated by D4Science.org. Some of the technologies, services, and procedures, hereafter referred as Content of the Gateway, powering it have been produced with the co-funding of the European Commission. The content of the Gateway is the sole responsibility of the D4Science.org Team and cannot be considered to reflect the views of the European Commission.

Information and content accessible through the Gateway are provided on an "as is" and "as available" basis. The D4Science.org Team makes every effort to ensure, but does not guarantee, the accuracy, completeness or authenticity of the information on this Gateway. The D4Science.org Team reserves the right to alter, limit or discontinue any part of this service at its discretion. Under no circumstances shall the D4Science.org Team be liable for any loss, damage, liability or expense suffered that is claimed to result from the use of information posted on this site or through its web services, including without limitation, any fault, error, omission, interruption or delay.

Hyperlinks to non-D4Science.org websites do not imply any official endorsement of or responsibility on the part of D4Science.org for the opinions, ideas, content or products presented at these locations, or guarantee the validity of the information provided. The sole purpose of links to non-D4Science.org sites is to indicate further information available on related topics. The information is provided on the basis that users accessing this Gateway assume responsibility for assessing its relevance, accuracy and suitability for application.

Copyright

The content and information made available through this Gateway are available under terms described in the metadata accompanying each product (e.g. "license" or "constraints" field). Except where otherwise noted (i.e. in case of primary requirement to comply with the content provider's policy) this is the Creative Commons License.

All derivative products produced and made publicly available through this Gateway are licensed under the Creative Commons License CC BY-SA except where otherwise noted.

Your Agreement to the Terms

YOUR ACCESS OR USE OF THE GATEWAY, A VIRTUAL RESEARCH ENVIRONMENT OR A SERVICE IN ANY WAY SIGNIFIES THAT YOU HAVE READ, UNDERSTAND AND AGREE TO BE BOUND BY THE TERMS.

Our Services are very diverse, so additional terms or requirements apply. Additional terms are available with the relevant Services, and those additional terms become part of your agreement with us if you use those Services.

By accessing or using the Gateway, a Virtual Research Environment or a Service you also declare that you have the legal authority to accept the Terms on behalf of yourself and any party you represent in connection with your use of the Gateway, a Virtual Research Environment or a Service. If you do not agree to the Terms, you are not authorized to use the Gateway, a Virtual Research Environment or a Service.

Please read them carefully.

 

Gateway

Any Gateway exploiting resources of the D4Science Infrastructure and operated by D4Science.org is the access point to products and services identified by the community requesting it.

Upon registration to the Gateway Service, the user can immediately use the storage space via the Workspace Service, the Social Networking Collaborative Platform including the Email Service to send/receive content  to other registered users, the Social Service to share and read news and posts with your connections and the Notification Service for user notifications, and the Catalogue Service to browse available datasets, methods, and services therein published. You can also apply to one or more moderated or public Virtual Research Environments offering one or more additional services.

All the Services made accessible through this Gateway are also accessible through APIs by specifying the secure token generated with the registration and specialised for each Virtual Research Environment the user is member of. Starting from January 1st 2020 the APIs are exploitable only by using the secure and encrypted communication protocol over a computer network, HTTPS, while the support for the deprecated communication protocol over a computer network, HTTP, is definitely terminated December 31st 2019.

You own the information, content, and method you provide to the D4Science Infrastructure through this Gateway under this Agreement, and may request its deletion at any time, unless you have shared information content, or method with others and they have not deleted it, or it was copied or stored by other users. Additionally, you grant D4Science.org a nonexclusive, worldwide, assignable, sublicensable, fully paid up and royalty-free right to us to copy, prepare derivative works of, improve, distribute, publish, remove, retain, add, process, analyse, and use in any way now known or in the future discovered, any information, content, or method you explicitly share or publish, directly or indirectly to D4Science.org, including, but not limited to, any user generated content, techniques or data to the services without any further consent, notice and/or compensation to you or to any third parties to serve the purpose of providing the service required by the User. Any information you submit to us is at your own risk of loss as noted in the following sections of this Agreement.

By sharing information, content, or method to us, you represent and warrant that you are entitled to share the information, content, or method and that the information, content, or method is accurate, not confidential, and not in violation of any contractual restrictions or other third-party rights.

You must promptly notify the D4Science.org Team of any breach of security related to the Services, including but not limited to unauthorized use of your password or account. To help ensure the security of your password or account, please sign out from your account at the end of each session.

The D4Science.org Team, hereafter referred ad D4Science.org, may terminate your account in accordance with the terms of service if you fail to login to your account for a period of SIX months. Moreover, if you are found to be in violation of our policies at any time, as determined by the D4Science.org in its sole discretion, we may warn you or suspend or terminate your account.

 

Action

Conditions of use

Posting Content or Method

You can post content or method to any service made accessible through this Gateway only if (a) you created and own the rights to the content or method or you have the owner's express permission to post the content or method; and (b) the content or method does not infringe any other person's or entity's rights (including the copyrights, trademarks, or privacy rights) or violate any applicable laws, this Terms of Use, Privacy Policy, our Community Best Practices, or any other posted policies. D4Science.org can remove content or method for any infringing reason.

When you use a service to upload content or register a method into a VRE, you retain the irrevocable, exclusive, worldwide right and license to use, reproduce, modify, display, remix, re-distribute, create derivative works, and syndicate your content or method in any medium and through any service. Only where you have specifically shared, or used a specialized service for sharing content, will your content become part of a derivative work to be licensed under a schema selected by the owner.

Sharing Content or Method

D4Science.org offers sharing and publication at different levels: within a VRE, among different VREs and by publication to data repositories that may be accessible to external users.

Each shared content is associated to a copyright license. A CC BY-SA license is proposed if the dataset is obtained with a mash up process from different sources (a true derivative product), any other license (not technically supported by the D4Science Infrastructure) in case of unaltered product shared with other Users/VREs. In both cases the metadata must be filled (if not automatically compiled) and a preferred citation must be indicated.

You are responsible for any content or method you post to our Services and the consequences of sharing such content or method with others or the general public. This includes, for example, any personal information, any confidential data or any unofficial data. D4Science.org is not responsible for the consequences of sharing or posting any personal or other information on its services.

When you use a Service that allows users to share, transform, readapt, modify, or combine the content or method with other content or methods, you grant D4Science.org and our Users a non-exclusive, royalty free, worldwide right and license to [use, reproduce, modify, display, remix, perform, distribute, redistribute, adapt, promote, create derivative works, and syndicate] your content or method [in any medium and through any form of technology or distribution] and to permit any derivative works to be licensed under these same license terms.

Any user accessing shared content must behave according to the copyright license of the dataset. However, the D4Science Infrastructure does not control or audit access to and / or audit use of shared content. It can therefore not impose a shared policy, but can encourage sharing according to best practices.

Sharing content or method within the D4Science infrastructure (i.e. sharing through one or more VREs) will make them accessible only to authorized users, and cannot be accessed from outside. 

Publishing Content

The VRE authorized Users can publish content and make it available to the public exposing to general view with no restrictions except for a registration procedure that may be required. Each published set of content is associated to a copyright license.  Published content can be accessed by the general public through this Gateway and through dedicated web applications (e.g. web services).

Secondary Use of the Content

Data accessed through this Gateway do not, and should not, include controls over its end use. However, the content owner or authoritative source for the content must retain version control of content accessed.

Once the content has been downloaded from the Gateway, D4Science.org cannot vouch for their quality and timeliness. D4Science.org cannot vouch for any analyses conducted with the content retrieved from the Gateway.

Derivative Work and Data Citation

Derivative work should only be produced by complying with the terms and conditions established in the license of the used content.

The derivative work is owned by the legitimate copyright holder identified among the contributors co-owning the new product and is the contact point for any use, re-use/re-distribute request.

Derivative work may contain content retrieved from the D4Science Infrastructure.

Every content should include in its metadata at least a preferred citation, a reference to the infrastructure source dataset and its generation date, and the date that content was accessed or retrieved from the Gateway or any of its services. If the content was a composition of multiple datasets with different citations, the owner of the content should have added that to the relevant metadata.

If such derivative work is produced through VREs, the D4Science Infrastructure generates a default provenance metadata, which matches references to sources used for the derivative work and their respective citations.

You must clearly state that "D4Science.org cannot vouch for the analyses derived from this content after it has been retrieved from the D4Science Infrastructure".

Table 1: Conditions of Use Summary Table

Gateway Service

The Gateway Service enables registered users to access Gateway services and one or more VREs. The Gateway Service employs the Liferay portal as the portlet-hosting platform. Liferay Portal offers a complete platform for building web apps, mobile apps, and web services quickly, using features and frameworks designed for rapid development, good performance, and ease of use. It runs on all major application servers and servlets containers and it is JSR 168 and JSR 286 compliant. Portlets are using JSR 286 and several other technologies such as Java Server Pages for dynamically generation of HTML/XML documents in response to a client's request, and the most popular Front-End frameworks such as Angular, React or Vue.

With the registration procedure, the user has to accept to deposit her/his Personal Profile Data and the user has to accept these conditions of use:

·       Among the types of Personal Profile Data that the Gateway collects, by itself or through third parties, there are: Cookies, Usage data, first name, last name and email address.

·       The Personal Profile Data may be freely provided by the User, or, in case of Usage Data, collected automatically when using an Application;

·       All Data requested by the Gateway is mandatory and failure to provide this Data may make it impossible for this Gateway to provide its services. In cases where this Gateway specifically states that some Data is not mandatory, Users are free not to communicate this Data without any consequences on the availability or the functioning of the service;

·       Users who are uncertain about which Personal Data is mandatory are welcome to contact the Provider;

·       Any use of Cookies – or of other tracking tools – by this Gateway or by the owners of third-party services used by this Gateway serves the purpose of providing the service required by the User, in addition to any other purposes described in the present document and in the Cookie Policy, if available;

·       Users are responsible for any third-party Personal Data obtained, published or shared through this Gateway and confirm that they have the third party's consent to provide the Data to the Owner;

·       D4Science.org, i.e. the Data Controller, processes the Data of Users in a proper manner and shall take appropriate security measures to prevent unauthorized access, disclosure, modification, or unauthorized destruction of the Data;

·       The Data processing is carried out using computers and/or IT enabled tools, following organizational procedures and modes strictly related to the purposes indicated. In addition to D4Science.org, in some cases, the Data may be accessible to certain types of persons in charge, involved with the operation of the site (administration, legal, system administration) or external parties (such as third-party technical service providers, email carriers, hosting providers, communications agencies) appointed, if necessary, as Data Processors by D4Science.org. The updated list of these parties may be requested from D4Science.org at any time;

·       The Data is processed at the D4Science.org operating offices and in any other places where the parties involved with the processing are located. For further information, please contact D4Science.org;

·       The Data is kept for the time necessary to provide the service requested by the User, or stated by the purposes outlined in this document, and the User can always request that D4Science.org suspend or remove the data;

·       The Data concerning the User is collected to allow D4Science.org to provide its services, as well as for the following purposes: Interaction with external social networks and platforms, Contacting the User and Analytics;

·       The Data concerning the access to the services, a.k.a. access log, is kept for the minimal time necessary to provide the service, time that is set to ONE month. This retention period may be modified by D4Science.org without further notice;

·       The Data concerning the exploitation of the services, a.k.a. accounting data, is kept for the minimal time necessary to provide the service, time that is set to ONE year. After this period this Data are aggregated and anonymized. This retention period may be modified by D4Science.org without further notice.

Workspace Service

The Shared Workspace is an online environment to support secure and controlled data storage and sharing.

This facility relies on Apache Jackrabbit for storing and managing workspace item representations. Items payload are stored by relying on a hybrid cloud storage solution that, by means of ad-hoc plugins, exploits various storage solutions suitable for diverse typologies of content, e.g. MongoDB for binary files.

Every workspace item is

·       equipped with an actionable unique identifier that can be used for citation and access purposes;

·       versioned and a new version is automatically produced whenever the item is explicitly changed by a user or any application/service of the VRE on behalf of an authorised user;

·       equipped with rich and extensible business metadata that capture descriptive features as well as lineage features

·       organised in folders that can be

o   private: content is available only for its owner;

o   shared: content is available for selected users (decided by the owner);

o   VRE folder, content is available to VRE members;

The Workspace is tightly integrated with both the social networking and the catalogue for easing the dissemination of its artefacts.

Data Managers can exploit the Shared Workspace while keeping accounting and traceability. Individual scientists can easily share data, experiments, derived data products with selected colleagues. Working groups empower the working group with an easy-to-use environment where operations are traced and accounted.

A backup and recovery strategy protect about data loss incidents that can take a variety of forms, but when it comes to a workspace technology, they generally fall into two categories: catastrophic failure and human error. Even if catastrophic failure includes natural disasters and any other scenario that permanently destroy all the server nodes powering the Workspace service, most of the issues are generated by human error. Humans introduce application bugs or accidentally delete data. A bad code release that corrupts some or all of the production data is an unfortunate but common example. In the case of human error, the errors introduced will propagate automatically to the replicas, often within seconds.

The Workspace service provides continuous, online backup for data as a fully managed service. The service streams encrypted and compressed oplog data to a dedicated server so that a continuous backup is activated. By default, the Workspace takes snapshots every 6 hours and oplog data is retained for at least 24 hours and in any case till all the servers of the storage cluster are properly synchronized. 

By accessing the Workspace service, the user shall be deemed to accept these conditions of use:

·       The user may use the Workspace to store, retrieve, query, serve, and execute content that is owned, licensed or lawfully obtained by you;

·       The user can post content to the Workspace only if (a) the user created and own the rights to the content or the user has the owner's express permission to post the content; and (b) the content does not infringe any other person's or entity's rights (including the copyrights, trademarks, or privacy rights) or violate any applicable laws, the D4Science Terms of Use, Privacy Policy, or any other posted policies;

·       The ownership of any intellectual property rights is not in any way transferred to the D4Science.org;

·       In the case the user decides to share it with other Users, the user remains responsible for any misuse of the data by other Users;

·       At any time, the user can decide to unregister her data. All requests are immediately accepted and operated by the D4Science Infrastructure;

·       The D4Science Infrastructure can reproduce, modify, and generate derivative works (such as those resulting from translations, adaptations or other changes required by the management capabilities of the D4Science Infrastructure) to serve the purpose of providing the service required by the User;

·       The D4Science Infrastructure will not otherwise move or distribute your data for any purpose, except when required to do so by law;

·       D4Science.org is not responsible of the data uploaded and hosted by the D4Science Infrastructure;

·       D4Science.org will not be responsible of any issue regulating intellectual property rights infringement or illegal use of user's data;

·       The D4Science Infrastructure will make reasonable efforts to ensure that data are persisted. In the event of hardware or software Failures caused by failures to a hard drive or power supply, the D4Science Infrastructure will make reasonable attempts to restore your data. No guarantee whatsoever is provided on the success of any user's data recovery;

·       If D4Science.org reasonably believes any of the Content violates the law, infringes or misappropriates the rights of any third party, you will be notified and may request that such content be removed from the Workspace or access to it be disabled. If you do not remove or disable access to the Prohibited Content within ONE business day of our notice, the D4Science infrastructure manager may remove or disable access to the Prohibited Content or suspend its access;

·       The user can use the data other users share with her/him in any circumstance and for all usage, reproduce the data, modify the data, and make derivative data based upon the original data, communicate to the public, including the right to reproduce or display the data or copies thereof to the public and perform publicly, as the case may be, the data;

·       The user cannot in any circumstance and for any usage redistribute the data other users share with her/him or copies thereof, lend and rent those data or copies thereof, sub-license rights in the data or copies thereof.

Social Networking Collaborative Platform

The Social Networking Collaborative Platform enables a comprehensive and collaborative environment to support sharing and collaboration among users. It resembles a social networking environment with posts, tags, mentions, comments and reactions, yet its integration with the other services makes it a powerful and flexible communication channel for researchers.

Social Networking Service

The communication about Virtual Research Environment users relies on the Social Networking Engine, a Cassandra database for storing social networking related data and on Elasticsearch for the retrieval of social networking data. The Engine exposes its facilities by an HTTPS REST Interface and comprises two services: (i ) the Social Networking Service that efficiently stores and accesses social networking data (Posts, Comments, Notifications, etc.) in the underlying Cassandra Cluster, and (ii ) the Social Networking Indexer Service that builds Elasticsearch indices to perform search operations over the social networking data.

There is no predefined way to structure a discussion; users can start new discussion threads, annotate them with tags for easing the cataloguing and discovery, refer to other threads and material both internally stored and available on the web. Every (re)action performed by a user – be it a new post, a reply to a post, or the rating of a certain post or post reply – is carefully captured, documented, and equipped with an actionable unique identifier that can be used for citation and access purposes.

Individual scientists can easily share ideas, comments, suggestions and communicate with other members of the community. Trainers can easily share data, algorithms, and technologies and communicate with the classroom. Trainees can easily access to data and technologies in a controlled environment where experiments and tests can be performed, traced and documented.

By exploiting the Social Networking Service any user shall be deemed to accept these conditions of use:

·       The sharing process can be done only to the members of a single Virtual Research Environment;

·       Ideas the user post and information the user share may be seen and used by other users and D4Science.org cannot guarantee that other Users will not use the ideas and information that any user may share. Therefore, if the user has an idea or information that she/he would like to keep confidential and/or don't want others to use, or that is subject to third party rights that may be infringed by sharing it, that user shall not post it to the Gateway;

·       D4Science.org is not responsible for a user's misuse or misappropriation of any content of information any user post on the Gateway Social Service;

·       D4Science.org is not responsible on the updates/comments that Users can post on the Social Networking Service;

·       In the case the user removes any updates/comments, the D4Science Infrastructure cannot guarantee that other Users have copy of the removed content.

Messaging Service

The Social Networking Collaborative Platform implements a Messaging Service, in order to enable messages exchange only between the D4Science Infrastructure registered users.

This facility relies on the Social Networking service and on Apache Cassandra for storing, indexing and managing messages.

Any user of the D4Science Infrastructure Messaging Service is subject to the policies reported below. In addition to any other violations described in the D4Science.org terms of use, the user may not:

·       Generate or facilitate unsolicited commercial messages;

·       Sending unsolicited messages to significant numbers of VRE users belonging to individuals and/or entities with whom the user has no pre-existing relationship;

·       Send, upload, distribute or disseminate or offer to do the same with respect to any unlawful, defamatory, harassing, abusive, fraudulent, infringing, obscene, or otherwise objectionable content;

·       Transmit content that may be harmful to minors;

·       Illegally transmit another's intellectual property or other proprietary information without such owner's or licensor's permission;

·       Promote or encourage illegal activity;

·       Use the Messaging Service in connection with illegal peer-to-peer file sharing.

Notification Service

For the purpose of facilitating the user interaction, the Social Networking Collaborative Platform includes a Notification Service. The Notification Service can be configured from the Gateway to an email address associated with the user account.

By exploiting the Notification Service, the user shall be deemed to accept these conditions of use:

·       The user agrees that the Gateway may send communication by e-mail;

·       The user can review the Notifications settings to enable/disable the email messages notifications;

·       The user acknowledges and agrees that D4Science.org shall have no liability associated with or arising from failure to do so, including, but not limited to, user failures to receive important notifications.

Catalogue Service

The Publication Platform is a comprehensive environment to support data harmonization and publication. It resembles a catalogue of artefacts with search and browse, yet the openness with respect to the typologies of products published, the metadata to document them as well as the integration with the rest make it a flexible environment. Every published item in the catalogue is characterised by (i) a type, which highlights its features and allows an easier search, (ii) an open-ended set of metadata which carefully describe the item, and (iii) optional resource(s) representing the actual payload of the item.

This platform primarily relies on CKAN technology, i.e. an open source software enabling users to build and operate open data catalogues. This core technology has been wrapped and extended by means of the Catalogue Service, a component realizing the business logic of the publishing platform. The Catalogue Service enacts the management of Catalogue Items. A Catalogue Portlet, accessible in each VRE, allows navigating the catalogue content as well as publishing content by exploiting the Publishing Widget. This widget is also embedded into the Workspace portlet, so users can publish folders and/or files directly from there. External services can access the catalogue content and publish new items via the gCube Catalogue RESTful APIs. The Catalogue Service relies on the Workspace for storing the payload of the published items.

Each catalogue item type carefully defines the metadata elements characterising the item typology by specifying the names of the attributes, the possible values, whether an attribute is a single instance or a repeatable one. In addition to that, each item type contains directives on how to exploit attributes for items organisational purposes, e.g. automatically transform values in tags or exploit the values for creating collections or groups of items.  Moreover, the following properties apply:

·       every catalogue item is equipped with an actionable, persistent, unique identifier (namely a PURL) that can be used for citation and access purposes;

·       whenever a catalogue item is published, the associated payload(s) is stored in a persistent and unchangeable storage area to guarantee its long-term availability;

·       every catalogue item is equipped with a license carefully characterising the possible (re-)uses;

·       every publication of an item may lead (according to the VRE configuration selected by the VRE Manager) to the automatic production of a post in the social networking area of the VRE to inform its members;

·       every catalogue item is equipped with rich and open metadata, i.e. it is possible to carefully customise the typologies of products and the accompanying metadata to the community needs;

·       catalogue contents (item's metadata and resources) are made available for consumption by clients by the RESTful API as well as by other standard APIs, e.g. DCAT and OAI-PMH.

Data Managers extend current practices with controlled and standard data publication with the aim to enlarge the access to data products while maintained full ownership and control over the sharing of results. Individual scientists collaborate via standard formats and protocols with selected colleagues since the conception, definition, validation, and sharing of a result. Working groups normalize the exchange of data on activities, enlarge the audience of products, user and data management for the working group.

The D4Science Infrastructure offers data sharing at different levels: within a Catalogue accessible only in the VRE, accessible in different VREs and publicly accessible via the Gateway service.

By accessing the Catalogue Service, the user shall be deemed to accept these conditions of use:

·       Any user can post content, data and/or publication, to the D4Science Infrastructure only if (a) the user created and own the rights to the content or the user has the owner's express permission to post the content; and (b) the content does not infringe any other person's or entity's rights (including the copyrights, trademarks, or privacy rights) or violate any applicable laws, the Terms of Use, and the Privacy Policy. D4Science.org can remove content for any infringing reason;

·       Any user is responsible for any content the user publishes and the consequences of publishing such content with others or the general public. This includes, for example, any personal information, any confidential data or any unofficial data. D4Science.org is not responsible for the consequences of sharing or posting any personal or other information on its services;

·       When any user uses a service to publish content into a VRE, any other user of the VRE retains the irrevocable, exclusive, worldwide right and license to use, reproduce, modify, display, remix, re-distribute, create derivative works, and syndicate the content in any medium and through any service. Only where this other user has specifically published, or used a specialized service for publishing data, the data become part of a derivative work to be licensed under a schema selected by the owner;

·       Each published set of data is associated to a copyright license. A CC BY-SA license if the dataset is obtained with a mash up process from different sources (a true derivative product), any other license (not technically supported by D4Science.org) in case of unaltered product shared with other Users/VREs. In both cases the metadata must be filled (if not automatically compiled) and a preferred citation must be indicated;

·       When any user uses a Service that provides the capabilities to publish content, the user grants D4Science.org a non-exclusive, royalty free, worldwide right and license to [use, reproduce, modify, display, remix, perform, distribute, redistribute, adapt, promote, create derivative works, and syndicate] the content [in any medium and through any form of technology or distribution] and to permit any derivative works to be licensed under these same license terms;

·       Any user accessing published data must behave according to the copyright license of the dataset. However, D4Science.org cannot control or audit access to and / or audit use of shared data. It can therefore not impose a shared data policy, but can encourage sharing according to best practices;

·       Publishing data within the D4Science Infrastructure, i.e. publishing through one or more VREs, will make them accessible only to authorized users, and cannot be accessed from outside with the exception of the metadata describing the data that may become accessible by anonymous users.   

·       The VRE authorized Users can publish data and make it available to the public exposing to general view with no restrictions except for a registration procedure. Each published set of data is associated to a copyright license;

·       Data accessed through the Gateway do not, and should not, include controls over its end use. However, the data owner or authoritative source for the data must retain version control of datasets accessed;

·       Once the data have been downloaded from the Gateway, D4Science.org cannot vouch for their quality and timeliness;

·       D4Science.org cannot vouch for any analyses conducted with data retrieved from the Gateway;

·       Derivative work should only be produced by complying with the terms and conditions established in the license of the used dataset(s);

·       The derivative work is owned by the legitimate copyright holder identified among the contributors co-owning the new product and is the contact point for any use, re-use/re-distribute request;

·       Derivative work may contain data retrieved from the D4Science Infrastructure;

·       Every dataset should include in its metadata at least a preferred citation, a reference to the infrastructure source dataset and its generation date, and the date that data were accessed or retrieved from the Gateway or any of its services. If the dataset was a composition of multiple datasets with different citations, the owner of the dataset should have added that to the relevant metadata;

·       If such derivative work is produced through VREs, the D4Science Infrastructure generates a default citation, which matches references to sources used for the derivative work and their respective citations;

·       Any user must clearly state "D4Science.org cannot vouch for the data or analyses derived from these data after the data have been retrieved from the D4Science Infrastructure".

Virtual Research Environments (VREs)

In the D4Science Infrastructure, services and related access to data are available through Virtual Research Environments (VREs). These VREs can take various forms including web interactive user interface, web applications, pluggable standalone user interface.

VREs can be created for a specific period and a specific task and only authorized users access data and services exposed through these VREs.

There are different levels of user authorization available: VRE Manager, VRE Designer, VRE Data Manager and VRE User:

·       VRE Designers select the services/data that are made available in the VRE;

·       VRE Managers have the right to grant access to other users and to assign roles;

·       VRE Data Managers have the right to register/unregister datasets and make them available to all Users of the VRE;

·       VRE Users can upload and manage data within dedicated VRE(s) according to their authorization level.

Users can apply to a Moderated or a Public Virtual Research Environment:

·       Access to the Moderated VREs is restricted. The acceptance of an application issued by a user is solely under the responsibility of the VRE Manager that has to verify that the user belongs to the community operating the VRE.

·       Access to the Public VRE is regulated by an automatic procedure that grants the access to any user applied for it.

Each VRE may offer access to a Data Analytics Platform to perform data analysis; a Tabular Data Manager Platform to collate, harmonize and manage tabular data; a Spatial Data Infrastructure set of services to publish, discover, access, and manage vectoral and raster georeferenced datasets; a NLPHub to orchestrate and combine several state-of-the-art text mining services that recognize spatiotemporal events, keywords, and a large set of named entities; and to third-party software as Share-Latex, Shiny Proxy, and RStudio.

By exploiting any of the VRE service, the user shall be deemed to accept the conditions of use reported in the following sections.

Data Analytics Platform

The Data Analytics Platform is a comprehensive and collaborative environment to perform statistical data analysis. It resembles a stand-alone analytics platform, e.g. Weka, with a collection of ready-to-use algorithms and methods, yet it relies on a distributed and heterogeneous computing infrastructure enacting the execution of complex tasks.

The Data Analytics Platform is composed by four main software components: the DataMiner Master, the DataMiner Worker, the Algorithm Importer, and the Algorithm Publisher:

  • The DataMiner Master is a web service that orchestrates (distributes, monitors, collects) the processes to the DataMiner Worker(s). Occasionally, the DataMiner Master can execute non-partitionable algorithms locally. The Master is conceived to work in a cluster of replica services operating behind a proxy acting as load balancer. It is offered by a standard web-based protocol, i.e. OGC WPS.
  • The DataMiner Worker is a web service in charge of executing the processes it is assigned to by a Master. The service is conceived to work in a cluster of replica services and is offered by a standard web-based protocol, i.e. OGC WPS.
  • The Algorithm Importer is a Web Application enabling users to inject new algorithms into the platform by using various programming languages as R, Java, Python, PHP, and more. Injected new algorithms are private, meaning that only the user can access and execute them. That user can easily either share the code and the algorithm with her/his colleagues or make the new algorithm executable by all users of the VRE.
  • The Algorithm Publisher is a web service in charge of deploying new algorithms on the cluster of DataMiner Workers.

Non-partitionable algorithms are directly executed on a DataMiner Master instance and possibly use parallel processing on several cores and a large amount of memory. Distributed algorithms use distributed computing with a Map-Reduce approach.

Both Master and Worker services are conceived to be deployed and operated by relying on various providers, e.g. Master and Worker instances can be deployed on private or public cloud providers

The Data Analytics Platform is not a typical Cloud computational engine since it adds to the capabilities to optimally distribute processes to cluster of workers the following features that effectively enables open science:

  • Every process hosted by the platform is equipped with an actionable unique identifier that can be used for citation and access purposes;
  • the offering and publication of user provided processes (e.g. scripts, compiled programs) by an as-a-Service standard-based approach (processes are described and exposed by the OGC WPS);  
  • the ability to manage and support processes produced by using several programming languages (e.g. R, Java, Fortran, Phyton);
  • the automatic production of a detailed provenance record for every analytics task executed by the platform, i.e. the overall input/output data, parameters, and metadata that would allow to reproduce and repeat the task are stored into the workspace and documented by a PROV-O-based accompanying record;
  • integration with the shared workspace to implement collaborative experimental spaces, e.g. users can easily share datasets, methods, code;
  • extensibility of the platform to quasi-transparently rely on and adapt to a distributed, heterogeneous and elastically provided array of workers to execute the processing tasks.

Software Developers can easily test, validate, and run algorithms developed in a large spectrum of software languages on a distributed infrastructure that appears to them as a single coherent system effectively hiding the different technologies and security frameworks. IT Managers can offer a scalable (on demand), controlled (by maximum quota of exploitable resources per user/per day), and secure environment where executing reproducible data analytical algorithms. Data Managers select the appropriate data analytical models among the wide spectrum of models offered with the aim of offering an easy-to-use data dashboard. Individual scientists have access to a wide spectrum of available data analytical models and can easily test, validate, and run algorithms developed in a large spectrum of software languages. Working groups can configure and enrich the data dashboard with the aim of supporting the activities of a working group while hiding the complexity of a powerful, controlled, and secure e-infrastructure that can scale according to the evolving needs of the group.

By exploiting the Data Analytics Platform, the user shall be deemed to accept these conditions of use:

·       The user may use the Data Analytics Platform to execute processes that are available in a VRE;

·                         The user can register a new method to the Data Analytics Platform only if (a) the user created and own the rights to the software method or the user has the owner's express permission to post the software method; and (b) the software method does not infringe any other person's or entity's rights (including the copyrights, trademarks, or privacy rights) or violate any applicable laws, the D4Science Terms of Use, Privacy Policy, or any other posted policies;

·       The ownership of any intellectual property rights is not in any way transferred to D4Science.org;

·       In the case the user decides to enable other Users to execute it by making her/his software method executable in the VRE, the user remains responsible for any misuse of the software method by other Users;

·       At any time, the user can unshare the access to her software method code. All requests are immediately accepted and operated by D4Science.org;

·       At any time, the user can issue a D4Science Software Maintenance Ticket to unregister and remove access to her executable code. All requests are accepted and operated by D4Science.org within 2 working days;

·       The D4Science Infrastructure can reproduce, modify, and generate derivative works (such as those resulting from translations, adaptations or other changes required by the management capabilities of the D4Science Infrastructure) to serve the purpose of providing the service required by the User;

·       The D4Science Infrastructure will not otherwise use, reuse software methods for any purpose, except when required to do so by law;

·       D4Science.org is not responsible of the software methods uploaded and hosted by the D4Science Infrastructure;

·       D4Science.org will not be responsible of any issue regulating intellectual property rights infringement or illegal use of user's software methods;

·       The D4Science Infrastructure will make reasonable efforts to ensure that software methods are persisted. In the rare event of hardware or software Failures caused by failures to a hard drive or power supply, the D4Science Infrastructure will make reasonable attempts to restore any registered software method. No guarantee whatsoever is provided on the success of any user's software method recovery;

·       The D4Science Infrastructure will make reasonable efforts to ensure that the analytics methods integrated into the infrastructure remain accessible and executable for the longest time possible without human intervention from the process owner. Whenever human intervention from the process owner is needed because of reasons including security aspects and technology evolution, a notification to the owner with request for changes will be issued by the D4Science Infrastructure. If no reply is received by the method owner within ONE business week from the notification of the request for changes, the access to the method will be disabled and a second notification will be sent as a reminder. If no reply to the reminder is received by the method owner within ONE business week from the notification, the method will be discontinued. This operation period may be modified by D4Science.org without further notice;

·       If D4Science.org reasonably believes any of the software method violates the law, infringes or misappropriates the rights of any third party, the user will be notified that such software methods be removed from the Workspace or access to it be disabled. If the user does not remove or disable access to the Prohibited software method within ONE business day of notice, the D4Science infrastructure manager may remove or disable access to the software method or block access to it;

·       The user can use the software method other users share with her/him in any circumstance and for all usage, exploit it in workflows, communicate to the public, including the right to reproduce or display the software method or copies thereof to the public and perform publicly, as the case may be, the software method;

·       The user cannot in any circumstance and for any usage redistribute the software method other users share with her/him or copies thereof, lend and rent those software method or copies thereof, sub-license rights in the software method or copies thereof.

Tabular Data Manager Platform

The Tabular Data Manager Platform provides users with a collaborative toolset to collate, harmonize and manage tabular data (e.g. time series, code lists. The harmonization includes conversion from private formats to international formats such as SDMX and access to environment where it is possible to use R-programming.

The Tabular Data Manager Platform is composed by three main software components:

  • Tabular Data Portlet: human centric web application. Allows management of the Tabular Data Service system on a per-user basis, allowing invocation of tabular data service methods and additional functionality with external apps.
  • Tabular Data Service: main component of the tabular data architecture. It exposes several remote interfaces covering different areas of functionality.
    • Operation orchestrator: The operation orchestrator is a component that receives incoming call requests from the service interface and unwinds them into a sequence of operation call. The orchestrator may enforce policies and command automatic operations/validations according to its configuration.
    • Operation modules: Operation modules are classes that brings functionality to the tabular data service. This functionality can be reached directly with invocations on the Operation Interface. Operation modules can work directly with data on the Data back-end or leverage the cube manager in order to create/clone tables or modify table metadata.
    • Cube Manager: The cube manager is the lowest level component of the service stack. Its main responsibilities are managing the creation/modification of tables (and their metadata) and acting as a registry for all the created tables, allowing retrieval of tables metadata.
  • Data/Metadata back-end: This is where raw tables data and metadata are stored and where the service keeps its management data. It relies on PostgreSQL database, opportunity federated in cluster to ensure scalability and availability.

IT Managers are provided with a scalable and controlled environment for teams working on data quality control, data harmonization, and data analysis and mining. Data Managers can exploit a scalable and controlled environment where data provenance and traceability are integrated features of the data dashboard. Individual scientists can import, validate, and harmonize data in order to make them suitable for analytical methods accessible through D4Science or locally. Working groups can support large and distributed teams requiring a common trusted environment ensuring traceability of the operations, roll-back, and supervision before publication.

By exploiting the Tabular Data Manager (TDM) Platform, the user shall be deemed to accept these conditions of use:

·       The user may use the TDM to store, retrieve, query, serve, and execute content that is owned, licensed or lawfully obtained by you;

·       The user can post content to the TDM only if (a) the user created and own the rights to the content or the user has the owner's express permission to post the content; and (b) the content does not infringe any other person's or entity's rights (including the copyrights, trademarks, or privacy rights) or violate any applicable laws, the D4Science Terms of Use, Privacy Policy, or any other posted policies;

·       The ownership of any intellectual property rights is not in any way transferred to D4Science.org;

·       In the case the user decides to share it with other Users, the user remains responsible for any misuse of the data by other Users;

·       In the case the user decides to publish the content into the SDMX Registry provided by D4Science.org, the user remains responsible for any misuse of the data by other Users;

·       At any time, the user can decide to unregister her data. All requests are immediately accepted and operated by the D4Science Infrastructure;

·       The D4Science Infrastructure can reproduce, modify, and generate derivative works (such as those resulting from translations, adaptations or other changes required by the management capabilities of the D4Science Infrastructure) to serve the purpose of providing the service required by the User;

·       The D4Science Infrastructure will not otherwise move or distribute your data for any purpose, except when required to do so by law;

·       D4Science.org is not responsible of the data uploaded and hosted by the infrastructure;

·       D4Science.org will not be responsible of any issue regulating intellectual property rights infringement or illegal use of user's data;

·       The D4Science Infrastructure will make reasonable efforts to ensure that data are persisted. In the event of hardware or software Failures caused by failures to a hard drive or power supply, the D4Science Infrastructure will make reasonable attempts to restore your data. No guarantee whatsoever is provided on the success of any user's data recovery;

·       If D4Science.org reasonably believes any of the Content violates the law, infringes or misappropriates the rights of any third party, you will be notified and may request that such content be removed from the Workspace or access to it be disabled. If you do not remove or disable access to the Prohibited Content within ONE business day of our notice, the D4Science Infrastructure manager may remove or disable access to the Prohibited Content or suspend its access;

·       The user can use the data other users share with her/him in any circumstance and for all usage, reproduce the data, modify the data, and make derivative data based upon the original data, communicate to the public, including the right to reproduce or display the data or copies thereof to the public and perform publicly, as the case may be, the data;

·       The user cannot in any circumstance and for any usage redistribute the data other users share with her/him or copies thereof, lend and rent those data or copies thereof, sub-license rights in the data or copies thereof.

Spatial Data Infrastructure (SDI) Service

The Spatial Data Infrastructure (SDI) services provide users with the capability to store, discover, access, and manage vectoral and raster georeferenced datasets.

The SDI exploits the following technologies: GeoServer equipped with PostgreSQL and PostGIS, GeoNetwork, Thredds. All the exploited technologies are deployed to ensure fault-tolerance and load-balancing. Moreover, they are designed to scale to the limit established and reported in this SLA. All datasets can be described using the standard ISO 19115, that defines what information should exist in the metadata document, and represented in ISO 19139 that defines how metadata conforming to ISO 19115 have be stored in XML format.

The D4Science Infrastructure ensures that the SDI services are integrated with the other services. In particular, it is worth to report the following:

·       It is possible to publish content in Thredds by simply uploading that content in special folders provided by the Workspace service.

·       It is possible to generate and/or simply validate the ISO 19139 metadata before publishing them into GeoNetwork. It is possible also to validate the compliancy with the INSPIRE directive if requested.

·       It is possible to use any viewer compliant with the OGC standards, as the MapViewer.

·       It is possible to use PgAdmin as tool to connect to the Postgresql database exploited by the GeoServer instances by issuing an explicitly personal request that has to be approved by the VRE Manager and enforced by D4Science.org.

Software Developers can exploit powerful OGC-based spatial management services. IT Managers can find a coherent OGC complaint SDI for data storage, access and processing. Data Managers can exploit a comprehensive overview of spatial info and processes. Individual scientists have a spatial data on demand and processing capabilities at their fingerprints. Working groups can develop a comprehensive understanding of the socio-economic state of and pressure of a specific spatial area.

By exploiting the SDI services, the user shall be deemed to accept these conditions of use:

·       Any user can publish metadata and/or data, hereafter generically identified as content, to the SDI only if (a) the user created and own the rights to the content or the user has the owner's express permission to post the content; and (b) the content does not infringe any other person's or entity's rights (including the copyrights, trademarks, or privacy rights) or violate any applicable laws, the Terms of Use, and the Privacy Policy. D4Science.org can remove content for any infringing reason;

·       Any user is responsible for any content the user publishes and the consequences of publishing such content with others or the general public. This includes, for example, any personal information, any confidential data or any unofficial data. D4Science.org is not responsible for the consequences of publishing any personal or other information on its services;

·       When any user uses a service to publish content a SDI service, any other user retains the irrevocable, exclusive, worldwide right and license to use, reproduce, modify, display, remix, re-distribute, create derivative works, and syndicate the content in any medium and through any service. Only where this other user has specifically published, or used a specialized service for publishing data, the data become part of a derivative work to be licensed under a schema selected by the owner;

·       Each published set of data is associated to a copyright license. A CC BY-SA license if the dataset is obtained with a mash up process from different sources (a true derivative product), any other license (not technically supported by the D4Science Infrastructure) in case of unaltered product published with other Users/VREs. In both cases the metadata must be filled (if not automatically compiled) and a preferred citation must be indicated;

·       When any user uses a Service that provides the capabilities to publish, transform, readapt, modify, or combine user content with other content, the user grants D4Science.org a non-exclusive, royalty free, worldwide right and license to [use, reproduce, modify, display, remix, perform, distribute, redistribute, adapt, promote, create derivative works, and syndicate] the content [in any medium and through any form of technology or distribution] and to permit any derivative works to be licensed under these same license terms;

·       Any user accessing published data must behave according to the copyright license of the dataset. However, any SDI service cannot control or audit access to and / or audit use of published data. It can therefore not impose a data policy, but can encourage to reuse data according to best practices;

·       Publishing data through the SDI services will enable anonymous user access. An explicit action has to be performed at publication time to make the data accessible only from authorized users;

·       The VRE authorized Users can publish data and make it available to the public exposing to general view with no restrictions except for a registration procedure that may be required with specific configuration activated in a VRE;

·       Published data can be accessed by the general public through the Gateway and through dedicated web applications (e.g. web services);

·       Data accessed through the Gateway do not, and should not, include controls over its end use. However, the data owner or authoritative source for the data must retain version control of datasets accessed;

·       Once the data have been downloaded from the Gateway, D4Science.org cannot vouch for their quality and timeliness;

·       D4Science.org cannot vouch for any analyses conducted with data retrieved from the Gateway;

·       Derivative work should only be produced by complying with the terms and conditions established in the license of the used dataset(s);

·       The derivative work is owned by the legitimate copyright holder identified among the contributors co-owning the new product and is the contact point for any use, re-use/re-distribute request;

·       Derivative work may contain data retrieved from the D4Science Infrastructure;

·       Every metadata of a dataset should include at least a preferred citation, a reference to the infrastructure source dataset and its generation date, and the date that data were accessed or retrieved from the Gateway or any of its services. If the dataset was a composition of multiple datasets with different citations, the owner of the dataset should have added that to the relevant metadata;

·       Any user must clearly state "D4Science.org cannot vouch for the data or analyses derived from these data after the data have been retrieved from the D4Science Infrastructure".

NLPHub

NLPHub is a distributed system that orchestrates and combines several state-of-the-art text mining services that recognize spatiotemporal events, keywords, and a large set of named entities. NLPHub integrates: (i) the Stanford CoreNLP software for English, German, French, and Spanish; (ii) the Italian NLP Tool (Tint) for Italian; (iii) Apache OpenNLP toolkit that includes methods for language detection, tokenization, part-of-speech tagging, morphological parsing, and named entity recognition; (iv) ItaliaNLP that includes methods for part-of-speech tagging, tokenization, morphological parsing, lemmatization, named entity recognition, clustering, words similarity assessment, and sentiment analysis; (v) NewsReader event recognizer; (vi) Keywords NEW that produces tag clouds of words and nouns. For all of them the access is offered as-a-service.

An orchestrator algorithm (AMERGE), implemented as a DataMiner process, receives a user-provided input text, along with the indication of the text language (optionally), and a set of annotations to extract (selected among those supported for that language).

NLPHub is exploitable with anonymous access only via its graphical user interface. By exploiting the user interface, a user may upload a text file and obtain an annotated text. Programmatic exploitation of the NLPHub is possible only by joining a VRE offering it as a service.

By exploiting the NLPHub, the user shall be deemed to accept these conditions of use:

·       The user agrees that the infrastructure may log, audit, and reboot the service;

·       The user acknowledges and agrees that D4Science.org shall have no liability associated with or arising from failure to provide the service, including, but not limited to, failures to emit availability notifications;

·       The ownership of any intellectual property rights generated by exploiting this service is not in any way transferred to D4Science.org;

·       If D4Science.org reasonably believes any of the content generated through NLPHub violates the law, infringes or misappropriates the rights of any third party, the user will be notified and may request that such content be removed from the infrastructure. If the user does not remove or disable access to the prohibited content within ONE business day of the notice, the D4Science Infrastructure manager may remove or disable access to that content or suspend its access.

ShareLaTex

ShareLaTeX is an online LaTeX editor that allows real-time collaboration and online compiling of projects to PDF format. In comparison to other LaTeX editors, ShareLaTeX is a server-based application, which is accessed through a web browser. On July 20, 2017, ShareLaTeX was acquired by Overleaf.  Overleaf plans to continue ShareLaTeX under the brand Overleaf v2 which was in beta testing up until the 4th of September 2018. The provision of the ShareLaTeX service is therefore limited to the availability of the open source license and it will be guaranteed by D4Science.org until Overleaf will not change it.

By exploiting the ShareLaTex, the user shall be deemed to accept these conditions of use:

·       The user agrees that the infrastructure may log, audit, and reboot the service;

·       The user acknowledges and agrees that D4Science.org shall have no liability associated with or arising from failure to provide the service, including, but not limited to, failures to emit availability notifications;

·       The ownership of any intellectual property rights generated by exploiting this service is not in any way transferred to D4Science.org;

·       At any time, the user can decide to unpublish content made available through SharaLaTex. All requests are immediately accepted and operated by the D4Science Infrastructure;

·       If D4Science.org reasonably believes any of the content made available through SharaLaTex violates the law, infringes or misappropriates the rights of any third party, the user will be notified and may request that such content be removed from the infrastructure. If the user does not remove or disable access to the prohibited content within ONE business day of the notice, the D4Science Infrastructure manager may remove or disable access to that content or suspend its access.

RShiny Application

The Shiny application is delivered by the user to the D4Science.org as a complete Shiny Application on the Docker Registry[1].

The user is responsible to build the standard Docker image as described in the ShinyProxy documentation by writing the docker file, building the docker image and providing the configuration for ShinyProxy.

D4Science.org makes accessible the Shiny Application on the D4Science infrastructure by ensuring its operation and orchestration in a cluster. The cluster is composed by multiple Docker hosts which run in parallel mode and act as managers and workers.

By exploiting the RShiny Application, the user accepts the conditions of use:

·       The user agrees that the infrastructure may log, audit, and reboot the Shiny application;

·       The user acknowledges and agrees that D4Science.org shall have no liability associated with or arising from failure to provide the service, including, but not limited to, failures to emit availability notifications;

·       The ownership of any intellectual property rights generated by exploiting this application is not in any way transferred to D4Science.org;

·       At any time, the user can remove any uploaded Shiny application. All requests are immediately accepted and operated by the D4Science Infrastructure;

·       If D4Science.org reasonably believes any of the uploaded application through this service violates the law, infringes or misappropriates the rights of any third party, the user will be notified and may request that such application be removed from the infrastructure. If the user does not remove or disable access to the prohibited application within ONE business day of the notice, the D4Science Infrastructure manager may remove or disable access to the Prohibited Application or suspend its access.

RStudio Application

The RStudio allows to perform online statistical analyses with the R software for statistical computing

The user is responsible to implement proper R scripts as described in the RStudio documentation.

The D4Science Infrastructure makes accessible the RStudio Application on the D4Science infrastructure by ensuring its operation and orchestration in a cluster. The cluster is composed by multiple hosts, each of which is assigned in exclusive mode to a user for an entire online session. At the end of the session, all the content stored in that host may be removed by D4Science Infrastructure. All the data, scripts, and other resources that the user needs to persist have to be stored into the Workspace that is accessible through the RStudio Application.

By exploiting the RStudio Application, the user accepts the conditions of use:

·       The user agrees that the infrastructure may log, audit, and reboot the RStudio application;

·       The user acknowledges and agrees that D4Science.org shall have no liability associated with or arising from failure to provide the service, including, but not limited to, failures to emit availability notifications;

·       The ownership of any intellectual property rights generated by exploiting this application is not in any way transferred to D4Science.org;

·       If D4Science.org reasonably believes any of the uploaded data and/or script through this service violates the law, infringes or misappropriates the rights of any third party, the user will be notified and may request that such data and/or script be removed from the infrastructure. If the user does not remove or disable access to the prohibited data and/or script within ONE business day of the notice, the D4Science Infrastructure manager may remove or disable access to the Prohibited data and/or script or suspend its access.

 [1] https://hub.docker.com