Datasets are the third-level (and therefore also optional) resource to organize stored data.
Hence, they need at least a Project or a Collection as parent for their creation.
Datasets should be used to group objects that are closely related to each other into a logical unit and to describe them with additional metadata.
If you don't know how to create a Project you should read the previous chapter about the Project API basics.
If you don't know how to create a Collection you should read the previous chapter about the Collection API basics which is eerily similar to the Project API.
Create Dataset
API example for creating a new Dataset.
Required permissions
This request requires at least APPEND permission on the parent resource in which the Dataset is to be created.
# Native JSON request to create a simple Dataset
curl-d' { "name": "json-api-dataset", "title": "JSON API Dataset", "description": "Created with JSON over HTTP.", "keyValues": [], "relations": [], "data_class": "DATA_CLASS_PUBLIC", "projectId": "<project-id>", "collectionId": "<dataset-id>", "metadataLicenseTag": "CC-BY-4.0", "defaultDataLicenseTag": "CC-BY-4.0", "authors": [] }'\-H'Authorization: Bearer <AUTH_TOKEN>'\-H'Content-Type: application/json'\-XPOSThttps://<URL-to-Aruna-instance-API-endpoint>/v2/datasets
1 2 3 4 5 6 7 8 910111213141516171819202122
// Create tonic/ArunaAPI request to create a Datasetletrequest=CreateDatasetRequest{name:"rust-api-dataset".to_string(),title:"Rust API Dataset".to_string(),description:"Created with the gRPC Rust API client.".to_string(),key_values:vec![],relations:vec![],data_class:DataClass::Publicasi32,metadata_license_tag:Some("CC-BY-4.0".to_string()),default_data_license_tag:Some("CC-BY-4.0".to_string()),parent:Some(Parent::ProjectId("<project-id>".to_string())),authors:vec![]};// Send the request to the Aruna instance gRPC endpointletresponse=dataset_client.create_dataset(request).await.unwrap().into_inner();// Do something with the responseprintln!("{:#?}",response);
1 2 3 4 5 6 7 8 91011121314151617181920
# Create tonic/ArunaAPI request to create a new Datasetrequest=CreateDatasetRequest(name="python-api-project",title="Python API Project",description="Created with the gRPC Python API client.",key_values=[],relations=[],data_class=DataClass.DATA_CLASS_PUBLIC,project_id="<project-id>",collection_id="<collection-id>",metadata_license_tag="CC-BY-4.0",default_data_license_tag="CC-BY-4.0",authors=[])# Send the request to the Aruna instance gRPC endpointresponse=client.dataset_client.CreateDataset(request=request)# Do something with the responseprint(f'{response}')
Get Dataset(s)
API examples of how to fetch information for one or multiple existing Dataset(s).
Required permissions
This request requires at least READ permissions on the Dataset or one if its parent resources.
1234
# Native JSON request to fetch information of a Dataset
curl-H'Authorization: Bearer <AUTH_TOKEN>'\-H'Content-Type: application/json'\-XGET'https://<URL-to-Aruna-instance-API-endpoint>/v2/datasets/{dataset-id}'
1234
# Native JSON request to fetch information of multiple Datasets
curl-H'Authorization: Bearer <AUTH_TOKEN>'\-H'Content-Type: application/json'\-XGET'https://<URL-to-Aruna-instance-API-endpoint>/v2/datasets?datasetIds=dataset-id-01&datasetIds=dataset-id-02'
1 2 3 4 5 6 7 8 910111213
// Create tonic/ArunaAPI request to fetch information of a Datasetletrequest=GetDatasetRequest{dataset_id:"<dataset-id>".to_string(),};// Send the request to the Aruna instance gRPC endpointletresponse=dataset_client.get_dataset(request).await.unwrap().into_inner();// Do something with the responseprintln!("{:#?}",response);
1 2 3 4 5 6 7 8 91011121314151617
// Create tonic/ArunaAPI request to fetch information of multiple Datasetsletrequest=GetDatasetsRequest{dataset_ids:vec!["<dataset-id-01>".to_string(),"<dataset-id-02>".to_string(),"<...>".to_string(),],};// Send the request to the Aruna instance gRPC endpointletresponse=dataset_client.get_datasets(request).await.unwrap().into_inner();// Do something with the responseprintln!("{:#?}",response);
1 2 3 4 5 6 7 8 910
# Create tonic/ArunaAPI request to fetch information of a Datasetrequest=GetDatasetRequest(dataset_id="<dataset-id>")# Send the request to the Aruna instance gRPC endpointresponse=client.dataset_client.GetDataset(request=request)# Do something with the responseprint(f'{response}')
1 2 3 4 5 6 7 8 910111213
# Create tonic/ArunaAPI request to fetch information of multiple Datasetsrequest=GetDatasetsRequest(dataset_ids=["<dataset-id-01>","<dataset-id-02>","<...>"])# Send the request to the Aruna instance gRPC endpointresponse=client.dataset_client.GetDatasets(request=request)# Do something with the responseprint(f'{response}')
Update Dataset
API examples of how to update individual metadata of an existing Dataset.
Required permissions
Name update needs at least WRITE permissions on the specific Dataset or one of its parent resources
Description update needs at least WRITE permissions on the specific Dataset or one of its parent resources
KeyValue update needs at least WRITE permissions on the specific Dataset or one of its parent resources
Dataclass update needs at least WRITE permissions on the specific Dataset or one of its parent resources
12345678
# Native JSON request to update the name of a Dataset
curl-d' { "name": "updated-json-api-dataset" }'\-H'Authorization: Bearer <AUTH_TOKEN>'\-H'Content-Type: application/json'\-XPATCHhttps://<URL-to-Aruna-instance-API-endpoint>/v2/datasets/{dataset-id}/name
12345678
# Native JSON request to update the title of a Dataset
curl-d' { "title": "Updated JSON API Dataset" }'\-H'Authorization: Bearer <AUTH_TOKEN>'\-H'Content-Type: application/json'\-XPATCHhttps://<URL-to-Aruna-instance-API-endpoint>/v2/datasets/{dataset-id}/title
12345678
# Native JSON request to update the description of a Dataset
curl-d' { "description": "Updated with JSON over HTTP." }'\-H'Authorization: Bearer <AUTH_TOKEN>'\-H'Content-Type: application/json'\-XPATCHhttps://<URL-to-Aruna-instance-API-endpoint>/v2/datasets/{dataset-id}/description
123456789
# Native JSON request to update the key-values associated with a Dataset
curl-d' { "addKeyValues": [], "removeKeyValues": [] }'\-H'Authorization: Bearer <AUTH_TOKEN>'\-H'Content-Type: application/json'\-XPATCHhttps://<URL-to-Aruna-instance-API-endpoint>/v2/datasets/{dataset-id}/key_values
Info
Dataclass can only be relaxed: Confidential > Workspace > Private > Public
12345678
# Native JSON request to update the dataclass of a Dataset
curl-d' { "dataClass": "DATA_CLASS_PUBLIC" }'\-H'Authorization: Bearer <AUTH_TOKEN>'\-H'Content-Type: application/json'\-XPATCHhttps://<URL-to-Aruna-instance-API-endpoint>/v2/datasets/{dataset-id}/data_class
123456789
# Native JSON request to update the license of a Dataset
curl-d' { "metadataLicenseTag": "CC0", "defaultDataLicenseTag": "CC0" }'\-H'Authorization: Bearer <AUTH_TOKEN>'\-H'Content-Type: application/json'\-XPATCHhttps://<URL-to-Aruna-instance-API-endpoint>/v2/datasets/{dataset-id}/licenses
1 2 3 4 5 6 7 8 91011121314151617
# Native JSON request to add an author to a Dataset
curl-d' { "addAuthors": [ { "firstName": "John", "lastName": "Doe", "email": "john.doe@example.com", "orcid": "0000-0002-1825-0097", "id": "<user-id-if-registered>" } ], "removeAuthors": [] }'\-H'Authorization: Bearer <AUTH_TOKEN>'\-H'Content-Type: application/json'\-XPATCHhttps://<URL-to-Aruna-instance-API-endpoint>/v2/dataset/{collection-id}/authors
1 2 3 4 5 6 7 8 91011121314
// Create tonic/ArunaAPI request to update the name of a Datasetletrequest=UpdateDatasetNameRequest{dataset_id:"<dataset-id>".to_string(),name:"updated-rust-api-dataset".to_string(),};// Send the request to the Aruna instance gRPC endpointletresponse=dataset_client.update_dataset_name(request).await.unwrap().into_inner();// Do something with the responseprintln!("{:#?}",response);
1 2 3 4 5 6 7 8 91011121314
// Create tonic/ArunaAPI request to update the title of a Datasetletrequest=UpdateDatasetTitleRequest{dataset_id:"<dataset-id>".to_string(),title:"Updated Rust API Dataset".to_string(),};// Send the request to the Aruna instance gRPC endpointletresponse=dataset_client.update_dataset_title(request).await.unwrap().into_inner();// Do something with the responseprintln!("{:#?}",response);
1 2 3 4 5 6 7 8 91011121314
// Create tonic/ArunaAPI request to update the description of a Datasetletrequest=UpdateDatasetDescriptionRequest{dataset_id:"<dataset-id>".to_string(),description:"Updated with the gRPC Rust API client.".to_string(),};// Send the request to the Aruna instance gRPC endpointletresponse=dataset_client.update_dataset_description(request).await.unwrap().into_inner();// Do something with the responseprintln!("{:#?}",response);
1 2 3 4 5 6 7 8 9101112131415
// Create tonic/ArunaAPI request to update the key-values associated with a Datasetletrequest=UpdateDatasetKeyValuesRequest{dataset_id:"<dataset-id>".to_string(),add_key_values:vec![],remove_key_values:vec![]};// Send the request to the Aruna instance gRPC endpointletresponse=dataset_client.update_dataset_key_values(request).await.unwrap().into_inner();// Do something with the responseprintln!("{:#?}",response);
Info
Dataclass can only be relaxed: Confidential > Private > Public
1 2 3 4 5 6 7 8 91011121314
// Create tonic/ArunaAPI request to update the datacalass of a Datasetletrequest=UpdateDatasetDataClassRequest{dataset_id:"<dataset-id>".to_string(),data_class:DataClass::Publicasi32,};// Send the request to the Aruna instance gRPC endpointletresponse=dataset_client.update_dataset_data_class(request).await.unwrap().into_inner();// Do something with the responseprintln!("{:#?}",response);
1 2 3 4 5 6 7 8 9101112131415
// Create tonic/ArunaAPI request to update the licenses of a Datasetletrequest=UpdateDatasetLicensesRequest{dataset_id:"<dataset-id>".to_string(),metadata_license_tag:"CC0".to_string(),default_data_license_tag:"CC0".to_string(),};// Send the request to the Aruna instance gRPC endpointletresponse=dataset_client.update_dataset_licenses(request).await.unwrap().into_inner();// Do something with the responseprintln!("{:#?}",response);
1 2 3 4 5 6 7 8 9101112131415161718192021
// Create tonic/ArunaAPI request to add an author to a Datasetletrequest=UpdateDatasetAuthorsRequest{dataset_id:"<dataset-id>".to_string(),add_authors:vec![Author{first_name:"John".to_string(),last_name:"Doe".to_string(),email:"john.doe@example.com".to_string(),orcid:"0000-0002-1825-0097".to_string(),id:"<user-id-if-registered>".to_string(),}],remove_authors:vec![],};// Send the request to the Aruna instance gRPC endpointletresponse=dataset_client.update_dataset_authors(request).await.unwrap().into_inner();// Do something with the responseprintln!("{:#?}",response);
1 2 3 4 5 6 7 8 91011
# Create tonic/ArunaAPI request to update the name of a Datasetrequest=UpdateDatasetNameRequest(dataset_id="<dataset-id>",name="updated-python-api-project")# Send the request to the Aruna instance gRPC endpointresponse=client.dataset_client.UpdateDatasetName(request=request)# Do something with the responseprint(f'{response}')
1 2 3 4 5 6 7 8 91011
# Create tonic/ArunaAPI request to update the title of a Datasetrequest=UpdateDatasetTitleRequest(dataset_id="<dataset-id>",title="Updated Python API Dataset")# Send the request to the Aruna instance gRPC endpointresponse=client.dataset_client.UpdateDatasetTitle(request=request)# Do something with the responseprint(f'{response}')
1 2 3 4 5 6 7 8 91011
# Create tonic/ArunaAPI request to update the description of a Datasetrequest=UpdateDatasetDescriptionRequest(dataset_id="<dataset-id>",description="Updated with the gRPC Python API client")# Send the request to the Aruna instance gRPC endpointresponse=client.dataset_client.UpdateDatasetDescription(request=request)# Do something with the responseprint(f'{response}')
1 2 3 4 5 6 7 8 9101112
# Create tonic/ArunaAPI request to update the key-values associated with a Datasetrequest=UpdateDatasetKeyValuesRequest(dataset_id="<dataset-id>",add_key_values=[],remove_key_values=[])# Send the request to the Aruna instance gRPC endpointresponse=client.dataset_client.UpdateDatasetKeyValues(request=request)# Do something with the responseprint(f'{response}')
Info
Dataclass can only be relaxed: Confidential > Private > Public
1 2 3 4 5 6 7 8 91011
# Create tonic/ArunaAPI request to relax the data_class of a Datasetrequest=UpdateDatasetDataClassRequest(dataset_id="<dataset-id>",data_class=DataClass.DATA_CLASS_PUBLIC)# Send the request to the Aruna instance gRPC endpointresponse=client.dataset_client.UpdateDatasetDataClass(request=request)# Do something with the responseprint(f'{response}')
1 2 3 4 5 6 7 8 9101112
# Create tonic/ArunaAPI request to update the licenses of a Datasetrequest=UpdateDatasetLicensesRequest(dataset_id="<dataset-id>",metadata_license_tag="CC0",default_data_license_tag="CC0")# Send the request to the Aruna instance gRPC endpointresponse=client.dataset_client.UpdateDatasetLicenses(request=request)# Do something with the responseprint(f'{response}')
1 2 3 4 5 6 7 8 9101112131415161718
# Create tonic/ArunaAPI request to add an author to a Datasetrequest=UpdateDatasetAuthorsRequest(dataset_id="<dataset-id>",add_authors=[Author(first_name="John",last_name="Doe",email="john.doe@example.com",orcid="0000-0002-1825-0097",user_id="<user-id-if-registered")],remove_authors=[])# Send the request to the Aruna instance gRPC endpointresponse=client.dataset_client.UpdateDatasetAuthors(request=request)# Do something with the responseprint(f'{response}')
Snapshot Dataset
API examples of how to snapshot a Dataset, i.e. create an immutable clone of the Dataset and its underlying resources.
Required permissions
This request requires at least ADMIN permissions on the Dataset or one if its parent resources.
1234
# Native JSON request to snapshot a Dataset
curl-H'Authorization: Bearer <AUTH_TOKEN>'\-H'Content-Type: application/json'\-XPOSThttps://<URL-to-Aruna-instance-API-endpoint>/v2/datasets/{dataset-id}/snapshot
1 2 3 4 5 6 7 8 910111213
// Create tonic/ArunaAPI request to snapshot a Datasetletrequest=SnapshotDatasetRequest{dataset_id:"<dataset-id>".to_string()};// Send the request to the Aruna instance gRPC endpointletresponse=dataset_client.snapshot_dataset_version(request).await.unwrap().into_inner();// Do something with the responseprintln!("{:#?}",response);
1 2 3 4 5 6 7 8 910
# Create tonic/ArunaAPI request to snapshot a Datasetrequest=SnapshotDatasetRequest(dataset_id="<dataset-id>")# Send the request to the Aruna instance gRPC endpointresponse=client.dataset_client.SnapshotDatasetVersion(request=request)# Do something with the responseprint(f'{response}')
Delete Dataset
API examples of how to delete a Dataset.
Info
Deletion does not remove the Dataset from the database, but sets the status of the Dataset and the underlying resources to "DELETED".
Required permissions
This request requires at least ADMIN permissions on the Dataset or one if its parent resources.
1234
# Native JSON request to delete a Dataset
curl-H'Authorization: Bearer <AUTH_TOKEN>'\-H'Content-Type: application/json'\-XDELETEhttps://<URL-to-Aruna-instance-API-endpoint>/v2/datasets/{dataset-id}
1 2 3 4 5 6 7 8 910111213
// Create tonic/ArunaAPI request to delete a Datasetletrequest=DeleteDatasetRequest{dataset_id:"<dataset-id>".to_string()};// Send the request to the Aruna instance gRPC endpointletresponse=dataset_client.delete_dataset(request).await.unwrap().into_inner();// Do something with the responseprintln!("{:#?}",response);
1 2 3 4 5 6 7 8 910
# Create tonic/ArunaAPI request to delete a Datasetrequest=DeleteDatasetRequest(dataset_id="<dataset-id>")# Send the request to the Aruna instance gRPC endpointresponse=client.dataset_client.DeleteDataset(request=request)# Do something with the responseprint(f'{response}')