The New Dataset form allows you to register a dataset in RexCommand and supply the necessary metadata for governance, risk classification, and approval workflows.
Filling out this form ensures the dataset is properly documented and available to be linked to AI systems for training, testing, or deployment.
Below is a description of each field and its purpose.
Dataset Details – Field Reference
Dataset Name (required)
Provide a clear, descriptive name for the dataset. This will be visible in the inventory and throughout the platform.
Owning Team (required)
Specify the internal team or department responsible for the dataset’s management.
Contact Person (required)
Enter the name or email of the individual who serves as the point of contact for questions or issues regarding this dataset.
Purpose of Use (required)
Describe how the dataset will be used (e.g., model training, evaluation, validation).
Lifecycle Stage
Select the dataset's current lifecycle phase. For example, “Training” if it's used during model development.
Source Type
Indicate whether the dataset is internal or external in origin.
Data Sensitivity Classification
Label the dataset according to internal data classification standards (e.g., Public, Internal, Confidential).
Approval Status
This is automatically set to “Pending” upon creation and will update as the dataset moves through review and approval processes.
Risk Tier
Assign a risk level to the dataset (e.g., Low, Medium, High) based on its sensitivity, source, or regulatory impact.
Watermarking Technique
Select the method, if any, used to watermark the dataset for traceability or tamper detection.
Source Details
Provide a brief explanation of where the dataset came from (e.g., vendor, in-house collection).
Data Collection Method
Describe how the data was collected (e.g., web scraping, surveys, transactional logs).
Geographic Origin
List the country or region where the data was originally collected or generated.
Volume/Size
Estimate the size of the dataset (e.g., 10GB, 2 million rows).
Retention Policy
State how long the dataset is retained and under what conditions it is archived or deleted.
Access Controls
Describe how access to the dataset is managed (e.g., role-based access, internal-only).
Encryption
Note whether the dataset is encrypted and under what protocols (if applicable).
Sharing Restrictions
List any restrictions on how or with whom the dataset can be shared.
Data Dictionary Link
Include a link to a formal data dictionary or schema definition, if available.
Metadata Description
Provide a summary of what metadata is associated with the dataset.
Known Data Quality Issues
List any known issues, limitations, or inconsistencies in the dataset.
Validation Summary
Summarize any validation steps taken to verify the quality or accuracy of the dataset.
Metadata Available
Toggle this on if metadata has been reviewed and is available for governance use.
Datasheet Available
Toggle this on if a datasheet or formal documentation is available to accompany the dataset.
Data Modalities
Select the type(s) of data contained in the dataset (e.g., text, image, audio). You can also add a custom entry.
Sensitive Data Types Present
Indicate whether the dataset contains any sensitive data types (e.g., PII, biometric, financial).
Compliance Tags
Apply relevant compliance or regulatory tags (e.g., GDPR, HIPAA). You may also enter custom tags.
Downstream Dependencies
List any systems, tools, or workflows that rely on this dataset.
Review Frequency
Specify how often the dataset should be reviewed for compliance or quality (e.g., quarterly, annually).
Monitoring In Place
Indicate whether any monitoring or auditing tools are actively reviewing this dataset.
Notes
Required fields must be completed to create a dataset.
All entered data contributes to compliance readiness and helps teams evaluate dataset appropriateness before use in AI systems.
Once created, the dataset will appear in the Dataset Inventory and can be viewed or edited later.
Click Create Dataset to finalize and register the dataset in the system.