Dataset Management#
This section covers the steps to create, upload, and manage datasets in Focoos using the SDK.
The focoos
library supports multiple dataset formats, making it flexible for various machine learning tasks.
In this guide, we will show the following steps:
- 🧬 Dataset format
- 📸 Create dataset
- 📤 Upload data
- 📥 Download your own dataset from Focoos
- 🌍 Download dataset from external sources
- 🗑️ Delete data
- 🚮 Delete dataset
1. Dataset format#
The focoos
library currently supports three distinct dataset layouts, providing seamless compatibility with various machine learning workflows. Below are the supported formats along with their respective folder structures:
- ROBOFLOW_COCO (Detection, Instance Segmentation):
1 2 3 4 5 6 7 8 9
root/ train/ - _annotations.coco.json - img_1.jpg - img_2.jpg valid/ - _annotations.coco.json - img_3.jpg - img_4.jpg
- ROBOFLOW_SEG (Semantic Segmentation):
1 2 3 4 5 6 7 8 9
root/ train/ - _classes.csv (comma separated csv) - img_1.jpg - img_2.jpg valid/ - _classes.csv (comma separated csv) - img_3_mask.png - img_4_mask.png
- SUPERVISELY (Semantic Segmentation):
1 2 3 4 5 6 7 8 9 10 11
root/ train/ meta.json img/ ann/ mask/ valid/ meta.json img/ ann/ mask/
Note
More dataset formats will be added soon. If you need support for a specific format, feel free to reach out via email at support@focoos.ai
2. Create dataset#
The focoos
library enables you to create datasets tailored for specific deep learning tasks, such as object detection and semantic segmentation. The available computer vision tasks are defined in the FocoosTask function. Each dataset must follow a specific structure to ensure compatibility with the Focoos platform. You can select the appropriate dataset format from the supported options detailed in Dataset Format.
Use the following code to create a new dataset:
1 2 3 4 5 6 7 8 9 10 11 |
|
3. Upload data#
Once you've created a dataset, you can upload your data as a ZIP archive from your local folder:
1 |
|
After the upload, you can check dataset preview using:
1 2 |
|
Alternatively, you can list all available datasets (both personal and shared):
1 2 3 4 5 6 7 8 |
|
4. Download your own dataset from Focoos platform#
If you have previously uploaded a dataset to Focoos platform, you can retrieve it by following these steps. First, list all your datasets to identify the dataset reference:
1 2 3 4 5 |
|
Once you have the dataset reference, use the following code to download the associated data to a predefined local folder:
1 2 3 4 |
|
5. Download dataset from external sources#
You can also download datasets from external sources like Dataset-Ninja (Supervisely) and Roboflow Universe, then upload them to the Focoos platform for use in your projects.
pip install dataset-tools roboflow
pip install setuptools
-
Dataset Ninja:
1 2 3
import dataset_tools as dtools dtools.download(dataset="dacl10k", dst_dir="./datasets/dataset-ninja/")
-
Roboflow:
1 2 3 4 5 6 7 8
import os from roboflow import Roboflow rf = Roboflow(api_key=os.getenv("ROBOFLOW_API_KEY")) project = rf.workspace("roboflow-58fyf").project("rock-paper-scissors-sxsw") version = project.version(14) dataset = version.download("coco")
6. Delete data#
If you need to remove specific files from an existing dataset without deleting the entire dataset, you can do so by specifying the filename. This is useful when updating or refining your dataset.
Use the following command:
1 2 3 |
|
Warning
This will permanently remove the specified file from your dataset in Focoos platform. Be sure to double-check the filename before executing the command, as deleted data cannot be recovered.
7. Delete dataset#
If you want to remove an entire dataset from the Focoos platform, use the following command:
1 2 3 |
|
Warning
Deleting a dataset is irreversible. Once deleted, all data associated with the dataset is permanently lost and cannot be recovered.