Main Concepts#
FocoosModel#
The FocoosModel
class is the main interface for working with computer vision models in Focoos. It provides high-level methods for training, testing, inference, and model export while handling preprocessing and postprocessing automatically.
Key Features#
- End-to-End Inference: Automatic preprocessing and postprocessing
- Training Support: Built-in training pipeline with distributed training support
- Model Export: Export to ONNX and TorchScript formats
- Performance Benchmarking: Built-in latency and throughput measurement
- Hub Integration: Seamless integration with Focoos Hub for model sharing
- Multiple Input Formats: Support for PIL Images, NumPy arrays, and PyTorch tensors
Loading Strategies#
The primary method for loading models is using the ModelManager.get()
(see ModelManager
). It supports multiple loading strategies based on the input parameters. The return value is a Focoos Model.
The ModelManager employs different loading strategies based on the input:
1. From Focoos Hub#
The Focoos Hub is a cloud-based model repository where you can store, share, and collaborate on models. This method enables seamless model downloading and caching from the hub using the hub://
protocol.
When to use: Load models shared by other users, access your own cloud-stored models, or work with models that require authentication.
Requirements: Valid API key for private models, internet connection for initial download.
1 2 3 4 5 6 7 8 9 10 11 12 |
|
2. From Model Registry#
The Model Registry contains curated, pretrained models that are immediately available without download. These models are optimized, tested, and ready for production use across various computer vision tasks.
When to use: Start with proven, high-quality pretrained models, baseline experiments, or when you need reliable performance without customization.
Requirements: No internet connection needed, models are bundled with the library.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
Available Model Categories:
- Object Detection:
fai-detr-l-coco
,fai-detr-m-coco
,fai-detr-l-obj365
- Instance Segmentation:
fai-mf-l-coco-ins
,fai-mf-m-coco-ins
,fai-mf-s-coco-ins
- Semantic Segmentation:
fai-mf-l-ade
,fai-mf-m-ade
,bisenetformer-l-ade
,bisenetformer-m-ade
,bisenetformer-s-ade
3. From Local Directory#
Load models from your local filesystem, whether they're custom-trained models or models stored in non-standard locations. This method provides maximum flexibility for local development and deployment scenarios.
When to use: Load custom-trained models, work with locally stored models, integrate with existing model storage systems, or work in offline environments.
Requirements: Valid model directory containing model artifacts (weights, configuration, metadata).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
4. From ModelInfo Object#
The ModelInfo
class represents comprehensive model metadata including architecture specifications, training configuration, class definitions, and performance metrics. This method provides the most programmatic control over model instantiation.
When to use: Programmatically construct models, work with dynamic configurations, integrate with custom model management systems, or when you need fine-grained control over model instantiation.
Requirements: Properly constructed ModelInfo object with valid configuration parameters.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
Predict#
Performs end-to-end inference on input images with automatic preprocessing and postprocessing. The model accepts input images in various formats including:
- PIL Image objects (
PIL.Image.Image
) - NumPy arrays (
numpy.ndarray
) - PyTorch tensors (
torch.Tensor
)
The input images are automatically preprocessed to the correct size and format required by the model. After inference, the raw model outputs are postprocessed into a standardized FocoosDetections
format that provides easy access to:
- Detected object classes and confidence scores
- Bounding box coordinates
- Segmentation masks (for segmentation models)
- Additional model-specific outputs
This provides a simple, unified interface for running inference regardless of the underlying model architecture or task.
Parameters:
- inputs
: Input images in various supported formats (PIL.Image.Image
, numpy.ndarray
, torch.Tensor
)
- **kwargs
: Additional arguments passed to postprocessing
Returns: FocoosDetections
containing detection/segmentation results
Example:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Training#
Trains the model on provided datasets. The training function accepts:
args
: Training configuration (TrainerArgs) specifying the main hyperparameters, among which:run_name
: Name for the training runoutput_dir
: Name for the output foldernum_gpus
: Number of GPUs to use (must be >= 1)sync_to_hub
: For tracking the experiment on the Focoos Hub. -batch_size
,learning_rate
,max_iters
and other hyperparametersdata_train
: Training dataset (MapDataset)data_val
: Validation dataset (MapDataset)hub
: Optional FocoosHUB instance for experiment tracking
The data can be obtained using the AutoDataset helper.
After the training is complete, the model will have updated weights and can be used for inference or export. Furthermore, in the output_dir
can be found the model metadata (model_info.json
) and the PyTorch weights (model_final.pth
).
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
Here you can find an extensive training tutorial.
Model Export#
Exports the model to different runtime formats for optimized inference. The main function arguments are:
- runtime_type
: specify the target runtime and must be one of the supported (see RuntimeType)
- out_dir
: the destination folder for the exported model
- image_size
: the target image size, as an optional integer
The function returns an InferModel
instance for the exported model.
Example:
1 2 3 4 5 6 7 8 9 |
|
Infer Model#
The InferModel
class represents an optimized model for inference, typically created through the export process of a FocoosModel
. It provides a streamlined interface focused on fast and efficient inference while maintaining the same input/output format as the original model.
Key Features#
- Optimized Performance: Models are optimized for the target runtime (e.g., TensorRT, ONNX)
- Consistent Interface: Uses the same input/output format as FocoosModel
- Resource Management: Proper cleanup of runtime resources when no longer needed
- Multiple Input Formats: Support for PIL Images, NumPy arrays, and PyTorch tensors
Initialization#
InferModel instances are typically created through the export()
method of a FocoosModel, which handles the model optimization and conversion process. This method allows you to specify the target runtime (see the availables in Runtimetypes
) and the output directory for the exported model. The export()
method returns an InferModel
instance that is optimized for fast and efficient inference.
Example:
1 2 3 4 5 6 7 8 |
|
Predict#
Performs end-to-end inference on input images with automatic preprocessing and postprocessing on the selected runtime. The model accepts input images in various formats including:
- PIL Image objects (
PIL.Image.Image
) - NumPy arrays (
numpy.ndarray
) - PyTorch tensors (
torch.Tensor
)
The input images are automatically preprocessed to the correct size and format required by the model. After inference, the raw model outputs are postprocessed into a standardized FocoosDetections
format that provides easy access to:
- Detected object classes and confidence scores
- Bounding box coordinates
- Segmentation masks (for segmentation models)
- Additional model-specific outputs
This provides a simple, unified interface for running inference regardless of the underlying model architecture or task.
Parameters:
- inputs
: Input images in various supported formats (PIL.Image.Image
, numpy.ndarray
, torch.Tensor
)
- **kwargs
: Additional arguments passed to postprocessing
Returns: FocoosDetections
containing detection/segmentation results
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|