Multi Domain Object Detection With Coco Objects365 Pretraining

1

MS COCO (Common Objects in Context)Dataset60/100

via “multi-task object instance annotation with polygon and rle-encoded segmentation masks”

330K images with object detection, segmentation, and captions.

Unique: Dual segmentation encoding (polygon + RLE) in single dataset enables both precise boundary analysis and efficient computational workflows; 2.5M instances across 330K images provides scale unmatched by contemporaneous datasets (ImageNet had ~1.2M images, PASCAL VOC had ~11K images)

vs others: Larger and more densely annotated than PASCAL VOC (11K images, ~6 objects/image) and more task-diverse than ImageNet (classification-only); RLE encoding enables 10-100x faster mask loading than polygon-only formats

2

yolos-smallModel46/100

via “coco dataset-aligned class prediction with 80-class taxonomy”

object-detection model by undefined. 7,35,352 downloads.

Unique: Integrates COCO dataset taxonomy directly into the model architecture, enabling zero-shot compatibility with existing COCO-trained detection pipelines and benchmarks. Uses standard softmax classification head aligned with COCO's 80-class taxonomy rather than custom class sets.

vs others: Provides immediate compatibility with COCO evaluation metrics and existing detection datasets, unlike custom-trained detectors that require class remapping; weaker than fine-tuned models on domain-specific classes

3

detr-resnet-50Model45/100

via “fine-tuning on custom datasets with transfer learning”

object-detection model by undefined. 2,39,063 downloads.

Unique: Leverages ImageNet-pretrained ResNet-50 backbone and COCO-pretrained decoder weights to enable efficient fine-tuning on custom datasets with minimal data and compute compared to training from scratch

vs others: Faster convergence than training from scratch; requires fewer annotated examples than anchor-based methods due to transformer's ability to learn object relationships

4

rtdetr_r18vd_coco_o365Model43/100

via “multi-dataset transfer learning with coco and objects365 pre-training”

object-detection model by undefined. 5,21,638 downloads.

Unique: Combines COCO (80 general objects) and Objects365 (365 fine-grained objects) in single pre-training, creating a hybrid feature space that balances broad coverage with fine-grained discrimination; most detection models use single-dataset pre-training

vs others: Outperforms single-dataset pre-trained models (COCO-only YOLOv8, DETR) on diverse object categories and shows faster convergence during fine-tuning due to richer initialization

5

yolov10sModel42/100

via “coco dataset-aligned class prediction with 80-class taxonomy”

object-detection model by undefined. 2,23,706 downloads.

Unique: Pre-trained on COCO with YOLOv10's improved training recipe (including anchor-free loss functions and dynamic label assignment), achieving higher mAP than prior YOLO versions on the same 80-class taxonomy without architectural changes to the classifier.

vs others: More accurate on COCO classes than YOLOv8s due to improved training dynamics; simpler class handling than open-vocabulary models (CLIP-based) which require additional inference steps but offer flexibility beyond 80 classes.

6

yolos-tinyModel41/100

via “coco-pretrained multi-class object detection with 80 object categories”

object-detection model by undefined. 83,525 downloads.

Unique: Leverages COCO pretraining with transformer architecture, enabling detection of 80 common object classes without custom training while maintaining parameter efficiency through the tiny variant design

vs others: Requires no dataset collection or fine-tuning for COCO classes (vs YOLOv5 which also supports COCO but with larger model sizes), though accuracy is typically 2-5% lower than larger transformer detectors due to model compression

7

mask2former-swin-tiny-coco-instanceModel41/100

via “coco-pretrained 80-class object recognition with transfer learning”

image-segmentation model by undefined. 63,563 downloads.

Unique: Weights trained on COCO instance segmentation task (not just classification), meaning features encode both semantic and spatial information about object boundaries. This differs from ImageNet-pretrained backbones which optimize for classification only; COCO pretraining provides better initialization for segmentation tasks.

vs others: Outperforms ImageNet-pretrained backbones by 3-5 mAP on segmentation tasks due to instance-aware training; requires more computational resources than lightweight classification models but provides better transfer to dense prediction tasks.

8

detr-resnet-101Model41/100

via “coco dataset-pretrained weight initialization”

object-detection model by undefined. 63,737 downloads.

Unique: Weights distributed via HuggingFace Hub with safetensors format (faster, more secure than pickle) and automatic caching, enabling one-line loading via transformers.AutoModelForObjectDetection without manual weight management

vs others: Easier weight management than downloading from GitHub or torchvision (which uses pickle), and safer than pickle due to safetensors' sandboxed format preventing arbitrary code execution

9

rtdetr_r101vd_coco_o365Model40/100

via “multi-domain object detection with coco+objects365 pretraining”

object-detection model by undefined. 1,21,720 downloads.

Unique: Combines COCO (80 classes, high-quality annotations) with Objects365 (365 classes, broader coverage) in a unified detection framework using class-agnostic bounding box regression, enabling detection across 365+ object categories with a single model rather than ensemble or multi-task approaches

vs others: Broader category coverage than COCO-only models (365 vs 80 classes) with better generalization than Objects365-only training due to COCO's higher annotation quality, outperforming single-dataset detectors on diverse real-world images

10

rtdetr_r50vd_coco_o365Model39/100

via “multi-dataset transfer learning with coco and objects365 pre-training”

object-detection model by undefined. 80,830 downloads.

Unique: Combines COCO (80 classes, high-quality annotations) and Objects365 (365 classes, broader coverage) pre-training in a single model, enabling transfer learning that balances annotation quality with category diversity—a rare combination in published detection models

vs others: Broader object category coverage than COCO-only models (365 vs 80 classes) while maintaining COCO's annotation quality, reducing fine-tuning data requirements compared to training from scratch on custom datasets

11

oneformer_coco_swin_largeModel39/100

via “coco-dataset-pretraining-with-133-class-vocabulary”

image-segmentation model by undefined. 54,407 downloads.

Unique: Pre-trained jointly on semantic, instance, and panoptic segmentation tasks using a unified architecture, enabling transfer learning across all three tasks simultaneously. Unlike task-specific pre-training, this approach learns shared representations that benefit all downstream tasks.

vs others: Achieves 45.1 mIoU on COCO panoptic segmentation with a single model, competitive with specialized panoptic models while maintaining flexibility for semantic and instance tasks without retraining.

12

rtdetr_v2_r18vdModel39/100

via “coco-pretrained multi-class object classification and localization”

object-detection model by undefined. 1,06,918 downloads.

Unique: Leverages COCO pretraining with deformable transformer architecture, enabling efficient transfer to custom domains without the computational overhead of training from scratch. Safetensors serialization ensures reproducible, secure weight loading compared to pickle-based .pth files.

vs others: Outperforms lightweight detectors (MobileNet-SSD) on COCO classes due to transformer capacity, while maintaining faster inference than heavier models (ResNet-101 backbone) through deformable attention efficiency.

13

rtdetr_r50vdModel36/100

via “coco-pretrained weight initialization with transfer learning support”

object-detection model by undefined. 32,868 downloads.

Unique: Provides safetensors-format checkpoints with full layer compatibility for both zero-shot COCO inference and head-replacement fine-tuning; weights are optimized for deformable attention initialization, avoiding common gradient flow issues in transformer detection models

vs others: Faster checkpoint loading than pickle-based PyTorch weights (safetensors is memory-mapped) and more flexible than ONNX exports for fine-tuning, while maintaining full reproducibility across platforms

14

mmdetBenchmark30/100

via “model evaluation with coco, lvis, and custom metrics”

OpenMMLab Detection Toolbox and Benchmark

Unique: Integrates COCO and LVIS evaluation as pluggable metric modules that compute AP at multiple IoU thresholds and object sizes, with support for class-wise breakdown and long-tail weighting, enabling standardized benchmarking across different detection datasets

vs others: More comprehensive than standalone pycocotools because it integrates LVIS metrics and custom metric support in a unified framework; more flexible than TensorFlow Object Detection API because metrics are composable and can be easily extended for custom evaluation protocols

15

A ConvNet for the 2020s (ConvNeXt)Product18/100

via “coco-object-detection-backbone-integration”

* ⭐ 01/2022: [Patches Are All You Need (ConvMixer)](https://arxiv.org/abs/2201.09792)

Unique: Achieves COCO detection performance that outperforms Swin Transformer while maintaining pure convolutional architecture, demonstrating that modernized ConvNets can compete with transformer-based backbones on detection tasks without attention mechanisms

vs others: Outperforms Swin Transformer on COCO object detection while providing simpler architecture, lower inference latency (unquantified), and better interpretability than attention-based backbones

Top Matches

Also Known As

Company