Capability
Transformer Encoder Decoder Object Prediction
3 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
via “transformer encoder-decoder with learned object queries for set prediction”
object-detection model by undefined. 2,28,520 downloads.
Unique: Uses learned object query embeddings (not spatial grids or anchors) that attend to the full feature map via multi-head cross-attention, enabling the model to dynamically allocate detection capacity based on image content rather than predefined spatial locations
vs others: More flexible than anchor-based methods (no anchor tuning) and more interpretable than dense prediction heads; weaker than specialized small-object detectors due to set prediction formulation