The Technology Innovation Institute has released Falcon Perception, a 600-million-parameter early-fusion transformer that unifies vision and language processing for open-vocabulary grounding and segmentation. The compact model challenges conventional modular architectures by processing image patches and text tokens in a shared parameter space from the very first layer.






