Microsoft Bing Visual Search enables people around the world to find content using photographs as queries. The heart of this capability is Microsoft’s TuringMM visual embedding model that maps images and text into a shared high-dimensional space. Operating on billions of images across the web, performance is critical. This post details efforts to optimize the TuringMM pipeline using NVIDIA…
]]>Real-time cloud-scale applications that involve AI-based computer vision are growing rapidly. The use cases include image understanding, content creation, content moderation, mapping, recommender systems, and video conferencing. However, the compute cost of these workloads is growing too, driven by demand for increased sophistication in the processing. The shift from still images to video is…
]]>