07.14.22
Zebra Technologies announced a team of Zebra AI researchers in the UK secured second place in one of the world’s most prestigious computer vision challenges. Challenge team rankings were an-nounced in June at the CVPR RetailVision workshop designed to address problems in e-commerce online searches.
“When a consumer logs onto an e-commerce website and starts searching for items of clothing, for example, the website will return a range of results, some better matched than others,” said Andrea Mirabile, senior manager, computer vision, Zebra Technologies, who led the Zebra team.
Sponsored by the Alibaba Group and Trax, the CVPR 2022 Challenge: Large-scale Cross-Modal Product Retrieval brought together 650 research teams from across the world to work on a large-scale multimodal retail dataset of around five million image-caption pairs of circa 100,000 products. Each team was tasked with finding the top-K product candidates to match a query such as “blue men's turtleneck sweater.”
“From the consumer point of view, it’s a less than optimal experience, with time and effort wasted, that could result in either no order, ordering the wrong item, or having to look elsewhere,” added Mirabile. “From the perspective of the retailer, it’s about the problem of how to better use words, images, and search functions that match what consumers are looking for when they shop and being able to make more relevant product recommendations.”
Top-K is a common measure of performance in machine learning and computer vision. In the con-text of limited and noisy data, such as retail data with a mixed range of image quality and text cap-tions, the use of a loss function (the function that computes the distance between the current output of an algorithm and the expected output) designed for top-K classification can bring significant im-provements.
“AI applications cover speech, text, sound, and vision. Our challenge team and the wider Zebra global AI research team is bringing those together, in the same way a human uses their five senses to sense and analyze the world around them to inform decision-making and action,” explained Mirabile.
“When a consumer logs onto an e-commerce website and starts searching for items of clothing, for example, the website will return a range of results, some better matched than others,” said Andrea Mirabile, senior manager, computer vision, Zebra Technologies, who led the Zebra team.
Sponsored by the Alibaba Group and Trax, the CVPR 2022 Challenge: Large-scale Cross-Modal Product Retrieval brought together 650 research teams from across the world to work on a large-scale multimodal retail dataset of around five million image-caption pairs of circa 100,000 products. Each team was tasked with finding the top-K product candidates to match a query such as “blue men's turtleneck sweater.”
“From the consumer point of view, it’s a less than optimal experience, with time and effort wasted, that could result in either no order, ordering the wrong item, or having to look elsewhere,” added Mirabile. “From the perspective of the retailer, it’s about the problem of how to better use words, images, and search functions that match what consumers are looking for when they shop and being able to make more relevant product recommendations.”
Top-K is a common measure of performance in machine learning and computer vision. In the con-text of limited and noisy data, such as retail data with a mixed range of image quality and text cap-tions, the use of a loss function (the function that computes the distance between the current output of an algorithm and the expected output) designed for top-K classification can bring significant im-provements.
“AI applications cover speech, text, sound, and vision. Our challenge team and the wider Zebra global AI research team is bringing those together, in the same way a human uses their five senses to sense and analyze the world around them to inform decision-making and action,” explained Mirabile.