Client Cases

Getty Images Korea | Building a Smart Natural Language Image Search System Based on Amazon Bedrock

Key Point


Getty Images Korea, the largest content provider in Korea with 190 million items, leveraged Amazon Bedrock Claude and the Titan Multimodal model to implement intuitive, natural language-based image search and achieve a 50% reduction in inference costs.



The Client


Getty Images Korea, the largest content provider in Korea with 190 million items, leveraged Amazon Bedrock Claude and the Titan Multimodal model to implement intuitive, natural language-based image search and achieve a 50% reduction in inference costs.


1. A Global Hub for Creative Content

Creative images, press and publication photos, videos, and audio files from Korea and around the world are updated daily. The collection includes creative photos and illustrations for use in a variety of fields, including advertising, public relations, web design, publishing, and broadcasting, as well as masterpieces and rare works by renowned artists from around the world, as well as works by contemporary artists. 


2. A Pioneer in Unique User Experience

We offer a differentiated user experience through a variety of services, including semantic-based personalized search support, 24-hour free draft download service, real-time messenger consultation, and image recommendation services, beyond simple Korean translation search.


3. A Reliable Partner to Over 100,000 Customers

We provide advertising content to over 100,000 clients, including advertising agencies, production companies, web agencies, corporations, and government agencies, solidifying our position as a leader in the advertising content market. We offer comprehensive solutions that meet both global standards and local needs!


 

The Challenge


1. From Keywords to Natural Language: A Revolution in Search Needed

Getty Images Korea was at an inflection point, needing to evolve from its traditional keyword-based search method to a natural language-based image search service. There was an explosive increase in demand from designers and content creators who wanted to search for images they pictured in natural language.


2. The Unfortunate Limitations of the CLIP Model 

Early on, we introduced a search system utilizing the CLIP model, but it faced significant challenges in accurately recognizing detailed image characteristics, such as shooting techniques, mood, and emotional tone and manner. In particular, distinguishing subtle elements critical to the creative industry, such as professional photography techniques, subtle color differences, and lighting effects, proved incredibly difficult.


3. 1The Overwhelming Scale of 190 Million Data Items

The biggest challenge was building a cost-effective search system while automatically generating accurate metadata for our massive image database of over 190 million items. Processing hundreds of millions of images required massive computing resources and costs, yet we had to address the conflicting requirements of ensuring search quality and speed.


4. The Dilemma of Real-Time Performance and Cost-Efficiency

Satisfying both large-scale data processing and real-time search performance was a mission-impossible challenge.


 

해The Solution


1. Everything About Images Expressed in Natural Language

This system accurately describes image components, photography techniques, mood, and color in natural language, allowing users to more accurately find the images they envision! 


2. A Revolution in Search Quality with Amazon Titan Multimodal

To fundamentally improve search quality, we introduced the Amazon Titan Multimodal Embedding G1 model. This model effectively expresses the relationship between text and images, significantly improving the detailed feature recognition problem that was a limitation of the existing CLIP model.


3. Meeting the Demanding Needs of Creative Professionals

This system allows for more precise distinction of professional photography techniques, subtle emotional tones, and artistic styles, perfectly meeting the demanding needs of creative professionals!


4. Cost Innovation with Vector Quantization and OpenSearch

By applying vector quantization techniques to a database of 200 million images and leveraging OpenSearch's disk-based mode, we optimized memory usage while cost-effectively implementing a large-scale search system.


5. A Perfect Search Experience with OpenSearch kNN

By building a similarity-based search system utilizing the OpenSearch kNN algorithm, we've significantly improved the accuracy and relevance of user queries. This ensures fast search speeds while accurately identifying user intent and displaying the most relevant images at the top.


 

The Result


1. Perfect Implementation of Natural Language Search

By introducing a cost-effective image embedding solution based on the Amazon Bedrock Titan Multimodal model, we were able to successfully implement natural language search without the need to develop a complex matching algorithm!


2. Natural Searches, such as "In a Cozy Cafe..."

Users can now search with natural phrases like "A person working on a laptop in a cozy cafe," providing a much more intuitive and accurate search experience than traditional keyword combinations!


3. A 50% Cost Reduction with Batch Inference

By adopting batch inference for processing 200 million images, we achieved groundbreaking results: a significant reduction in processing time and a 50% reduction in inference costs! This laid the economic foundation for continuously updating and improving our massive image database.


 cfa6e21a8bedb.jpeg