apa.at
blog / Freitag 09.08.24

G39 Agencies Exchange Platform (AEP) – a milestone in the news industry

News agencies need to work together to keep pace with the fast-moving media world. With the G39-project, we want to redefine the news industry.
Uwe Umstätter / Westend61 / picturedesk.com

In an era where news travels around the world in seconds, it is essential that national news organisations work efficiently and collaboratively to deliver high quality, up-to-date information. This is exactly where the G39project comes in: The aim is to develop an innovative platform that pools the news production of eleven agencies and makes it usable through artificial intelligence (AI). 

Who is G39 anyway?
G39, originally founded as Hell Commune in 1939, is an alliance of independent national news agencies from the Oslo states and Switzerland. It was formed in response to the manipulation of news transmission by international agencies during the political unrest of the 1930s and the Second World War. After the war, in November 1945, the agencies decided to reactivate the association, give it a more formal form and name it Group 39 after the year it was founded. The group, which now has eleven members, focusses on a unified appearance to international agencies and promotes exchange on topics such as technology and copyright.

The challenges of a multilingual content platform

The Agencies Exchange Platform (AEP) faces these complex tasks:

Multilingual processing and translation

The news agencies involved in the project provide content in nine different languages. The new platform must be able to make this content efficiently searchable and translate accurately if necessary. This requires the development of advanced AI models that can handle multilingualism.

Data integration and metadata management

In order to standardise the different formats and standards of the agencies (e.g. different metadata for tagging articles), the metadata must be unified and managed within the platform. For example, by using international standards such as the IPTC-MediaTopics.

Real-time search and retrieval

The aim of the platform is to make large volumes of news content searchable in real time and relevant information quickly retrievable. All of this only works with AI-supported semantic search technologies (so-called embedding models). The search is no longer based solely on the wording, but on the specific context of the article.

Scalability and infrastructure

A flexible and scalable infrastructure is necessary for the platform in order to meet the growing requirements of the agencies and the flood of news. This is made possible by a powerful API (application programming interface) as well as various storage and search functions.

 

The role of the Austria Press Agency (APA) in the G39-project

The APA plays a central role in the G39-project: Our focus is on the development and localisation of search models for a multilingual semantic search in the entire news content. We are proceeding on the basis of these three steps:

  • Training: A search model is trained for the individual agencies in their respective language. (e.g. in Swedish, Dutch, Italian, German, etc.)
  • Assessment: As soon as we have trained the models sufficiently, the respective agencies evaluate the result. A standardised questionnaire is used to assess the difference to an untrained, multilingual model.
  • Integration: Once all steps have been completed and the agencies have extensively tested their model, we create a complete, multilingual trained model from all eleven search models. The final step is then the integration into the overall infrastructure of the platform.

Minimising linguistic bias in the search results is a particular challenge in order to ensure a balanced presentation of all agency contents. If this is not the case, the results of individual news agencies can be overrepresented, resulting in an unbalanced and poorer quality search result. With the G39 Exchange Platform, efficient and cross-language collaboration is possible, improving access to high-quality news. You can find out how the Austria Press Agency and its implementation partner Cloudflight are achieving this in the next blog post on this project.

Business Icons
Advantages of semantic search over lexical search
Understanding context

Semantic search captures the meaning behind words and phrases, providing more relevant results that take into account the context of the query.

Natural language queries

It allows users to make queries in natural language, similar to a conversation with a human.

Precision and relevance

By understanding synonyms and thematic connections, semantic search can deliver more precise and thematically relevant results.

User intention

It is able to interpret the intent behind the search, resulting in an improved user experience.

Cross-language search

Semantic search technologies can be used across languages as they analyse the meaning rather than the specific words.