The results of the 2017 Knowledge Base Population (KBP) sponsored by the National Institute of Standards and Technology (NIST) recently announced that Tencent AI Lab first entered the competition to gain access to the Entity Discovery and Linking Track (EDL) ) Task champion. KBP Competition was started in 2009, is the world's most influential, the highest level of knowledge in the field of spectrum events, the Championship reflects Tencent in the knowledge of the map, semantic understanding and other technical areas of strong technical strength.
Entity Discovery and Linking Tasks Trinity Total Top Ten Ranking Tables (Anonymous for Teams)
Knowledge maps are techniques that try to structure human knowledge, generally by describing the properties of an entity and by establishing the relationship between the entity and the entity. At present, many tasks of natural language processing need knowledge map support, such as query semantic understanding in question answering system. For example, if you look up "What is the main theme of the Journey to the West in 1986?" To fully understand this sentence, the first step is to understand the entity and its category of Journey to the West. This task is called entity discovery and identifies " Journey to the West "is a TV series. The second step is to solve the ambiguity of the entity because the knowledge map contains all versions of Journey to the West. The task of the entity link is to chain the Journey to the West according to the context in the query, for example, version 86 here The correct version of Journey to the West, also known as Journey to the West, released in 1986.
Top international events show the technical architecture
Depth learning to build a leading semantic understanding model
Entity discovery and linking are the core tasks of the KBP event. They need to identify the entities from the target texts and link them to the existing knowledge base. This is a highly technical challenge with tasks in three languages: Chinese, English and Spanish Language, Tencent received the first place in the discovery and link score of the Trilingual entity. The indicators on the single language are the first two in Chinese and the second in Spanish and the second in English. The competition has always been widespread concern of academia and industry, the task of a total of 24 teams to participate, including IBM, the United States Carnegie Mellon University and the University of Illinois at Urbana-Champaign and other world-class scientific research institutions, and Alibaba, Beijing University of Posts and Telecommunications and Zhejiang University and other well-known domestic enterprises and institutions.
In this competition, Tencent AI Lab has introduced a chapter understanding model and an associated graph model in the current industry-leading EDL architecture. The chapter understanding model adopts the deep learning framework to understand the semantics of the chapter more accurately through the training of large-scale data. The association graph model models all the important information of the whole article into a graph structure together and solves the whole problem excellent.
Self-built knowledge map TopBase
The future extends to deep understanding of natural language
Knowledge map is the core technology in the field of natural language processing. At present, Tencent AI Lab has built a knowledge map called TopBase that currently covers more than 50 fields such as characters, music, film, sports, poetry, etc., Triples, and has been widely applied to Daily Express, WeChat to see and WeChat search, and Tencent cloud micro-business.
TopBase knowledge diagram
Machine learning, computer vision, speech recognition and natural language processing are the four basic research directions of Tencent's AI Lab. Based on this research, we combine with the needs of Tencent's business and partners, landing on the four major content, games, social and platform AI Application.
Tencent's AI Lab also focuses on core research in Q & A, conversation, text generation, automatic summarization and machine translation in the NLP area of knowledge mapping and applies cross-border applications in the field of speech recognition and computer vision, Interpretation and image description generation technology. The ultimate goal is to make the machine better understand and generate textual content, improve understanding, decision-making and creativity, and ultimately communicate with humans through natural language.