Material Detail

NLP Interchange Format (NIF)

NLP Interchange Format (NIF)

This video was recorded at W3C Workshop: A Local Focus for the Multilingual Web, Limerick 2011. NIF is an RDF/OWL-based format that allows to combine and chain several NLP tools in a flexible, light-weight way. The core of NIF consists of a vocabulary, which can represent Strings as RDF resources. A special URI design is used to pinpoint annotations to a part of a document. These URIs can then be used to attach arbitrary annotations to the respective character sequence. Based on these URIs, annotations can be interchanged between different NLP tools. Although NLP Tools are abundantly available on all linguistic levels for the English language, this is often not the case for languages with fewer speakers. Thus, it becomes especially necessary to create a format that allows the integration and interoperability of NLP tools. Web site: With respect to multilinguality, two use cases come to mind: 1. an already existing English software system, that uses an English NLP tool needs to be ported to another language. The NLP tool for the other language is not compatible to the system, because there is no common interface (Example: A CMS with keyword extraction). 2. Paragraphs in different kinds of documents can be annotated in RDF with multilingual translations that can potentially remain stable over the life-time of a document. Especially, the introduced URI recipe (Context-Hash) possesses advantageous properties, which withstand comparison to other URI naming approaches.


  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material


Log in to participate in the discussions or sign up if you are not already a MERLOT member.