Never-Ending Language Learning system (NELL) is a semantic machine learning system that as of 2010 was being developed by a research team at Carnegie Mellon University, and supported by grants from DARPA, Google, NSF, and CNPq with portions of the system running on a supercomputing cluster provided by Yahoo!.[1]
This section needs to be updated.(October 2023) |
NELL was programmed by its developers to be able to identify a basic set of fundamental semantic relationships between a few hundred predefined categories of data, such as cities, companies, emotions and sports teams. Since the beginning of 2010, the Carnegie Mellon research team has been running NELL around the clock, sifting through hundreds of millions of web pages looking for connections between the information it already knows and what it finds through its search process – to make new connections in a manner that is intended to mimic the way humans learn new information.[2] For example, in encountering the word pair "Pikes Peak", NELL would notice that both words are capitalized and deduce from the second word that it was the name of a mountain, and then build on the relationship of words surrounding those two words to deduce other connections.[1]
The goal of NELL and other semantic learning systems, such as IBM's Watson system, is to be able to develop means of answering questions posed by users in natural language with no human intervention in the process.[3] Oren Etzioni of the University of Washington lauded the system's "continuous learning, as if NELL is exercising curiosity on its own, with little human help".[1]
By October 2010, NELL has doubled the number of relationships it has available in its knowledge base and has learned 440,000 new facts, with an accuracy of 87%.[4][1] Team leader Tom M. Mitchell, chairman of the machine learning department at Carnegie Mellon described how NELL "self-corrects when it has more information, as it learns more", though it does sometimes arrive at incorrect conclusions. Accumulated errors, such as the deduction that Internet cookies were a kind of baked good, led NELL to deduce from the phrases "I deleted my Internet cookies" and "I deleted my files" that "computer files" also belonged in the baked goods category.[5] Clear errors like these are[when?] corrected every few weeks by the members of the research team and the system is allowed to continue its learning process.[1] By 2018, NELL had "acquired a knowledge base with 120mn diverse, confidence-weighted beliefs (e.g., servedWith(tea,biscuits)), while learning thousands of interrelated functions that continually improve its reading competence over time."[6]
As of September 2023, the project's most recently gathered facts dated from February 2019 (according to its Twitter feed)[7] or September 2018 (according to its home page).[8]
In his 2019 book "Human Compatible", Stuart Russell commented that 'Unfortunately NELL has confidence in only 3 percent of its beliefs and relies on human experts to clean out false or meaningless beliefs on a regular basis—such as its beliefs that “Nepal is a country also known as United States” and "value is an agricultural product that is usually cut into basis."'[9] A 2023 paper commented that "While the never-ending part seems like the right approach, NELL still had the drawback that its focus remained much too grounded on object-language descriptions, and relied on web pages as its only source, which significantly influenced the type of grammar, symbolism, slang, etc. analysed."[10]
Since the start of the year, a team of researchers at Carnegie Mellon University — supported by grants from the Defense Advanced Research Projects Agency and Google, and tapping into a research supercomputing cluster provided by Yahoo — has been fine-tuning a computer system that is trying to master semantics by learning more like a human.