What is Natural Language Processing (NLP)?

Natural language processing (NLP) is an area of artificial intelligence that helps computers understand human natural language.  Often referred to as the engineering side of computational linguistics, NLP focuses on extracting meaning from unstructured data. NLP includes many different techniques for interpreting human language, ranging from statistical and machine learning methods to rules-based and algorithmic approaches.

NLP has removed many of the barriers between humans and computers, not only enabling them to understand and interact with each other, but also creating new opportunities to augment human intelligence and accomplish tasks that were impossible before. NLP enables real-world applications, including:

  • Automatic summarization: the process of creating a short and coherent version of a longer document. (machinelearningmastery.com)
  • Sentiment analysis: the process of determining the emotional tone behind a series of words, used to gain an understanding of the the attitudes, opinions and emotions expressed within an online mention. (brandwatch.com)
  • Named entity recognition (NER) locates and classifies the named entities present in the text. NER classifies entities into pre-defined categories such as the names of persons, organizations, locations, quantities, monetary values, specialized terms, product terminology and expressions of times. (blog.paralleldots.com)
  • Parts of speech tagging: the process of marking up a word in a text as corresponding to a particular part of speech (noun, verb, adjective, etc.), based on both its definition and its context—i.e., its relationship with adjacent and related words in a phrase , sentence , or paragraph. (en.wikipedia.org)

Subcategories of NLP include natural language generation (NLG) — a computer’s ability to create communication of its own — and natural language understanding (NLU) — the ability to understand slang, mispronunciations, misspellings, and other variants in language. (cio.com)

Business applications of NLP

NLP has been widely applied across many industries. Some examples include:

  • Enterprise question answering tools leverage NLP to enhance customer experience and improve administrative activities by allowing users to ask questions in everyday language about products, services or applications and receive immediate and accurate answers.  Many companies are successfully using customer support chatbots to streamline some of the work that would traditionally fall to representatives. Models built with NLP algorithms are the brains of these chatbots. They’re trained using text data from past conversations between your customer support agents and customers.
  • Optimizing Customer Satisfaction: manually sifting through product reviews and surveys can be prohibitively time consuming, but NLP can be used to build data models that generate insights to help optimize customer satisfaction. Sentiment analysis is used to classify text into positive, neutral, or negative categories.
  • Classifying medical records: Researchers at MIT in 2012 were able to attain a 75 percent accuracy rate for deciphering the semantic meaning of specific clinical terms contained in free-text clinical notes, using a statistical probability model to assess surrounding terms and put ambiguous terms into context. (healthitanalytics.com)
  • Ad placement: NLP can help in intelligent advertisement targeting and placement. Media buying is usually the largest channel in an organization’s advertising budget. So, it is important to ensure that the advertisement reaches the right eyeballs. Browsing behaviors, social media and emails contains a lot of information imbedded that can give a lot of insights about consumer preferences. NLP can be used here to match keywords of interest in the texts to target the right consumers. It can also be used for disambiguation or identification of the sense in which a word is used in a sentence.
  • Reputation monitoring: with increased competition in diverse market, monitoring reputation is essential to avoid getting drifted away in the tide. With a plethora of information sources abut companies like social media, blog posts, and reports, it becomes imperative to utilize these sources to get more insights about the reputation and reviews of the company. NLP is the best way to understand and extract insights from these sources.
  • Helping hiring manager: NLP can help hiring managers to filter resumes. Automated candidate sourcing tools can scan CVs of applicants to extract required information and pinpoint the candidates who are right fit for the job. This will save a lot of time and give a more efficient solution.
  • Market intelligence: NLP can help to monitor and track market intelligence. Since markets are influenced by information exchange, using event extraction, NLP can recognize what happens to an entity.

These are only a few examples of the many ways NLP can be used to unlock valuable information from text data. To learn how to get started implementing these techniques in your business, below are some resources that might help you dive deeper into the world of NLP:

Relevant Wikipedia articles

  • Natural language generation (NLG) is the natural language processing task of generating natural language from a machine representation system such as a knowledge base or a logical form. Psycholinguists prefer the term language production when such formal representations are interpreted as models for mental representations.
  • Computational linguistics is an interdisciplinary field concerned with the statistical or rule-based modeling of natural language from a computational perspective, as well as the study of appropriate computational approaches to linguistic questions.
  • Question answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP), which is concerned with building systems that automatically answer questions posed by humans in a natural language.
  • Textual entailment (TE) in natural language processing is a directional relation between text fragments. The relation holds whenever the truth of one text fragment follows from another text.

  • The bag-of-words model is a simplifying representation used in natural language processing and information retrieval (IR). Also known as the vector space model. In this model, a text (such as a sentence or a document) is represented as the bagof its  words, disregarding grammar and even word order but keeping multiplicity.

Open Source NLP Libraries

These libraries provide the algorithmic building blocks of NLP in real-world applications.

  • Apache OpenNLP: a machine learning toolkit that provides tokenizers, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and more.
  • Natural Language Toolkit (NLTK): a Python library that provides modules for processing text, classifying, tokenizing, stemming, tagging, parsing, and more.
  • Standford NLP: a suite of NLP tools that provide part-of-speech tagging, the named entity recognizer, coreference resolution system, sentiment analysis, and more.
  • MALLET: a Java package that provides Latent Dirichlet Allocation, document classification, clustering, topic modeling, information extraction, and more.

NLP Courses

In case you are looking to get your feet wet with NLP, these are 2 popular online courses for beginners:

  • Stanford Natural Language Processing on Coursera: “This course covers a broad range of topics in natural language processing, including word and sentence tokenization, text classification and sentiment analysis, spelling correction, information extraction, parsing, meaning extraction, and question answering, We will also introduce the underlying theory from probability, statistics, and machine learning that are crucial for the field, and cover fundamental algorithms like n-gram language modeling, naive bayes and maxent classifiers, sequence models like Hidden Markov Models, probabilistic dependency and constituent parsing, and vector-space models of meaning.”
  • Udemy’s Introduction to Natural Language Processing: ” This course introduces Natural Language Processing through the use of python and the Natural Language Tool Kit. Through a practical approach, you’ll get hands on experience working with and analyzing textAs a student of this course, you’ll get updates for free, which include lecture revisions, new code examples, and new data projects.”

NLP YouTube videos

  • Natural Language Processing with Deep Learning – Stanford University
  • Introduction to Natural Language Processing – Cambridge Data Science Bootcamp
  • Natural Language Generation at Google Research