Measuring corporate reputation relies on understanding how people express views towards companies and topics in speech and writing. The underlying technology is natural language processing, or NLP. The combination of growing data volumes and inexpensive computing power has propelled recent advances in analytics and machine learning including NLP.
Natural language capabilities for corporate reputation can be viewed as starting with identifying entities such as companies, products, or people. Next, reputation systems often measure sentiment in sentences and articles referring to an entity. Basic reputation applications link entities to the sentiment of surrounding words. More advanced solutions also identify topics, and then connect sentiment to topics to entities. For example, a successful drug trial may lift sentiment toward a pharmaceutical company while a lawsuit may depress sentiment for a bank.
Natural language technology has progressed through distinct stages. Initially, linguists proposed representations of language (semantics) based on theories about how humans organize language. Next, early NLP software used complex sets of rules. In the 1980s, the first practical applications emerged using statistical formulas. Many sentiment applications remain at this stage, essentially counting good and bad words around an entity.
In the 1990s, technology progressed to machine learning. Techniques like random forest and support vector machines find relationships in large sets of variables. More recently, the term artificial intelligence is often applied to a form of machine learning called deep learning. This technique attempts to function similarly to the human brain. Data passes through many (deep) layers in a graph of cells that each apply a weighting value. This neural network process incrementally transforms inputted values into a resulting category.
alva’s neural networks start with a technique called transfer learning. Rather than starting with a blank set of cells, we use pre-trained neural networks with a map of how words relate to each other. Using Wikipedia as a corpus, or body of text, the system applies the ULMFiT language model to seed the network with a map of a selected language.
Next, we apply supervised machine learning techniques to train the pre-seeded neural network to quantify sentiment appropriately for corporate reputation intelligence. Words tend to hold meaning based on their context. We apply a form of neural network, called Long Short-Term Memory, or LSTM, that assesses natural language in context looking across sentences, paragraphs, and articles.
The combination of transfer learning with an advanced form of neural net results in more accurate results and the ability to effectively train models with a smaller training data set compared to prior forms of neural networks.
Overall, machine learning improves on earlier approaches with:
- Accuracy, and finding nuanced relationships in data in real time
- Robustness to measure broad forms of data including incomplete or malformed inputs
- Continual improvement based on a feedback loop
From a business perspective, machine learning enables more confidently making data-driven decisions. A well-structured model also improves finding relationships between reputation and related data such as customer satisfaction, employee engagement, or brand perception. Finally, machine learning facilitates looking broadly across content types, languages, and global regions.
Machine learning brings clear advantages but should be used selectively. Many steps in quantifying reputation can be performed faster and with less cost and operational risk through filtering and traditional statistical models. For example, consider identifying content about the company Amazon and excluding references to the Amazon River. Parsing the text in conventional software code to detect if “Amazon” is next to “River” is easier and less expensive to implement, operate, and maintain than using a machine learning model.
In practice, mature natural language platforms that solve real world business needs tend to combine filters, statistical models, and machine learning. alva’s platform incorporates many years of practical lessons and incremental improvements to optimally match each step to an efficient implementation.
Machine learning presents at least three challenges compared to prior technology:
- Managing training and test data: machine learning works by training a model to relate inputs to a “correct” answer. However, people rarely completely agree on language meaning and sentiment. Combining many people’s views of sentiment typically results in a neutral rating, which is ineffective for training a model (central limit theorem).
- Complexity: selecting, implementing, and training machine learning algorithms involves more moving parts and decisions than prior statistical methods.
- Operational costs: machine learning models are more computationally intensive, and so demand more expensive computing power.
Artificial intelligence must be thoughtfully implemented but client results show benefits clearly outweigh costs. Accuracy, a comprehensive view across sources, and integration with other KPIs enables corporate communications and investor relations decisions that achieve concrete impact.
Be part of the
Stakeholder Intelligence community