What is text mining?

Your business can gain valuable insights by letting AI analytics loose on your emails and documents

Analytics, the process of deriving information from raw data, is an important practice that businesses have tried to master for as long as such data has been available. In a short space of time, it's evolved from a fairly basic concept to an advanced practice incorporating technologies like machine learning and artificial intelligence (AI).

Advertisement - Article continues below

Text mining is one such evolution, which takes the basic idea of deriving information from data and applying this to vast volumes of documents, letters, emails and written material. As with other conventional analytics, the aim of text mining is to convert raw data into meaningful information which can then be used to support other processes.

Text mining has only really been possible thanks to the advent of AI and more specialist technology like natural language processing (NLP), given that to produce effective results you need to trawl through vast quantities of data at pace. If deployed correctly, text mining has the potential to open new insights for your organisation.

How does text mining work?

Before your organisation can take advantage of text mining, any text-based data needs to be structured – in other words, text mining is a secondary process. For example, data contained in streams of uncategorised documents is considered unstructured.

Advertisement - Article continues below

To give this kind of data structure, businesses often deploy relational databases, where the data is organised based on connections between stored items. This would need to involve a variety of processes, such as parsing of text or pattern analysis, before it is considered structured. Once in this form, the data can be translated to something more visually appealing, such as charts, maps and tables.

Advertisement - Article continues below

Unleashing the power of natural language processing is one key method used to structure raw text-based big data. This technology uses data in combination with algorithms to add context to the way machines try to understand spoken language. It essentially tries to replicate the process by which a human might read text, and often serves to understand and define potentially vague words, like 'bow' for example. It's also embedded in most AI-powered virtual assistants, like Apple's Siri or Microsoft's Alexa.

NLP is deployed as part of this process to churn through reams of documentation in a way that would otherwise be too costly and time-consuming for any human, identifying the most relevant and important nuggets of information, based on any particular request.

One important branch of text mining is sentiment analysis, which involves combing through vast quantities of documentation to summarise how certain groups of people, either customers or employees, feel towards a certain issue. This could be used to learn how customers feel toward a brand, such as using text mining on web forums, or can be used to assess worker morale by subjecting internal emails to analysis.

Advertisement - Article continues below

Relationships, patterns and key facts are isolated and then turned into structured data so that AI can conduct further analysis on the data and identify insights based on what was demanded in the first place.

Benefits of sentiment analysis

Once assorted into a more structured format, the data can then be exposed to algorithms designed to give businesses high-quality insights that were impossible to glean through human-led analysis.

Advertisement - Article continues below

Sentiment analysis is one key application of text mining that can give businesses the exact thoughts and feelings about a company, or a particular aspect of a company. The insights could range from customer attitudes towards a brand to the morale of employees within the organisation.

In the former example, the text absorbed into the text mining process might come from online reviews, social media, customer interactions via email, as well as call centre interactions. These can be turned into data points to identify patterns that point to common threads in the way people perceive a certain brand. The information can then be presented in such a way as to devise strategies to solve negative branding and improve standards and practices.

Advertisement - Article continues below

This form of data analytics can also be applied within an organisation to monitor the way that workers interact with each other through workspace applications like Slack or Microsoft Teams, as well as email. This is so that an organisation can determine how employees are feeling towards the leadership, for instance, and use this information to find ways to boost morale or build trust in areas where it may be lacking.

The Enron effect

The infamous Enron scandal of 2001 proves a fascinating case study for where text mining technology can potentially help an enterprise from completely imploding. More than a decade after the energy firm declared bankruptcy, a text-mining company, KeenCorp, managed to acquire a trove of emails dating back to the date of the scandal, and the preceding few years.

These emails, sent by and between the company's top 150 executives, were passed through KeenCorp's text mining system and then fed through an algorithm designed to assess company morale.

Advertisement - Article continues below

The software managed to pinpoint the exact date where things went south; 28 June 1999. This date was significant because it was the date that Enron's company board had discussed 'LJM', a proposal that would hide the company's poor financial situation. This proposal would eventually contribute to the firm's downfall.

The above example is just an experiment, but points to how emerging technologies could make big strides in industries that are producing an exponential amount of data, but finding it challenging to make the most of it.

Featured Resources

Preparing for long-term remote working after COVID-19

Learn how to safely and securely enable your remote workforce

Download now

Cloud vs on-premise storage: What’s right for you?

Key considerations driving document storage decisions for businesses

Download now

Staying ahead of the game in the world of data

Create successful marketing campaigns by understanding your customers better

Download now

Transforming productivity

Solutions that facilitate work at full speed

Download now


Business strategy

What is big data analytics?

17 Sep 2019

Most Popular

Business operations

Nvidia overtakes Intel as most valuable US chipmaker

9 Jul 2020

How to find RAM speed, size and type

24 Jun 2020
Google Android

Over two dozen Android apps found stealing user data

7 Jul 2020