What is text mining?

Your business can gain valuable insights by letting AI analytics loose on your emails and documents

Analytics, the process of deriving information from raw data, is an important practice that businesses have tried to master for as long as such data has been available. In a short space of time, it's evolved from a fairly basic concept to an advanced practice incorporating technologies like machine learning and artificial intelligence (AI).

Advertisement - Article continues below

Text mining is yet another step forward in the wider field of data analytics, taking the idea of deriving useful information from data but applying this to vast volumes of documents, letters, emails, and other written material. The aim, as with conventional data analytics, is to derive meaning from the raw data available and use this to either enhance processes or improve an organisation's services.

One branch of this, sentiment analysis, involves combing through vast quantities of documentation to summarise how certain groups of people, either customers or employees, feel towards a certain issue. This could be used to learn how customers feel toward a brand, such as using text mining on web forums, or can be used to assess worker morale by subjecting internal emails to analysis.

It's an emerging technology that's powered by AI, but also incorporates technologies like natural language processing (NLP) to give nuanced results. Ultimately, deploying text mining software in an effective way could lead your organisation to gain new insights on age-old questions that were incredibly difficult to answer before. This is because the work that's involved is massive. The sheer amount of text that employees would have to go through, let alone glean useful information from, rendered it impossible in the past.

How does text mining work?

Before any advanced analytics can be applied to sample text, it must first be turned into a 'structured' form of data. Streams of writing laid out in documents are considered to be unstructured data which must be first be turned into structured data points before any high-quality insights can be gained.

Advertisement - Article continues below
Advertisement - Article continues below

This usually takes the shape of relational databases, in which the data is loaded in such a way that connections between stored items of information can be identified. The data records could contain facts as well as text strings, but are put into this form to allow for simpler future analysis. This process will involve a variety of methods, such as parsing the text, as well as deriving patterns and key information, before restructuring this in a structured form. The data can then be presented in a host of more visually-appealing means than a database, like charts, maps and tables.

Unleashing the power of natural language processing is one key method used to structure raw text-based big data. This technology uses data in combination with algorithms to add context to the way machines try to understand spoken language. It essentially tries to replicate the process by which a human might read text, and often serves to understand and define potentially vague words, like 'bow' for example. It's also embedded in most AI-powered virtual assistants, like Apple's Siri or Microsoft's Alexa.

Advertisement - Article continues below

NLP is deployed as part of this process to churn through reams of documentation in a way that would otherwise be too costly and time-consuming for any human, identifying the most relevant and important nuggets of information, based on any particular request.

Relationships, patterns and key facts are isolated and then turned into structured data so that AI can conduct further analysis on the data and identify insights based on what was demanded in the first place.

Benefits of sentiment analysis

Once assorted into a more structured format, the data can then be exposed to algorithms designed to give businesses high-quality insights that were impossible to glean through human-led analysis.

Sentiment analysis is one key application of text mining that can give businesses the exact thoughts and feelings about a company, or a particular aspect of a company. The insights could range from customer attitudes towards a brand to the morale of employees within the organisation.

Advertisement - Article continues below
Advertisement - Article continues below

In the former example, the text absorbed into the text mining process might come from online reviews, social media, customer interactions via email, as well as call centre interactions. These can be turned into data points to identify patterns that point to common threads in the way people perceive a certain brand. The information can then be presented in such a way as to devise strategies to solve negative branding and improve standards and practices.

This form of data analytics can also be applied within an organisation to monitor the way that workers interact with each other through workspace applications like Slack or Microsoft Teams, as well as email. This is so that an organisation can determine how employees are feeling towards the leadership, for instance, and use this information to find ways to boost morale or build trust in areas where it may be lacking.

The Enron effect

The infamous Enron scandal of 2001 proves a fascinating case study for where text mining technology can potentially help an enterprise from completely imploding. More than a decade after the energy firm declared bankruptcy, a text-mining company, KeenCorp, managed to acquire a trove of emails dating back to the date of the scandal, and the preceding few years.

Advertisement - Article continues below

These emails, sent by and between the company's top 150 executives, were passed through KeenCorp's text mining system and then fed through an algorithm designed to assess company morale.

The software managed to pinpoint the exact date where things went south; 28 June 1999. This date was significant because it was the date that Enron's company board had discussed 'LJM', a proposal that would hide the company's poor financial situation. This proposal would eventually contribute to the firm's downfall.

The above example is just an experiment, but points to how emerging technologies could make big strides in industries that are producing an exponential amount of data, but finding it challenging to make the most of it.



Business strategy

What is big data analytics?

17 Sep 2019

Most Popular

Mobile Phones

Microsoft patents a mobile device with a third screen

6 Apr 2020
application programming interface (API)

Apple buys Dark Sky weather app and leaves Android users in the cold

1 Apr 2020
video conferencing

Zoom CEO admits company "moved too fast" as privacy issues mount

6 Apr 2020