Text Sentiment Analytics


Text Sentiment Analytics

Architecture, Data – Pipeline, Analytics: Gus Segura

Text and Sentiment Analytics are becoming a more important data set in the corporate enterprise. The metrics and analysis are helping drive the customer journey.  We use this information daily to create and enhance the customer experience through personalization and responding to customer feedback.  I build these systems for my clients to improve marketing efficacy, drive new content creation, personalize search response and understand areas where we need to improve.

It is as important to understand the Data Journey as the Customer Journey.  Action window and impact are all based on Data Journey, Timelines and Cleanliness of the data. The following are a few ideas and considerations for ingesting text based data, capacity and metrics.

Wether your a small business or a large enterprise; You can benefit from analyzing the responses to your goods and services from your customers. Please Send Mail if you need help or have questions with your text analysis system.


  • Cloud vs On-Premise vs Hybrid
  • Server vs Serverless
  • Staged [ Data Lakes ] vs Pipeline
  • Big Data Hadoop vs Relational [RDBMS] vs CAP Document Stores (MongoDB) [Clustered]
  • Processes [In-Process] (Dynamic) [Pipelining] vs Batch [Yes, Margaret, There is still batch]

How you architect your systems is really up to you. You have all the power in most cases even if you’re already in the cloud or on-premise; some of the above topics should resonate.  You will hear everyone trying to push you in one direction and another. People often ask me, “Where do we start?”… “Do we need build in the Cloud 1st?” .. Start with a plan.  How Big or Small your system will be and where it will run is ultimate up to you and your needs.

Slow down and think about what you trying to achieve.  Examples:

We’re building a text analysis platform to improve our CRM relationships

We want to know what people are saying about our products

We want to know how people describe a product when they don’t really know the name or brand

Got it?

White board with your team. Understand what you’re trying to build. How it will be used.  How it will be supported. How the new system will bring value to you’re enterprise.  You should find a stopping point to a current discussion; Yet, know that architecture will usually be a dynamic living ‘thing’ and/or idea.

Side Note: There are many (small, medium and large) enterprises purchasing server hardware and building they’re own private clouds based on micro-services architectures (docker, kubernetes).  They are optimizing spend on that hardware while at the same time building capacity to expand into the Cloud as needed.

Your take-away from you initial discussions are a set of metrics that anyone should be able to understand and explain.  Example:

We process 10Billion messages a day from 2.5 Million users. We generated 2-15TB of RAW Text based data related to 115K topics in 16 languages. 52% of the Text traffic is generated in English, 28% is generated in Spanish with the remaining languages distributed about evenly…etc. Our Data – Pipeline Aggregates, Filters and Cleans our inbound data set to reduce the inbound data to a manageable 2G-150GB.  As the data moves through the pipe-line; its cleaned and changed from unstructured text to a semi-structured and structured data model…



The Metrics are key – “Things you can measure will drive you decisions”. Text Metrics are like a puzzle – You can start by building a frame (the edge pieces), you can start to look at the picture and/or you can just brute fore trying to fit the pieces together.  I like to spend some time Knolling: “Knolling is the process of arranging related objects in parallel or 90-degree angles as a method of organization”.  In Text Analytics – Knolling is putting down or quantifying all the metrics you already have and/or can data munge before you apply machine learning.


We have 800 CRM Agents in 5 data centers located in the US, Canada and the Philippines.  They process 80-100 calls per day.  We create 300 characters of text to annotate each call.  Our IVR system will allow customer to leave feedback up to 1000 characters of text. Do the math. Talk with the call center supervisors and understand how the systems works.  You going to finish with some very interesting metrics around call volume, duration of call, actions taken and more based on 100-300 characters of text enter by your CSR’s.

That was a simple example with a finite call center.  Try identifying all the known metrics from 2.5million mobile users with several social media accounts blogging, posting, pinning about your product and/or service along with reviews on your corporate website.  That is going to be ALLOT of text data and its not going to be pretty or as clean as a CSR would enter at a call center. [note: that is why we pipe-line data : clean and filter in process].

Usage: Text Metrics – Analytics

How do you drive insights with Text Metrics and Analytics?

What do the data sets look like?

How are they used?

Metrics – Metrics – Metrics …

In web analytics, there are common metrics like:

  • page-views
  • click-through-rate
  • session time

In Text Analytics we have similar metrics:

  • word-count
  • word-frequency
  • unique-stem-words
  • meaning (lol, that’s a hard one) [what do you mean by that?]
  • sentiment (negative -neutral -positive)
  • post-frequency
  • organic (a  measure of probability if the author is human or a bot)
  • subjectivity vs objectivity
  • There are several others)…

Example of a completed metrics narrative:

We are seeing an uptick in product reviews with 18-25 year olds in Sporting Goods. There is a statistically significant increase in both frequency and word-count.  The general sentiment in positive in this segment as well.  The number of reviews has increase in the following social media channels (a list of channels) . Here is a sample of stem words associated with the reviews and are seen most frequent:  Great, Product, Amazing, Quality, Value.

So, You can see – There is allot to consider when developing a data platform for text analysis. It’s not just another data warehouse and/or business intelligence platform.  There are gobs of products trying to get you to invest into they’re technology.  I say, there is allot you can do on your own with a little bit of planning and foresight – you can build you’re own text analytics platform and start driving a better customer experience for your enterprise today.

Contact US if you’re ready to implement Text – Sentiment Analytics in your enterprise. We can help you develop and/or implement an architecture. We can help you optimize an existing system. There is much you can learn even before implementing any kind of machine learning and/or deep learning.  Its all about driving value and insight from what is at hand. Thanks for your time. – Gus Segura