Big Data Advanced Analytics


Big Data Advanced Analytics

The selection of technologies available today to get work done is awesome! Literally… Let’s take a moment and talk about two interesting use-cases for Big Data: IOT Device Messaging and Text Sentiment Aggregation.  We going to examine both use cases in detail and talk about the similarities.  Finally, we’re going to touch on Data Federation and how using more than one type of data store is a solid strategy as long as you commit to a manageable group.  Remember, It’s a commitment you can visit  monthly or quarterly (depending on your release cycle) and is only in place as long as its healthy and yields productivity.

We’ve committed to MongoDB and Hadoop years ago. It was rocky at the beginning getting pulled in several different directions with competing technologies.

Today, Our relationship is solid and we’re happy with the features available in both technologies. We also very pleased that we can implement MongoDB, Hadoop and Others in Amazon AWS, Azure and On-Premise with scale-as-you-go sans vendor lock.

If you’re about to start a new project and/or build a new data product, Please Contact US : if you need help or have general questions.

IOT Device Messaging

IOT Device Messaging : Typically, You will build an architecture that supports some type of messaging, filtering and aggregation. You will hear technologies like Kafka, Storm, Zookeeper, Log Mining, MQTT and many more.. But, Lets back up and talk briefly about the “Why” with two quick examples:

CustomerA: “I have 100-200 devices connected to a wireless network, Most of the time I just want to monitor and/or issue an inquiry to the devices, I don’t want to be bothered about every little problem (“Yes, I hear this allot”),  I want to know when there is a big problem heading our way (“predictive analytics”). I’m not sure What I want to store.  Do I store everything or just the things that matter and How do I tell the difference?”

CustomerB: “I have N devices around the globe with different connections types.  Some devices are offer real-time connectivity others upload they’re logs at various times. I have different rules to how I acquire data and have to be sensitive to local laws and data privacy issues.  I need edge connectivity and some type of predictive – advanced analytics capability on the edge.  I want to store everything wether I need it or not.”

These are two examples that you would find somewhere in the middle of the use case spectrum.  The point is that everyone is going to have a slightly different user story and may require a different approach. We found that having a solid tool kit, the right partnerships and being prepared is the best way to approach building world scale solution architectures.

Text Sentiment Aggregation

One of our core verticals is Marketing Analytics. In Marketing; we find that the user story in Text Sentiment closely resembles the examples we noted above.  In OOP, We would call this a specialization of the IOT devices to mobile, desktop and tablets. The point is that Text can flow into your system from several places.  Starting with the obvious ones: customer feedback, social media, news articles and blogs.  Then maybe the less obvious ones: customer complaints, IVR systems and CSR system input. Just a reminder; even a PC or CSR terminal is IOT connected device in this context. [IVR: Interactive Voice Response. CSR: Customer Service Representative.]

The IOT message will be focused around “Text” as opposed to a temperature or pressure.  Finally, the predictive event that we would likely want to identify would be a decrease in sales and/or customer churn. With a Temperature sensor; We want to identify a component failure i.e. A 120$ sensor can help save a $2Million dollar engine by identifying a pattern of over temperature. The final piece is that someone or something would have to take corrective action to prevent or mitigate the failure.  [Automated Corrective Action.]

Hopefully, By now you can start to see that the architectures for data collection could actually be very similar .. And, In several use cases we’ve implemented over the years .. Are.  Its interesting to point out that even in the more complex architectures, the core seems to be a fast, scalable and reliable store to perform filtering, pre-aggregation, counts, volumetrics and data quality with MongoDB.  Next, performing robust, powerful and cost effective analytics implemented in another stage with Hadoop.

Data Federation

You could “probably” implement everything in one technology.  This could “maybe” make your data management costs lower. However, If you embrace Data Federation; you will likely lower your TCO and have built a right-sized (scalable) architecture.

We found in various studies that going with stateless Hadoop Clusters and cycling data to / from MongoDB is a solid choice. Our studies included final data sets stored in relational RDBMS like Oracle and SQL Server.

We typically implement these final stores in the cloud with Amazon RDS – Azure. A “Data Federated” solution is the new normal for several enterprises from small startups to large scale web companies. If you want to learn more about our approach to IOT Messaging and Sentiment Analytics Aggregation; Please let me know. Thank you. – Gus Segura

Please Contact US or Subscribe to our Blog if you found this interesting or would like more information.

Subscribe : Blueskymetrics Blog

* indicates required,  Managed By Mail-chimp – Please check your Spam Folder and Confirm Subscription.