Deep Text

Deep text is an approach to text analytics that adds depth to our ability to utilize a growing mass of unstructured text the world is drowning in.

"to do a good job of extracting data from text, you have to take into account the linguistic and cognitive elements of text." "text analytics can be used to characterize the content of unstructured text by subject matter (major and minor topics) and by positive and negative sentiment. This is often the most difficult thing to do but also the most valuable if done correctly. Frequently referred to as “auto-categorization,” it can be used to do far more than categorize a document."

Tom's book is a great resource for anyone that works in analytics. Much of analytics today is focused on buzzwords like Ai (Artificial Intelligence) and ML (Machine Learning) which focus solely on numeric data. Take a minute to let that sink in - the majority of the magic we see in modern shopping apps and intelligent systems is based on numeric data. Text analytics is made up of several techniques that are used to make meaning out of TEXT. It is referred to as unstructured data. You see in structured data, like numbers and date fields the values within the field are structured. If you have a data field for dollars it will be in the format of $100.00 and while the currency may change the structure of integers with a decimal place is known. In unstructured data, like in a comment box, feedback box, notes field, you know that you are going to get something... but the user can put in words, numbers, a mix of the two, anything. Since the system does not know what is going to be entered, then, standard mathematical equations can't be used. This is where TEXT ANALYTICS comes in. Using techniques to understand what is going on in all the unstructured data. The most widely known is NLP (natural language processing). Using NLP will help to get text data into a format that can be analyzed. Adding auto-categorization to NLP will let you group the text into categories. In practice, this is used in review sites like Yelp and TripAdvisor. How many reviews deal with the price? How many reviews talk about the food or the service? This book is a great resource to help you understand how the text analytic field is evolving.

Another good resource is KDnuggets.com which focuses on news in the Data Science community. Check out this POST for more info.

Here is the Link to the Amazon book.

Previous
Previous

Turn the Ship Around!: A True Story of Turning Followers into Leaders

Next
Next

Non-Obvious: How to Think Different, Curate Ideas & Predict The Future