Identify observations from the dataset that are relevant to our aspect.
Example: Analyze consumer sentiment specifically related to the color of dresses
Color is the aspect
Options
Filter text column to rows that only contain the aspect key word (e.g. color, colors)
regex?
Sometimes with more abstract aspects, such as Experience, Service, or Location, you may need to leverage topic modeling to predict which aspect is most relevant to a text.
Tokenize text into smaller pieces
Some simple options would be to split text by punctuation (e.g. commas) and/or conjunctions
Calculate the Polarity (aka sentiment) relating to the aspect
Apply a pre-trained sentiment classifier model
Models: TextBlob
Polarity range: -1 to 1
Extract the descriptors associated with our aspect
Descriptors help explain the “why” behind the sentiment.
e.g. the customer comment had a positive sentiment about color because they thought it was beautiful. “Beautiful” is the descriptor.
Options
spaCy’s token classification features to automatically analyze the linguistic structure of the sentence and extract what adjectives/adverbs are associated with our noun
Analysis
Polarity histogram related to aspect
Interpretation
Customers typically respond positively to dress colors.
Therefore, color is not a primary contributor to Not Recommended reviews
Descriptors
Interpretation
Customers love products with bright, vivid, and vibrant colors.
Customers tend to complain about a product’s color when it appears darker in person than on online pictures or is too muted or dull
Company should focus on using bright, vibrant colorways and materials