In machine learning, there is one question you can't answer with a prediction.
In machine learning, there is one question you can't answer with a prediction.
5,000 films. No labels. No schema.
Clustering answers the question no one has labeled yet.
I work on finding structure in customer experience.
Hundreds of clients have handed us piles of text — tickets, reviews, survey responses — no labels, asking "what's in here?"
I can't show you client data. Same pipeline — UMAP, HDBSCAN, BERTopic — pointed at 5,000 movies. 47 clusters.
new_young_life_family
What BERTopic shipped me:
new_young_life_family
What Claude labelled it:
Coming-of-age drama
The clustering didn't get smarter.
The labelling layer did.
Clustering isn't a button. It's a pipeline.
Let's walk this.