Subscribe to DSC Newsletter

Dalila Benachenhou's Blog (2)

Comparison Between Global Vs Local Normalization of Tweets, and Various Distances

In the previous example we used clustering to see if an apparent pattern exists within Brexit tweets.   We found out that we have three distinct patterns, the leave, the referendum, and Brexit.  This in itself helps us think that we may even create a classifier that can identify if the tweet writer is pro or agains an issue automatically, with no human intervention.



Let's get back to the issues related to clustering.  To use the clustering algorithm we had to…

Continue

Added by Dalila Benachenhou on October 29, 2016 at 2:30pm — No Comments

Context Matters When Text Mining

Context Matters When Text Mining

Many times the most followed approach can result in failure.  The reason has more to do with thinking that one approach works in all cases.  This is specially true in text mining.  For instance, a common approach in clustering documents is to create tf-idf matrix for all documents, use SVD or other dimension reduction algorithm and then use a clustering.  In most cases, this will work; However, as I will present here,  there are instances…
Continue

Added by Dalila Benachenhou on October 27, 2016 at 5:30pm — 2 Comments

Videos

  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service