Subscribe to DSC Newsletter

I'm creating a model with a binary target.

My training dataset contains some 5% bad (target = 1). The dataset also have approx 10% duplicate rows (the extra duplicate rows does not change the 5% bad ratio in the data)

IF I don't remove the duplicate rows (which might be the smart way to go) which algorithms/methods are insensitive to the duplicates?

Br Tomas

Views: 194

Reply to This

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service