I'm creating a model with a binary target. My training dataset contains some 5% bad (target = 1). The dataset also have approx 10% duplicate rows (the extra duplicate rows does not change the 5% bad ratio in the data) IF I don't remove the duplicate rows (which might be the smart way to go) which algorithms/methods are insensitive to the duplicates? Br Tomas
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Most popular articles