Subscribe to DSC Newsletter

I know that usually survey sampling is done the way that after a quota is reached, the survey is closed for respondents that would meet the criteria for that quota.

However, at the company I work at, the survey is open for everyone until every demographic quota is met; and only after that do we start deleting responses until the quotas are met. So for example if we need 500 cases (250 females and 250 males) and we closed the survey with 532 responses that have 273 females and 259 males, we delete 23 female and 9 male responses. It sounds easy, but most studies have 3-4 demographic quotas (e.g. gender, age group, region, settlement type), and it is really difficult and time-consuming to figure out what cases I have to delete to meet the quotas.

Is there any way or software that would calculate automatically what cases should be deleted?

Views: 230

Reply to This

Replies to This Discussion

Hi Gabriel, if you are using Excel, just use an if/nested if statement to sort out your desired/undesired results.

If you are using SQL, just write a 

SELECT (your columns)

FROM (your table)

WHERE(list conditions) (this may required further study to get more specific results

Let me know if I didn't understand your question.

Hi Jon David, thank you for your response.

Well, I'm not sure if I understand your answer :). I guess the nested if statements would apply well if we knew the exact numbers in each cell we have to fill. However, we do not use interlocked quotas, and that's where the difficulty arises. I guess in the other case your solution might work, but in this case I have the freedom to distribute interviews - which sounds easy until I clean so many excess interviews from a quota (e.g. from the Eastern region) that I end up needing more interviews in another quota (for the Western region). And with more quotas and more complex quotas (2 genders, 3 settlement types, 4 age groups, 5 regions) I feel like I play 4D chess where I only find out whether I did it right when I'm at the very end - and if then it turns out I need some more cases or I still have to delete cases, but every quota is met, I have to go back many steps or even start from the beginning.

That's why I asked if there was a method that would calculate the optimal distribution.

A senior analyst recently suggested weighing each case based on the quota difficulty then deleting the lowest weights. I haven't had the chance to check it out yet, but I'm hoping it would work.



  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service