Lastly, we found that in order to get these fairly high quality numbers, we had to run the same gig with three workers. I.e. have three workers categorize each dress. We took the majority "vote" of the categories and found this to improve our quality significantly (as opposed to having just one worker do each gig). $3 for a good categorization of 100 dresses is great! Takeaways: run a gig 3 times, pay as little as possible, and be super clear in creating your gig.
Pretty interesting write up of using Amazon's Mechanical Turk for making people sort out some data, that is still expensive or too inaccurate to be sorted automatically.