It would be really simple to create a few rules that would be 99% accurate in splitting those 2 categories of sites.
I disagree If it was that easy to sort out Google would have done it yonks ago - its not as if the issue of abuse of exact match domains is a new phenomena