but I'm still getting a lot of duplicates from e.g. BBC news and Pitpass.com.
It will happen from any feeds that updates a published story in a few different ways. Atom feeds are much more likely to yield duplicates than RSS feeds.
My other feeling is that it seemed less likely to download a duplicate if the previous copy was still in the trash.
This should not affect it.