Text Content Originality
Text content is probably checked for originality relative to other sites.
Low variance or duplication would suggest that content may have been transcribed from an original source. No variance obviously would mean a direct copy.
While I’d only guess at the level or originality required to be seen as unique, I would assume the tolerance is far greater than that required outside of the search engine – by copyright laws for example where I think an 8.5% variance is required.
I expect the difference factor would need to be greater than 50% for Google to give it a reasonable uniqueness score, but that’s just a guess.
Wherever possible, if you need to reproduce some content that already exists online, here’s what I suggest: Read the original many times over until you fully understand the meaning and intent. Once you are comfortable that you can write about what you read, hide away the original and don’t go back to it at all. Just write your own version of it without checking against the original.
There are many SEO experts who believe Google gives a penalty to duplicated content.
I believe the issue is that duplicate content is disregarded for rank, not penalised. It may not appear in Google’s cache at all, or may exist by token only, i.e. it is in the cache but never gains rank or impressions unless rendered via a site: search operator.
Intentional duplication should be marked by the rel=”canonical” tag indicating the original URL for the content. This will not exempt it from being seen as duplicate content. It merely signals that the source should be the version that gains rank.