How to fight-off Content Scraping in WordPress Blogs

Sandeep Kumar Mishra
Flippercode
in Blog Posts
May 28, 2020
5 min read

Original content on your WordPress blog can be a hotspot for content scrapers; it can end in a lot of SPAM sites within a short time. These are sites with stolen content and they may even end up outranking your original site. This really is frustrating to a website owner; seeing that someone is stealing your hard earned content without permission, making money from it, outranking you even wrongly making your site appear as SPAM, and stealing your visitors. Content scraping has grown to an alarming level today with huge sites easily stealing content from original owner’s sites.

How Content Scraping Occurs

Content scraping usually is performed by scripts that extract content from original web sources and takes the content into one site. A person creates a beautiful WordPress site, then installs some plugins to go and scrape content from specified blogs, to be published on his/her site. Ulterior motives drive people to scrape content from other websites. Some people want to exploit the system to make some money by using original content from other sites as there’s so as to drive traffic from that site to theres. The person scrapes content to his site to attract traffic so he can get money from the site by putting advertisement when the site is made popular from the increase in traffic.

How to Identify your WordPress Site’s Scrapers

It is a tedious task to identify a site that scrapes content from yours not forgetting that it’s a time-consuming task. One way to identify content scrapers is to search Google using your site’s post titles. This is a tedious and time-consuming procedure, especially when are searching a very popular topic that resembles the one in your blog post. Another way is to use trackback by adding internal links in your blog posts. If a site steals your content, you will notice a trackback. Using Akismet in WordPress shows a lot of trackbacks in the SPAM folder. But remember this only works if you implement internal backlinks in your posts. Google webmaster tools let you know of links that come from scraper’s site to your site. Just look at “Traffic”, where you will find “links to your site” that leads you to a page the displays links to scraper’s sites. You can also identify content scraper’s site by using a FeedBurner i.e. if you had installed it on your WordPress blog; check on the Analyze Tab bar under Feed Stats, where you will see “Uncommon Uses”. It contains a list of scraper’s sites.

The Approach for Dealing with Content Scrapers

The easiest approach you can take is to take no action considering that it takes a lot of time-fighting content scraping.  For authority sites in Google’s eyes, this will do no harm but for other sites, they can be flagged as scrape sites when Google thinks the site’s content is scraped but it’s not. This usually happens during a Panda Update. On the other hand, you can contact the scraper asking them to remove your site content from theirs. Some may refuse, in such a case file a Digital Millennium Copyright Act (DMCA) referencing their host. You can also block their IP. The last approach is taking advantage of scrapers by internal linking to get backlinks from their sites thus increasing your audience or you can auto link keywords with Affiliate links.