Knowledge about the general graph structure of this graph is important for designing
ranking methods for search engines. To amend the ranking calculated by search engines for
different websites, search engine optimization agencies focus on linkage structure for
their clients. An extreme appearance of ranking manipulation manifests in spam networks,
where pages and websites publishing dubious content try to increase their ratings by
setting a massive number of links to other pages and retrieve backlinks. The WDC Hyperlink
Graph aggregated by pay-level-domain has been extracted from the Common Crawl 2012 web
corpus and covers 43 million pay-level-domains, linked by 623 million connections which
have been derived from hyperlinks between the pages contained in the
pay-level-domains.