Mannheimer Webpanel

DOI

The ZEW-FDZ offers a novel panel of semi-structured webpage data on company level – the Mannheimer Webpanel. It comprises textual webpage data retrieved from a broad range of German firm websites. A detailed description of the webscraping methods used to harvest the data as well as an examination of the dataset (corpus of German corporate websites) can be found in this discussion paper. The dataset provides, among others, the following variables:- ID – unique company identifier.- dl_rank – usually a company website consists of several single webpages. In this context, dl_rank represents the chronological order in which the individual webpages were downloaded. The main page of a website has rank 0, the first subpage processed after the main page has rank 1, and so on.- dl_slot – the domain name of the website.- title – the title of the company website as indicated in the website's meta data.- keywords – list of keywords of the company website as indicated in the website's meta data.- description – the description of the company website as indicated in the website's meta data.- text – the text/content that was downloaded from the webpage.- timestamp – the exact time when the webpage was downloaded.- url – the URL of the webpage.

Identifier
DOI https://doi.org/10.7806/zew.mwp.2021.v1
Related Identifier IsDocumentedBy https://doi.org/10.2139/ssrn.3924887
Related Identifier IsDocumentedBy https://doi.org/10.1371/journal.pone.0249583
Related Identifier IsDocumentedBy https://doi.org/10.1007/s11192-020-03726-9
Related Identifier IsDocumentedBy https://doi.org/10.1371/journal.pone.0249071
Metadata Access https://api.datacite.org/dois/10.7806/zew.mwp.2021.v1
Provenance
Creator Kinne, Jan
Publisher ZEW – Leibniz Centre for European Economic Research
Contributor Dörr, Julian; ZEW – Leibniz Centre For European Economic Research
Publication Year 2020
OpenAccess true
Representation
Language English
Resource Type Dataset
Format CSV
Version 1
Discipline Social Sciences