Datasets ======== This section outlines, how various public mailing lists can be scraped from the web and stored to disk for further processing. Currently, the ``BigBang`` repository does not contain personally identifiable information of any kind. The datasets included in ``BigBang`` pertain to organizational entities and provide :ref:`ancillary data ` useful in preprocessing and analysis of those entities. As the mailing-list archives are large and time consuming to scrape from the web, we are working on GDPR compliant method to share the datasets with other researchers. .. toctree:: :maxdepth: 2 mailinglists drafts ancillary git