# Gender Share in Casting Actors (DataLiteracy-WS2122) Team project "Analyzing Gender Share in Casting Actors" as part of the lecture "Data Literacy", Uni-Tübingen. --- In the context of gender equality, and inspired by the Bechdel test and a possible impact of the test, we aim to examine the gender balance in principal roles in movies by using IMDb data [[1](https://www.imdb.com/interfaces/), [2](https://datasets.imdbws.com/)] on movie casting. ## README ### Repository Cloning - (Do not download archive-balls!) The dataset is provided as large archives (`dat/*.tsv.gz`) that are tracket with *git lfs (large file storage)*. A download of this repository via a tar-ball or zip-ball won't include this archive data. To download the repository, use either HTTPS or SSH: ```bash git clone https://coreco.samstagskind.de/tobi/Gender-Share-in-Casting-Actors_DL-WS2122_public.git ``` ```bash git clone git@coreco.samstagskind.de:tobi/Gender-Share-in-Casting-Actors_DL-WS2122_public.git ``` ### Preprocessing - (Before Executing the Experiments) Executing the experiments (`exp/*.ipynb`) requires an extraction and preprocessing of the datasets large archives (`dat/*.tsv.gz`). To extract and preprocess these files, please open the [Jupyter-Notebook `exp/exp-001_Data-Preprocessing-and-Provisioning.ipynb`](https://coreco.samstagskind.de/tobi/Gender-Share-in-Casting-Actors_DL-WS2122_public/src/branch/master/exp/exp-001_Data-Preprocessing-and-Provisioning.ipynb) and execute all cells of this Data-Preprocessing-and-Provisioning ipython document once. You may use the button that's revealed by executing this documents first code cell. ## PROJECT REPORT in pdf format [`doc/projectsubmission2022/projectsubmission.pdf`](https://coreco.samstagskind.de/tobi/Gender-Share-in-Casting-Actors_DL-WS2122_public/src/branch/master/doc/projectsubmission2022/projectsubmission.pdf) ## REPOSITORY STRUCTURE ``` ├── dat // (data-set) │   ├── title.basics.tsv.gz │   ├── title.principals.tsv.gz │   └── title.ratings.tsv.gz ├── doc // (reports) │   ├── projectregistration2022 │   │   ├── neurips_2021.sty │   │   ├── neurips_2021.tex │   │   └── projectregistration.tex │   └── projectsubmission2022 │   ├── bibliography.bib │   ├── fig-001_Share-in-principal-cast-of-actresses-in-all-movies-1980-2020.png │   ├── neurips_2021.sty │   ├── neurips_2021.tex │ ├── projectsubmission.pdf │   └── projectsubmission.tex └── exp // (experiments) ├── exp-001_Data-Preprocessing-and-Provisioning.ipynb ├── exp-002_Share-in-principal-cast-of-actresses-in-all-movies-1980-2020.ipynb ├── exp-003_T-Test-Hypothesis-Testing.ipynb ├── exp-004_Beta-Binomial-Hypothesis-Testing.ipynb └── exp-005_Relationship-Rating-and-Share-Actresses-on-principal-cast.ipynb ```