openwashdata Data Package Hackathon

This hackathon is organized by Mian Zhong from the Global Health Engineering group at ETH Zurich. Participants will make an openwashdata R data package with the washr package developed by openwashdata.

🎯 Objectives

Deliver publishable openwashdata data packages
Beta-test the washr R package to receive feedback for the package release
Meet openwashdata friends and have fun

🌟 Showcase

Here showcase the data packages developed from our hackathon. These datasets cover WASH data about Malawi 🇲🇼, Uganda 🇺🇬, Brazil 🇧🇷, Peru 🇵🇪, and Ghana. A big shoutout to all the participants for their hard work and dedication!

barplot from boreholefuncmwi data package

boreholefuncmwi

Data about the survey on borehole functionality in Karonga district, Malawi.

Go to Dataset

boxplot about the portawaterperu data package

portawaterperu

Data about a preliminary review of the portable water system in Peru.

Go to Dataset

ugabore

Data about borehole repair collected from two districts in central Uganda.

Go to Dataset

waschoolpiracema

Data about water supply, sewage disposal, waste collection and sanitary equipment of the schools in Piracema, Brazil.

Go to Dataset

watercostaccra

Data about the surveys on household water costs, coping mechanisms, and water point estimates in Accra, Ghana.

Go to Dataset

📆 Event Details

Date: Friday, June 21, 9:00 AM - 4:30 PM (CET)
Location: Location: Zurich, Switzerland

📝 Agenda

Time	Title	Remark
08:30 - 09:00	Check in & Breakfast
09:00 - 09:05	Opening	by Mian Zhong
09:05 - 09:50	Introduction workshop for washr R package	by Mian Zhong
10:00 - 12:00	Coding	Package Setup & Data Cleaning
12:00 - 12:40	Lunch
12:45 - 14:45	Coding	Data Cleaning & README writing
14:45 - 15:00	Break / Stretch
15:00 - 15:30	Final Polish	Write Feedback Survey
15:30 - 16:15	Showcase
16:25 - 16:30	Closing

🖼️ Slides

View slides in full screen ｜ Download slides as PDF

🧑‍💻 Workflow

Initialize package repository

Open GitHub
- 1. Open your invitation email from GitHub, follow the link, and accept the GitHub invitation to contribute
Open RStudio IDE
- 1. Check if R Packages devtools and usethis are installed. Otherwise, in console, run install.packages(c("devtools", "usethis")).
Create a new project following:
- File -> New Project -> New Directory -> R Package using devtools
  - Scroll down, this option is usually at bottom
- Copy-and-paste assigned GitHub repo name as the directory name
- Choose a location of sub-directory

Configure Git version control on your local computer.

In console, run:

usethis::use_git_config(user.name = "Your Name", user.email = "Your GitHub Email")

Add git version control to local directory
- In console, run, usethis::use_git()
- yes, commit
- yes, restart
Connect local computer with GitHub, please refer to https://happygitwithr.com/https-pat.html for more details.
- In Console, run
```
usethis::create_github_token()
```
- Click “Generate token”.
- Copy the generated PAT to your clipboard. Or leave that browser window open and available for a little while, so you can come back to copy the PAT. You will need the PAT in the next step.
  - Please consider to store the PAT securely, e.g., a password manager. You may store it by following the steps here .
Open Terminal (the tab next to Console), run commands suggested in your assigned GitHub repository. The commands should look like the following:
```
# Change the url link to be your assigned GitHub repo!

git remote add origin "https://github.com/openwashdata/fssample.git" 

git branch -M main

git push -u origin main
```
- You can find and copy the commands in your assigned repo too:
- If this is your FIRST TIME connecting RStudio with GitHub, you will receive a message in the Terminal to log in. Enter your GitHub username, and for the password, copy-and-paste the PAT token generated from Step 7 (ghp_xxxxxxxxxxxxxxxxxxxxxx)
Install washr R package:
```
library(devtools)
devtools::install_github("openwashdata-dev/washr")
```
You might encounter a message to update packages, choose the option “All” to continue.
Load required libraries:
- library(devtools)
- library(usethis)
- library(washr)

Create Dataset

Data Processing

Add directory for raw data to project
- In Console, execute setup_rawdata()
Move raw data files to the directory data-raw
- Add, commit and push all changes to GitHub
  - Select “Git” tab on the top-right panel
  - (Click “Pull” first for good practice)
  - Tick all files and click Commit
  - Enter a commit message and click “push”
Work on data-raw/data_processing.R to clean raw data and export tidy data.
- You may need to modify or delete some code in data_processing.R
Export the tidy data by executing the whole data_processing.R
- Add, commit and push all changes to GitHub

Dictionary

Once data reaches tidy state, in console, execute setup_dictionary()
- Go to data-raw/dictionary.csv
- Fill the column “description” in dictionary.csv for each dataset and variable
  - It might be easier to edit in a spreadsheet software (e.g. Excel)
- Save dictionary.csv
Add, commit and push all changes to GitHub:

Document Dataset

Roxygen

Initiate and write documentation in R/ folder by executing in console: setup_roxygen()
- Open each documentation file in R/ to write a human-readable title and description about the dataset
Add, commit and push all changes to GitHub
Use devtools to document, check and install the package
- devtools::document()
- devtools::check()
- devtools::install()
  
  If there is any error or warning, please let me know and we can look together. You will have a warning message about license which will be addressed in the next step.

DESCRIPTION

Add yourself as the creator and author of the package

use_author(given = "First Name", family = "Last Name", 
           role = c("aut", "cre"), email = "Your email",
           comment = c(ORCID = "XXXX-XXXX-XXXX-XXXX"))

On GitHub, create an issue with details to write up author information for DESCRIPTION file
- Contributors (name, email, role, ORCID)
  - Include everyone here
    - Roles
      - cre = maintainer
      - aut = significant contributions
      - ctb = contributor with smaller contributions
- Add other author(s):
```
use_author(given = "Second Author", family = "Second Author", role = "aut")
```
Go to DESCRIPTION file, write the Title and Description about the package. Then, in console, run: update_description() to update other fields. Proof-read the DESCRIPTION file to make sure that the fields are correct.
Use devtools to document, check and install the package
- devtools::document()
- devtools::check()
- devtools::install()
  
  If there is any error or warning, please let me know and we can look together.

Communicate Dataset

README

In console, execute setup_readme().
- If you are not tight on time, optionally, enable has_example=TRUE to create an example article for the package.
Open README.Rmd and edit the sections.
- Make at least one plot about the data in the section “Example”
Once you finish writing README.Rmd, run build_readme().
Add, commit and push all changes to GitHub

Pkgdown Website

In console, run setup_website() to create an openwashdata style pkgdown website
- Select “No” option to not override the _pkgdown.yml in the console
Use devtools to document, check and install the package
- devtools::document()
- devtools::check()
- devtools::install()
  
  If there is any error or warning, please let me know and we can look together.
Open .gitignore, remove docs, and save the file.
Add, commit and push all changes to GitHub