A reflection on all the things I’m learning in my first weeks of graduate school.
Here are some examples of things I’ve learned so far:
One function we’ve learned that I enjoy is the case_when()
function. When paired with the mutate()
function, we can sort data in a newly created column based on existing data. Below is an example with the Palmer Penguins dataset.
I made this stacked bar graph so you can see that I have a new column that has two classifications for flipper length: medium and small.
A concept we’ve learned in EDS 221 is tidy data. As an organizational fiend, this has been one of my favorite concepts! I have worked on many, many untidy spreadsheets over the years. In tidy data, data is organized in a predictable way! Most importantly, tidy data has:
1.) Each variable is a column. 2.) Each observation is a row. 3.) Each cell contains 1 value.
A thing I have learned about doing data science is that reproducible workflows are essential, and acheived in various ways! Valuing the code over the product (i.e. saving and pushing to repo) is more important that the output (html you knit!). Create a repo and stage, commit, push, pull early and often! Try not to copy and paste code. Be aware of the order in which you put functions. Do not alter your original data set. There are many more, but as you can see, reproducible workflows are essential.
Distill is a publication format for scientific and technical writing, native to the web.
Learn more about using Distill at https://rstudio.github.io/distill.
For attribution, please cite this work as
Leonard (2021, Aug. 16). Scout Leonard (she/her): EDS 221: Things I'm Learning!. Retrieved from https://scoutcleonard.github.io/posts/2021-08-16-eds-221-things-im-learning/
BibTeX citation
@misc{leonard2021eds, author = {Leonard, Scout}, title = {Scout Leonard (she/her): EDS 221: Things I'm Learning!}, url = {https://scoutcleonard.github.io/posts/2021-08-16-eds-221-things-im-learning/}, year = {2021} }