Data never arrive in the condition that you need them in order to do effective data analysis. Data need to be re-shaped, re-arranged, and re-formatted, so that they can be visualized or be inputted into a machine learning algorithm. This course addresses the problem of wrangling your data so that you can bring them under control and analyze them effectively. The key goal in data wrangling is transforming non-tidy data into tidy data.
Dieser Kurs ist Teil der Spezialisierung Spezialisierung Tidyverse Skills for Data Science in R
von
Über diesen Kurs
Was Sie lernen werden
Apply Tidyverse functions to transform non-tidy data to tidy data
Conduct basic exploratory data analysis
Conduct analyses of text data
von

Johns Hopkins University
The mission of The Johns Hopkins University is to educate its students and cultivate their capacity for life-long learning, to foster independent and original research, and to bring the benefits of discovery to the world.
Lehrplan - Was Sie in diesem Kurs lernen werden
Wrangling Data in the Tidyverse
Data never arrive in the condition that you need them in order to do effective data analysis. Data need to be re-shaped, re-arranged, and re-formatted, so that they can be visualized or be inputted into a machine learning algorithm. This module addresses the problem of wrangling your data so that you can bring them under control and analyze them effectively. The key goal in data wrangling is transforming non-tidy data into tidy data.
Working With Factors, Dates, and Times
In R, categorical data are handled as factors. By definition, categorical data are limited in that they have a set number of possible values they can take. For example, there are 12 months in a calendar year. In a month variable, each observation is limited to taking one of these twelve values. Thus, with a limited number of possible values, month is a categorical variable. Categorical data, which will be referred to as factors for the rest of this lesson, are regularly found in data. Learning how to work with this type of variable effectively will be incredibly helpful.
Working With Strings and Text and Functional Programming
Working with text data is increasingly common in data science projects. Text manipulation is often needed to clean up messy datasets and to create numerical measurements out of text input. In addition, often the text themselves are the data and this module covers tools to extract information from the text.
Exploratory Data Analysis
The goal of an exploratory analysis is to examine, or explore the data and find relationships that weren’t previously known. Exploratory analyses explore how different measures might be related to each other but do not confirm that relationship as causal, i.e., one variable causing another. You’ve probably heard the phrase “Correlation does not imply causation,” and exploratory analyses lie at the root of this saying. Just because you observe a relationship between two variables during exploratory analysis, it does not mean that one necessarily causes the other.
Case Studies
Now we will demonstrate how to import data using our case study examples. When working through the steps of the case studies, you can use either RStudio on your own computer or Coursera lab spaces provided for each case study.
Project: Wrangling data in the Tidyverse
In this project, you will practice data exploration and data wrangling with the tidyverse using consumer complaint data from the Consumer Financial Protection Bureau (CFPB).
Bewertungen
- 5 stars73,07 %
- 4 stars15,38 %
- 3 stars7,69 %
- 2 stars3,84 %
Top-Bewertungen von WRANGLING DATA IN THE TIDYVERSE
Excellent course! I've learned so many useful R techniques/codes!
Great course to get yourself acquanted with data wrangling in Tidyverse.
Über den Spezialisierung Tidyverse Skills for Data Science in R
This Specialization is intended for data scientists with some familiarity with the R programming language who are seeking to do data science using the Tidyverse family of packages. Through 5 courses, you will cover importing, wrangling, visualizing, and modeling data using the powerful Tidyverse framework. The Tidyverse packages provide a simple but powerful approach to data science which scales from the most basic analyses to massive data deployments. This course covers the entire life cycle of a data science project and presents specific tidy tools for each stage.

Häufig gestellte Fragen
Wann erhalte ich Zugang zu den Vorträgen und Aufgaben?
Was bekomme ich, wenn ich diese Spezialisierung abonniere?
Ist finanzielle Unterstützung möglich?
Haben Sie weitere Fragen? Besuchen Sie das Learner Help Center.