Chevron Left
Back to Getting and Cleaning Data

Learner Reviews & Feedback for Getting and Cleaning Data by Johns Hopkins University

4.5
stars
8,047 ratings

About the Course

Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data....

Top reviews

HS

May 2, 2020

This course provides an introduction of some important concepts and tools on a very important aspect of data science: cleaning and organizing data before any analysis. A must for any data scientist.

DH

Feb 1, 2016

Easy, mostly instructive Course. The Assignments and quizzes are quite good, and illustrates the lessons very well.

See the videos for general presentation, but use the energy on the excersizes.

Filter by:

826 - 850 of 1,306 Reviews for Getting and Cleaning Data

By Rouholamin R

Jan 14, 2019

I've passed two courses of this specialization before this. first of all I think it was a little bit harder and filled with more content. for me it's like anything professors say, I'll start to R&D about them and learn. but for this course there were lots of stuff I learnt and unfortunately I already started to forget regex patterns and so on.

I liked the project in so many levels except the main dataset wasn't well documented. after finding out what's the data set is about I did the project and I think it helped me take back my confidence .

By Amr E (

Jul 15, 2020

I very much liked this course. It is challenging and take you once step deeper into data science. The final project is a real-world project that you may face in your professional career. It is well organized in many aspects. However, what I didn't like in this course is the following:

1- Many of the used functions are deprecated as of 2020 and haven't been updated

2- Some lectures took longer than they should (Reading HD5 lecture for example)

3- Unlike the first two courses, the discussion forms are not as rich (especially in week 3 & 4)

By Karin R

May 24, 2020

This course is wonderful for those who are already equipped with coding experience. For the rest, it's extremely difficult, and I found myself wishing that there were better resources available for those who aren't already there. I have taken each course in succession and have purchased books to help guide me—and also have a very patient brother with advanced computer engineering expertise who had to answer all of my questions. I absolutely love the package in R that allows you to do tutorials. All of my stars are for that.

By Jason J

Jan 29, 2016

Thank you to the professors who made this course possible and especially to Dr. Peng who was willing to spend some time with us face to face via video. I found the course very challenging but at the same time I did learn quite a bit about R. Working through the course assignments and the final project was the best part. My rating is 4 stars because the course lectures are not engaging. The lecture style is basically just reading the slides to us and they don't take much time to explain what is going on.

By Rick H

Mar 8, 2016

It is a great course so far, with a lot of applicable topics covered. However, I feel that some of the questions are structured poorly to achieve their goals.

It is fine to have difficult questions where students are expected to do a lot of extra research, but it should done in a way that they students know what they are getting into ahead of time, and not include questions or code that does not run without explaining what is supposed to be done.

It's just not a good teaching technique for learning.

By Jason B

Jun 4, 2016

Good course, though I have to say that the final project was a bit confusing, and I am not sure that the people who did the final project really understood the course and how to create a tidy dataset, as the ones that I looked at did not meet all the principles of tidy data that were outlined. What concerns me is that they all had similar issues, and are all doing peer review of each other - this means that there is no one that can make sure that their answers are really tidy...

By Dzmitry B

Nov 7, 2020

The principles of tidy data are well delivered and, overall, the course structure is great. Many great packages were covered (maybe some are a little outdated). Personally I felt that the data.table and dplyr/plyr shouldn't be covered to such depth, just mentioned. The main reason being is that they are constantly updated and often enough deprecate functions/parameters. I believe learning R dialects should be individual's choice and is not required for the data processing in R.

By Alex B

Nov 3, 2016

This is an interesting, helpful class. It was challenging, and exposed me to a very wide variety of topics outside R for data analysis, including databases, XML, APIs for getting data. I would have found found swirl type exercises for those topics helpful, as the additional practice really reinforced the lecture material and homework/quiz problems. I also would have found some worked examples or discussion of the homework problems after they were submitted helpful.

By Matthew D

Jun 1, 2020

Pretty good, I liked the instructor. He explained things better than other instructors and didn't just read off the slides. Some of the quizzes were a bit off in my opinion. As far as I could tell with some of my code and the consistent answers I got, I think the quiz is not up to date with the data. The data is outside the course and is updated fairly frequently, one as early as December 2019 and I think the quiz answers were not updated accordingly.

By Tamir L

Jul 25, 2016

This course could be a little difficult for people with no programming experience what so ever, even if they took the previous R programming course in this series. Examples are often a little too laconic and not all of the material is as practically useful as the best of it.

However Jeff Leak teaches some excellent data tidying, cleaning and extraction techniques with modern tools and libraries, that I find very useful in my everyday work with data.

By Lee Y L R

Jun 13, 2017

This is a tough but important course. I learnt how to get the data from the web sources other than reading files of various formats, manipulate and group the data, and how to prepare a tidy data set for future analysis. There is ample practice to do each of the above. While the discussion forum is a great platform to address our queries, it would be good if there is greater clarity on some of the tools employed especially those in Week 2.

By Beñat G

Aug 11, 2016

I liked the initial approach of this course, aswell as the resources given. However, I feel that the difficulty the exercises showed wasn't really linked to the concepts: many things were not explained in the lectures, and I had to find the complementary informations in various sites. So, rather than evaluating the understanding of the concepts given in the lectures, the skill to look for further informations was assessed.

By Varun B

Jan 2, 2018

The course is really great, it starts of a bit slow initially but then really picks up pace with new concepts and R packages as you move to week 3-4. It really helps you strengthen the basics in what exactly Tidy and Clean data is before you move on to more advanced concepts. The course material needs updating though, some links did not work, and the presentations which are downloadable are not selection friendly.

By Steven Y

Feb 5, 2021

Great course. Just one suggestion. Thousands of students take this course. They have different internet environments, and the videos were recorded several years ago. It is possible that some of them are not able to download a file from a URL. It would be better if the course could provide files directly in case that students fail to download them, they can still continue to practice other skills.

By Stefan H

Apr 9, 2019

pretty good examples, good guidance. However again it would be more helpful to start learning from a PROBLEM statement first, moving to an EXAMPLE on how to solve it and then explain how the new information helps you with this in THEORY. it makes learning so much easier and i don't understand teachers that don't follow this human problem solving approach for better understanding and learning.

By Rok B

May 15, 2019

The course has valuable content, but there is not enough emphasis on how to create a tidy data set. You kind of learn what a tidy data set is (although the definition is vauge), but you would need to see examples of messy data sets and how to convert it to tidy data set. There is one exercise in swirl called tidyr that addresses that, but it would be nice to have also videos on this topic.

By Ingrid M V

Dec 23, 2020

Compared with other courses in the same series I observed several problems:

1. The explanation was not good, I had so many doubts that I clarified in other forums. The APIs lecture was too easy compared to the required to solve the quiz. The Dplyr section taught by Professor Roger Peng was the best explained.

2. Links s don't work.

3. The questions are not answered by the teachers.

By Juha R

May 18, 2018

I like the specialization quite a bit as it contains real world data and difficult enough exercises. This particular course is maybe not as good as the other courses I have taken (1,2,5) as the instructions lack a bit of clarity sometimes. However, the peer reviewed assignments are quite tricky and an excellent opportunity for learning. Took my some serious work to get this course done.

By Abdul S

Apr 2, 2020

The first thing about the course is that the learning objective was clearer. And the content tied back to it, while also leaving room for self research and study. The project instructions could be a bit clearer, but perhaps the availability of the discussion forum allows this to foster curiosity and community interaction. Overall, it was a worthwhile course.

By Lalit O

Jan 17, 2018

All Coursera data science courses have been designed very carefully. I found this course very beneficial as it explains the concepts and also tests the knowledge of the learner through tests.

In this course I learnt basics of fetching data from different sources like, API, Text-file, web-page e.t.c. Also I learnt cleaning data using various techniques.

By Sam M

Jun 3, 2018

Excellent course! Very useful videos, quizzes, and assignments. Provided the hands-on experience I was looking for. Improvement needs to be done to provide more technical information for doing the quizzes and assignments. Many critical details are being left out and students end up spending way too much time in digging them up via Forum, Google, etc.

By Dev P

Dec 2, 2019

Good introduction to getting and cleaning data and very useful learning about the principles of tidy data.

Jeff Leek isn't as good a tutor as Roger Peng and it was a bit frustrating following along at times as no hyperlinks are available for the data. The lessons are just recycled content from Jeff's lectures.

The course project was a good challenge!

By Adetunji O

May 4, 2017

Really great course material. I spent way too much time on the exams and projects, because i believe not enough information was given (had to spend a lot of time searching through discussion forums, stackoverflow, help files etc...and while that is useful experience, it was a lot more time commitment than expected from course description)

By chayan s

Apr 25, 2016

Honestly, I wanted to give complete 5 rating to this course, because the content of the lecture is well explained. But one feature I didn't like at all and that is Coursera has made it mandatory for the users to purchase the course in order to submit the quiz/assignment which I personally didn't like it. Except that the course is awesome.

By Dylan B

Dec 11, 2017

Good course, better structured than course 1 and 2 of this programme. However, still a few of frustrating moments when the lecturers all of a sudden use language/jargon that cannot be understood by a beginner with little background in computer science (like me). Final coursework is ambitious, but answers can be found on the internet.