Chevron Left
Zurück zu Getting and Cleaning Data

Bewertung und Feedback des Lernenden für Getting and Cleaning Data von Johns Hopkins University

7,870 Bewertungen
1,290 Bewertungen

Über den Kurs

Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data....


2. Mai 2020

This course provides an introduction of some important concepts and tools on a very important aspect of data science: cleaning and organizing data before any analysis. A must for any data scientist.

25. Okt. 2016

This course is really a challenging and compulsory for any one who wants to be a data scientist or working in any sort of data. It teaches you how to make very palatable data-set fro ma messy data.

Filtern nach:

1026 - 1050 von 1,253 Bewertungen für Getting and Cleaning Data

von Harsha G

3. März 2016

very helpful

von Giorgos A

26. Mai 2019

good course

von M C

22. Jan. 2021

Pretty OK

von Greg B G

3. Okt. 2016

very good

von Khobindra N C

18. Mai 2016


von Luiz C

16. Sep. 2017


von Nithya M

23. Juni 2017


von Shreya S

20. März 2017


von Veena M

21. Feb. 2016


von Chuang M

6. Feb. 2016


von Marcos A

12. Mai 2019


von Nithin k G

16. Nov. 2018


von Ashish S

6. Feb. 2017


von shipra g

8. Aug. 2016


von Vivek R

7. Feb. 2016


von Mrunal 1

8. Mai 2021


von Dimitri d

23. Feb. 2017


von Sergio R

27. Nov. 2016


von sugyoo

22. Okt. 2016


von Borja C

7. Mai 2016


von Miguel C

25. März 2020

I really enjoyed and learned a lot in this course.

I feel a lot more comfortable with looking for and reading data. I learned how to clean data and getting it ready for further analysis. I think the course project was particularly good for completely understanding the process of tidying data and all the aspects it involves, such as writing a code book and a README file for accompanying it.

Furthermore, I believe I further developed my R programming skills, by learning how to code new things or things I already knew but in a more efficient way, by using new packages and techniques.

Moreover, I found Professor Jeffrey Leek quite engaging, very easy to understand and I had complete confidence in his knowledge on this subject.

However, I believe the course is slightly outdated. I was often disheartened and frustrated by not being able to replicate what was being done in the lecture videos. For example, there were many links that did not work anymore and sometimes information that simply wasn't correct anymore. I found the discussion forums and many mentors responses to be very helpful. I think this can easily be fixed by writing up an errata or updating the lecture videos.

von Tomasz J

11. Dez. 2017

The course is teaches you some principles of tidy data and cleaning data but it's very messy.

There is no systematic approach to plyr and dplyr libraries. The teachers peak some functions from one and some functions from other library, but without any clear principle. It looks that prof. Leek and prof. Peng are presenting their favorite functions without consulting each other. They are doing it in the right way, though very confusing. On the other hand loading data is approached very encyclopedically.

Assignments not only check what was taught in the videos but also sometimes require new skills and going through stackoverflow etc. (e.g. codes to read fortran files). This is not the way how you construct good coursers.

Additionally some instruction in the final assignment are provided in submission part! (Expected names of the files should be provided in the assignment description, not on the submission page).

Prof. Peng and prof. Leek are very skilled, they know their job, but they don't know how to teach efficiently. Nonetheless if you are motivated to learn, this course may be very helpful.

von Rahul G

3. Aug. 2021

The course is great and essential to becoming a good data scientist. Provided me with valuable lessons in managing and processing raw data.

A major issue is that the course has not been updated in a while and has a lot of missing/ expired data sources. You will however be able to find and download the data if you do a bit of searching online. All the data used in the course is still available online and you just need to find it.

The course will also need you to do a bit of research and exploration on your own to use some of the methods specified in lectures. All the details are not handed to you in the video lecture. I have seen other people complain about this, but my personal view is that you will anyway need the skill to research and adapt in a practical scenario outside the course and this is the best place to learn this skill as well.

It would have been great for the course to be updated and made easier for beginners but alas. It will be good to have your basics in R, using libraries and a bit of data analysis under your belt already before jumping into this course.

von Guillermo A G

23. Okt. 2017

This course was quite challenging in comparison with the first two. I felt that the material provided by the instructors was not enough to approach the quizes and assignments, so it's necessary to spent a lot of time researching for your own in other sources. I struggled with the Course Project Assignment because I didn't understand what I was supposed to do exactly. Fortunately, the forum threads were really helpful. Nevertheless, the course's intention is very valuable and if you are patient and go all the way through it you will improve your data science skills, learn very useful techniques and habits, specially if you're a beginner. But I strongly suggest the instructors to make the course contents more explicit and helpful.

von John Y

4. Aug. 2017

Great class for an important piece of data analytics and data science. One issue I've been noticing with R compared to using Anaconda/Python is that a lot of the libraries required for the class aren't explicitly mentioned. That's fine if you're experienced with these environments and able to read error codes with familiarity. Minor annoyance to me when I run a script and realize I don't have a library installed.

I'd imagine though its extremely frustrating for beginners who may have written perfectly good code but haven't figured out that they simply need to install certain packages to answer quiz or homework questions. Perhaps having a full library or package list for Course 1 of this series will be helpful.