Chevron Left
Zurück zu Getting and Cleaning Data

Bewertung und Feedback des Lernenden für Getting and Cleaning Data von Johns Hopkins University

7,929 Bewertungen
1,311 Bewertungen

Über den Kurs

Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data....



2. Mai 2020

This course provides an introduction of some important concepts and tools on a very important aspect of data science: cleaning and organizing data before any analysis. A must for any data scientist.


25. Okt. 2016

This course is really a challenging and compulsory for any one who wants to be a data scientist or working in any sort of data. It teaches you how to make very palatable data-set fro ma messy data.

Filtern nach:

801 - 825 von 1,274 Bewertungen für Getting and Cleaning Data

von Anil G

23. Okt. 2017


von Jihee Y

12. Nov. 2016


von Angelica S

2. Jan. 2017

The last course in this specialization had me second guessing myself on whether or not I could do this (based on how difficult the quizzes/course project was). I spent tons of time researching, looking at the forums, and programming documentation. Going into this course, I was surprised about how much I knew when it came to the quizzes and how my time looking at all the resources above decreased and I actually knew where to begin! The course project was still VERY difficult for this beginning specialization compared to what information is given in the lectures (spent over 16 hours on it and still didn't meet ALL criteria). However, it does challenge you to think and reflect on your knowledge. You probably won't understand 100% of what is in the course (especially the project) but you will LEARN especially if you are totally NEW to programming like me! My knowledge has increased and that is what I was hoping for. If you have those same hopes and do not mind not getting 100%'s on all quizzes/projects this course is for you. Do be mindful that I personally feel that this course requires more than 4-6 hours per week especially for beginners. I spent more than 24 hours on week 4 alone.

von Marcelo A M

2. Mai 2020

Aunque el curso es muy bueno y lo recomiendo a cualquiera que quiera hacer una carrera como Data Scientist, tiene algunos problemas completamente solucionables, por ejemplo en algunas de las clases que muestran el uso de paquetes problemáticos y que son completamente prescindibles para el 99% de los casos. Además no hay orientación para los sistemas Linux, y siendo un curso sobre una herramienta open-source me parece una triste omisión. Dicho esto, el curso es muy bueno, y te va a enseñar a lidiar con datos desordenados y con distintos formatos de datos. No lo recomendaría para alguien que no tiene experiencia previa en R.

Although the course is very good and I recommend it to anyone who wants to pursue a career as a Data Scientist, it has some completely solvable problems, i.e. some of the classes that show the use of problematic packages and that are completely expendable for 99% of cases. Also there is no guidance for Linux systems, and being a course on an open-source tool seems to me a sad omission. That said, the course is very good, and it will teach you how to deal with messy data and different data formats. I would not recommend it for someone who has no previous experience in R.

von Luc R

18. Dez. 2015

I was pleased to be selected as a beta tester, but it turns out to be a little bit boring for a course I did already pass and succeed. Slides are unchanged, so my concentration drops after a while, even with some good will. It seems there are new swirl lessons, but they don't seem to be available yet, at beta test time, bad luck.I like the new overall presentation (provided by Coursera?) where we can pick transcripts without leaving the normal course flow. I tend to click the "finger up" icon for each lecture, because I just think the general organization of this specific course is quite good, and the contents consistent and not uselessly overlapping (unlike some other courses of the track such as "Reproducible research"). Also, the theme seems to perfectly fit the 4-weeks standard slot of the track. I have been busy those days, so I didn't fully review the course yet: I'll try to review as much lectures as possible till the end of the countdown. I'll try to do it better for my next beta test, if any. Anyway, thanks a lot for that very interesting track (I plan to go up to the capstone project, certified).

von Miguell M

30. Juni 2018

This course was pretty useful for learning the various ways to acquire, clean, and manipulate data, which I think is an awesome real-world skill. The course project at week 4 was a good way to exercise some of these skills, but I have some qualms about the delivery of the course project - primarily the instructions. The course project involves getting and cleaning a dataset, but the instructions are rather vague in some key areas that I believe could lead to a great variety in submissions. I'm not sure if the vagueness in instructions was intentional (perhaps to mimic real-world scenarios?), but it certainly lead to a lot of confusion in the interpretation of the instructions, a sentiment reflected in the discussion forums. That being said, the course was useful!

von Haonan J

4. Feb. 2018

the content about getting data is too difficult for me, as I'm a student who just completed the R Programming course. It's hard for me to learn data mining from API, website and excel in only one week. So I don't reommend this courses for some starter like me.

However, the content on Cleaning Data is great. the dplyr package is more convenient than what I've learned in the last course, And the mentor is still great.

In all, This is a nice course and help me a lot. Thanks a lot to the mentor. Maybe somedays later when I have a better foundation on programming, I will review the knowledge and skills in this course again.

von Christian B

4. Nov. 2016

The course content is important. I felt the final assignment quite hard. I struggled a lot with R on it. Interesting enough, when looking at the solutions during the peer reviews, they seem to have found way easier solutions than I had. I am not sure why. I got the same result but my code looks way more complicated. Also the description of the final assignment was a bit unclear. For example, where we supposed to rename the features or not? Where we supposed to calculate the mean per activity , subject or activity subject combination? Where we supposed to select the mean() only or also the FreqMean(). etc.

von Rouholamin R

14. Jan. 2019

I've passed two courses of this specialization before this. first of all I think it was a little bit harder and filled with more content. for me it's like anything professors say, I'll start to R&D about them and learn. but for this course there were lots of stuff I learnt and unfortunately I already started to forget regex patterns and so on.

I liked the project in so many levels except the main dataset wasn't well documented. after finding out what's the data set is about I did the project and I think it helped me take back my confidence .

von Amr E

15. Juli 2020

I very much liked this course. It is challenging and take you once step deeper into data science. The final project is a real-world project that you may face in your professional career. It is well organized in many aspects. However, what I didn't like in this course is the following:

1- Many of the used functions are deprecated as of 2020 and haven't been updated

2- Some lectures took longer than they should (Reading HD5 lecture for example)

3- Unlike the first two courses, the discussion forms are not as rich (especially in week 3 & 4)

von Karin R

24. Mai 2020

This course is wonderful for those who are already equipped with coding experience. For the rest, it's extremely difficult, and I found myself wishing that there were better resources available for those who aren't already there. I have taken each course in succession and have purchased books to help guide me—and also have a very patient brother with advanced computer engineering expertise who had to answer all of my questions. I absolutely love the package in R that allows you to do tutorials. All of my stars are for that.

von Jason J

29. Jan. 2016

Thank you to the professors who made this course possible and especially to Dr. Peng who was willing to spend some time with us face to face via video. I found the course very challenging but at the same time I did learn quite a bit about R. Working through the course assignments and the final project was the best part. My rating is 4 stars because the course lectures are not engaging. The lecture style is basically just reading the slides to us and they don't take much time to explain what is going on.

von Rick H

8. März 2016

It is a great course so far, with a lot of applicable topics covered. However, I feel that some of the questions are structured poorly to achieve their goals.

It is fine to have difficult questions where students are expected to do a lot of extra research, but it should done in a way that they students know what they are getting into ahead of time, and not include questions or code that does not run without explaining what is supposed to be done.

It's just not a good teaching technique for learning.

von Jason B

4. Juni 2016

Good course, though I have to say that the final project was a bit confusing, and I am not sure that the people who did the final project really understood the course and how to create a tidy dataset, as the ones that I looked at did not meet all the principles of tidy data that were outlined. What concerns me is that they all had similar issues, and are all doing peer review of each other - this means that there is no one that can make sure that their answers are really tidy...

von Dzmitry B

7. Nov. 2020

The principles of tidy data are well delivered and, overall, the course structure is great. Many great packages were covered (maybe some are a little outdated). Personally I felt that the data.table and dplyr/plyr shouldn't be covered to such depth, just mentioned. The main reason being is that they are constantly updated and often enough deprecate functions/parameters. I believe learning R dialects should be individual's choice and is not required for the data processing in R.

von Alex B

3. Nov. 2016

This is an interesting, helpful class. It was challenging, and exposed me to a very wide variety of topics outside R for data analysis, including databases, XML, APIs for getting data. I would have found found swirl type exercises for those topics helpful, as the additional practice really reinforced the lecture material and homework/quiz problems. I also would have found some worked examples or discussion of the homework problems after they were submitted helpful.

von Matthew D

1. Juni 2020

Pretty good, I liked the instructor. He explained things better than other instructors and didn't just read off the slides. Some of the quizzes were a bit off in my opinion. As far as I could tell with some of my code and the consistent answers I got, I think the quiz is not up to date with the data. The data is outside the course and is updated fairly frequently, one as early as December 2019 and I think the quiz answers were not updated accordingly.

von Tamir L

25. Juli 2016

This course could be a little difficult for people with no programming experience what so ever, even if they took the previous R programming course in this series. Examples are often a little too laconic and not all of the material is as practically useful as the best of it.

However Jeff Leak teaches some excellent data tidying, cleaning and extraction techniques with modern tools and libraries, that I find very useful in my everyday work with data.

von Lee Y L R

13. Juni 2017

This is a tough but important course. I learnt how to get the data from the web sources other than reading files of various formats, manipulate and group the data, and how to prepare a tidy data set for future analysis. There is ample practice to do each of the above. While the discussion forum is a great platform to address our queries, it would be good if there is greater clarity on some of the tools employed especially those in Week 2.

von Beñat G

11. Aug. 2016

I liked the initial approach of this course, aswell as the resources given. However, I feel that the difficulty the exercises showed wasn't really linked to the concepts: many things were not explained in the lectures, and I had to find the complementary informations in various sites. So, rather than evaluating the understanding of the concepts given in the lectures, the skill to look for further informations was assessed.

von Varun B

2. Jan. 2018

The course is really great, it starts of a bit slow initially but then really picks up pace with new concepts and R packages as you move to week 3-4. It really helps you strengthen the basics in what exactly Tidy and Clean data is before you move on to more advanced concepts. The course material needs updating though, some links did not work, and the presentations which are downloadable are not selection friendly.

von Steven Y

5. Feb. 2021

Great course. Just one suggestion. Thousands of students take this course. They have different internet environments, and the videos were recorded several years ago. It is possible that some of them are not able to download a file from a URL. It would be better if the course could provide files directly in case that students fail to download them, they can still continue to practice other skills.

von Stefan H

9. Apr. 2019

pretty good examples, good guidance. However again it would be more helpful to start learning from a PROBLEM statement first, moving to an EXAMPLE on how to solve it and then explain how the new information helps you with this in THEORY. it makes learning so much easier and i don't understand teachers that don't follow this human problem solving approach for better understanding and learning.

von Rok B

15. Mai 2019

The course has valuable content, but there is not enough emphasis on how to create a tidy data set. You kind of learn what a tidy data set is (although the definition is vauge), but you would need to see examples of messy data sets and how to convert it to tidy data set. There is one exercise in swirl called tidyr that addresses that, but it would be nice to have also videos on this topic.

von Ingrid M V

23. Dez. 2020

Compared with other courses in the same series I observed several problems:

1. The explanation was not good, I had so many doubts that I clarified in other forums. The APIs lecture was too easy compared to the required to solve the quiz. The Dplyr section taught by Professor Roger Peng was the best explained.

2. Links s don't work.

3. The questions are not answered by the teachers.