Skip to content

Mining Massive datasets course review

mining massive datasets

This is the first course I did (well, I tried because I finally quit) after my summer holidays, related to mining massive datasets and data management. The environment of the course have been great, and also I joined a WhatsApp group with all kind of people from around the world with questions, answers, links to courses, pepers, blogs, helping other collegues, ….

mining massive datasets

Starting with Mining Massive Datasets

The first week of the course is really interesting, with two cool topics: distributed file systems, and how to calculate PageRank, the Google algorithm that is used to order the web. In my opinion, the first part of the videos are really interesting if you are going to deal with large datasets, or even big data. The second part of the videos are very theorical, and you can learn about how an aparently “easy” algorithm is so complex, where nets and algebra are used to solve problems.

For solving the questions, you don’t need any specific programming language, because even Excel can help you to solve it. A lot of students decided to use R or Python with Pandas and Numpy.

By the way, if you are interested in the course, here you can download the book with all chapters to be seen in the course!

Second week and … I quit

The truth is that after reading the content of the second week, I understood that this course is not for beginners, it’s for advanced users because the algorithms that you’re using are not easy to understand neither to apply, and the mathematical base is hard. So, before going into this course, refresh your algebra. Because I don’t need this knowledge in the short term, and I won’t apply any of the material of the course in my projects, I decided to abandon the course.

In general, the course is too interesting if you are working with #BigData, and you have large datasets to work with, other way, it has no sense to follow the course. Another aspect is that resolving the exercises are not done with large datasets, because the course is more theorical than practical (although you need some skills if you want to solve the exercises!!).

The good part of the course is that it explain how to use complex algorithms for executing distinct tasks, but if you are working if the field of the course, so, this course is perfect for you!

I saw more videos of the following weeks, but complexity increase, and even other students asserted that the time dedicated to the course were more that they thought, I am so sorry for stop following the course, but i repeat, this course it is not for beginners, my case!

I guess that if I work some day in something related to the course, I will put more effort on it!

And in the meanwhile, there are more course to follow!!

Have a nice day and happy coding!

PD: a new edition of the course is starting on 31st January 2016!

Manejando Datos Newsletter

Noticias, entradas, eventos, cursos, … News, entrances, events, courses, …

Gracias por unirte a la newsletter. Thanks for joining the newsletter.