Practical Data Analysis(Second Edition)
上QQ阅读APP看书,第一时间看更新

Chapter 2. Preprocessing Data

Building real world data analytic solutions requires accurate data. In this chapter, we discuss how to collect, clean, normalize, and transform raw data into a standard format such as Comma-Separated Values (CSV) format or JavaScript Object Notation (JSON), using a tool to process a messy data called OpenRefine.

In this chapter, we will cover the following:

  • Data sources
  • Data scrubbing
  • Data reduction methods
  • Data formats
  • Getting started with OpenRefine