Python for the Data Scientist / for the economist


Objective

Python has recently become a more than convincing alternative for scientists and as it is a generic language, it is possible to manage all the processing applied to data, from data source processing to data visualization without changing the language. This course introduces different tools that allow you to make the data "speak" in order to quickly obtain results.

Planning

Part 1: Handling Data

* Introduction:

                Back to the basics of Python,

                Presentation of the Python Ecosystem for Data Science

                Introduction to good practices

                Presentation of the principles of data-science

* Handling structured data :

                Basic principles with numpy

                Manipulate databases with pandas and SQL

                Introduction to spatial data (geopandas)

* Handle less traditional data:

                Retrieve data by webscraping and APIs

                Manipulate text data

Part 2: View

* Presentation of the basic packages for graphics:

                matplotlib, seaborn

* Cartography:

                still maps

                dynamic maps (HTML)

Part 3: Modeling

* General models:

                Regression

                CPA

                Machine Learning with sklearn

* Natural Language Processing

* Deepening of Machine Learning models

Références