Imported all the notebooks
[tm351-notebooks.git] / notebooks / 03. Bending Data to Your Will / 03.4 Bending Data - Putting the Pieces Together.ipynb
1 {
2 "metadata": {
3 "name": "",
4 "signature": "sha256:edc308311c2dc2196d8bec2330e46855b6b98160d8dc287397494e98ac3e81d6"
5 },
6 "nbformat": 3,
7 "nbformat_minor": 0,
8 "worksheets": [
9 {
10 "cells": [
11 {
12 "cell_type": "markdown",
13 "metadata": {},
14 "source": [
15 "<div style='color:purple'>CT Note: do we need a synthesis activity here? If so, a structured one or an open one?</div>"
16 ]
17 },
18 {
19 "cell_type": "markdown",
20 "metadata": {},
21 "source": [
22 "In this final synthesis activity, you will have an opportunity to put into practice some of the things you have learned during this week. Try not to spend too long on this activity - no more than an hour.\n",
23 "\n",
24 "If you find yourself getting frustrated becuase you can't work out how to tidy your data, post a question to the course forums."
25 ]
26 },
27 {
28 "cell_type": "heading",
29 "level": 2,
30 "metadata": {},
31 "source": [
32 "Cleaning Your Own Dataset"
33 ]
34 },
35 {
36 "cell_type": "markdown",
37 "metadata": {},
38 "source": [
39 "For this activity, you will need to find your own dataset, for example by looking on data.gov.uk, your own local council website, or another source of datafiles that you have discovered yourself."
40 ]
41 },
42 {
43 "cell_type": "markdown",
44 "metadata": {},
45 "source": [
46 "Download the dataset you have chosen, and open it into a pandas datframe. If necessary, do some preliminary parsing and tidying up of it using OpenRefine."
47 ]
48 },
49 {
50 "cell_type": "markdown",
51 "metadata": {},
52 "source": [
53 "Explore your dataset so that you get a feel for what it contains by using a range of pandas methods. For example look at the datatypes assigned to each column, or the range of values (or unique values) contained within a column.\n",
54 "\n",
55 "If there are any elements that need tidying up, make a note of what they are and try to clean them.\n",
56 "\n",
57 "If your dataset contains dates, or date times, see if you can parse them into date or datetime objects.\n",
58 "\n",
59 "If your dataset is public and non-proprietary, consider sharing your notebook on OpenDesignStudio so that other students may learn from what you have done and discuss the approach or approaches you took towards cleaning it with you."
60 ]
61 },
62 {
63 "cell_type": "heading",
64 "level": 2,
65 "metadata": {},
66 "source": [
67 "What Next?"
68 ]
69 },
70 {
71 "cell_type": "markdown",
72 "metadata": {},
73 "source": [
74 "That completes the practical activities for this session. Return to the course materials now."
75 ]
76 }
77 ],
78 "metadata": {}
79 }
80 ]
81 }