Imported all the notebooks
[tm351-notebooks.git] / notebooks / 02. Getting started with pandas / 02.2.3 Data File Formats - Other.ipynb
1 {
2 "metadata": {
3 "name": "",
4 "signature": "sha256:ed9f987045fb59cd00c996657ab30f8a3118d71b80f83d6b948e93841c988104"
5 },
6 "nbformat": 3,
7 "nbformat_minor": 0,
8 "worksheets": [
9 {
10 "cells": [
11 {
12 "cell_type": "heading",
13 "level": 1,
14 "metadata": {},
15 "source": [
16 "Data File Formats - Other"
17 ]
18 },
19 {
20 "cell_type": "markdown",
21 "metadata": {},
22 "source": [
23 "In this notebook, you will learn how to work with a variety of other file formats. Details for some file formats are left deliberately sparse. If you find yourself spending a lot of time working with such file formats, feel free to add additional notes to this notebook, or create a new notebook to record the recipes you find useful."
24 ]
25 },
26 {
27 "cell_type": "heading",
28 "level": 2,
29 "metadata": {},
30 "source": [
31 "Spreadsheet Files (Excel XLS and XLSX Files)"
32 ]
33 },
34 {
35 "cell_type": "markdown",
36 "metadata": {},
37 "source": [
38 "Although spreadsheet files are one of the most widely used file formats for sharing data, we have relegated them to this notebook becuase we want you to get into the habit of using other file formats to publish and request data yourself!"
39 ]
40 },
41 {
42 "cell_type": "markdown",
43 "metadata": {},
44 "source": [
45 "As one of the most widely used spreadsheet applications, the file formats used by Excel by default are the ones most commonly encountered. Excel spreadsheet files can be recognised from the file extensions *.xls* and *.xlsx*."
46 ]
47 },
48 {
49 "cell_type": "markdown",
50 "metadata": {},
51 "source": [
52 "You can open a file from a spreadsheet into a *pandas* data frame using the `.read_excel()` method."
53 ]
54 },
55 {
56 "cell_type": "code",
57 "collapsed": false,
58 "input": [
59 "#The xlrd library allows us to read and write files using Excel's .xls and .xlsx formats\n",
60 "import xlrd\n",
61 "#The following spreadsheet is taken from a HEFCE briefing on \"Data about demand and supply in higher education subjects\"\n",
62 "##http://www.hefce.ac.uk/whatwedo/crosscutting/sivs/data/ [retrieved 22/07/14]\n",
63 "workbook = xlrd.open_workbook('data/subjects_analysis_undergraduates.xls')\n",
64 "#It also allows us to preview the sheet names\n",
65 "print(workbook.sheet_names())"
66 ],
67 "language": "python",
68 "metadata": {},
69 "outputs": [
70 {
71 "output_type": "stream",
72 "stream": "stdout",
73 "text": [
74 "['Information', 'Table 2.1', 'Table 2.1.2', 'Table 2.1.3', 'Table 2.1.4', 'Table 2.2', 'Table 2.2.1', 'Table 2.2.2', 'Table 2.2.2a', 'Table 2.2.3', 'Table 2.2.5', 'Table 2.3', 'Table 2.3.1', 'Table 2.3.2', 'Table 2.3.2a', 'Table 2.3.5', 'Table 2.4', 'Table 2.4.1', 'Table 2.4.2', 'Table 2.4.2a', 'Table 2.4.5', 'Table 2.5', 'Table 2.5.1', 'Table 2.5.2', 'Table 2.5.2a', 'Table 2.5.5', 'Table 2.6', 'Table 2.6.1', 'Table 2.6.2', 'Table 2.6.2a', 'Table 2.6.5', 'Table 2.7', 'Table 2.7.1', 'Table 2.7.2', 'Table 2.7.2a', 'Table 2.7.5', 'Table 2.8', 'Table 2.8.1', 'Table 2.8.2', 'Table 2.8.2a', 'Table 2.8.5', 'Table 2.9', 'Table 2.9.1', 'Table 2.9.2', 'Table 2.9.2a', 'Table 2.9.5', 'Table 2.10', 'Table 2.10.1', 'Table 2.10.2', 'Table 2.10.2a', 'Table 2.10.5', 'Table 2.11', 'Table 2.11.1', 'Table 2.11.2', 'Table 2.11.2a', 'Table 2.11.5']\n"
75 ]
76 }
77 ],
78 "prompt_number": 100
79 },
80 {
81 "cell_type": "code",
82 "collapsed": false,
83 "input": [
84 "#We can try to import a sheet directly into pandas using its read_excel() method.\n",
85 "import pandas as pd\n",
86 "pd.read_excel('data/subjects_analysis_undergraduates.xls',sheetname='Table 2.1.2')[:10]"
87 ],
88 "language": "python",
89 "metadata": {},
90 "outputs": [
91 {
92 "html": [
93 "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
94 "<table border=\"1\" class=\"dataframe\">\n",
95 " <thead>\n",
96 " <tr style=\"text-align: right;\">\n",
97 " <th></th>\n",
98 " <th>Return to Information and links to tables</th>\n",
99 " <th>Unnamed: 1</th>\n",
100 " <th>Unnamed: 2</th>\n",
101 " <th>Unnamed: 3</th>\n",
102 " <th>Unnamed: 4</th>\n",
103 " <th>Unnamed: 5</th>\n",
104 " <th>Unnamed: 6</th>\n",
105 " <th>Unnamed: 7</th>\n",
106 " <th>Unnamed: 8</th>\n",
107 " <th>Unnamed: 9</th>\n",
108 " <th>Unnamed: 10</th>\n",
109 " <th>Unnamed: 11</th>\n",
110 " <th>Unnamed: 12</th>\n",
111 " <th>Unnamed: 13</th>\n",
112 " <th>Unnamed: 14</th>\n",
113 " </tr>\n",
114 " </thead>\n",
115 " <tbody>\n",
116 " <tr>\n",
117 " <th>0</th>\n",
118 " <td>NaN</td>\n",
119 " <td> NaN</td>\n",
120 " <td> NaN</td>\n",
121 " <td> NaN</td>\n",
122 " <td> NaN</td>\n",
123 " <td> NaN</td>\n",
124 " <td> NaN</td>\n",
125 " <td> NaN</td>\n",
126 " <td> NaN</td>\n",
127 " <td> NaN</td>\n",
128 " <td> NaN</td>\n",
129 " <td> NaN</td>\n",
130 " <td> NaN</td>\n",
131 " <td> NaN</td>\n",
132 " <td> NaN</td>\n",
133 " </tr>\n",
134 " <tr>\n",
135 " <th>1</th>\n",
136 " <td>NaN</td>\n",
137 " <td> Table 2.1.2 Numbers of A level entries in STEM...</td>\n",
138 " <td> NaN</td>\n",
139 " <td> NaN</td>\n",
140 " <td> NaN</td>\n",
141 " <td> NaN</td>\n",
142 " <td> NaN</td>\n",
143 " <td> NaN</td>\n",
144 " <td> NaN</td>\n",
145 " <td> NaN</td>\n",
146 " <td> NaN</td>\n",
147 " <td> NaN</td>\n",
148 " <td> NaN</td>\n",
149 " <td> NaN</td>\n",
150 " <td> NaN</td>\n",
151 " </tr>\n",
152 " <tr>\n",
153 " <th>2</th>\n",
154 " <td>NaN</td>\n",
155 " <td> NaN</td>\n",
156 " <td> NaN</td>\n",
157 " <td> NaN</td>\n",
158 " <td> NaN</td>\n",
159 " <td> NaN</td>\n",
160 " <td> NaN</td>\n",
161 " <td> NaN</td>\n",
162 " <td> NaN</td>\n",
163 " <td> NaN</td>\n",
164 " <td> NaN</td>\n",
165 " <td> NaN</td>\n",
166 " <td> NaN</td>\n",
167 " <td> NaN</td>\n",
168 " <td> NaN</td>\n",
169 " </tr>\n",
170 " <tr>\n",
171 " <th>3</th>\n",
172 " <td>NaN</td>\n",
173 " <td> A level subject area</td>\n",
174 " <td> 2002-03</td>\n",
175 " <td> 2003-04</td>\n",
176 " <td> 2004-05</td>\n",
177 " <td> 2005-06</td>\n",
178 " <td> 2006-07</td>\n",
179 " <td> 2007-08</td>\n",
180 " <td> 2008-09</td>\n",
181 " <td> 2009-10</td>\n",
182 " <td> 2010-11</td>\n",
183 " <td> 2011-12</td>\n",
184 " <td> % change 2002-03 to 2011-12 (2003-04 to 2011-1...</td>\n",
185 " <td> % change 2009-10 to 2011-12</td>\n",
186 " <td> % change 2010-11 to 2011-12</td>\n",
187 " </tr>\n",
188 " <tr>\n",
189 " <th>4</th>\n",
190 " <td>NaN</td>\n",
191 " <td> Biology</td>\n",
192 " <td> 46192</td>\n",
193 " <td> 46320</td>\n",
194 " <td> 47925</td>\n",
195 " <td> 48813</td>\n",
196 " <td> 48659</td>\n",
197 " <td> 50148</td>\n",
198 " <td> 49526</td>\n",
199 " <td> 51932</td>\n",
200 " <td> 55667</td>\n",
201 " <td> 56720</td>\n",
202 " <td> 0.2279183</td>\n",
203 " <td> 0.09219749</td>\n",
204 " <td> 0.01891605</td>\n",
205 " </tr>\n",
206 " <tr>\n",
207 " <th>5</th>\n",
208 " <td>NaN</td>\n",
209 " <td> Chemistry</td>\n",
210 " <td> 32544</td>\n",
211 " <td> 33492</td>\n",
212 " <td> 34935</td>\n",
213 " <td> 36123</td>\n",
214 " <td> 36458</td>\n",
215 " <td> 37711</td>\n",
216 " <td> 38465</td>\n",
217 " <td> 40063</td>\n",
218 " <td> 43935</td>\n",
219 " <td> 45087</td>\n",
220 " <td> 0.3854167</td>\n",
221 " <td> 0.1254025</td>\n",
222 " <td> 0.02622055</td>\n",
223 " </tr>\n",
224 " <tr>\n",
225 " <th>6</th>\n",
226 " <td>NaN</td>\n",
227 " <td> ICT, Computer studies</td>\n",
228 " <td> 25236</td>\n",
229 " <td> 21871</td>\n",
230 " <td> 19209</td>\n",
231 " <td> 17572</td>\n",
232 " <td> 15908</td>\n",
233 " <td> 14282</td>\n",
234 " <td> 13375</td>\n",
235 " <td> 12921</td>\n",
236 " <td> 12563</td>\n",
237 " <td> 11822</td>\n",
238 " <td> -0.5315422</td>\n",
239 " <td> -0.08505534</td>\n",
240 " <td> -0.05898273</td>\n",
241 " </tr>\n",
242 " <tr>\n",
243 " <th>7</th>\n",
244 " <td>NaN</td>\n",
245 " <td> Design and technology</td>\n",
246 " <td> 16128</td>\n",
247 " <td> 16124</td>\n",
248 " <td> 16766</td>\n",
249 " <td> 17346</td>\n",
250 " <td> 15702</td>\n",
251 " <td> 15718</td>\n",
252 " <td> 15640</td>\n",
253 " <td> 16519</td>\n",
254 " <td> 16301</td>\n",
255 " <td> 15234</td>\n",
256 " <td> -0.05543155</td>\n",
257 " <td> -0.07778921</td>\n",
258 " <td> -0.06545611</td>\n",
259 " </tr>\n",
260 " <tr>\n",
261 " <th>8</th>\n",
262 " <td>NaN</td>\n",
263 " <td> Mathematics</td>\n",
264 " <td> 51061</td>\n",
265 " <td> 47997</td>\n",
266 " <td> 48058</td>\n",
267 " <td> 51168</td>\n",
268 " <td> 54833</td>\n",
269 " <td> 59105</td>\n",
270 " <td> 66552</td>\n",
271 " <td> 70654</td>\n",
272 " <td> 76528</td>\n",
273 " <td> 78951</td>\n",
274 " <td> 0.5462094</td>\n",
275 " <td> 0.1174314</td>\n",
276 " <td> 0.03166161</td>\n",
277 " </tr>\n",
278 " <tr>\n",
279 " <th>9</th>\n",
280 " <td>NaN</td>\n",
281 " <td> Further mathematics</td>\n",
282 " <td> NaN</td>\n",
283 " <td> 5443</td>\n",
284 " <td> 5627</td>\n",
285 " <td> 6950</td>\n",
286 " <td> 7551</td>\n",
287 " <td> 8743</td>\n",
288 " <td> 10073</td>\n",
289 " <td> 11312</td>\n",
290 " <td> 11805</td>\n",
291 " <td> 12688</td>\n",
292 " <td> n/a</td>\n",
293 " <td> 0.1216407</td>\n",
294 " <td> 0.07479881</td>\n",
295 " </tr>\n",
296 " </tbody>\n",
297 "</table>\n",
298 "</div>"
299 ],
300 "metadata": {},
301 "output_type": "pyout",
302 "prompt_number": 89,
303 "text": [
304 " Return to Information and links to tables \\\n",
305 "0 NaN \n",
306 "1 NaN \n",
307 "2 NaN \n",
308 "3 NaN \n",
309 "4 NaN \n",
310 "5 NaN \n",
311 "6 NaN \n",
312 "7 NaN \n",
313 "8 NaN \n",
314 "9 NaN \n",
315 "\n",
316 " Unnamed: 1 Unnamed: 2 Unnamed: 3 \\\n",
317 "0 NaN NaN NaN \n",
318 "1 Table 2.1.2 Numbers of A level entries in STEM... NaN NaN \n",
319 "2 NaN NaN NaN \n",
320 "3 A level subject area 2002-03 2003-04 \n",
321 "4 Biology 46192 46320 \n",
322 "5 Chemistry 32544 33492 \n",
323 "6 ICT, Computer studies 25236 21871 \n",
324 "7 Design and technology 16128 16124 \n",
325 "8 Mathematics 51061 47997 \n",
326 "9 Further mathematics NaN 5443 \n",
327 "\n",
328 " Unnamed: 4 Unnamed: 5 Unnamed: 6 Unnamed: 7 Unnamed: 8 Unnamed: 9 \\\n",
329 "0 NaN NaN NaN NaN NaN NaN \n",
330 "1 NaN NaN NaN NaN NaN NaN \n",
331 "2 NaN NaN NaN NaN NaN NaN \n",
332 "3 2004-05 2005-06 2006-07 2007-08 2008-09 2009-10 \n",
333 "4 47925 48813 48659 50148 49526 51932 \n",
334 "5 34935 36123 36458 37711 38465 40063 \n",
335 "6 19209 17572 15908 14282 13375 12921 \n",
336 "7 16766 17346 15702 15718 15640 16519 \n",
337 "8 48058 51168 54833 59105 66552 70654 \n",
338 "9 5627 6950 7551 8743 10073 11312 \n",
339 "\n",
340 " Unnamed: 10 Unnamed: 11 Unnamed: 12 \\\n",
341 "0 NaN NaN NaN \n",
342 "1 NaN NaN NaN \n",
343 "2 NaN NaN NaN \n",
344 "3 2010-11 2011-12 % change 2002-03 to 2011-12 (2003-04 to 2011-1... \n",
345 "4 55667 56720 0.2279183 \n",
346 "5 43935 45087 0.3854167 \n",
347 "6 12563 11822 -0.5315422 \n",
348 "7 16301 15234 -0.05543155 \n",
349 "8 76528 78951 0.5462094 \n",
350 "9 11805 12688 n/a \n",
351 "\n",
352 " Unnamed: 13 Unnamed: 14 \n",
353 "0 NaN NaN \n",
354 "1 NaN NaN \n",
355 "2 NaN NaN \n",
356 "3 % change 2009-10 to 2011-12 % change 2010-11 to 2011-12 \n",
357 "4 0.09219749 0.01891605 \n",
358 "5 0.1254025 0.02622055 \n",
359 "6 -0.08505534 -0.05898273 \n",
360 "7 -0.07778921 -0.06545611 \n",
361 "8 0.1174314 0.03166161 \n",
362 "9 0.1216407 0.07479881 "
363 ]
364 }
365 ],
366 "prompt_number": 89
367 },
368 {
369 "cell_type": "markdown",
370 "metadata": {},
371 "source": [
372 "By inspecting this data, or by opening the spreadsheet using a spreadsheet application or OpenRefine, we notice that the first few rows are metadata - or blank - rows. We can discount a certain number of lines at the top of the sheet using the `skiprows` parameter, or we can specify the header row explicilty and ignore the rows preceding that one. We can also define which columns we wish to import.\n",
373 "\n",
374 "(For more information, see the documentation: http://pandas.pydata.org/pandas-docs/dev/generated/pandas.io.excel.read_excel.html )"
375 ]
376 },
377 {
378 "cell_type": "code",
379 "collapsed": false,
380 "input": [
381 "pd.read_excel('data/subjects_analysis_undergraduates.xls',sheetname='Table 2.1.2', header=4,parse_cols=\"B:N\")[:5]"
382 ],
383 "language": "python",
384 "metadata": {},
385 "outputs": [
386 {
387 "html": [
388 "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
389 "<table border=\"1\" class=\"dataframe\">\n",
390 " <thead>\n",
391 " <tr style=\"text-align: right;\">\n",
392 " <th></th>\n",
393 " <th>A level subject area</th>\n",
394 " <th>2002-03</th>\n",
395 " <th>2003-04</th>\n",
396 " <th>2004-05</th>\n",
397 " <th>2005-06</th>\n",
398 " <th>2006-07</th>\n",
399 " <th>2007-08</th>\n",
400 " <th>2008-09</th>\n",
401 " <th>2009-10</th>\n",
402 " <th>2010-11</th>\n",
403 " <th>2011-12</th>\n",
404 " <th>% change 2002-03 to 2011-12 (2003-04 to 2011-12 for further mathematics)</th>\n",
405 " <th>% change 2009-10 to 2011-12</th>\n",
406 " </tr>\n",
407 " </thead>\n",
408 " <tbody>\n",
409 " <tr>\n",
410 " <th>0</th>\n",
411 " <td> Biology</td>\n",
412 " <td> 46192</td>\n",
413 " <td> 46320</td>\n",
414 " <td> 47925</td>\n",
415 " <td> 48813</td>\n",
416 " <td> 48659</td>\n",
417 " <td> 50148</td>\n",
418 " <td> 49526</td>\n",
419 " <td> 51932</td>\n",
420 " <td> 55667</td>\n",
421 " <td> 56720</td>\n",
422 " <td> 0.2279183</td>\n",
423 " <td> 0.092197</td>\n",
424 " </tr>\n",
425 " <tr>\n",
426 " <th>1</th>\n",
427 " <td> Chemistry</td>\n",
428 " <td> 32544</td>\n",
429 " <td> 33492</td>\n",
430 " <td> 34935</td>\n",
431 " <td> 36123</td>\n",
432 " <td> 36458</td>\n",
433 " <td> 37711</td>\n",
434 " <td> 38465</td>\n",
435 " <td> 40063</td>\n",
436 " <td> 43935</td>\n",
437 " <td> 45087</td>\n",
438 " <td> 0.3854167</td>\n",
439 " <td> 0.125402</td>\n",
440 " </tr>\n",
441 " <tr>\n",
442 " <th>2</th>\n",
443 " <td> ICT, Computer studies</td>\n",
444 " <td> 25236</td>\n",
445 " <td> 21871</td>\n",
446 " <td> 19209</td>\n",
447 " <td> 17572</td>\n",
448 " <td> 15908</td>\n",
449 " <td> 14282</td>\n",
450 " <td> 13375</td>\n",
451 " <td> 12921</td>\n",
452 " <td> 12563</td>\n",
453 " <td> 11822</td>\n",
454 " <td> -0.5315422</td>\n",
455 " <td>-0.085055</td>\n",
456 " </tr>\n",
457 " <tr>\n",
458 " <th>3</th>\n",
459 " <td> Design and technology</td>\n",
460 " <td> 16128</td>\n",
461 " <td> 16124</td>\n",
462 " <td> 16766</td>\n",
463 " <td> 17346</td>\n",
464 " <td> 15702</td>\n",
465 " <td> 15718</td>\n",
466 " <td> 15640</td>\n",
467 " <td> 16519</td>\n",
468 " <td> 16301</td>\n",
469 " <td> 15234</td>\n",
470 " <td>-0.05543155</td>\n",
471 " <td>-0.077789</td>\n",
472 " </tr>\n",
473 " <tr>\n",
474 " <th>4</th>\n",
475 " <td> Mathematics</td>\n",
476 " <td> 51061</td>\n",
477 " <td> 47997</td>\n",
478 " <td> 48058</td>\n",
479 " <td> 51168</td>\n",
480 " <td> 54833</td>\n",
481 " <td> 59105</td>\n",
482 " <td> 66552</td>\n",
483 " <td> 70654</td>\n",
484 " <td> 76528</td>\n",
485 " <td> 78951</td>\n",
486 " <td> 0.5462094</td>\n",
487 " <td> 0.117431</td>\n",
488 " </tr>\n",
489 " </tbody>\n",
490 "</table>\n",
491 "</div>"
492 ],
493 "metadata": {},
494 "output_type": "pyout",
495 "prompt_number": 96,
496 "text": [
497 " A level subject area 2002-03 2003-04 2004-05 2005-06 2006-07 \\\n",
498 "0 Biology 46192 46320 47925 48813 48659 \n",
499 "1 Chemistry 32544 33492 34935 36123 36458 \n",
500 "2 ICT, Computer studies 25236 21871 19209 17572 15908 \n",
501 "3 Design and technology 16128 16124 16766 17346 15702 \n",
502 "4 Mathematics 51061 47997 48058 51168 54833 \n",
503 "\n",
504 " 2007-08 2008-09 2009-10 2010-11 2011-12 \\\n",
505 "0 50148 49526 51932 55667 56720 \n",
506 "1 37711 38465 40063 43935 45087 \n",
507 "2 14282 13375 12921 12563 11822 \n",
508 "3 15718 15640 16519 16301 15234 \n",
509 "4 59105 66552 70654 76528 78951 \n",
510 "\n",
511 " % change 2002-03 to 2011-12 (2003-04 to 2011-12 for further mathematics) \\\n",
512 "0 0.2279183 \n",
513 "1 0.3854167 \n",
514 "2 -0.5315422 \n",
515 "3 -0.05543155 \n",
516 "4 0.5462094 \n",
517 "\n",
518 " % change 2009-10 to 2011-12 \n",
519 "0 0.092197 \n",
520 "1 0.125402 \n",
521 "2 -0.085055 \n",
522 "3 -0.077789 \n",
523 "4 0.117431 "
524 ]
525 }
526 ],
527 "prompt_number": 96
528 },
529 {
530 "cell_type": "code",
531 "collapsed": false,
532 "input": [
533 "#By inspection of the originally previewed sheet, we can use xlrd to read off the metadata from the metadata cell\n",
534 "#Note that row/columns indices are integer values, indexed on 0.\n",
535 "sheet=workbook.sheet_by_name('Table 2.1.2')\n",
536 "sheet.cell_value(rowx=2, colx=1)"
537 ],
538 "language": "python",
539 "metadata": {},
540 "outputs": [
541 {
542 "metadata": {},
543 "output_type": "pyout",
544 "prompt_number": 108,
545 "text": [
546 "'Table 2.1.2 Numbers of A level entries in STEM by subject area, 2002-03 to 2011-12'"
547 ]
548 }
549 ],
550 "prompt_number": 108
551 },
552 {
553 "cell_type": "heading",
554 "level": 2,
555 "metadata": {},
556 "source": [
557 "XML Files"
558 ]
559 },
560 {
561 "cell_type": "markdown",
562 "metadata": {},
563 "source": [
564 "Importing XML data into a *pandas* dataframe is currently a little trickier than importing JSON, as there are no default *pandas* methods for supporting the import.\n",
565 "\n",
566 "Instead, you need to load in a file, parse it using a third party parser such as `lxml`, and then handle the mapping to the dataframe yourself.\n",
567 "\n",
568 "Alternatively, use OpenRefine to parse the elements of the XML document that you are interested in and then save the data out again as a tabular CSV document."
569 ]
570 },
571 {
572 "cell_type": "markdown",
573 "metadata": {},
574 "source": [
575 "We will try to limit our use of XML based datasets in this course, preferring instead CSV formats for tabular data and JSON for more elaborately structured datasets. You will however, work with a particular style of XML later in the course when you look at Linked Data and the semantic web."
576 ]
577 },
578 {
579 "cell_type": "markdown",
580 "metadata": {},
581 "source": [
582 "One thing worth bearing in mind is that popular XML formats may have Python libraries defined to make it easier to parse them, both in terms of reading and writing files defined using the format. For example, the KML format that is used to transport geographical data (points, lines, boundaries) can be parsed using the `fastkml` library."
583 ]
584 },
585 {
586 "cell_type": "heading",
587 "level": 3,
588 "metadata": {},
589 "source": [
590 "Working with KML Files"
591 ]
592 },
593 {
594 "cell_type": "code",
595 "collapsed": false,
596 "input": [
597 "!pip3 install git+https://github.com/cleder/fastkml.git"
598 ],
599 "language": "python",
600 "metadata": {},
601 "outputs": [
602 {
603 "output_type": "stream",
604 "stream": "stdout",
605 "text": [
606 "Downloading/unpacking git+https://github.com/cleder/fastkml.git\r\n"
607 ]
608 },
609 {
610 "output_type": "stream",
611 "stream": "stdout",
612 "text": [
613 " Cloning https://github.com/cleder/fastkml.git to /tmp/pip-0u9gbjq8-build\r\n"
614 ]
615 },
616 {
617 "output_type": "stream",
618 "stream": "stdout",
619 "text": [
620 " Running setup.py (path:/tmp/pip-0u9gbjq8-build/setup.py) egg_info for package from git+https://github.com/cleder/fastkml.git\r\n"
621 ]
622 },
623 {
624 "output_type": "stream",
625 "stream": "stdout",
626 "text": [
627 " \r\n"
628 ]
629 },
630 {
631 "output_type": "stream",
632 "stream": "stdout",
633 "text": [
634 " warning: no previously-included files matching '*.pyo' found under directory '*.pyc'\r\n",
635 " warning: no previously-included files found matching 'fastkml/.*'\r\n"
636 ]
637 },
638 {
639 "output_type": "stream",
640 "stream": "stdout",
641 "text": [
642 "Downloading/unpacking pygeoif (from fastkml==0.7)\r\n"
643 ]
644 },
645 {
646 "output_type": "stream",
647 "stream": "stdout",
648 "text": [
649 " Downloading pygeoif-0.4.1.tar.gz\r\n",
650 " Running setup.py (path:/tmp/pip_build_root/pygeoif/setup.py) egg_info for package pygeoif\r\n"
651 ]
652 },
653 {
654 "output_type": "stream",
655 "stream": "stdout",
656 "text": [
657 " \r\n"
658 ]
659 },
660 {
661 "output_type": "stream",
662 "stream": "stdout",
663 "text": [
664 " warning: no previously-included files matching '*.pyo' found under directory '*.pyc'\r\n",
665 " warning: no previously-included files found matching 'pygeoif/.*'\r\n"
666 ]
667 },
668 {
669 "output_type": "stream",
670 "stream": "stdout",
671 "text": [
672 "Requirement already satisfied (use --upgrade to upgrade): python-dateutil in /usr/local/lib/python3.4/dist-packages (from fastkml==0.7)\r\n",
673 "Requirement already satisfied (use --upgrade to upgrade): six in /usr/local/lib/python3.4/dist-packages (from python-dateutil->fastkml==0.7)\r\n",
674 "Installing collected packages: pygeoif, fastkml\r\n",
675 " Running setup.py install for pygeoif\r\n"
676 ]
677 },
678 {
679 "output_type": "stream",
680 "stream": "stdout",
681 "text": [
682 " \r\n"
683 ]
684 },
685 {
686 "output_type": "stream",
687 "stream": "stdout",
688 "text": [
689 " warning: no previously-included files matching '*.pyo' found under directory '*.pyc'\r\n",
690 " warning: no previously-included files found matching 'pygeoif/.*'\r\n"
691 ]
692 },
693 {
694 "output_type": "stream",
695 "stream": "stdout",
696 "text": [
697 " Running setup.py install for fastkml\r\n"
698 ]
699 },
700 {
701 "output_type": "stream",
702 "stream": "stdout",
703 "text": [
704 " \r\n",
705 " warning: no previously-included files matching '*.pyo' found under directory '*.pyc'\r\n",
706 " warning: no previously-included files found matching 'fastkml/.*'\r\n"
707 ]
708 },
709 {
710 "output_type": "stream",
711 "stream": "stdout",
712 "text": [
713 "Successfully installed pygeoif fastkml\r\n",
714 "Cleaning up..."
715 ]
716 },
717 {
718 "output_type": "stream",
719 "stream": "stdout",
720 "text": [
721 "\r\n"
722 ]
723 }
724 ],
725 "prompt_number": 2
726 },
727 {
728 "cell_type": "code",
729 "collapsed": false,
730 "input": [
731 "#We can load in data from a KML file and then render it onto a map quite easily.\n",
732 "#For example, in the data directory is a file that contains a list of car park locations on the Isle of Wight\n",
733 "!ls data"
734 ],
735 "language": "python",
736 "metadata": {},
737 "outputs": [
738 {
739 "output_type": "stream",
740 "stream": "stdout",
741 "text": [
742 "CarParks.kml iwCouncilSpending tmp.json tmpfile.csv\r\n"
743 ]
744 }
745 ],
746 "prompt_number": 4
747 },
748 {
749 "cell_type": "code",
750 "collapsed": false,
751 "input": [
752 "from fastkml import kml\n",
753 "k = kml.KML()\n",
754 "\n",
755 "#We need to open the file as a bytestream - and let the lxml parser used by the fastxml package identify the encoding itself\n",
756 "doc = open(\"data/CarParks.kml\",'rb').read()\n",
757 "k.from_string(doc)\n",
758 "\n",
759 "#The alternative is to open the file with a UTF-8 encoding to get a Unicode string, then throw away the first line\n",
760 "#that now incorrectly declares the decoding to be UTF-8\n",
761 "#!head -n 3 data/CarParks.kml\n",
762 "#doc = open(\"data/CarParks.kml\",encoding='utf-8')\n",
763 "#lines = '\\n'.join(doc.readlines()[1:])\n",
764 "#k.from_string(lines)\n",
765 "\n",
766 "\n",
767 "#Parse out the locations of the carpark placemarks from the file\n",
768 "#via http://ocefpaf.github.io/python4oceanographers/blog/2014/05/05/folium/\n",
769 "locations = dict()\n",
770 "for feature in k.features():\n",
771 " for placemark in feature.features():\n",
772 " locations.update({placemark.name: (placemark.geometry.y, placemark.geometry.x, )})"
773 ],
774 "language": "python",
775 "metadata": {},
776 "outputs": [],
777 "prompt_number": 1
778 },
779 {
780 "cell_type": "code",
781 "collapsed": false,
782 "input": [
783 "#Let's quickly plot the markers to show how the parser has pulled out the placemark information\n",
784 "import folium\n",
785 "folium.initialize_notebook()\n",
786 "\n",
787 "#If we know the latitude and longitude at the centre of the map we want to display, we can set it directly\n",
788 "carparks = folium.Map(location=[50.68, -1.2667], zoom_start=11)\n",
789 "\n",
790 "#Alternatively, we can calculate it as the mean latitude and longitude of the points we wish to plot\n",
791 "#latSum=lonSum =0\n",
792 "#for name, location in locations.items():\n",
793 "# latSum+=location[0]\n",
794 "# lonSum+=location[1]\n",
795 "#carparks = folium.Map(location=[latSum/len(locations.items()), lonSum/len(locations.items())], zoom_start=11)\n",
796 "\n",
797 "for name, location in locations.items():\n",
798 " carparks.circle_marker(location=location, popup=name,radius=20,line_color='blue',fill_color='blue',fill_opacity=0.2)\n",
799 "\n",
800 "carparks"
801 ],
802 "language": "python",
803 "metadata": {},
804 "outputs": [
805 {
806 "html": [
807 "<link rel=\"stylesheet\" href=\"http://cdn.leafletjs.com/leaflet-0.7.2/leaflet.css\" />\n",
808 "<style>\n",
809 " .leaflet-popup-content {\n",
810 " color: black !important;\n",
811 " }\n",
812 "\n",
813 " .leaflet-control-zoom-in {\n",
814 " text-decoration: none !important;\n",
815 " }\n",
816 "\n",
817 " .leaflet-control-zoom-out {\n",
818 " text-decoration: none !important;\n",
819 " }\n",
820 "</style>"
821 ],
822 "metadata": {},
823 "output_type": "display_data",
824 "text": [
825 "<IPython.core.display.HTML at 0x7f3e613fb9b0>"
826 ]
827 },
828 {
829 "html": [
830 "<script>\n",
831 "\n",
832 " var folium_event = new CustomEvent(\n",
833 " \"folium_libs_loaded\",\n",
834 " {bubbles: true, cancelable: true}\n",
835 " );\n",
836 "\n",
837 " var load_folium_charts = function(){\n",
838 " window.dispatchEvent(folium_event);\n",
839 " };\n",
840 "\n",
841 " var load_folium_libs = function(){\n",
842 " console.log('Loading all Folium libraries...')\n",
843 " $.getScript(\"http://cdn.leafletjs.com/leaflet-0.7.2/leaflet.js\", function(){\n",
844 " $.getScript('https://wrobstory.github.io/leaflet-dvf/leaflet-dvf.markers.min.js', function(){\n",
845 " if (window['vg'] === undefined){\n",
846 " $.getScript(\"http://trifacta.github.com/vega/vega.js\", function(){\n",
847 " load_folium_charts();\n",
848 " });\n",
849 " } else {\n",
850 " load_folium_charts();\n",
851 " }\n",
852 " });\n",
853 " });\n",
854 " };\n",
855 "\n",
856 " if(typeof define === \"function\" && define.amd){\n",
857 " var load_paths = {\n",
858 " paths: {\n",
859 " topojson:'http://d3js.org/topojson.v1.min',\n",
860 " queue: 'http://d3js.org/queue.v1.min',\n",
861 " d3: 'http://d3js.org/d3.v3.min'\n",
862 " }\n",
863 " };\n",
864 " var libs = ['d3', 'queue', 'topojson'];\n",
865 " for (var i=0; i < libs.length; i++){\n",
866 " lib = libs[i]\n",
867 " if (window[lib] !== undefined){\n",
868 " delete load_paths.paths[lib]\n",
869 " };\n",
870 " };\n",
871 " if (Object.keys(load_paths.paths).length != 0){\n",
872 " require.config(load_paths);\n",
873 " require([\"queue\"], function(queue){\n",
874 " window.queue = queue;\n",
875 " });\n",
876 " require([\"d3\"], function(d3){\n",
877 " console.log('Loading from require.js...')\n",
878 " window.d3 = d3;\n",
879 " require([\"topojson\"], function(topojson){\n",
880 " window.topojson = topojson;\n",
881 " load_folium_libs();\n",
882 " });\n",
883 " });\n",
884 " } else {\n",
885 " load_folium_libs();\n",
886 " }\n",
887 "\n",
888 " }else{\n",
889 " console.log('Require.js not found!');\n",
890 " throw \"Require.js not found!\"\n",
891 " };\n",
892 "\n",
893 "</script>"
894 ],
895 "metadata": {},
896 "output_type": "display_data",
897 "text": [
898 "<IPython.core.display.HTML at 0x7f3e613fb9b0>"
899 ]
900 },
901 {
902 "html": [
903 "\n",
904 "<div id=\"folium_2b26d090e7094c7ea05a1e6a0a59f5e3\" style=\"width: 960px; height: 500px\"></div>\n",
905 "\n",
906 "<script>\n",
907 "\n",
908 " var make_plot = function(){\n",
909 " if (typeof L === 'undefined'){\n",
910 " window.addEventListener('leaflet_libs_loaded', make_plot)\n",
911 " return;\n",
912 " }\n",
913 "\n",
914 " var render_plot = (function(){\n",
915 "\n",
916 " \n",
917 "\n",
918 " var map = L.map('folium_2b26d090e7094c7ea05a1e6a0a59f5e3').setView([50.68, -1.2667], 11);\n",
919 "\n",
920 " L.tileLayer('http://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png', {\n",
921 " maxZoom: 18,\n",
922 " attribution: 'Map data (c) <a href=\"http://openstreetmap.org\">OpenStreetMap</a> contributors'\n",
923 " }).addTo(map);\n",
924 "\n",
925 " \n",
926 " var circle_1 = L.circle([50.701221, -1.09955], 20, {\n",
927 " color: 'blue',\n",
928 " fillColor: 'blue',\n",
929 " fillOpacity: 0.2\n",
930 " });\n",
931 " circle_1.bindPopup(\"The Duver, St Helens\");\n",
932 " map.addLayer(circle_1)\n",
933 " \n",
934 " var circle_2 = L.circle([50.626411, -1.178317], 20, {\n",
935 " color: 'blue',\n",
936 " fillColor: 'blue',\n",
937 " fillOpacity: 0.2\n",
938 " });\n",
939 " circle_2.bindPopup(\"Vernon Meadow, Shanklin\");\n",
940 " map.addLayer(circle_2)\n",
941 " \n",
942 " var circle_3 = L.circle([50.703762, -1.500149], 20, {\n",
943 " color: 'blue',\n",
944 " fillColor: 'blue',\n",
945 " fillOpacity: 0.2\n",
946 " });\n",
947 " circle_3.bindPopup(\"River Road, Yarmouth\");\n",
948 " map.addLayer(circle_3)\n",
949 " \n",
950 " var circle_4 = L.circle([50.661434, -1.136071], 20, {\n",
951 " color: 'blue',\n",
952 " fillColor: 'blue',\n",
953 " fillOpacity: 0.2\n",
954 " });\n",
955 " circle_4.bindPopup(\"Yaverland, Sandown\");\n",
956 " map.addLayer(circle_4)\n",
957 " \n",
958 " var circle_5 = L.circle([50.697956, -1.297119], 20, {\n",
959 " color: 'blue',\n",
960 " fillColor: 'blue',\n",
961 " fillOpacity: 0.2\n",
962 " });\n",
963 " circle_5.bindPopup(\"Chapel Street, Newport\");\n",
964 " map.addLayer(circle_5)\n",
965 " \n",
966 " var circle_6 = L.circle([50.649334, -1.168966], 20, {\n",
967 " color: 'blue',\n",
968 " fillColor: 'blue',\n",
969 " fillOpacity: 0.2\n",
970 " });\n",
971 " circle_6.bindPopup(\"New Road, Lake\");\n",
972 " map.addLayer(circle_6)\n",
973 " \n",
974 " var circle_7 = L.circle([50.761272, -1.287557], 20, {\n",
975 " color: 'blue',\n",
976 " fillColor: 'blue',\n",
977 " fillOpacity: 0.2\n",
978 " });\n",
979 " circle_7.bindPopup(\"Maresfield Road, East Cowes\");\n",
980 " map.addLayer(circle_7)\n",
981 " \n",
982 " var circle_8 = L.circle([50.681995, -1.523602], 20, {\n",
983 " color: 'blue',\n",
984 " fillColor: 'blue',\n",
985 " fillOpacity: 0.2\n",
986 " });\n",
987 " circle_8.bindPopup(\"Moa place, Freshwater\");\n",
988 " map.addLayer(circle_8)\n",
989 " \n",
990 " var circle_9 = L.circle([50.655479, -1.154149], 20, {\n",
991 " color: 'blue',\n",
992 " fillColor: 'blue',\n",
993 " fillOpacity: 0.2\n",
994 " });\n",
995 " circle_9.bindPopup(\"Station Avenue, Sandown\");\n",
996 " map.addLayer(circle_9)\n",
997 " \n",
998 " var circle_10 = L.circle([50.633636, -1.16969], 20, {\n",
999 " color: 'blue',\n",
1000 " fillColor: 'blue',\n",
1001 " fillOpacity: 0.2\n",
1002 " });\n",
1003 " circle_10.bindPopup(\"Hope Road, Shanklin\");\n",
1004 " map.addLayer(circle_10)\n",
1005 " \n",
1006 " var circle_11 = L.circle([50.729691, -1.165924], 20, {\n",
1007 " color: 'blue',\n",
1008 " fillColor: 'blue',\n",
1009 " fillOpacity: 0.2\n",
1010 " });\n",
1011 " circle_11.bindPopup(\"Lind Place, Ryde\");\n",
1012 " map.addLayer(circle_11)\n",
1013 " \n",
1014 " var circle_12 = L.circle([50.700375, -1.297578], 20, {\n",
1015 " color: 'blue',\n",
1016 " fillColor: 'blue',\n",
1017 " fillOpacity: 0.2\n",
1018 " });\n",
1019 " circle_12.bindPopup(\"Lugley Street, Newport\");\n",
1020 " map.addLayer(circle_12)\n",
1021 " \n",
1022 " var circle_13 = L.circle([50.728966, -1.165173], 20, {\n",
1023 " color: 'blue',\n",
1024 " fillColor: 'blue',\n",
1025 " fillOpacity: 0.2\n",
1026 " });\n",
1027 " circle_13.bindPopup(\"Garfield Road, Ryde\");\n",
1028 " map.addLayer(circle_13)\n",
1029 " \n",
1030 " var circle_14 = L.circle([50.688755, -1.537925], 20, {\n",
1031 " color: 'blue',\n",
1032 " fillColor: 'blue',\n",
1033 " fillOpacity: 0.2\n",
1034 " });\n",
1035 " circle_14.bindPopup(\"Colwell Bay, Colwell\");\n",
1036 " map.addLayer(circle_14)\n",
1037 " \n",
1038 " var circle_15 = L.circle([50.760742, -1.298468], 20, {\n",
1039 " color: 'blue',\n",
1040 " fillColor: 'blue',\n",
1041 " fillOpacity: 0.2\n",
1042 " });\n",
1043 " circle_15.bindPopup(\"Cross Street, Cowes\");\n",
1044 " map.addLayer(circle_15)\n",
1045 " \n",
1046 " var circle_16 = L.circle([50.670567, -1.510754], 20, {\n",
1047 " color: 'blue',\n",
1048 " fillColor: 'blue',\n",
1049 " fillOpacity: 0.2\n",
1050 " });\n",
1051 " circle_16.bindPopup(\"Freshwater Bay, Freshwater\");\n",
1052 " map.addLayer(circle_16)\n",
1053 " \n",
1054 " var circle_17 = L.circle([50.701633, -1.291242], 20, {\n",
1055 " color: 'blue',\n",
1056 " fillColor: 'blue',\n",
1057 " fillOpacity: 0.2\n",
1058 " });\n",
1059 " circle_17.bindPopup(\"County Hall Complex, Newport (Weekends Only)\");\n",
1060 " map.addLayer(circle_17)\n",
1061 " \n",
1062 " var circle_18 = L.circle([50.683849, -1.528172], 20, {\n",
1063 " color: 'blue',\n",
1064 " fillColor: 'blue',\n",
1065 " fillOpacity: 0.2\n",
1066 " });\n",
1067 " circle_18.bindPopup(\"Avenue Road, Freshwater\");\n",
1068 " map.addLayer(circle_18)\n",
1069 " \n",
1070 " var circle_19 = L.circle([50.702011, -1.293088], 20, {\n",
1071 " color: 'blue',\n",
1072 " fillColor: 'blue',\n",
1073 " fillOpacity: 0.2\n",
1074 " });\n",
1075 " circle_19.bindPopup(\"Little London, Newport\");\n",
1076 " map.addLayer(circle_19)\n",
1077 " \n",
1078 " var circle_20 = L.circle([50.595181, -1.202488], 20, {\n",
1079 " color: 'blue',\n",
1080 " fillColor: 'blue',\n",
1081 " fillOpacity: 0.2\n",
1082 " });\n",
1083 " circle_20.bindPopup(\"Market Street, Ventnor\");\n",
1084 " map.addLayer(circle_20)\n",
1085 " \n",
1086 " var circle_21 = L.circle([50.701595, -1.292873], 20, {\n",
1087 " color: 'blue',\n",
1088 " fillColor: 'blue',\n",
1089 " fillOpacity: 0.2\n",
1090 " });\n",
1091 " circle_21.bindPopup(\"Sea Street, Newport\");\n",
1092 " map.addLayer(circle_21)\n",
1093 " \n",
1094 " var circle_22 = L.circle([50.631695, -1.171428], 20, {\n",
1095 " color: 'blue',\n",
1096 " fillColor: 'blue',\n",
1097 " fillOpacity: 0.2\n",
1098 " });\n",
1099 " circle_22.bindPopup(\"Esplanade Gardens, Shanklin\");\n",
1100 " map.addLayer(circle_22)\n",
1101 " \n",
1102 " var circle_23 = L.circle([50.629387, -1.172769], 20, {\n",
1103 " color: 'blue',\n",
1104 " fillColor: 'blue',\n",
1105 " fillOpacity: 0.2\n",
1106 " });\n",
1107 " circle_23.bindPopup(\"Spa, Shanklin\");\n",
1108 " map.addLayer(circle_23)\n",
1109 " \n",
1110 " var circle_24 = L.circle([50.65786, -1.148173], 20, {\n",
1111 " color: 'blue',\n",
1112 " fillColor: 'blue',\n",
1113 " fillOpacity: 0.2\n",
1114 " });\n",
1115 " circle_24.bindPopup(\"Fort Street, Sandown\");\n",
1116 " map.addLayer(circle_24)\n",
1117 " \n",
1118 " var circle_25 = L.circle([50.722351, -1.118063], 20, {\n",
1119 " color: 'blue',\n",
1120 " fillColor: 'blue',\n",
1121 " fillOpacity: 0.2\n",
1122 " });\n",
1123 " circle_25.bindPopup(\"The Duver, Seaview\");\n",
1124 " map.addLayer(circle_25)\n",
1125 " \n",
1126 " var circle_26 = L.circle([50.597069, -1.187725], 20, {\n",
1127 " color: 'blue',\n",
1128 " fillColor: 'blue',\n",
1129 " fillOpacity: 0.2\n",
1130 " });\n",
1131 " circle_26.bindPopup(\"Shore Road, Bonchurch\");\n",
1132 " map.addLayer(circle_26)\n",
1133 " \n",
1134 " var circle_27 = L.circle([50.639824, -1.170124], 20, {\n",
1135 " color: 'blue',\n",
1136 " fillColor: 'blue',\n",
1137 " fillOpacity: 0.2\n",
1138 " });\n",
1139 " circle_27.bindPopup(\"Winchester House, Shanklin\");\n",
1140 " map.addLayer(circle_27)\n",
1141 " \n",
1142 " var circle_28 = L.circle([50.726505, -1.163929], 20, {\n",
1143 " color: 'blue',\n",
1144 " fillColor: 'blue',\n",
1145 " fillOpacity: 0.2\n",
1146 " });\n",
1147 " circle_28.bindPopup(\"Green Street, Ryde\");\n",
1148 " map.addLayer(circle_28)\n",
1149 " \n",
1150 " var circle_29 = L.circle([50.595085, -1.203883], 20, {\n",
1151 " color: 'blue',\n",
1152 " fillColor: 'blue',\n",
1153 " fillOpacity: 0.2\n",
1154 " });\n",
1155 " circle_29.bindPopup(\"Pound Lane, Ventnor\");\n",
1156 " map.addLayer(circle_29)\n",
1157 " \n",
1158 " var circle_30 = L.circle([50.633522, -1.175741], 20, {\n",
1159 " color: 'blue',\n",
1160 " fillColor: 'blue',\n",
1161 " fillOpacity: 0.2\n",
1162 " });\n",
1163 " circle_30.bindPopup(\"Atherley Road, Shanklin\");\n",
1164 " map.addLayer(circle_30)\n",
1165 " \n",
1166 " var circle_31 = L.circle([50.688473, -1.072294], 20, {\n",
1167 " color: 'blue',\n",
1168 " fillColor: 'blue',\n",
1169 " fillOpacity: 0.2\n",
1170 " });\n",
1171 " circle_31.bindPopup(\"Lane End, Bembridge\");\n",
1172 " map.addLayer(circle_31)\n",
1173 " \n",
1174 " var circle_32 = L.circle([50.698425, -1.29327], 20, {\n",
1175 " color: 'blue',\n",
1176 " fillColor: 'blue',\n",
1177 " fillOpacity: 0.2\n",
1178 " });\n",
1179 " circle_32.bindPopup(\"Church Litten, Newport \");\n",
1180 " map.addLayer(circle_32)\n",
1181 " \n",
1182 " var circle_33 = L.circle([50.732952, -1.156654], 20, {\n",
1183 " color: 'blue',\n",
1184 " fillColor: 'blue',\n",
1185 " fillOpacity: 0.2\n",
1186 " });\n",
1187 " circle_33.bindPopup(\"Quay Road, Ryde\");\n",
1188 " map.addLayer(circle_33)\n",
1189 " \n",
1190 " var circle_34 = L.circle([50.654915, -1.15526], 20, {\n",
1191 " color: 'blue',\n",
1192 " fillColor: 'blue',\n",
1193 " fillOpacity: 0.2\n",
1194 " });\n",
1195 " circle_34.bindPopup(\"St Johns Road, Sandown\");\n",
1196 " map.addLayer(circle_34)\n",
1197 " \n",
1198 " var circle_35 = L.circle([50.595749, -1.203443], 20, {\n",
1199 " color: 'blue',\n",
1200 " fillColor: 'blue',\n",
1201 " fillOpacity: 0.2\n",
1202 " });\n",
1203 " circle_35.bindPopup(\"Central (High Street), Ventnor\");\n",
1204 " map.addLayer(circle_35)\n",
1205 " \n",
1206 " var circle_36 = L.circle([50.70657, -1.288984], 20, {\n",
1207 " color: 'blue',\n",
1208 " fillColor: 'blue',\n",
1209 " fillOpacity: 0.2\n",
1210 " });\n",
1211 " circle_36.bindPopup(\"Seaclose Recreation Ground, Newport\");\n",
1212 " map.addLayer(circle_36)\n",
1213 " \n",
1214 " var circle_37 = L.circle([50.699642, -1.288447], 20, {\n",
1215 " color: 'blue',\n",
1216 " fillColor: 'blue',\n",
1217 " fillOpacity: 0.2\n",
1218 " });\n",
1219 " circle_37.bindPopup(\"Coppins Bridge, Newport\");\n",
1220 " map.addLayer(circle_37)\n",
1221 " \n",
1222 " var circle_38 = L.circle([50.595016, -1.20767], 20, {\n",
1223 " color: 'blue',\n",
1224 " fillColor: 'blue',\n",
1225 " fillOpacity: 0.2\n",
1226 " });\n",
1227 " circle_38.bindPopup(\"The Grove, Ventnor\");\n",
1228 " map.addLayer(circle_38)\n",
1229 " \n",
1230 " var circle_39 = L.circle([50.630047, -1.178836], 20, {\n",
1231 " color: 'blue',\n",
1232 " fillColor: 'blue',\n",
1233 " fillOpacity: 0.2\n",
1234 " });\n",
1235 " circle_39.bindPopup(\"Landguard Road, Shanklin\");\n",
1236 " map.addLayer(circle_39)\n",
1237 " \n",
1238 " var circle_40 = L.circle([50.696064, -1.290963], 20, {\n",
1239 " color: 'blue',\n",
1240 " fillColor: 'blue',\n",
1241 " fillOpacity: 0.2\n",
1242 " });\n",
1243 " circle_40.bindPopup(\"Medina Avenue, Newport\");\n",
1244 " map.addLayer(circle_40)\n",
1245 " \n",
1246 " var circle_41 = L.circle([50.758881, -1.295019], 20, {\n",
1247 " color: 'blue',\n",
1248 " fillColor: 'blue',\n",
1249 " fillOpacity: 0.2\n",
1250 " });\n",
1251 " circle_41.bindPopup(\"Brunswick Road, Cowes\");\n",
1252 " map.addLayer(circle_41)\n",
1253 " \n",
1254 " var circle_42 = L.circle([50.594139, -1.201404], 20, {\n",
1255 " color: 'blue',\n",
1256 " fillColor: 'blue',\n",
1257 " fillOpacity: 0.2\n",
1258 " });\n",
1259 " circle_42.bindPopup(\"Dudley Road, Ventnor\");\n",
1260 " map.addLayer(circle_42)\n",
1261 " \n",
1262 " var circle_43 = L.circle([50.72876, -1.164186], 20, {\n",
1263 " color: 'blue',\n",
1264 " fillColor: 'blue',\n",
1265 " fillOpacity: 0.2\n",
1266 " });\n",
1267 " circle_43.bindPopup(\"Victoria Street, Ryde\");\n",
1268 " map.addLayer(circle_43)\n",
1269 " \n",
1270 " var circle_44 = L.circle([50.593258, -1.204343], 20, {\n",
1271 " color: 'blue',\n",
1272 " fillColor: 'blue',\n",
1273 " fillOpacity: 0.2\n",
1274 " });\n",
1275 " circle_44.bindPopup(\"Eastern Esplanade, Ventnor\");\n",
1276 " map.addLayer(circle_44)\n",
1277 " \n",
1278 " var circle_45 = L.circle([50.698383, -1.297661], 20, {\n",
1279 " color: 'blue',\n",
1280 " fillColor: 'blue',\n",
1281 " fillOpacity: 0.2\n",
1282 " });\n",
1283 " circle_45.bindPopup(\"New Street, Newport\");\n",
1284 " map.addLayer(circle_45)\n",
1285 " \n",
1286 " var circle_46 = L.circle([50.628448, -1.180687], 20, {\n",
1287 " color: 'blue',\n",
1288 " fillColor: 'blue',\n",
1289 " fillOpacity: 0.2\n",
1290 " });\n",
1291 " circle_46.bindPopup(\"Orchardleigh Road, Shanklin\");\n",
1292 " map.addLayer(circle_46)\n",
1293 " \n",
1294 " var circle_47 = L.circle([50.732193, -1.162893], 20, {\n",
1295 " color: 'blue',\n",
1296 " fillColor: 'blue',\n",
1297 " fillOpacity: 0.2\n",
1298 " });\n",
1299 " circle_47.bindPopup(\"St Thomas Street Upper/Lower, Ryde\");\n",
1300 " map.addLayer(circle_47)\n",
1301 " \n",
1302 " var circle_48 = L.circle([50.703346, -1.289982], 20, {\n",
1303 " color: 'blue',\n",
1304 " fillColor: 'blue',\n",
1305 " fillOpacity: 0.2\n",
1306 " });\n",
1307 " circle_48.bindPopup(\"Newport Harbour, Newport\");\n",
1308 " map.addLayer(circle_48)\n",
1309 " \n",
1310 " var circle_49 = L.circle([50.592678, -1.211554], 20, {\n",
1311 " color: 'blue',\n",
1312 " fillColor: 'blue',\n",
1313 " fillOpacity: 0.2\n",
1314 " });\n",
1315 " circle_49.bindPopup(\"La Falaise, Ventnor\");\n",
1316 " map.addLayer(circle_49)\n",
1317 " \n",
1318 " var circle_50 = L.circle([50.766243, -1.307802], 20, {\n",
1319 " color: 'blue',\n",
1320 " fillColor: 'blue',\n",
1321 " fillOpacity: 0.2\n",
1322 " });\n",
1323 " circle_50.bindPopup(\"Mornington Road, Cowes\");\n",
1324 " map.addLayer(circle_50)\n",
1325 " \n",
1326 "\n",
1327 " \n",
1328 "\n",
1329 " \n",
1330 "\n",
1331 " \n",
1332 "\n",
1333 " })();\n",
1334 "\n",
1335 " };\n",
1336 "\n",
1337 " make_plot();\n",
1338 "\n",
1339 "</script>"
1340 ],
1341 "metadata": {},
1342 "output_type": "pyout",
1343 "prompt_number": 2,
1344 "text": [
1345 "<folium.folium.Map at 0x7f3e613fba58>"
1346 ]
1347 }
1348 ],
1349 "prompt_number": 2
1350 },
1351 {
1352 "cell_type": "code",
1353 "collapsed": false,
1354 "input": [
1355 "!pip3 uninstall -y folium\n",
1356 "!pip3 install git+https://github.com/tbicr/folium.git@fixed#folium\n",
1357 "#!pip3 install git+https://github.com/birdage/folium.git@clustered_markers#egg=folium --upgrade"
1358 ],
1359 "language": "python",
1360 "metadata": {},
1361 "outputs": [
1362 {
1363 "output_type": "stream",
1364 "stream": "stdout",
1365 "text": [
1366 "Uninstalling folium:\r\n",
1367 " Successfully uninstalled folium\r\n"
1368 ]
1369 },
1370 {
1371 "output_type": "stream",
1372 "stream": "stdout",
1373 "text": [
1374 "Downloading/unpacking git+https://github.com/tbicr/folium.git@fixed\r\n",
1375 " Cloning https://github.com/tbicr/folium.git (to fixed) to /tmp/pip-jjtckviy-build\r\n"
1376 ]
1377 },
1378 {
1379 "output_type": "stream",
1380 "stream": "stdout",
1381 "text": [
1382 " Running setup.py (path:/tmp/pip-jjtckviy-build/setup.py) egg_info for package from git+https://github.com/tbicr/folium.git@fixed\r\n"
1383 ]
1384 },
1385 {
1386 "output_type": "stream",
1387 "stream": "stdout",
1388 "text": [
1389 " \r\n"
1390 ]
1391 },
1392 {
1393 "output_type": "stream",
1394 "stream": "stdout",
1395 "text": [
1396 " warning: no files found matching '*.css' under directory 'folium'\r\n"
1397 ]
1398 },
1399 {
1400 "output_type": "stream",
1401 "stream": "stdout",
1402 "text": [
1403 "Installing collected packages: folium\r\n",
1404 " Running setup.py install for folium\r\n"
1405 ]
1406 },
1407 {
1408 "output_type": "stream",
1409 "stream": "stdout",
1410 "text": [
1411 " \r\n",
1412 " warning: no files found matching '*.css' under directory 'folium'\r\n"
1413 ]
1414 },
1415 {
1416 "output_type": "stream",
1417 "stream": "stdout",
1418 "text": [
1419 "Successfully installed folium\r\n",
1420 "Cleaning up...\r\n"
1421 ]
1422 }
1423 ],
1424 "prompt_number": 3
1425 },
1426 {
1427 "cell_type": "code",
1428 "collapsed": false,
1429 "input": [
1430 "from IPython.display import HTML\n",
1431 "from folium import Map\n",
1432 "\n",
1433 "def inline_map(map):\n",
1434 " \"\"\"\n",
1435 " Embeds the HTML source of the map directly into the IPython notebook.\n",
1436 " \n",
1437 " This method will not work if the map depends on any files (json data). Also this uses\n",
1438 " the HTML5 srcdoc attribute, which may not be supported in all browsers.\n",
1439 " \"\"\"\n",
1440 " map._build_map()\n",
1441 " return HTML('<iframe srcdoc=\"{srcdoc}\" style=\"width: 100%; height: 510px; border: none\"></iframe>'.format(srcdoc=map.HTML.replace('\"', '&quot;')))\n",
1442 "\n",
1443 "def embed_map(map, path=\"m213map.html\"):\n",
1444 " \"\"\"\n",
1445 " Embeds a linked iframe to the map into the IPython notebook.\n",
1446 " \n",
1447 " Note: this method will not capture the source of the map into the notebook.\n",
1448 " This method should work for all maps (as long as they use relative urls).\n",
1449 " \"\"\"\n",
1450 " map.create_map(path=path)\n",
1451 " return HTML('<iframe src=\"files/{path}\" style=\"width: 100%; height: 510px; border: none\"></iframe>'.format(path=path))\n",
1452 "\n",
1453 "map = folium.Map(location=[50.68, -1.2667], zoom_start=11)\n",
1454 "\n",
1455 "#Alternatively, we can calculate it as the mean latitude and longitude of the points we wish to plot\n",
1456 "#latSum=lonSum =0\n",
1457 "#for name, location in locations.items():\n",
1458 "# latSum+=location[0]\n",
1459 "# lonSum+=location[1]\n",
1460 "#carparks = folium.Map(location=[latSum/len(locations.items()), lonSum/len(locations.items())], zoom_start=11)\n",
1461 "\n",
1462 "for name, location in locations.items():\n",
1463 " map.simple_marker(location=location, popup=name,clustered_marker=True)\n",
1464 "\n",
1465 "inline_map(map)"
1466 ],
1467 "language": "python",
1468 "metadata": {},
1469 "outputs": [
1470 {
1471 "ename": "TypeError",
1472 "evalue": "simple_marker() got an unexpected keyword argument 'clustered_marker'",
1473 "output_type": "pyerr",
1474 "traceback": [
1475 "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)",
1476 "\u001b[1;32m<ipython-input-3-99c4b2f3c344>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m 32\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 33\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mname\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mlocation\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mlocations\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mitems\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 34\u001b[1;33m \u001b[0mmap\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msimple_marker\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mlocation\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mlocation\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mpopup\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mname\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0mclustered_marker\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;32mTrue\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 35\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 36\u001b[0m \u001b[0minline_map\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mmap\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
1477 "\u001b[1;32m/usr/local/lib/python3.4/dist-packages/folium/folium.py\u001b[0m in \u001b[0;36mwrapper\u001b[1;34m(self, *args, **kwargs)\u001b[0m\n\u001b[0;32m 44\u001b[0m \u001b[1;32mdef\u001b[0m \u001b[0mwrapper\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m*\u001b[0m\u001b[0margs\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 45\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mmark_cnt\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mtype\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mmark_cnt\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mget\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mtype\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m0\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;33m+\u001b[0m \u001b[1;36m1\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 46\u001b[1;33m \u001b[0mfunc_result\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mfunc\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m*\u001b[0m\u001b[0margs\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 47\u001b[0m \u001b[1;32mreturn\u001b[0m \u001b[0mfunc_result\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 48\u001b[0m \u001b[1;32mreturn\u001b[0m \u001b[0mwrapper\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
1478 "\u001b[1;31mTypeError\u001b[0m: simple_marker() got an unexpected keyword argument 'clustered_marker'"
1479 ]
1480 }
1481 ],
1482 "prompt_number": 3
1483 },
1484 {
1485 "cell_type": "code",
1486 "collapsed": false,
1487 "input": [
1488 "from folium.folium import Map"
1489 ],
1490 "language": "python",
1491 "metadata": {},
1492 "outputs": [],
1493 "prompt_number": 5
1494 },
1495 {
1496 "cell_type": "code",
1497 "collapsed": false,
1498 "input": [
1499 "map = folium.Map(width=500,height=500,location=[44, -73], zoom_start=4)\n",
1500 "\n",
1501 "map.simple_marker([40.67, -73.94], popup='Add <b>popup</b> text here.',marker_color='green',marker_icon='ok-sign',clustered_marker=True)\n",
1502 "map.simple_marker([44.67, -73.94], popup='Add <b>popup</b> text here.',marker_color='red',marker_icon='remove-sign',clustered_marker=True)\n",
1503 "map.simple_marker([44.67, -71.94], popup='Add <b>popup</b> text here.',clustered_marker=True)\n",
1504 "\n",
1505 "map.circle_marker([44, -71], popup='', fill_color='#ff0000', radius=5000, line_color='#ff0000')\n",
1506 "\n",
1507 "points1 = [40,-71]\n",
1508 "points2 = [42,-73]\n",
1509 "map.line([points1, points2])\n",
1510 "\n",
1511 "inline_map(map)"
1512 ],
1513 "language": "python",
1514 "metadata": {},
1515 "outputs": [
1516 {
1517 "ename": "NameError",
1518 "evalue": "name 'folium' is not defined",
1519 "output_type": "pyerr",
1520 "traceback": [
1521 "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[1;31mNameError\u001b[0m Traceback (most recent call last)",
1522 "\u001b[1;32m<ipython-input-5-13b40973bddd>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mmap\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mfolium\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mMap\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mwidth\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;36m500\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0mheight\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;36m500\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0mlocation\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m44\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m-\u001b[0m\u001b[1;36m73\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mzoom_start\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;36m4\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 2\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 3\u001b[0m \u001b[0mmap\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msimple_marker\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m40.67\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m-\u001b[0m\u001b[1;36m73.94\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mpopup\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;34m'Add <b>popup</b> text here.'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0mmarker_color\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;34m'green'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0mmarker_icon\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;34m'ok-sign'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0mclustered_marker\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;32mTrue\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 4\u001b[0m \u001b[0mmap\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msimple_marker\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m44.67\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m-\u001b[0m\u001b[1;36m73.94\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mpopup\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;34m'Add <b>popup</b> text here.'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0mmarker_color\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;34m'red'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0mmarker_icon\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;34m'remove-sign'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0mclustered_marker\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;32mTrue\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[0mmap\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msimple_marker\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m44.67\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m-\u001b[0m\u001b[1;36m71.94\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mpopup\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;34m'Add <b>popup</b> text here.'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0mclustered_marker\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;32mTrue\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
1523 "\u001b[1;31mNameError\u001b[0m: name 'folium' is not defined"
1524 ]
1525 }
1526 ],
1527 "prompt_number": 5
1528 },
1529 {
1530 "cell_type": "code",
1531 "collapsed": false,
1532 "input": [
1533 "#To make thing easier, we might combine the above code elements to create a simple function to plot marks on a map from a KML file...\n",
1534 "#We could then start to refine the function to support more callable parameters\n",
1535 "from fastkml import kml\n",
1536 "import folium\n",
1537 "folium.initialize_notebook()\n",
1538 "def previewPlaceMarksFromKML(filename):\n",
1539 " ''' Quick function to plot placemarks from KML file '''\n",
1540 " k = kml.KML()\n",
1541 " doc = open(filename,'rb').read()\n",
1542 " k.from_string(doc)\n",
1543 "\n",
1544 " locations = dict()\n",
1545 " for feature in k.features():\n",
1546 " for placemark in feature.features():\n",
1547 " locations.update({placemark.name: (placemark.geometry.y, placemark.geometry.x, )})\n",
1548 " \n",
1549 " latSum=lonSum =0\n",
1550 " for name, location in locations.items():\n",
1551 " latSum+=location[0]\n",
1552 " lonSum+=location[1]\n",
1553 " map = folium.Map(location=[latSum/len(locations.items()), lonSum/len(locations.items())], zoom_start=11)\n",
1554 "\n",
1555 " for name, location in locations.items():\n",
1556 " map.circle_marker(location=location, popup=name,radius=20,line_color='blue',fill_color='blue',fill_opacity=0.2)\n",
1557 "\n",
1558 " return map\n",
1559 "\n",
1560 "previewPlaceMarksFromKML(\"data/CarParks.kml\")"
1561 ],
1562 "language": "python",
1563 "metadata": {},
1564 "outputs": [
1565 {
1566 "html": [
1567 "<link rel=\"stylesheet\" href=\"http://cdn.leafletjs.com/leaflet-0.7.2/leaflet.css\" />\n",
1568 "<style>\n",
1569 " .leaflet-popup-content {\n",
1570 " color: black !important;\n",
1571 " }\n",
1572 "\n",
1573 " .leaflet-control-zoom-in {\n",
1574 " text-decoration: none !important;\n",
1575 " }\n",
1576 "\n",
1577 " .leaflet-control-zoom-out {\n",
1578 " text-decoration: none !important;\n",
1579 " }\n",
1580 "</style>"
1581 ],
1582 "metadata": {},
1583 "output_type": "display_data",
1584 "text": [
1585 "<IPython.core.display.HTML at 0x7f2dcaf53c18>"
1586 ]
1587 },
1588 {
1589 "html": [
1590 "<script>\n",
1591 "\n",
1592 " var folium_event = new CustomEvent(\n",
1593 " \"folium_libs_loaded\",\n",
1594 " {bubbles: true, cancelable: true}\n",
1595 " );\n",
1596 "\n",
1597 " var load_folium_charts = function(){\n",
1598 " window.dispatchEvent(folium_event);\n",
1599 " };\n",
1600 "\n",
1601 " var load_folium_libs = function(){\n",
1602 " console.log('Loading all Folium libraries...')\n",
1603 " $.getScript(\"http://cdn.leafletjs.com/leaflet-0.7.2/leaflet.js\", function(){\n",
1604 " $.getScript('https://wrobstory.github.io/leaflet-dvf/leaflet-dvf.markers.min.js', function(){\n",
1605 " if (window['vg'] === undefined){\n",
1606 " $.getScript(\"http://trifacta.github.com/vega/vega.js\", function(){\n",
1607 " load_folium_charts();\n",
1608 " });\n",
1609 " } else {\n",
1610 " load_folium_charts();\n",
1611 " }\n",
1612 " });\n",
1613 " });\n",
1614 " };\n",
1615 "\n",
1616 " if(typeof define === \"function\" && define.amd){\n",
1617 " var load_paths = {\n",
1618 " paths: {\n",
1619 " topojson:'http://d3js.org/topojson.v1.min',\n",
1620 " queue: 'http://d3js.org/queue.v1.min',\n",
1621 " d3: 'http://d3js.org/d3.v3.min'\n",
1622 " }\n",
1623 " };\n",
1624 " var libs = ['d3', 'queue', 'topojson'];\n",
1625 " for (var i=0; i < libs.length; i++){\n",
1626 " lib = libs[i]\n",
1627 " if (window[lib] !== undefined){\n",
1628 " delete load_paths.paths[lib]\n",
1629 " };\n",
1630 " };\n",
1631 " if (Object.keys(load_paths.paths).length != 0){\n",
1632 " require.config(load_paths);\n",
1633 " require([\"queue\"], function(queue){\n",
1634 " window.queue = queue;\n",
1635 " });\n",
1636 " require([\"d3\"], function(d3){\n",
1637 " console.log('Loading from require.js...')\n",
1638 " window.d3 = d3;\n",
1639 " require([\"topojson\"], function(topojson){\n",
1640 " window.topojson = topojson;\n",
1641 " load_folium_libs();\n",
1642 " });\n",
1643 " });\n",
1644 " } else {\n",
1645 " load_folium_libs();\n",
1646 " }\n",
1647 "\n",
1648 " }else{\n",
1649 " console.log('Require.js not found!');\n",
1650 " throw \"Require.js not found!\"\n",
1651 " };\n",
1652 "\n",
1653 "</script>"
1654 ],
1655 "metadata": {},
1656 "output_type": "display_data",
1657 "text": [
1658 "<IPython.core.display.HTML at 0x7f2dcaf6b048>"
1659 ]
1660 }
1661 ],
1662 "prompt_number": 97
1663 },
1664 {
1665 "cell_type": "code",
1666 "collapsed": false,
1667 "input": [
1668 "import folium"
1669 ],
1670 "language": "python",
1671 "metadata": {},
1672 "outputs": [],
1673 "prompt_number": 6
1674 },
1675 {
1676 "cell_type": "heading",
1677 "level": 2,
1678 "metadata": {},
1679 "source": [
1680 "YAML"
1681 ]
1682 },
1683 {
1684 "cell_type": "markdown",
1685 "metadata": {},
1686 "source": [
1687 "*pandas* does not support YAML imports directly, but it is possible to use libraries such as the PyYaml library to load in a YAML file and convert it to a python dict that can then be transformed to a pandas dataframe.\n",
1688 "\n",
1689 "As with XML, we will tend *not* to focus on the use of YAML, preferring instead JSON and CSV representations."
1690 ]
1691 },
1692 {
1693 "cell_type": "code",
1694 "collapsed": false,
1695 "input": [
1696 "import yaml\n",
1697 "document = \"\"\"\n",
1698 "image:\n",
1699 " width: 800\n",
1700 " height: 600\n",
1701 " title: View from 15th Floor\n",
1702 " thumbnail:\n",
1703 " url: http://www.example.com/image/481989943\n",
1704 " height: 125\n",
1705 " width: 100\n",
1706 " animated : false\n",
1707 " IDs:\n",
1708 " - 116\n",
1709 " - 943\n",
1710 " - 234\n",
1711 " - 38793\n",
1712 "\"\"\"\n",
1713 "parsedYAML=yaml.load(document)\n",
1714 "parsedYAML"
1715 ],
1716 "language": "python",
1717 "metadata": {},
1718 "outputs": []
1719 },
1720 {
1721 "cell_type": "code",
1722 "collapsed": false,
1723 "input": [
1724 "#We can also cast a dict to YAML\n",
1725 "print(yaml.dump(parsedYAML))"
1726 ],
1727 "language": "python",
1728 "metadata": {},
1729 "outputs": []
1730 },
1731 {
1732 "cell_type": "heading",
1733 "level": 2,
1734 "metadata": {},
1735 "source": [
1736 "Binary Data Formats - HDF5"
1737 ]
1738 },
1739 {
1740 "cell_type": "markdown",
1741 "metadata": {},
1742 "source": [
1743 "<span style=\"color:purple\">Or should this be in the file I/O section? In fact - do we need it al all?</span>"
1744 ]
1745 },
1746 {
1747 "cell_type": "markdown",
1748 "metadata": {},
1749 "source": [
1750 "TO DO - ref http://www.pytables.org/moin; maybe link to http://www.pytables.org/docs/manual-2.2.1/ch03.html ?"
1751 ]
1752 },
1753 {
1754 "cell_type": "code",
1755 "collapsed": false,
1756 "input": [
1757 "import tables"
1758 ],
1759 "language": "python",
1760 "metadata": {},
1761 "outputs": []
1762 },
1763 {
1764 "cell_type": "heading",
1765 "level": 2,
1766 "metadata": {},
1767 "source": [
1768 "What Next?"
1769 ]
1770 },
1771 {
1772 "cell_type": "markdown",
1773 "metadata": {},
1774 "source": [
1775 "Having learned about how to get data in to *pandas*, now it's time to see why that might be useful.\n",
1776 "\n",
1777 "If you are working through this notebook as part of an inline exercise, return to the course materials now.\n",
1778 "\n",
1779 "If you are working through this set of notebooks as a whole, now it's time for [Mini-Data Investigation, The First](02.3%20Mini%20Data%20Investigation%2C%20The%20First.ipynb)"
1780 ]
1781 }
1782 ],
1783 "metadata": {}
1784 }
1785 ]
1786 }