Day 9 in Sense
[ou-summer-of-code-2017.git] / 01-ticket-prices / ticket-pricing-solution.ipynb
1 {
2 "cells": [
3 {
4 "cell_type": "markdown",
5 "metadata": {},
6 "source": [
7 "# Ticket pricing\n",
8 "\n",
9 "You've been shopping around for a holiday package deal and its time to make your choice of which deal to go with. The file [01-holidays.txt](01-holidays.txt) contains a summary of your investigations. \n",
10 "\n",
11 "It's a simple text file, with one possible holiday package per line.\n",
12 "\n",
13 "Each line has four fields, separated by spaces. They are:\n",
14 "* The deal ID, from the price comparison website you found it.\n",
15 "* The holiday price, in whole pounds.\n",
16 "* The location of the holiday, always a single word.\n",
17 "* The number of nights you'd be staying. \n",
18 "\n",
19 "For example, the data file might look like this:\n",
20 "\n",
21 "```\n",
22 "db61bb90 769 Morgantown 3\n",
23 "202c898b5f 1284 Morgantown 21\n",
24 "def36ffcd 1514 Giessenmestia 21\n",
25 "389018bd0707 1052 Estacada 21\n",
26 "a487c4270 782 Geoje-Si 14\n",
27 "6caf2584a55 724 Stonington-Island 14\n",
28 "199608abc5 1209 Nordkapp 21\n",
29 "```"
30 ]
31 },
32 {
33 "cell_type": "markdown",
34 "metadata": {},
35 "source": [
36 "## Part 1\n",
37 "You have a budget of £1200. How many of the holidays can you afford?\n",
38 "\n",
39 "Given the example data above, you could afford four of the holidays: the trips to Estacada, Geoje-Si and Stonnington-Island, and the three-day trip to Morgantown. \n",
40 "\n",
41 "The 21 day trip to Morgantown and the trips to Giessenmestia and Nordkapp are all too expensive."
42 ]
43 },
44 {
45 "cell_type": "markdown",
46 "metadata": {},
47 "source": [
48 "### Solution"
49 ]
50 },
51 {
52 "cell_type": "code",
53 "execution_count": 12,
54 "metadata": {},
55 "outputs": [
56 {
57 "data": {
58 "text/plain": [
59 "[['dda7d369', '1546', 'Uzupis', '21'],\n",
60 " ['68022753', '1239', 'Mamula', '21'],\n",
61 " ['b261dbd1cef', '996', 'Holmegaard', '21']]"
62 ]
63 },
64 "execution_count": 12,
65 "metadata": {},
66 "output_type": "execute_result"
67 }
68 ],
69 "source": [
70 "holidays = []\n",
71 "with open('01-holidays.txt') as f:\n",
72 " for hol_line in f.readlines():\n",
73 " holidays.append(hol_line.split())\n",
74 " \n",
75 "holidays[:3]"
76 ]
77 },
78 {
79 "cell_type": "code",
80 "execution_count": 13,
81 "metadata": {},
82 "outputs": [
83 {
84 "data": {
85 "text/plain": [
86 "59"
87 ]
88 },
89 "execution_count": 13,
90 "metadata": {},
91 "output_type": "execute_result"
92 }
93 ],
94 "source": [
95 "affordable_holidays = []\n",
96 "for h in holidays:\n",
97 " if int(h[1]) <= 1200:\n",
98 " affordable_holidays.append(h)\n",
99 "\n",
100 "len(affordable_holidays)"
101 ]
102 },
103 {
104 "cell_type": "code",
105 "execution_count": 14,
106 "metadata": {},
107 "outputs": [
108 {
109 "data": {
110 "text/plain": [
111 "124"
112 ]
113 },
114 "execution_count": 14,
115 "metadata": {},
116 "output_type": "execute_result"
117 }
118 ],
119 "source": [
120 "len(holidays)"
121 ]
122 },
123 {
124 "cell_type": "markdown",
125 "metadata": {},
126 "source": [
127 "### Smart-alec one-line solution"
128 ]
129 },
130 {
131 "cell_type": "code",
132 "execution_count": 15,
133 "metadata": {},
134 "outputs": [
135 {
136 "data": {
137 "text/plain": [
138 "59"
139 ]
140 },
141 "execution_count": 15,
142 "metadata": {},
143 "output_type": "execute_result"
144 }
145 ],
146 "source": [
147 "sum(1 for h in open('01-holidays.txt').readlines() if int(h.split()[1]) <= 1200)"
148 ]
149 },
150 {
151 "cell_type": "markdown",
152 "metadata": {},
153 "source": [
154 "# Part 2\n",
155 "You don't just want _a_ holiday. You want the _best_ holiday. What is the code of the holiday which would give you the best value?\n",
156 "\n",
157 "The \"value\" of a holiday is the duration per pound. Because some destinations are better than others, you'll want to scale the value for some locations. For instance, a night in Timbuktu is worth three times as much as a holiday in Bletchley.\n",
158 "\n",
159 "Assume all holidays have a relative value of 1, apart from these destinations.\n",
160 "\n",
161 "| Destination | Score |\n",
162 "|-------------|-------|\n",
163 "| Almaty | 2.0 |\n",
164 "| Brorfelde | 0.9 |\n",
165 "| Estacada | 0.4 |\n",
166 "| Jayuya | 0.6 |\n",
167 "| Karlukovo | 2.2 |\n",
168 "| Morgantown | 2.9 |\n",
169 "| Nordkapp | 1.5 |\n",
170 "| Nullarbor | 2.2 |\n",
171 "| Puente-Laguna-Garzonkuala-Penyu | 0.4 |\n",
172 "| Uzupis | 0.9 |\n",
173 "\n",
174 "## Example\n",
175 "\n",
176 "Given the holiday list above, the holiday to Geoje-Si (with the standard weighting of 1) has a value of $\\frac{14}{782} = 0.0179$ nights per pound. \n",
177 "\n",
178 "The trip to Estacada looks promising, at $\\frac{21}{1052} = 0.0200$ nights per pound. Unfortunately, the weighting for Estacada is low, to the adjusted cost is $0.4 \\times \\frac{21}{1052} = 0.00798$ nights per pound.\n",
179 "\n",
180 "The best value holiday is the 21 day trip to Morgantown, with a value of $2.9 \\times \\frac{21}{1284} = 0.0474$ nights per pound. Unfortunately, it's unaffordable. \n",
181 "\n",
182 "The best value affordable holiday is the trip to Stonnington Island, with $\\frac{14}{1284} = 0.0193$ nights per pound."
183 ]
184 },
185 {
186 "cell_type": "code",
187 "execution_count": 16,
188 "metadata": {
189 "collapsed": true
190 },
191 "outputs": [],
192 "source": [
193 "destination_values = {'Almaty': 2.0, 'Brorfelde': 0.9, 'Estacada': 0.4, 'Jayuya': 0.6, 'Karlukovo': 2.2, \n",
194 " 'Morgantown': 2.9,'Nordkapp': 1.5, 'Nullarbor': 2.2, \n",
195 " 'Puente-Laguna-Garzonkuala-Penyu': 0.4, 'Uzupis': 0.9}"
196 ]
197 },
198 {
199 "cell_type": "code",
200 "execution_count": 17,
201 "metadata": {
202 "collapsed": true
203 },
204 "outputs": [],
205 "source": [
206 "def value_of_destination(name):\n",
207 " if name in destination_values:\n",
208 " return destination_values[name]\n",
209 " else:\n",
210 " return 1"
211 ]
212 },
213 {
214 "cell_type": "code",
215 "execution_count": 18,
216 "metadata": {
217 "collapsed": true
218 },
219 "outputs": [],
220 "source": [
221 "def value_of_holiday(holiday):\n",
222 " hid, cost, destination, duration = tuple(holiday)\n",
223 " value = value_of_destination(destination) * float(duration) / int(cost)\n",
224 " return value"
225 ]
226 },
227 {
228 "cell_type": "code",
229 "execution_count": 19,
230 "metadata": {},
231 "outputs": [
232 {
233 "data": {
234 "text/plain": [
235 "'ee064e1e2ea'"
236 ]
237 },
238 "execution_count": 19,
239 "metadata": {},
240 "output_type": "execute_result"
241 }
242 ],
243 "source": [
244 "best_holiday = ''\n",
245 "best_value = 0\n",
246 "\n",
247 "for h in affordable_holidays:\n",
248 " if value_of_holiday(h) > best_value:\n",
249 " best_value = value_of_holiday(h)\n",
250 " best_holiday = h[0]\n",
251 " \n",
252 "best_holiday"
253 ]
254 },
255 {
256 "cell_type": "markdown",
257 "metadata": {},
258 "source": [
259 "## Smart-alec solution"
260 ]
261 },
262 {
263 "cell_type": "code",
264 "execution_count": 20,
265 "metadata": {},
266 "outputs": [
267 {
268 "data": {
269 "text/plain": [
270 "'ee064e1e2ea'"
271 ]
272 },
273 "execution_count": 20,
274 "metadata": {},
275 "output_type": "execute_result"
276 }
277 ],
278 "source": [
279 "# Right answer\n",
280 "max(affordable_holidays, key=value_of_holiday)[0]"
281 ]
282 },
283 {
284 "cell_type": "code",
285 "execution_count": 21,
286 "metadata": {},
287 "outputs": [
288 {
289 "data": {
290 "text/plain": [
291 "'c86e2e5826'"
292 ]
293 },
294 "execution_count": 21,
295 "metadata": {},
296 "output_type": "execute_result"
297 }
298 ],
299 "source": [
300 "# Answer if you don't filter by affordability\n",
301 "max(holidays, key=value_of_holiday)[0]"
302 ]
303 },
304 {
305 "cell_type": "code",
306 "execution_count": 23,
307 "metadata": {},
308 "outputs": [
309 {
310 "data": {
311 "text/plain": [
312 "'f60e203aaaf9'"
313 ]
314 },
315 "execution_count": 23,
316 "metadata": {},
317 "output_type": "execute_result"
318 }
319 ],
320 "source": [
321 "# Answer if you don't scale by perceived value\n",
322 "max(affordable_holidays, key=lambda h: float(h[3]) / float(h[1]))[0]"
323 ]
324 },
325 {
326 "cell_type": "code",
327 "execution_count": 22,
328 "metadata": {},
329 "outputs": [
330 {
331 "data": {
332 "text/plain": [
333 "'f60e203aaaf9'"
334 ]
335 },
336 "execution_count": 22,
337 "metadata": {},
338 "output_type": "execute_result"
339 }
340 ],
341 "source": [
342 "# Answer if you don't scale by perceived value, AND don't filter by affordability\n",
343 "max(holidays, key=lambda h: float(h[3]) / float(h[1]))[0]"
344 ]
345 }
346 ],
347 "metadata": {
348 "kernelspec": {
349 "display_name": "Python 3",
350 "language": "python",
351 "name": "python3"
352 },
353 "language_info": {
354 "codemirror_mode": {
355 "name": "ipython",
356 "version": 3
357 },
358 "file_extension": ".py",
359 "mimetype": "text/x-python",
360 "name": "python",
361 "nbconvert_exporter": "python",
362 "pygments_lexer": "ipython3",
363 "version": "3.5.2+"
364 }
365 },
366 "nbformat": 4,
367 "nbformat_minor": 2
368 }