From: Neil Smith Date: Wed, 10 Oct 2018 17:28:32 +0000 (+0100) Subject: Added some comments X-Git-Url: https://git.njae.me.uk/?p=summerofcode2018soln.git;a=commitdiff_plain;h=cfb7c81e1ac2a46536473a4a50d482fef8341734 Added some comments --- diff --git a/src/task9/task9.ipynb b/src/task9/task9.ipynb index f8b2d9e..556ce5e 100644 --- a/src/task9/task9.ipynb +++ b/src/task9/task9.ipynb @@ -11,6 +11,22 @@ "from functools import lru_cache" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Needed for the recursive functions, as the `lru_cache` requires all the function's arguments be hashable. " + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": {}, + "outputs": [], + "source": [ + "Element = collections.namedtuple('Element', ['weight', 'value'])" + ] + }, { "cell_type": "code", "execution_count": 2, @@ -25,12 +41,12 @@ ] }, { - "cell_type": "code", - "execution_count": 32, + "cell_type": "markdown", "metadata": {}, - "outputs": [], "source": [ - "Element = collections.namedtuple('Element', ['weight', 'value'])" + "# Part 1: as many bags as possible\n", + "\n", + "A dynamic programming solution. See the code in part 2 (below) for an explanation: it makes more sense when we're counting the value of the bags separately from the number of bags." ] }, { @@ -60,6 +76,13 @@ " return count_table[len(elements), weight_limit], count_table, back_refs" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "A recursive solution. Note the use of `lru_cache`, part of the standard `functools` Python library. The `lru_cache` memoises the function it's applied to, making these recursive algorithms feasible on larger instances." + ] + }, { "cell_type": "code", "execution_count": 37, @@ -73,8 +96,6 @@ " else:\n", " this_element = list(elements)[0]\n", " other_elements = elements.difference(frozenset([this_element]))\n", - "# this_element = elements[0]\n", - "# other_elements = elements[1:]\n", " if this_element.weight > weight_limit:\n", " return recursive_count(other_elements, weight_limit)\n", " else:\n", @@ -86,6 +107,43 @@ " return without_this" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, a greedy version that just packs the lightest bags in first, continuing while there's space for another." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [], + "source": [ + "def greedy_fill(elements, weight_limit):\n", + " return len(list(itertools.takewhile(lambda s: s < weight_limit, itertools.accumulate(sorted(e['weight'] for e in elements)))))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Part 2: getting the most value\n", + "First, the dynamic programming solution.\n", + "\n", + "The table is indexed as `(number of bags, weight used)` and contains the solution (i.e. value of the chosen bags) for the problem when using the first few bags and a smaller weight limit. When considering adding this bag, look at the solutions for whether to use this bag or not.\n", + "\n", + "Let's say we're looking at including bag _i_ and a total weight limit of _wl_. \n", + "* If we're not using this bag, the best solution is given by the best solution for the _i_-1 other bags with the same weight limit _wl_. \n", + "* If we _are_ using this bag, the best solution is the value of this bag plus the best solution for the other _i_-1 bags packed into (_wl_ - this bag's weight). We need the smaller weight limit to ensure there's enough space to fit this bag.\n", + "\n", + "We can fill the table row by row, looking up the values in the row above. \n", + "\n", + "The final solution is read off from the table, in the cell that represents all the bags and the full weight limit.\n", + "\n", + "The backpointers are there to easily trace back which decision was made when filling in each cell. That allows you to reconstruct the full solution." + ] + }, { "cell_type": "code", "execution_count": 5, @@ -115,84 +173,30 @@ ] }, { - "cell_type": "code", - "execution_count": 19, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "frozenset({1, 2, 3})" - ] - }, - "execution_count": 19, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "fs = frozenset([1, 2, 3])\n", - "fs" - ] - }, - { - "cell_type": "code", - "execution_count": 21, + "cell_type": "markdown", "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "1" - ] - }, - "execution_count": 21, - "metadata": {}, - "output_type": "execute_result" - } - ], "source": [ - "list(fs)[0]" + "A recursive definition, directly applying the ideas above. Without memoisation, this just won't terminate on even trivial-sized problems. " ] }, { "cell_type": "code", - "execution_count": 23, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "frozenset({2, 3})" - ] - }, - "execution_count": 23, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "fs.difference(frozenset([1]))" - ] - }, - { - "cell_type": "code", - "execution_count": 47, + "execution_count": 58, "metadata": {}, "outputs": [], "source": [ "@lru_cache(maxsize=None)\n", - "def recursive_valuefs(elements, weight_limit):\n", + "def recursive_value(elements, weight_limit):\n", " if len(elements) == 0:\n", " return frozenset()\n", " else:\n", - " this_element = list(elements)[0]\n", + " this_element = next(iter(elements)) # pick an arbitrary element\n", " other_elements = elements.difference(frozenset([this_element]))\n", " if this_element.weight > weight_limit:\n", - " return recursive_valuefs(other_elements, weight_limit)\n", + " return recursive_value(other_elements, weight_limit)\n", " else:\n", - " with_this = recursive_valuefs(other_elements, weight_limit - this_element.weight)\n", - " without_this = recursive_valuefs(other_elements, weight_limit)\n", + " with_this = recursive_value(other_elements, weight_limit - this_element.weight)\n", + " without_this = recursive_value(other_elements, weight_limit)\n", " items_with_this = with_this.union(frozenset([this_element]))\n", " if sum(e.value for e in items_with_this) > sum(e.value for e in without_this):\n", " return items_with_this\n", @@ -200,6 +204,51 @@ " return without_this" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "A couple of greedy attempts. The first tries to pack the items by weight, always putting in the lightest items first.\n", + "\n", + "The second tries to be a bit more clever and sorts the items by pennies-per-kilo, so the highest-value bags are placed first." + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [], + "source": [ + "def greedy_value_w(elements, weight_limit):\n", + " return list(itertools.takewhile(lambda es: es['weight'] < weight_limit,\n", + " itertools.accumulate(\n", + " sorted((e for e in elements), key=lambda e: e['weight']),\n", + " lambda es, e: {'weight': es['weight'] + e['weight'], 'value': es['value'] + e['value']}))\n", + " )[-1]['value']" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [], + "source": [ + "def greedy_value_vpw(elements, weight_limit):\n", + " return list(itertools.takewhile(lambda es: es['weight'] < weight_limit,\n", + " itertools.accumulate(\n", + " sorted((e for e in elements), key=lambda e: e['value'] / e['weight'], reverse=True),\n", + " lambda es, e: {'weight': es['weight'] + e['weight'], 'value': es['value'] + e['value']}))\n", + " )[-1]['value']" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Utilities\n", + "A couple of functions for displaying the results. Only try these on the small example in the question, as otherwise there won't be enough screen area to display the table!" + ] + }, { "cell_type": "code", "execution_count": 7, @@ -248,41 +297,13 @@ ] }, { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [], - "source": [ - "def greedy_fill(elements, weight_limit):\n", - " return len(list(itertools.takewhile(lambda s: s < weight_limit, itertools.accumulate(sorted(e['weight'] for e in elements)))))" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [], - "source": [ - "def greedy_value_vpw(elements, weight_limit):\n", - " return list(itertools.takewhile(lambda es: es['weight'] < weight_limit,\n", - " itertools.accumulate(\n", - " sorted((e for e in elements), key=lambda e: e['value'] / e['weight'], reverse=True),\n", - " lambda es, e: {'weight': es['weight'] + e['weight'], 'value': es['value'] + e['value']}))\n", - " )[-1]['value']" - ] - }, - { - "cell_type": "code", - "execution_count": 12, + "cell_type": "markdown", "metadata": {}, - "outputs": [], "source": [ - "def greedy_value_w(elements, weight_limit):\n", - " return list(itertools.takewhile(lambda es: es['weight'] < weight_limit,\n", - " itertools.accumulate(\n", - " sorted((e for e in elements), key=lambda e: e['weight']),\n", - " lambda es, e: {'weight': es['weight'] + e['weight'], 'value': es['value'] + e['value']}))\n", - " )[-1]['value']" + "# Solving the problems\n", + "Finally, how to solve the problem!\n", + "\n", + "Load the elements, convert them into something hashable for the recursive solutions." ] }, { @@ -307,6 +328,15 @@ " )" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Solving part 1\n", + "\n", + "All the approaches find the optimal solution." + ] + }, { "cell_type": "code", "execution_count": 14, @@ -368,6 +398,15 @@ "len(recursive_count(hashable_elements, weight_limit))" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Solving part 2\n", + "\n", + "The dynamic programming version find the optimal solution." + ] + }, { "cell_type": "code", "execution_count": 16, @@ -389,6 +428,13 @@ "value" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The greedy versions don't find optimal solutions, but the smarter algorithm gets pretty close." + ] + }, { "cell_type": "code", "execution_count": 17, @@ -429,9 +475,16 @@ "greedy_value_vpw(elements, weight_limit)" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The recursive solution also finds the optimal solution,but only thanks to memoisation." + ] + }, { "cell_type": "code", - "execution_count": 48, + "execution_count": 60, "metadata": {}, "outputs": [ { @@ -453,18 +506,18 @@ " Element(weight=434, value=195)})" ] }, - "execution_count": 48, + "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "recursive_valuefs(hashable_elements, weight_limit)" + "recursive_value(hashable_elements, weight_limit)" ] }, { "cell_type": "code", - "execution_count": 49, + "execution_count": 61, "metadata": {}, "outputs": [ { @@ -473,33 +526,13 @@ "2383" ] }, - "execution_count": 49, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "sum(e.value for e in recursive_valuefs(hashable_elements, weight_limit))" - ] - }, - { - "cell_type": "code", - "execution_count": 52, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "305061" - ] - }, - "execution_count": 52, + "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "(len(elements) + 1) * 5001" + "sum(e.value for e in recursive_value(hashable_elements, weight_limit))" ] }, {