Click here to monitor SSC
  • Av rating:
  • Total votes: 19
  • Total comments: 9
Joe Celko

Bin Packing Problems: The SQL

22 March 2012

The 'bin packing' problem isn't just a fascination for computer scientists, but comes up in a whole range of real-world applications.  It isn't that easy to come up with a practical, set oriented solution in SQL that gives a near-optimal result.

The bin packing problem is an NP-complete problem. It is a great way to make computer science students do some work and it is also useful in the real world. The basic problem statement is that you are given a set of (n) items. The items all have different volumes. We then have a supply of bins or boxes of the same size. The goal is to put the items into the bins and to minimizes the number of bins used.

As a real world application for this algorithm, look at television advertisements of arbitrary durations. Given time slots of, say, five minutes how do we pack the the most commercials into each time slot and maximize our daily profits. Think of free public service announcements or promo spots as unfilled bins.

Another example is scheduling tasks with known execution times on a set of identical machines. Minimize the number of machines needed for the whole project. This is actually the simplest form of a more complex problem. The task usually have an order of execution. The wire has to be on the wheel before the lug nuts, etc. But is another article for later.

To make the math easier, we assume that the items are sized in integer values from 1 to (k) where (k <= bin size). And to keep thing simple, the bins are size (v). I happen to like (v =12) because 12 has a lot of divisors and because I can use egg cartons and colored plastic eggs when I teach this problem.

It is obvious that we can not have an item of size zero. In spite of what people believe about suitcases, you cannot cram an oversized load into a bin. It is also obvious that if an item is the size of the bin then you have filled that bin and need to get to the next one. We know that the number of bins is between ceiling(sum(item_size)/ bin size) and (k), the number of items.

If the items are all of size one, then we can simply fill bins and use that lower limit of ceiling(sum(item_size) / bin_size). There are sophisticated algorithms for particular item sets, too.

But in the general case, we have to fall back on heuristics. Frankly, the good ones are done with procedural code and not SQL. But there are some quick heuristics that work very well in the real world.

The First Fit Algorithm provides ie fast, but often gives a non-optimal solution. It is just what you chink and the way that people pack a suitcase. This assumes that the items are in a queue and you have enough bins. Grab an item and put it into the first bin in which it will fit.

The First Fit Algorithm requires θ(n log n) execution time, where (n) is the number of items to be packed. This is the same as a non-stable sorting algorithm. You can boost this algorithm by first sorting the list of items in decreasing order (First-Fit Decreasing Algorithm).

This does not guarantee an optimal solution, and for longer queues, it may increase the running time because of the extra sorting. It is known, however, that there always exists at least one ordering of the items queue that produces an optimal solution. You just cannot find it easily. In 1973, Jeffrey Ullman (a very important name in computer science) proved that this algorithm can differ from an optimal packing by as much at 70%.That same year, S. S. Johnson showed that this strategy is never suboptimal by more than 22%, and furthermore that no efficient bin-packing algorithm can be guaranteed to do better than 22%.

Another version is if this algorithm is the First-Fit Increasing Algorithm, which is the same queue sorted from smallest to largest item size. It has all the same problems and limits.

The Best Fit Algorithm needs some smarts. We you get the next item in the queue, you look over the partially filled bins for one which will be completely full if the item is added to it. This sounds like it ought to be optimal. Sorry, but it can fail, too. As an exercise, try writing a query to pick subsets of (n) items that total to the bin target bin size. If you cannot fill a bin, insert itnto the gin that will have the smallest space left; the best fit.

The (n = 1) case is trivial and it is one configuration that you have to have in the final solution. The cases for (n > 1) have to be done with self-joins.

The Worst Fit Algorithm has an unfortunate name. We look at the available partially filled bins and put the item into the bin with the most empty space.

There is a website with animations of of various algorithms from the University of Arizona's computer science department. It is a demonstration of the ICON programming language. It is as high level language that started off as a string manipulation language to replace SNOBOL and then it evolved.

SQL and Bins

Getting a model for bin packing in SQL is not that hard, but the constraints are harder because they have to be at different level of aggregation. :et me jump into my DDL and then explain it.

CREATE TABLE Bins

(item_nbr SMALLINT DEFAULT 0 NOT NULL PRIMARY KEY

 CHECK(item_nbr > 0),

 item_size SMALLINT DEFAULT 0 NOT NUlL

  CHECK(item_size > 0),

 bin_nbr SMALLINT DEFAULT 0 NOT NULL

 CHECK(bin_nbr >= 0));

The size and the item numbers are constants which we load the set of items. Bin zero is essentially loose items on the floor, waiting to get packed. A VIEW of the bins will be handy, so let's do that:

CREATE VIEW Bins_Loads (bin_nbr, load_qty)

AS

SELECT bin_nbr, SUM (item_size)

  FROM Bins

 GROUP BY bin_nbr;

Let's set the bin size at 9. That means we need a constraint to enforce the condition that no bin is over-filled.

CREATE VIEW Bins2 (item_nbr, item_size, bin_nbr)

AS

SELECT item_nbr, item_size, bin_nbr

  FROM Bins

 WHERE 9

       >= ALL (SELECT SUM (item_size)

                 FROM Bins

                WHERE bin_nbr > 0

                GROUP BY bin_nbr)

WITH CHECK OPTION ;

The WITH CHECK OPTION is not well-known but it has been a part of SQL since the SQL-92 Standards. The updatable view is protected from any update that would change a row in violation of the WHERE clause. Try to do this insertion:

 INSERT INTO Bins2 (item_nbr, item_size, bin_nbr)

VALUES (99, 4, 1);  

leads to this error message:

  Msg 550, Level 16, State 1, Line 2

The attempted insert or update failed because the target view either specifies WITH CHECK OPTION or spans a view that specifies WITH CHECK OPTION and one or more rows resulting from the operation did not qualify under the CHECK OPTION constraint.

The statement has been terminated.

However, this is just fine:

INSERT INTO Bins2 (item_nbr, item_size, bin_nbr)

VALUES (99, 1, 1);

This might be easier to see with actual data, Start with this queue {6, 6, 5, 5, 5, 4, 4, 4, 2, 2, 2, 2, 3, 3, 7, 7, 5, 5, 8, 8, 4, 4, 5}   which gives us 110 units to go into bins of size of 9 units. If we did a first fit algorithm, the table would look like this:

INSERT INTO Bins(item_nbr, item_size, bin_nbr)

VALUES (1, 6, 1), (2, 6, 2), (3, 5, 3),

(4, 5, 4), (5, 5, 5), (6, 4, 5),

(7, 4, 6), (8, 4, 6), (9, 4, 7),

(10, 2, 7), (11, 2, 7), (12, 2, 8),

(13, 2, 8), (14, 3, 8), (15, 3, 9),

(16, 7, 10), (17, 7, 11), (18, 5, 12),

(19, 5, 13),(20, 8, 14),(21, 8, 15),

(22, 4, 16), (23, 4, 16), (24, 5, 17);

Doing the math, the theoretical lower bound is 13 boxes.  This gave us 17 boxes which looks like this:

Box nbr

Packing

1

6

2

6

3

5

4

5

5

9

6

8

7

8

8

7

9

3

10

7

11

7

12

5

13

5

14

8

15

8

16

8

17

5

Can we actually hit that lower bin count? Maybe, but the only way to find out is to try all possible packing arrangements in the general case.

One other estimation we can make is to try to try all combinations of (k) items and get a heretical upper bound.  The basic approach is a self join and hope that you do not a lot of small items.

SELECT B1.item_nbr AS item_nbr_1, B2.item_nbr AS item_nbr_2,

      (B1.item_size + B2.item_size) AS packing

  FROM Bins AS B1, Bins AS B2

 WHERE B1.item_nbr < B2.item_nbr

  AND (B1.item_size + B2.item_size) <= 9;

 

I wonder if I can pack three items in a b, so we try that:

SELECT B1.item_nbr AS item_nbr_1,

       B2.item_nbr AS item_nbr_2,

       B3.item_nbr AS item_nbr_3,

      (B1.item_size + B2.item_size + B3.item_size) AS packing

  FROM Bins AS B1, Bins AS B2, Bins AS B3

 WHERE B1.item_nbr < B2.item_nbr

   AND B2.item_nbr < B3.item_nbr

   AND (B1.item_size + B2.item_size + B3.item_size) <= 9;

 Let's try packing all combinations of four items. It still works. Push this to four items per bin with an extension of the pattern we have been using, we get this query.

SELECT B1.item_nbr AS item_nbr_1,

       B2.item_nbr AS item_nbr_2,

       B3.item_nbr AS item_nbr_3,

       B4.item_nbr AS item_nbr_4,

      (B1.item_size + B2.item_size + B3.item_size+ B4.item_size) AS packing

  FROM Bins AS B1, Bins AS B2, Bins AS B3, Bins AS B4

 WHERE B1.item_nbr < B2.item_nbr

   AND B2.item_nbr < B3.item_nbr

   AND B3.item_nbr < B4.item_nbr

   AND (B1.item_size + B2.item_size

        B3.item_size + B4.item_size) <= 9;

Again, we can extend the query to five items with this:

SELECT B1.item_nbr AS item_nbr_1,

       B2.item_nbr AS item_nbr_2,

       B3.item_nbr AS item_nbr_3,

       B4.item_nbr AS item_nbr_4,

       B4.item_nbr AS item_nbr_5,

      (B1.item_size + B2.item_size + B3.item_size

        + B4.item_size + B5.item_size) AS packing

  FROM Bins AS B1, Bins AS B2, Bins AS B3,

       Bins AS B4, Bins AS B5

 WHERE B1.item_nbr < B2.item_nbr

   AND B2.item_nbr < B3.item_nbr

   AND B3.item_nbr < B4.item_nbr

   AND B4.item_nbr < B5.item_nbr

   AND (B1.item_size + B2.item_size + B3.item_size

        + B4.item_size + B5.item_size) <= 9;

but now we get a failures at five items in a bin. Obviously, it would be better to start with a high value of (k) and keep reducing the value until you get a result back.

As it works out, with this data the Decreasing First Fit algorithm gives an optimal result. That queue would look like this: {8, 8, 7, 7, 6, 6, 5, 5, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 3, 3, 2, 2, 2, 2}. The results are like this:

Bin nbr

Packing

1

8

2

8

3

9

4

9

5

9

6

9

7

9

8

9

9

9

10

9

11

9

12

9

13

4

Do not be fooled by this!  If we has a bin of size 10 and five items of size 6, the theoretical lower bound would be three bins using ((5 * 6)/10). But it requires five bins no matter which algorithm you pick.

Without generating all possible permutations, see if you can come up with a practical, set oriented solution that gives a near-optimal result.

References:

  • Albers, S. and Mitzenmacher, M. "Average-Case Analysis of First Fit and Random Fit Bin Packing." Random Structures Alg. 16, 240-259, 2000.
  • Coffman, E. G. Jr.; Garey, M. R.; and Johnson, D. S. "Approximation Algorithms for Bin-Packing--An Updated Survey." In Algorithm Design for Computer System Design. Vienna: Springer-Verlag, pp. 49-106, 1984.
  • Garey, M. R.; Graham, R. L.; and Ullman, J. D. "An Analysis of Some Packing Algorithms." In Combinatorial Algorithms. New York: Algorithmics Press, pp. 39-47, 1973.
  • Graham, R. L. "Bounds on Performance of Scheduling Algorithms." In Computer and Job-Shop Scheduling Theory (Ed. E. G. Coffman Jr.). New York: Wiley, pp. 165-227, 1976.
  • Johnson, D. S. "Approximation Algorithms for Combinatorial Problems." In J. Comput. System Sci. 9, 256-278, 1974.
  • Johnson, D. S. "Approximation Algorithms for Combinatorial Problems." In Fifth Annual ACM Symposium on Theory of Computing (Austin, Tex., 1973). New York: Assoc. Comput. Mach., pp. 38-49, 1973.
  • Hoffman, P. The Man Who Loved Only Numbers: The Story of Paul Erdős and the Search for Mathematical Truth. New York: Hyperion, 1998.
  • Lewis, R.; 2009. "A General-Purpose Hill-Climbing Method for Order Independent Minimum Grouping Problems: A Case Study in Graph Colouring and Bin Packing"; Computers
Joe Celko

Author profile:

Joe Celko is one of the most widely read of all writers about SQL, and was the winner of the DBMS Magazine Reader's Choice Award four consecutive years. He is an independent consultant living in Austin, TX. He has taught SQL in the US, UK, the Nordic countries, South America and Africa.
He served 10 years on ANSI/ISO SQL Standards Committee and contributed to the SQL-89 and SQL-92 Standards.
He has written over 800 columns in the computer trade and academic press, mostly dealing with data and databases. He is the author of eight books on SQL for Morgan-Kaufmann, including the best selling SQL FOR SMARTIES.
Joe is a well-known figure on Newsgroups and Forums, and he is famous for his his dry wit. He is also interested in Science Fiction.

Search for other articles by Joe Celko

Rate this article:   Avg rating: from a total of 19 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: Bin packing
Posted by: Hugo Kornelis (view profile)
Posted on: Monday, April 2, 2012 at 5:02 AM
Message: "see if you can come up with a practical, set oriented solution that gives a near-optimal result."

I've done so (and published it) a long time ago already. See http://sqlblog.com/blogs/hugo_kornelis/archive/tags/Bin+Packing/default.aspx

Subject: Editing
Posted by: Ross Presser (view profile)
Posted on: Monday, April 2, 2012 at 5:32 AM
Message: Mr. Celko, I appreciate the technical smarts of this article, but what the heck happened to the writing? There are a few dozen typos before I even get to the first bit of SQL code. The page you linked to has VISUALIZATIONS of various algorithms, not ANIMATIONS -- you are not shown the process of packing, just the result. And the typos continue right through to the end ... "now we get a failures"??? Please, let someone proofread before you submit an article.

Subject: typo, or religious mathematics?
Posted by: Joseph (view profile)
Posted on: Monday, April 2, 2012 at 11:18 AM
Message: I'm curious what a "heretical upper bound" would be.

Is it a bound that causes the Inquisitor to pay you a visit? Has it been banned by The Church?

Or, sadly, is it just a typo for "theoretical"?

Subject: BIF
Posted by: aikimark (not signed in)
Posted on: Monday, April 2, 2012 at 4:58 PM
Message: Granted, my approach (Big-In-First) would have been more iterative than set, it should have a T-SQL implementation.

Similar to your Bin table, create a table that has a capacity (9) column for the bin. Initialize the table with 23 rows. For big sets of data, define an index on the Capacity column.

With a cursor (there, I said it), iterate through a descending sorted list of items, assigning the item to the first bin with capacity. A second (update) query is invoked to decrement the available space for that bin.

Delete any bin table rows that still have their initial capacity (9). This produces the following configuration of bin assignments.

------ Items ---------- Sum Capacity
8 8 1
8 8 1
7 2 9 0
7 2 9 0
6 3 9 0
6 3 9 0
5 4 9 0
5 4 9 0
5 4 9 0
5 4 9 0
5 4 9 0
5 2 7 2
2 2 7

Subject: sorry about the formatting
Posted by: aikimark (not signed in)
Posted on: Monday, April 2, 2012 at 5:01 PM
Message: The tab-delimited data looks so much nicer, trust me.

Subject: sorry about the formatting
Posted by: aikimark (not signed in)
Posted on: Monday, April 2, 2012 at 5:17 PM
Message: The tab-delimited data looks so much nicer, trust me.

Subject: Take the next step
Posted by: Bryant (view profile)
Posted on: Friday, April 6, 2012 at 5:50 AM
Message: Here is an extension to the problem. Packing bins is only 1 part of the issue. My assumption is that you are packing them fore some reason other than esthetics...or theoretical problem-solving.

Now that you have everything neatly packed away in the fewest number of bins, make them fit into a standard container, like a semi-trailer or a shipping container. You need to maximize the cubage used because anything less represents wasted fuel and lost revenue. That is a practical application.

Subject: Knapsack Problem and free book download
Posted by: Anonymous (not signed in)
Posted on: Friday, April 6, 2012 at 11:28 AM
Message: The Knapsack Problem is related to Bin Packing and it has been covered in other articles. A hueristic for Bin packing is to do it as a series of knapsack problems. Pack a knapsack, remove those items from the inventory and pack a second knapsack. Repeat until the inventory is empty.

Ond toeh best books on this topic is "Knapsack Problems: Algorithms and Computer Implementation" by Martello and Toth (John Wiley and Sons, 1990) is avaialble as a pdf on line at http://www.or.deis.unibo.it/kp/Chapter1.pdf.

The bad news is that the code is in unstructured FORTRAN.

Subject: Yet Another Step
Posted by: Robert young (view profile)
Posted on: Saturday, April 7, 2012 at 9:33 AM
Message: Bins
Ship Containers

then

54 footer loaded for customer drops. This is bin packing meets the traveling salesman. The semi has to be: weight balanced, in delivery order, and maximally filled. People earn a nice living writing such subsystems.

 
Simple-Talk Database Delivery

DLM
Patterns & Practices Library

Visit our patterns and practices library to learn more about database lifecycle management.

Find out how to automate the process of building, testing and deploying your database changes to reduce risk and make rapid releases possible.

Get started

Phil Factor
Microsoft and Database Lifecycle Management (DLM): The DacPac

The Data-Tier Application Package (DacPac), together with the Data-Tier Application Framework (DacFx), provides an... Read more...

 View the blog

Top Rated

Working with SQL Server data in Power BI Desktop
 What's the best way of providing self-service business intelligence (BI) to data that is held in... Read more...

Microsoft and Database Lifecycle Management (DLM): The DacPac
 The Data-Tier Application Package (DacPac), together with the Data-Tier Application Framework (DacFx),... Read more...

A Start with Automating Database Configuration Management
 For a number of reasons, it pays to have the up-to-date source of all the databases and servers that... Read more...

Archiving Hierarchical, Deleted Transactions Using XML
 When you delete a business transaction from the database, there are times when you might want to keep a... Read more...

Rollback and Recovery Troubleshooting; Challenges and Strategies
 What happens if your database deployment goes awry? Do you restore from a backup or snapshot and lose... Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

Temporary Tables in SQL Server
 Temporary tables are used by every DB developer, but they're not likely to be too adventurous with... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.