The SQL of Parts Explosions

30 May 2013
by Joe Celko

Parts explosions present a classic IT problem. How can one calculate such things as weight or cost  of assemblies in SQL? Joe shows how it can be done using nested sets, with not an IDENTITY or GUID in sight.

Parts Explosions with Complications

Parts explosions or Bill of Materials (BOM) data bases hold the description of how to build something, usually a manufactured product. I am not sure if “explosion” or “bomb” is a more politically correct term in the age of terrorism. I grew up with parts explosion and it sounds macho so I will stick with that. These databases describe how to make something by telling us two things: (1) what quantity of each parts we need and (2) how they fit together in a hierarchy of assemblies that make the final assembly.

Parts explosions are a classic IT problem and they have a history with RDBMS. This was one of the reasons people thought that SQL would never be used for “real databases”, but it might have a place with DSS (Decision Support Systems, an old name for BI). All the commercial databases in the early days were based on traversing trees or networks. Even after SQL became the choice for data bases, programmers were still locked into the old mindset, so you will see unnecessary  IDENTITY, GUIDs, and other exposed physical locators used to build fake pointer chains.

Parts is not Parts

Years ago there was a television commercial where a customer at the competition’s fried chicken store is served a bag of compressed chicken nuggets and asks “what is this made from?” to a slack-jawed, dull-witted boy beside the counter. The clerk  replies “Chicken parts” and the customer then ask which parts. The reply, “Parts is parts”, is  the tag-line for the commercial. This was some years before the “pink slime” debate, making it something of a prophecy.

The parts in a Parts Explosion come in flavors. A part_name can be

  1.  Final Assembly -There is only one of those and this is the whole point of the parts explosion

  2.  Atomic Parts - These are things that cannot be decomposed, they are often things like screws, nuts, bolts that are bought in bulk from an outside supplier. However an atomic part_name can be made in-house, such as an empty motherboard. Or they can be complicated units that we treat as atomic; we do not generally make a battery or an electric motor when we can buy them in bulk cheaper from a battery or motor manufacturer. When it breaks, we replace the whole unit without attempting repairs.

  3.  Sub-Assembly - These are the distinct decomposable units in the parts explosion. They can be made of further sub-assemblies until we get down to atomic parts. The same sub-assembly can appear in many places. The wheels on your automobile all use the same size and type of tire. However, an Indianapolis race car has different tires for the inside and outside wheels.

Versions of a Part

By now everyone has some experience with a product recall. I got a note from Electrolux that the battery pack in my electric broom was defective and that if I would mail it back in a box they provided, I would get a new version of the old unit.

Version inventory control is a good problem in itself. This is a combination of temporal and equivalence classes. Software is a familiar example. We have to support multiple versions of software at various levels until the older release is finally retired.

I have to back up a bit for a math lesson. An equivalence class is a partitioning of a set into subsets that are in some way equivalent under a function. A simple example is breaking natural numbers into odd and even with a mod function. I can give each class name or identify it with the minimum member and a rule. In the case of even numbers, we can use the definition {n: n N:  mod(n, 2) = 0}.

The versions of a part_name form classes whose members can be interchanged in an assembly. Here is a skeleton table for the lifetime of a series of product versions

CREATE TABLE Product_Lifetimes
(product_name CHAR(10) NOT NULL,
 version_nbr SMALLINT NOT NULL
  CHECK (version_nbr > 0),
 PRIMARY KEY (product_name, version_nbr),
 production_date DATE NOT NULL,
 retirement_date DATE,
  CHECK (production_date < retirement_date)
);

This follows the usual convention that a duration or lifetime is modeled as a half-open interval pair where the ending temporal value is NULL-able. The NULL means that event is still in process, still alive or whatever. Let's load the skeleton with dummy data.

INSERT INTO Product_Lifetimes
VALUES
('ABC', 8, '2000-01-01', '2012-03-31'),
('ABC', 9, '2010-01-01', '2012-12-31'),
('ABC', 10, '2012-01-01', NULL);

Version 9 was two year stop-gap between 8 and the current version 10 of product ABC. Finding what product versions are available for sale any given date is the usual query:

SELECT @sales_date, product_name, version_nbr
  FROM Product_Lifetimes
 WHERE @sale_date BETWEEN production_date AND retirement_date;

The most recent and oldest supported version on any date is two simple extra columns. Let's do it for the whole calendar:

CREATE VIEW Supported_Products
SELECT C.cal_date, P.product_name, P.version_nbr,
       MAX(P.version_nbr) OVER (PARTITION BY product_name)
       AS current_version_nbr
       MIN(P.version_nbr) OVER (PARTITION BY product_name)
       AS oldest_version_nbr
  FROM Product_Lifetimes AS P, Calendar AS C
 
   WHERE C.cal_date BETWEEN P.production_date AND P.retirement_date;

If I have been good about upgrades, I can look this view and see what I have to do. I can also use this to report sales in a roll up over time. if your company is less consumer friendly,  you can also use this to clear out old inventory.

I Like Bling

Besides versions of the same part_name , there can be options that are upgrades and not replacements. Automobiles are a good example. Do you want plain wheels or the fancy spinners? The basic radio or the full stereo system? 

The fancy stereo is a good example of the configuration management problem. You do not just unscrew the basic radio and plug in the Super Deluxe stereo. You need to cut different shaped holes for the speakers. You need speakers in the doors. That means the carpet is different. The wiring harness is different. This can be a major piece of work!  In fact, it is a good question as to whether we have a new product rather than just a base product with options.

Sometimes the “new product” is more marketing than configuration management. Painting a car “Mary Kay Pink” is not a major manufacturing problem but it has been a marketing success for Mary Kay Cosmetics. I am going to postpone that discussion and get to the next part_name of this article.

At this point, we have a pile of parts and now we need a blueprint.

The Blueprint

Let's consider a sample database that shows a parts explosion for a Frammis in a nested sets representation. A Frammis is the imaginary device that holds those Widgets that MBA students are always marketing in their textbooks. Each assembly in the hierarchy can appear multiple times; look at that bag of screws that came with your Ikea furniture. Each assembly has a physical weight  that we can compute from the weight of its parts and sub-assemblies times the quantity of those parts and assemblies. I suppose we could have something with a weight of zero, such as a warranty which is a legal abstraction or a packing slip that is too light to count.

 Here is a skeleton table

CREATE TABLE Frammis

(part_name CHAR(2) PRIMARY KEY,

 part_qty INTEGER NOT NULL

      CONSTRAINT positive_part_qty CHECK (part_qty > 0),

 part_wgt INTEGER DEFAULT 0 NOT NULL,

 CONSTRAINT non_negative_part_wgt

     CHECK (part_wgt >= 0),

 lft INTEGER NOT NULL UNIQUE

     CONSTRAINT valid_lft CHECK (lft > 0),

 rgt INTEGER NOT NULL UNIQUE

     CONSTRAINT valid_rgt CHECK (rgt > 1),

 CONSTRAINT valid_range_pair CHECK (lft < rgt));

Notice that at this point in the process, only the leaf nodes have a weight; we are going to fill in the zero weights shortly. We initially load it with this data:

Frammis

Part

name

Part qty

Part

wgt

 lft

 rgt

'A'

1

0

1

28

'B'

1

0

2

5

'C'

2

0

6

19

'D'

2

0

20

27

'E'

2

12

3

4

'F'

5

0

7

16

'G'

2

6

17

18

'H'

3

0

21

26

'I'

4

8

8

9

'J'

1

0

10

15

'K'

5

3

22

23

'L'

1

4

24

25

'M'

2

7

11

12

'N'

3

2

13

14

and here is the insert statement so you can play along

INSERT INTO frammis (Part_name,Part_qty,Part_wgt,lft,rgt)
VALUES ('A',1,0,1,28),('B',1,0,2,5),('C',2,0,6,19),('D',2,0,20,27),
       ('E',2,12,3,4),('F',5,0,7,16),('G',2,6,17,18),('H',3,0,21,26),
       ('I',4,8,8,9),('J',1,0,10,15),('K',5,3,22,23),('L',1,4,24,25),
       ('M',2,7,11,12),('N',3,2,13,14)

The leaf nodes are the atomic parts, the root node is the final assembly, and the nodes in between are sub-assemblies. Each part_name or assembly has a unique name (in this case one or two letters), a weight, and the quantity of this unit that is required to make the next unit above it. For example, a bicycle has two wheels, one seat and so forth.

The Frammis table is a convenient fiction to keep examples simple. In a real schema for a parts explosion, there should be other tables. One such table would be an Assemblies table to describe the structural relationship of each assembly. Another would be an Inventory table, tables for suppliers, for estimated assembly times and so forth.

Likewise, we might stop making our own motors and start buying them from a supplier. The motor assembly would still be in the tree and it would still be referred to by an assembly code, but its subordinates would disappear. In effect, the "blueprint" for the assemblies is shown in the nesting of the nodes of the Assemblies table with quantities added.

The iterative procedure for calculating the weight of any part_name is fairly straight forward. If the part has no sub-assemblies just use its own weight. For each of its sub-assemblies, if they have no sub-assemblies, then their contribution is their weight times their quantity. If they do have sub-assemblies their contribution is the total of the quantity times the weight of all the sub-assemblies.

CREATE PROCEDURE WgtCalc()

BEGIN

UPDATE Frammis -- clear out the weights

   SET part_wgt = 0

 WHERE lft < (rgt - 1);

WHILE EXISTS (SELECT * FROM Frammis WHERE part_wgt = 0)

UPDATE Frammis

   SET part_wgt

       = CASE -- all the sub-assemblies have a weight computed

         WHEN 0 < ALL (SELECT C.part_wgt

                         FROM Frammis AS C

                              LEFT OUTER JOIN

                              Frammis AS B

                              ON B.lft

                                = (SELECT MAX(S.lft)

                                     FROM Frammis AS S

                                    WHERE C.lft > S.lft

                                     AND C.lft < S.rgt)

                       WHERE B.part_name = Frammis.part_name )

       THEN (SELECT COALESCE (SUM(C.part_wgt * C.part_qty),

                                  Frammis.part_wgt)

               FROM Frammis AS C

                    LEFT OUTER JOIN

                    Frammis AS B

                    ON B.lft

                       = (SELECT MAX(S.lft)

                            FROM Frammis AS S

                           WHERE C.lft > S.lft

                             AND C.lft < S.rgt)

            WHERE B.part_name = Frammis.part_name )

       ELSE Frammis.part_wgt END;

END;

 I suggest that you step thru this loop and watch the computations move up to the final assembly. If you draw a picture, it is even better.

Frammis (with accumulated weight)

Part name

Part qty

Part wgt

 lft

 rgt

'A'

1

682

1

28

'B'

1

24

2

5

'C'

2

272

6

19

'D'

2

57

20

27

'E'

2

12

3

4

'F'

5

52

7

16

'G'

2

6

17

18

'H'

3

19

21

26

'I'

4

8

8

9

'J'

1

20

10

15

'K'

5

3

22

23

'L'

1

4

24

25

'M'

2

7

11

12

'N'

3

2

13

14

The weight of an assembly will be calculated as the total weight of all its sub-assemblies. Look at the M and N leaf nodes; the table says that we need two M units weighing 7 kilograms each, plus three N units weighing 2 kilograms each, to make one J Assembly. Therefore a J assembly weighs ((2 * 7) + (3 * 2)) = 20 kilograms. This process is iterated from the leaf nodes up the tree, one level at a time until the total weight appears in the root node.

This loop works bottom up. This means it will iterate once for each level of the Parts Explosion tree.

We could also create a recursive function that takes part_name as an input and returns the weight of that part or assembly. I will leave that as an exercise to the reader. I did a version that was slower than the loop, so see if you can do it better.

In a real Parts Explosion, we would also have columns for the cost of parts which also cascades upward like weight.  Assembly time is another factor; I can do some assemblies in parallel, so the time to build a sub-assembly Is the maximum assembly time of the sub-assemblies.

 


© Simple-Talk.com