The original SQL-86 standards introduced a bit of “standard-speak”: a word that has continued to be used ever since. This word is ‘effectively’. ‘Effectively’ is used to describe the final effect of a statement. We do not specify the implementation. We specify the results. You would probably assume that this is how all standards for languages would be done, but the standards for both FORTRAN and COBOL originally specified contiguous storage and other such implementation details. As far as I know, we were the first one to get away from that model with the SQL Standards.
Views are virtual tables, defined by a
SELECT statement, and it has to effectively behave as if the result of that statement was an actual physical base table. This meant that, anywhere in the grammar that allows a table, you can use the view. Views do not have space allocated for their data until they are invoked, so they are cheap in terms of resources.
Unfortunately, most SQL products show views and base tables separately in their object browsers as if they were fundamentally different. I suspect that the reason is that the text of the query has to be saved, for reasons will get into in a minute, and so it fits into a different part of the information schema. It is probably just easier to avoid rolling base tables and views together for displaying in a tool.
The Standard SQL syntax for the
VIEW definition is
CREATE VIEW <table name> [(<view column list>)] AS <query expression> [WITH [<levels clause>] CHECK OPTION] <levels clause> ::= CASCADED | LOCAL
<view column list> is optional; when it is not given, the
VIEW will inherit the column names from the query. The number of column names in the
<view column list> has to be the same as the degree of the query expression. If any two columns in the query have the same column name, you must have a
<view column list> to resolve the ambiguity. The same column name cannot be specified more than once in the
<view column list>.
<levels clause> option in the
WITH CHECK OPTION did not exist in SQL-89 and has no effect on queries, but only on
INSERT INTO and
FROM statements. We will get to the details of this underused feature. It has to do with nested views and how they are “unrolled” by a database event.
There are two common ways of implementing views in SQL products. The first way is to save the text, copy it inline like a derived table or CTE, and pass the expanded query to the optimizer. If you are an old assembly language or C programmer, think of an in-line text macro.
The advantage of in-line expansion is that it is simple, and lets the optimizer work with the whole statement. But this means that the same code will be executed over and over, even when the data is constant, by each query that uses this view.
This is actually a pretty common situation. Imagine that you have views that do monthly summaries, so that each time you referenced the “Sales_Summary” view, it is shared by many end-of-the-month reports. You are doing a pretty complicated computation over and over again. What was fine for the single query is suddenly less beneficial for the application as a whole.
The second method, which follows the Standards exactly, is to execute the view’s query text, and materialize it as a temporary table. The advantage of this is that if several sessions are sharing the same view, then they can get to this materialized table and share it. But temporary tables are not always cached quite the same as base tables.
IBM calls this approach a ‘materialized query table’ (MQT) and has more syntax to control how the system stores and refreshes the data. The view can be refreshed explicitly by the user or automatically by the system. But SQL Server users have to do a little more manual programming to get the same results.
Indexing and Views
SQL Server allows you to add indexes to views.This indexing is done for performance purposes. To make an indexed view, you need to create a unique clustered index on that view. This is effectively a key. I hope you notice the word ‘effectively’. This also requires that the view must be materialized. You could construct something similar by using temporary tables to store the data, but a temporary table is static. The indexed view will need to automatically maintain the view definition when it is accessed.
After you create the unique clustered index, you can then create multiple nonclustered indexes on the view. You can check whether you can do this by using the ObjectProperty function’s ‘IsIndexed’ argument. If you then drop the indexes, the view will still behave like a standard view.
Updateable and Read-Only views
Unlike base tables, views are either updateable or read-only, but not both.
DELETE operations are allowed on updateable views and base tables, subject to any other constraints.
DELETE are not allowed on read-only views, but you can change their base tables, and get the new data with the next invocation of the view.
VIEW is one that can associate each of its rows with exactly one row in an underlying base table. When the view is changed, the changes must pass through the view to that underlying base table unambiguously. Updateable views in Standard SQL are defined only for queries that meet these criteria
- They are built on only one table,
- No G
- No aggregate functions
- No calculated columns
- Any columns excluded from the view must be
NULL-able or have a
DEFAULTin the base table, so that a whole row can be constructed for insertion.
By implication, the view must also contain a key of the table. In short, the database engine must be absolutely sure that each row in the view maps back to one and only one row in the base table. Some updating is handled by the
CASCADE option in the referential integrity constraints on the base tables, not by the
The definition of updatability in Standard SQL is actually pretty limited, but very safe. The database system could look at information it has in the referential integrity constraints to widen the set of allowed updateable views. The SQL standard definition of an updateable view is actually a subset of the possible updateable views, and a very small subset at that. The major advantage of this definition is that it is based on syntax and not semantics. For example, these views are logically identical:
CREATE VIEW Foo1 -- updateable, has a key! AS SELECT * FROM Foobar WHERE x IN (1,2); CREATE VIEW Foo2 -- not updateable! AS SELECT * FROM Foobar WHERE x = 1 UNION ALL SELECT * FROM Foobar WHERE x = 2;
Foo1 is updateable and
Foo2 is not. While I know of no formal proof, I suspect that determining if a complex query resolves to an updateable query for allowed sets of data values possible in the table is an NP-complete problem.
Without going into details, here is a list of types of queries that can yield updateable views, as taken from “
VIEW Update Is Practical”, (Goodman 1990):
- Projection from a single table (Standard SQL)
- Restriction/projection from a single table (Standard SQL)
- Set Difference views
- One-To-One Joins
- One-To-One Outer Joins
- One-To-Many Joins
- One-To-Many Outer Joins
- Many-To-Many Joins
- Translated and Coded columns
This does not mean that all views of these types are updateable, but some of them can be with conditions.
Single-Table Projection and Restriction
In practice, many views are projections or restrictions on a single base table. This is a common method for obtaining security control by removing rows or columns that a particular group of users is not allowed to see. These views are usually implemented as in-line macro expansion, since the optimizer can easily fold their code into the final query plan.
One common use for a view is to provide summary data across a row. For example, given a table with measurements in metric units, we can construct a view that hides the calculations to convert them into US Customary units or English (Imperial) units.
It is important to be sure that you have no problems with
NULL values when constructing any calculated column. For example, given a Personnel table with columns for both salary and commission, you might construct this view:
CREATE VIEW Payroll (emp_nbr, paycheck_amt) AS SELECT emp_nbr, (salary + COALESCE(commission), 0.00) FROM Personnel;
Office workers do not get commissions, so the value of their commission column will be
NULL, so we use the
COALESCE() function to change the
NULLs to zeros.
If the values used in the computation come from one table, life is then easy for updating the view. But the computation requires values from two or more tables in the view, then I need to have an inverse function that also distributes the values out to the proper base tables. This is not possible in all cases.
Another common use of a view is to translate codes into text or other codes by doing table lookups. This is a special case of a joined view based on a foreign key relationship between two tables. For example, an order table might use a part number that we wish to display with a part name on an order entry screen. This is done with a
JOIN between the order table and the inventory table, thus:
CREATE VIEW Screen (part_nbr, part_name, ...) AS SELECT Orders.part_nbr, Inventory.part_name, ... FROM Inventory, Orders WHERE Inventory.part_nbr = Orders.part_nbr;
Sometimes the original code is kept and sometimes it is dropped from the view. As a general rule, it is a better idea to keep both values even though they are redundant. The redundancy can be used as a check for users, as well as a hook for nested joins in either of the codes.
The idea of
JOIN views to translate codes can be expanded to show more than just one translated column. The result is often a “star” query with one table in the center, joined by
FOREIGN KEY constraints to many other tables to produce a result that is more readable than the original central table.
Missing values are a problem. If there is no translation for a given code, no row appears in the view, or if an outer join was used, a null will appear. The programmer should establish a referential integrity constraint to
CASCADE changes between the tables to prevent loss of data.
A grouped view is based on a query with a
GROUP BY clause. Since each of the groups may have more than one row in the base from which it was built, these are necessarily read-only views. Such views usually have one or more aggregate functions and they are used for reporting purposes. They are also handy for working around weaknesses in SQL. Consider a view that shows the largest sale in each state. The query is straightforward:
CREATE VIEW Big_Sales (state_code, sales_amt_max) AS SELECT state_code, MAX(sales_amt) FROM Sales GROUP BY state_code;
SQL does not require that the grouping column(s) appear in the select clause, but it is a good idea in this case.
Until recently, a view based on a
UNION ALL operation was read-only because there is no way to map a change in the view to just one row in only one of the base tables. The
UNION operator will remove duplicate rows from the results. Both the
UNION ALL operators hide which table the rows came from. Such views must use a
<view column list>, because the columns in a
ALL] have no names of their own. In theory, a
UNION of two disjoint tables, neither of which has duplicate rows in itself should be updateable.
Some of the major databases, such as Oracle and DB2, support inserts, updates and delete from such views in some circumstances. For example, DB2 attempts to insert any given row into each of the tables underlying the
UNION ALL view. It then counts how many tables accepted the row. It has to process the entire view, one table at a time and collect the results.
- If exactly one table accepts the row, the insert is accepted.
- If no table accepts the row, a “no target” error is raised.
- If more than one table accepts the row, then an “ambiguous target” error is raised.
The overhead can be awful. Since SQL is set-oriented, a set might have all three conditions, so the method of resolving this is complicated.
A point that is often missed, even by experienced sql programmers, is that a view can be built on other views. The only restrictions are that:
- Circular references within the query expressions of the views are illegal
- A view must ultimately be built on base tables
One problem with nested views is that different updateable views can reference the same base table at the same time. If these views then appear in another view, it becomes hard to determine what has happened when the highest-level view is changed. As a very simple example, consider a table with two keys:
CREATE TABLE Canada (english_id INTEGER NOT NULL UNIQUE, french_id INTEGER NOT NULL UNIQUE, eng_word CHAR(30), fren_word CHAR(30)); INSERT INTO Canada VALUES (1, 2, 'muffins', 'croissants'), (2, 1, 'bait', 'escargots'); CREATE VIEW English_Words AS SELECT english_id, eng_word FROM Canada WHERE eng_word IS NOT NULL; CREATE VIEW French_Words AS SELECT french_id, fren_word FROM Canada WHERE french_id_word IS NOT NULL);
We have now tried the escargots and decided that we wish to change our opinion of them:
UPDATE English_Words SET eng_word = 'appetizer' WHERE english_id = 2;
Our French user has just tried haggis and decided to insert a new row for his experience:
UPDATE French_Words SET fren_word = 'Le swill' WHERE french_id = 3;
The row that is created is
(NULL, 3, NULL, 'Le swill'), since there is no way for view
French_Words to get to the view
English_Words columns. Likewise, the English view user can construct a row to record his translation,
(3, NULL, 'Haggis', NULL). But neither of them can consolidate the two rows into a meaningful piece of data.
To delete a row is also to destroy data; the French-speaker who drops ‘croissants’ from the table also drops ‘muffins’ from English_Words.
WITH CHECK OPTION Clause
WITH CHECK OPTION is specified, the viewed table has to be updateable. This is actually a fast way to check how your particular SQL implementation handles updateable views. Try to create a version of the view in question using the
WITH CHECK OPTION and see if your product will allow you to create it.
WITH CHECK OPTION is part of the SQL-89 standard, which was extended in Standard SQL by adding an optional
CASCADED is implicit if an explicit
LEVEL clause is not given. Consider a view defined as…
CREATE VIEW V1 AS SELECT * FROM Foobar WHERE col1 = 'A';
… and now
UPDATE it with …
UPDATE V1 SET col1 = 'B';
UPDATE will take place without any trouble, but the rows that were previously seen now disappear when we use V1 again. They no longer meet the
WHERE clause condition! Likewise, an
INSERT INTO statement with values
(col1 = 'b') would insert just fine, but its rows would never be seen again in this view. Views created this way will always have all the rows that meet the criteria and that can be handy. For example, you can set up a view of rows with a status code of ‘to be done’: Work on them, and change a status code to ‘finished’, and they will disappear from your view. The important point is that the
WHERE clause condition was checked only at the time when the view was invoked.
WITH CHECK OPTION makes the system check the
WHERE clause condition upon
UPDATE. If the new or changed row fails the test, the change is rejected and the view remains the same. Thus, the previous
UPDATE statement would get an error message and you could not change certain columns in certain ways. For example, consider a view of salaries under $30,000 defined with a
CHECK OPTION to prevent anyone from giving a raise above that ceiling.
WITH CHECK OPTION clause does not work like a
CREATE TABLE Foobar (col_a INTEGER NOT NULL PRIMARY KEY); CREATE VIEW TestView (col_a) AS SELECT col_a FROM Foobar WHERE col_a > 0 WITH CHECK OPTION; INSERT INTO TestView VALUES (NULL); -- This fails! CREATE TABLE Foobar_2 (col_a INTEGER CHECK (col_a > 0)); INSERT INTO Foobar_2(col_a) VALUES (NULL); -- This succeeds!
WITH CHECK OPTION must be
TRUE while the
CHECK constraint can be either
UNKNOWN. Once more, you need to watch out for
Standard SQL has introduced an optional
<levels clause>, which can be either
LOCAL. If no
<levels clause> is given, a
<levels clause> of
CASCADED is implicit. The idea of a
CASCADED check is that the system checks all the underlying levels that built views well as the
WHERE clause condition in the view itself. If anything causes a row to disappear from the view, the
UPDATE is rejected. The idea of a
WITH LOCAL check option is that only the local
WHERE clause is checked. Consider two views built on each other from the salary table:
CREATE VIEW Lowpay AS SELECT * FROM Personnel WHERE salary <= 250.00; CREATE VIEW Mediumpay AS SELECT * FROM Lowpay WHERE salary >= 100.00;
If neither view has a
WITH CHECK OPTION, the effect of updating
Mediumpay by increasing every salary by $1,000 will be passed without any check to
Lowpay will pass the changes to the underlying Personnel table. The next time
Mediumpay is used,
Lowpay will be rebuilt in its own right and
Mediumpay rebuilt from it, and all the employees will disappear from
Mediumpay has a
CASCADED CHECK OPTION on it, the
UPDATE will fail.
Mediumpay has no problem with such a large salary, but it would cause a row in
Lowpay to disappear, so
Mediumpay will reject it. However, if only
Mediumpay has a
CHECK OPTION on it, the
UPDATE will succeed.
Mediumpay has no problem with such a large salary, so it passes the change along to
Lowpay, in turn, passes the change to the Personnel table and the
UPDATE occurs. If both views have a
CHECK OPTION, the effect is a set of conditions, all of which have to be met. The Personnel table can accept
INSERTs only where the salary is between $100 and $250.
This can become very complex. Consider an example from an ANSI X3H2 paper by Nelson Mattos of IBM. Let us build a five-layer set of views, using xx and yy as place holders for
LOCAL, on a base table T1 with columns c1, c2, c3, c4, and c5, all set to a value of 10, thus:
CREATE VIEW V1 AS SELECT * FROM T1 WHERE (c1 > 5); CREATE VIEW V2 AS SELECT * FROM V1 WHERE (c2 > 5) WITH xx CHECK OPTION; CREATE VIEW V3 AS SELECT * FROM V2 WHERE (c3 > 5); CREATE VIEW V4 AS SELECT * FROM V3 WHERE (c4 > 5) WITH yy CHECK OPTION; CREATE VIEW V5 AS SELECT * FROM V4 WHERE (c5 > 5);
When we set each one of the columns to zero, we get different results, which can be shown in this chart, where ‘S’ means success and ‘F’ means failure:
To understand the chart, look at the last line. If xx =
CASCADE and yy =
LOCAL, updating column c1 to zero via V5 will fail, whereas updating c5 will succeed. Remember that a successful
UPDATE means the row(s) disappear from V5.
Follow the action for UPDATE V5 SET c1 = 0; VIEW V5 has no with check options, so the changed rows are immediately sent to V4 without any testing. VIEW V4 does have a
WITH LOCAL CHECK OPTION, but column c1 is not involved, so V4 passes the rows to V3.
VIEW V3 has no with check options, so the changed rows are immediately sent to V2.
VIEW V2 does have a
WITH CASCADED CHECK OPTION, so V2 passes the rows to V1 and awaits results.
VIEW V1 is built on the original base table and has the condition c1 > 5, which is violated by this
VIEW V1 then rejects the
UPDATE to the base table, so the rows remain in V5 when it is rebuilt. Now the action for
UPDATE V5 SET c3 = 0;
VIEW V5 has no with check options, so the changed rows are immediately sent to V4, as before.
VIEW V4 does have a
WITH LOCAL CHECK OPTION, but column c3 is not involved, so V4 passes the rows to V3 without awaiting the results.
VIEW V3 is involved with column c3 and has no with check options, so the rows can be changed and passed down to V2 and V1, where they
UPDATE the base table. The rows are not seen again when V5 is invoked, because they will fail to get past
VIEW V3. The real problem comes with
UPDATE statements that change more than one column at a time. For example, …
UPDATE V5 SET c1 = 0, c2 = 0, c3 = 0, c4 = 0, c5 = 0;
…will fail for all possible combinations of levels clauses in the example schema. I strongly advise againstthis kind of coding. I can be very powerful, but too hard to maintain and understand.
WITH CHECK OPTION as CHECK() clause
You can use a view that has a
WITH CHECK OPTION to enforce subquery constraints. These are constraints that use a schema or table level condition instead of just the row level constraints in
CHECK() clauses In full ANSI SQL you can use the
CREATE ASSERTION statement to do this, but SQL Server does not have that feature yet. Let’s fake it!
For example, consider a hotel registry that needs to have a rule that you cannot add a guest to a room that another is or will be occupying. Instead of writing the constraint directly, like this:
CREATE TABLE Hotel (room_nbr INTEGER NOT NULL, arrival_date DATE NOT NULL, departure_date DATE NOT NULL, guest_name CHAR(30) NOT NULL, CONSTRAINT schedule_right CHECK (H1.arrival_date <= H1.departure_date), CONSTRAINT no_double_booking CHECK (NOT EXISTS (SELECT * FROM Hotel AS H1, Hotel AS H2 WHERE H1.room_nbr = H2.room_nbr AND H2.arrival_date < H1.arrival_date AND H1.arrival_date < H2.departure_date)));
schedule_right constraint is fine, since it has no subquery, but many products will choke on the
no_double_booking constraint. Leaving the
no_double_booking constraint off the table, we can construct a view on all the rows and columns of the Hotel base table and add a
WHERE clause which will be enforced by the
WITH CHECK OPTION.
CREATE VIEW Hotel_Register (room_nbr, arrival_date, departure_date, guest_name) AS SELECT H1.room_nbr, H1.arrival_date, H1.departure_date, H1.guest_name FROM Hotel AS H1 WHERE NOT EXISTS (SELECT * FROM Hotel AS H2 WHERE H1.room_nbr = H2.room_nbr AND H2.arrival_date < H1.arrival_date AND H1.arrival_date < H2.departure_date) AND H1.arrival_date <= H1.departure_date WITH CHECK OPTION;
INSERT INTO Hotel_Register VALUES (1, '2016-01-01', '2016-01-03', 'Ron Coe'); COMMIT; INSERT INTO Hotel_Register VALUES (1, '2016-01-03', '2016-01-05', 'John Doe');
will give a WITH CHECK OPTION clause violation on the second INSERT INTO statement, as we wanted.
Views, like tables, can be dropped from the schema. The Standard SQL syntax for the statement is:
DROP VIEW <table name> <drop behavior> <drop behavior> ::= [CASCADE | RESTRICT]
Vendors had different
DROP behaviors in their SQL implementation. When you dropped a view, the engine usually removed the appropriate row from the schema tables. You found out about dependencies when you tried to use views built on other views that no longer existed, or whose name was re-used. This is not a good design, but it I what we had.
Likewise, dropping a base table could cause the same problem when the view was accessed. You did not even get a warning! The cascade option will find all other views that use the dropped view and remove them also. If
RESTRICT is specified, the view cannot be dropped if there is anything that is dependent on it.
The best way to approach views is to think of how a user wants to see the database and then give him a set of views that make it look as if the database had been designed just for his applications.There are two parts of this design principle. First, use views to hide complexity and secondly, to enforce consistency. One way to figure out what views you should have is to inspect the existing queries for repeated subqueries, CTEs, derived tables or expressions. These are good candidates for views.
Complexity and consistency are related. It is very hard to consistently reproduce a complex criteria. For example, it. We have an elaborate business rule about who can do what, we do not want to keep writing it over and over and over. It would be very much nicer to have a view of “employees who have level XYZ certification” that we can use, without this having to be computed over and over and over from a dozen tables.
One of the major uses of views is for security. The DBA can choose to hide certain columns from certain classes of users through a combination of security authorizations (that is the DCL or Data Control Language) and views.
However, while views can be used to mask columns, apply obscuring functions, or even change data, remember that the DCL (Data Control Language) also has a place in security. My personal opinion is that we do not spend enough time teaching DCL, but that is another topic.