10 May 2012

SQL VIEW Basics

SQL Views are essential for the database developer. However, it is common to see them misued, or neglected. Joe Celko tackles an introduction to the subject, but there is something about the topic that makes it likely that even the experienced developer will find out something new from reading it.

A VIEW is a virtual table, defined by a query, that does not exist until it is invoked by name in an SQL statement. This may sound simple enough, but  some developers have difficulties with the  concept.  Because of this, they tend to leave out VIEWs because they do not appreciate their value. It is easy to write a query and not bother to put it into a VIEW, because there is no performance boost. ‘If I am going to save code,’ they reason, ‘I want it in a stored procedure that can take a parameter list instead.’

In fact, a VIEW definition can be copied as in-line text, just like a CTE. But with a good optimizer, the SQL engine can decide that enough sessions are using the same VIEW and materialize it as a shared table. The CTE, in contrast, is strictly local to the statement in which it is declared.

VIEWs are often named incorrectly. A VIEW is a table, so it is named just like any other table. The name tells us what set of things it represents in the data model. The most common offender is the “Volkswagen” coder who prefixes or suffixes the VIEW name with “vw_” or “VIEW_” in violation of ISO-11179 rules. We do not mix data and meta data in a name. This is as silly as prefixing every noun in a novel with “n_” so the reader will know that  the word is a noun in English grammar.

The ANSI Standard SQL syntax for the VIEW definition is

the WITH CHECK OPTION is a little known and less used feature that has been around since the SQL-86 Standards. The <levels clause> option in the WITH CHECK OPTION did not exist in Standards before SQL-92. It is not implemented in T-SQL dialect, so I will skip it. If you see it, just remember T-SQL defaults to CASCADED behavior.

A VIEW has no physical existence in the database until it is invoked. You cannot put constraints on a VIEW for that reason. The name of the VIEW must be unique within the entire database schema, like a base table name. The VIEW definition cannot reference itself, since it does not exist yet. Nor can the definition reference only other VIEWs; the nesting of VIEWs must eventually resolve to underlying base tables. This only makes sense; if no base tables were involved, what would you be VIEWing?

You can either build a column name list in the VIEW header or inherit the column names from the SELECT statement. Building this list is usually just one quick “cut & paste” and well worth it. This is why we do not ever use “SELECT *” in a VIEW definition in production code. When the columns of a base tables change, the definition of the “star” will also change. If you are lucky, you will get an error when the VIEW has too many or too few columns when it is invoked. If you are not so lucky, the VIEW will run and give you unexpected results. If you are unlucky, the VIEW will run and give you wrong answers that you use.

Every few months, someone will post to a SQL forum asking how to use a parameter in a VIEW. They would never ask how to use a parameter in a base table. The sight of a SELECT statement instead of a list of column declarations throws their mindset in the wrong direction.

Mullins Heuristics for VIEWS

Programmers have rules and standards for creating base tables. The data element names should follow the ISO-11179 rules. We have to have a key. We can have all kinds of constraints. We can have Declarative Referential Integrity actions among other base tables. But how do you design a VIEW?

Craig Mullins, a DB2 expert and author, gave the following rule to ensure that VIEWs are created in a responsible and useful manner. Simply stated, the VIEW creation strategy should be goal-oriented. VIEWs should be created only when they achieve a specific, reasonable goal. Each VIEW should have a specific application or business requirement that it fulfills before it is created. That requirement should be documented somewhere, preferably in a data dictionary.

Although this rule seems obvious, VIEWs are implemented at some shops without much thought as to how they will be used. This can cause the number of VIEWs that must be supported and maintained to increase until so many VIEWs exist that it is impossible to categorize their uses. Nobody wants to take a chance and drop a VIEW, so they just write a new one. Whenever a base table used by a VIEW definition is changed, then all those VIEWs have to be re-compiled and checked. Since VIEWs can be built on top of VIEWs, this can be tricky.

Unlike other virtual tables, a VIEW is defined in the schema information tables and its definition (not its content!) is persisted. This implies some privileges are needed to use, create, alter and drop VIEWs. The first question is do you need to have privileges on the base tables that build a VIEW? Yes, but not full privileges. The minimal privileges would be to use the base tables, so you can build the VIEW. But that does not mean that the user needs to be able to directly query or modify the base tables.

The ideal design should give each user a set of VIEWs that make it look as if the schema was designed for just his or her use, without regard to the rest of the enterprise.

This is most often done for security and privacy. The payroll clerk can see the salaries of other personnel and change them. But he cannot give himself a pay raise and try to get out of town before the police find out. He can see the minimum, maximum and average salary in each department, but not who is making which salary.

The Data Control Language (DCL) is the third sub-language in SQL after DDL and DML. This is where the DBA can GRANT, REVOKE or DENY all kinds of schema object privileges. We spend almost no time on it in training classes, and failure to do it right can destroy your enterprise. As a generalization, the DBA ought to start with a list of roles users can play in the enterprise and create a script for the privileges each role needs. A new user can then be assigned a role and you do not have to repeat the script over and over.

Do not grant VIEW creation privileges to everyone. The “nearly the same” VIEWs are a special problem. One user might have read the spec “Employees must be over 21 years of age to serve alcohol” to mean strictly over 21 as of today or can they pour a drink on their 21-st birthday? If VIEW creation had been left to just one data modeler, only one of these VIEWs would exist and it would have the correct business rule.

Tricky Queries and Computations

Not all programers are equal, so you can make sure that the VIEWs preserve the best work in your shop. The other advantage is that if someone finds a better query for the current state of the database, you keep the VIEW header, drop the SELECT statement in the body, replace it and then re-compile your code. The programmer needs no knowledge of how the VIEW works. This technique becomes more useful as the SQL becomes more complex.

In T-SQL, we used to write complicated code to get sequence numbers and pure dates without time. This code was often hidden in VIEWs. The numbering can now be done with ROW_NUMBER() and we have a DATE data type since SQL Server 2008. In many cases, procedures and functions that used loops and fancy string manipulations can be replaced with VIEWs.

Updatable and Read-Only VIEWs

Unlike base tables, VIEWs are either updatable or read-only, but not both. INSERT, UPDATE, and DELETE operations are allowed on updatable VIEWs and base tables, subject to other constraints. INSERT, UPDATE, and DELETE are not allowed on read-only VIEWs, but you can change their base tables, as you would expect.

An updatable VIEW is one that can have each of its rows associated with exactly one row in an underlying base table. When the VIEW is changed, the changes pass through the VIEW to that underlying base table unambiguously. Updatable VIEWs in Standard SQL are defined only for queries that meet these criteria

  1. They are built on only one table
  2. No GROUP BY clause
  3. No HAVING clause
  4. No aggregate functions
  5. No calculated columns
  6. No UNION, INTERSECT or EXCEPT
  7. No SELECT DISTINCT clause
  8. Any columns excluded from the VIEW must be NULL-able or have a DEFAULT clause in the base table, so that a whole row can be constructed for insertion.

By implication, the VIEW must also contain a key of the table. In short, we are absolutely sure that each row in the VIEW maps back to one and only one row in the base table. The major advantage of this limited definition is that it is based on syntax and not semantics. For example, these VIEWs are logically identical:

But Foo1 is updateable and Foo2 is not. While I know of no formal proof, I suspect that determining if a complex query resolves to an updatable query for allowed sets of data values possible in the table is an NP-complete problem.

The INSTEAD OF trigger was the ANSI Standards Committee letting the Data Modeler decide on how to resolve the VIEW updating problem. These triggers are added to a VIEW and are executed on base tables that make up the VIEW. The user never sees them fire and work their magic.

 As an example, consider a VIEW that builds the total compensation for each employee by joining the personnel, employee stock holdings, bonuses and salary_amt tables in one VIEW. An INSTEAD OF trigger can update the total compensation using a hidden formula and complex business rules that the user never sees.

The use of INSTEAD OF triggers gives the user the effect of a single table, but there can still be surprises. Think about three tables; A, B and C. Table C is disjoint from the other two. Tables A and B overlap. So I can always insert into C and may or may not be able to insert into A and B if I hit overlapping rows.

Going back to my Y2K consulting days, I ran into a version of such a partition by calendar periods. Their Table C was set up on Fiscal quarters and got leap year wrong because one of the fiscal quarters ended on the last day of February.

Nested VIEWs

A point that is often missed, even by experienced SQL programmers, is that a VIEW can be built on other VIEWs. The only restrictions are that circular references within the query expressions of the VIEWs are illegal and that a VIEW must ultimately be built on base tables. One problem with nested VIEWs is that different updatable VIEWs can reference the same base table at the same time. If these VIEWs then appear in another VIEW, it becomes hard to determine what has happened when the highest-level VIEW is changed. As an example, consider a table with two keys:

The table declaration is a bit strange. It allows an English-only or French-only word to appear in the table. But the CHECK() constraint requires that a word must fall into one or both type codes.

We have now tried the escargots and decided that we wish to change our opinion of them:

Our French user has just tried Haggis and decided to insert a new row for his experience:

The row that is created is (NULL, 3, NULL, 'tripoux'), since there is no way for the VIEW FrenchWords to get to the VIEW EnglishWords columns. Likewise, the English VIEW user can construct a row to insert his translation, (3, NULL, 'Haggis', NULL), but neither of them can consolidate the two rows into a meaningful piece of data.

To delete a row is also to destroy data; the French-speaker who drops ‘croissants’ from the table also drops ‘muffins’ from VIEW EnglishWords.

WITH CHECK OPTION Clause

If WITH CHECK OPTION is specified, the VIEWed table has to be updatable. This is actually a fast way to check how your particular SQL implementation handles updatable VIEWs. Try to create a version of the VIEW in question using the WITH CHECK OPTION and see if your product will allow you to create it. The WITH CHECK OPTION was part of the SQL-89 Standard, but nobody seems to know about it! Consider this skeleton:

and now UPDATE it with

The UPDATE will take place without any trouble, but the rows that were previously seen now disappear when we use V1 again. They no longer meet the WHERE clause condition! Likewise, an INSERT INTO statement with VALUES (col1 = 'B') would insert just fine, but its rows would never be seen again in this VIEW. This might be the desired behavior. For example, you can set up a VIEW of rows in a jobs table with a status code of ‘to be done’, work on them, and change a status code to ‘finished’, and the rows will disappear from your VIEW. The important point is that the WHERE clause condition was checked only at the time when the VIEW was invoked.

The WITH CHECK OPTION makes the system check the WHERE clause condition upon INSERT and UPDATE. If the new or changed row fails the test, the change is rejected and the VIEW remains the same. The WITH CHECK OPTION clause does not work like a CHECK constraint.

The WITH CHECK OPTION must be TRUE while the CHECK constraint can be either TRUE or UNKNOWN. This is an example of the differences in DDL and DML in SQL. Once more, you need to watch out for NULLs.

T-SQL checks all the underlying levels that built the VIEW, as well as the WHERE clause condition in the VIEW itself. If anything causes a row to disappear from the VIEW, the UPDATE is rejected. Consider two VIEWs built on each other from the Personnel table:

If neither VIEW has a WITH CHECK OPTION, the effect of updating Medium_Paid_Personnel by increasing every salary_amt by $1,000 will be passed without any check to Low_Paid_Personnel . Low_Paid_Personnel will pass the changes to the underlying Personnel table. The next time Medium_Paid_Personnel is used, Low_Paid_Personnel will be rebuilt in its own right and Medium_Paid_Personnel rebuilt from it, and all the employees will disappear from Medium_Paid_Personnel .

If only Medium_Paid_Personnel has a WITH CHECK OPTION on it, the UPDATE will fail. Medium_Paid_Personnel has no problem with such a large salary_amt, but it would cause a row in Low_Paid_Personnel to disappear, so Medium_Paid_Personnel will reject it. However, if only Medium_Paid_Personnel has a WITH LOCAL CHECK OPTION on it, the UPDATE will succeed. Medium_Paid_Personnel has no problem with such a large salary_amt, so it passes the change along to Low_Paid_Personnel . Low_Paid_Personnel , in turn, passes the change to the Personnel table and the UPDATE occurs. If both VIEWs have a WITH CHECK OPTION, the effect is a set of conditions, all of which have to be met. The Personnel table can accept UPDATEs or INSERTs only where the salary_amt is between $100 and $250.

WITH CHECK OPTION as Constraints

Lothar Flatz, an instructor for Oracle Software Switzerland made the observation that while Oracle cannot put subqueries into CHECK()() constraints and triggers would not be possible because of the mutating table problem, you can use a VIEW that has a WITH CHECK OPTION to enforce subquery constraints.

For example, consider a hotel registry that needs to have a rule that you cannot add a guest to a room that another is or will be occupying. Instead of writing the constraint directly, like this:

The valid_stay_dates constraint is fine, since it has no subquery, but will choke on the no_overlaps constraint. Leaving the no_overlaps constraint off the table, we can construct a VIEW on all the rows and columns of the Hotel base table and add a WHERE clause which will be enforced by the WITH CHECK OPTION.

For example,

will give a WITH CHECK OPTION clause violation on the second INSERT INTO statement, as we wanted.

Dropping VIEWs

VIEWs, like tables, can be dropped from the schema. The T-SQL syntax for the statement is:

The use of the <table name list> is dialect and it gives you a shorthand for repeating drop statements. The drop behavior depends on your vendor. The usual way of storing VIEWs was in a schema information table is to keep the VIEW name, the text of the VIEW, but dependencies. When you drop a VIEW, the engine usually removes the appropriate row from the schema information tables. You find out about dependencies when you try to use something that wants the dropped VIEWs. Dropping a base table could cause the same problem when the VIEW was accessed. But the primary key/foreign key dependencies among base tables will prevent dropping some base tables.

Table Expression VIEWs

An old usage for VIEWs was to do the work of CTEs when there were no CTEs. The programmers created VIEWs, and then used them. Of course they wasted space, caused disk reads and were only used in one statement. It might be worth looking at old code for VIEWs that are not shared.

Today the reverse is true. Programmers create the same CTE code over and over in different queries and give it local names for each appearance. Those local names are seldom the same and are often “place markers” like “X” or “CTE_1” and give no hint as to what the table expression means in the data model.

It can be hard to factor out common table expressions across multiple queries. One query uses an infixed JOIN operators and another uses a <span class=”mono”>FROM</span> list, the predicates are equivalent but written slightly different and so forth.

I recommend that you sit down and think of useful VIEWs, write them and then see if you can find places where they would make the code easier to read and maintain. As an example, our hotel application will probably need to find vacant rooms by calendar date, compute an occupancy ratio by calendar date and other basic facts.

Another bad use is the one VIEW per Base Table myth that was poplar with DB2 programmers years ago. The reasoning behind this myth was the applaudable desire to insulate application programs from database changes. All programs were to be written against VIEWs instead of base tables. When a change is made to the base table, the programs would not need to be modified because they access a VIEW, not the base table.

This does not work in the long run. All you do is accumulate weird orphaned VIEWs. Consider the simplest type of database change – adding a column to a table. If you do not add the column to the VIEW, no programs can access that column unless another VIEW is created that contains the new column. But if you create a new VIEW every time you add a new column it will not take long for your schema to be swamped with VIEWs. Even more troublesome is the question of which VIEW should be used by which program. Similar arguments can be made for any structural change to the tables.

Keep up to date with Simple-Talk

For more articles like this delivered fortnightly, sign up to the Simple-Talk newsletter

This post has been viewed 130423 times – thanks for reading.

Tags: , , ,

  • Rate
    [Total: 147    Average: 3.8/5]
  • Share

Joe Celko is one of the most widely read of all writers about SQL, and was the winner of the DBMS Magazine Reader's Choice Award four consecutive years. He is an independent consultant living in Austin, TX. He has taught SQL in the US, UK, the Nordic countries, South America and Africa.
He served 10 years on ANSI/ISO SQL Standards Committee and contributed to the SQL-89 and SQL-92 Standards.
He has written over 800 columns in the computer trade and academic press, mostly dealing with data and databases. He is the author of eight books on SQL for Morgan-Kaufmann, including the best selling SQL FOR SMARTIES.
Joe is a well-known figure on Newsgroups and Forums, and he is famous for his his dry wit. He is also interested in Science Fiction.

View all articles by Joe Celko

  • Bruno

    auto materialization
    First, I’ll remember to point the metadata iso standard to people using annoying prefixes. Thanks for the link.

    Second, which DBMS auto materializes views? I know SQL Server doesn’t.

  • Tony Rogerson

    some points specific to Microsoft SQL Server
    This statement is incorrect “A VIEW has no physical existence in the database until it is invoked”.

    The only thing that is materialised is the view definition, that is stored in system tables are meta data. On referencing a view in a query that definition is expanded into the main body of the query, a quick check on the query plan clearly shows that. The only time it is materialised if when you index the view, it is then no longer a view but a persisted table linked to the underlying base table.

    Nesting views should be avoided because if a query needs to be debugged because of logic or performance problems its incredibly difficult to completely understand what is going on because the logic is hidden (the advantage of doing nesting), only by reading the underlying query plan can it be resolved – something not all developers are capable of.

    The WITH CHECK option is next to useless and doesn’t protect the underlying base table, it only works if the insert/update is done through the view itself, they cannot be relied upon to give good RI, constraints are a better choice [on the base table].

    Adding a view per table using the schema_binding option is good practice because it prevents modification of the table definition thus preventing anybody accidently dropping the table or dropping columns on it.

    Views have really had their day in SQL Server, a far better table expression exists that gives the view behaviour but also allows parameterisation, something severely missing in Views – that is inline Table Valued Functions.

  • pdanes

    Prefix
    I am one of those who you disparagingly call ‘Volkswagen coders’, and I resent the insult. I DO put vw in front of my view names, just like I put sp in front of stored procedures, fn in front of functions and so on. A view is NOT a table – it can contain JOINs, filters, transformation logic and a dozen other things that a pure table does not. The prefix also tells me where in SSMS to look for the definition – without that I would have to scan both table and view definitions to find the source. You have written some interesting and useful articles (including this one, actually), but you dropped the ball on this point.

  • Dave Poole

    Hungarian Notation
    I got bitten badly by the prefixing of objects to identify their type.

    I’m not sure if prefixing stored procs with sp_ still has the same performance hit that it did in SQL2000. In the old days prefixing stored procs with sp_ would cause SQL Server to look in the master database first before trying your local database.

    One of the projects I’ve worked on was to refactor a database where certain tables had become generic to the organisation rather than specific to a line of business. As we had a generic database that was replicated out to line-of-business specific servers this resulted in the table being moved to the company master DB but due to the complexities of the application code we could not eliminate the original from the line-of-business specific DB. The solution was to create a view that would reference the company master DB but retain the same name as the original table. So now we have a view called tbl_city!

    We had a similar situation where a DB was refactored so views became tables. So now we have tables with vw_ as a prefix.

    Yes, its a pain to have to hunt for SQL objects when you don’t know the type of object it is but you can get around this in SSMS.

    1. From the “Tools” menu selection “Options”
    2. Under the “Environment” category choose “Keyboard”.
    3. Choose a suitable shortcut and enter “SELECT * FROM INFORMATION_SCHEMA.TABLES ORDER BY TABLE_SCHEMA,TABLE_NAME”

    When there is a new query window you simply use your shortcut et voila. Job Done.

  • Rick

    Ignorant sheep
    There is absolutely nothing wrong with prefixing a view with vw. The organizational pros outweigh any possible con. When someone bashes on that notation, I realize they are an ignorant sheep.

    While SSMS is getting better organized, it still has ways to go.

  • Chai

    Prefixes
    Well-applied prefixes are very helpful for staying organized and efficient. Your analogy to English language is spurious, as code is not spoken language. (Although Esperanto does identify nouns by ending them with the letter “o”. Yes, it’s a dorky synthetic language, but more than a few people thought it was a good idea.) Context may identify nouns in English, but context is not always available to identify a view as such.

  • Tony Rogerson

    the problem with prefixes
    In pure theory a table has specific characteristics as defined by Codds rules; SQL allows you to break those. In an idea world we should not use nor need to use prefixes because the object name (table or table expression) should reflect what is being modelled.

    HOWEVER!

    In the real world it is extremely handy to visually see the type of object, for instance vw_, tvf_, usp_, fn_ etc. to aid in coding, debugging and performance optimising code.

    Remember – Celko does not code! Hence, his complete bias on this issue.

    T

  • pdanes

    Hungarian Notation
    It’s always possible to make a mess, no matter what system or convention is in use. I don’t know about the sp_, although I have read about it. I use just sp, not sp_.

    I have a distinct advantage over some, in that I am a one-man show. I do everything here, from OS and SQL Server installation and configuration to squabbling with users over the background color on a particular form. I can name tables, views, procedures and such anything I want and nobody else even knows or cares. It’s handy, that I don’t have to consult with anybody, but the downside is that I can’t consult with anybody when I get stuck. I rely on newsgroups, forums and personal experimentation for pretty much everything.

    The tip about the keyboard shortcut is handy, thanks, I have already thought of several other uses for it. However, in this case, it still involves several extra steps to locate what I need, so I’ll stick with my naming convention, unless someone shows me a VERY convincing reason why it’s a bad idea.

  • Robert young

    Rose is a rose, is a rose
    — A view is NOT a table – it can contain JOINs, filters, transformation logic and a dozen other things that a pure table does not.

    And, Codd’s by definition (and SQL), the result of a relational operation is, by definition, a table. Not a base table, but a derived table. This fact is what gives the RM much of its power. So far as Codd, the RM, and even SQL is concerned, ‘table’ isn’t the implementation of stored data; implementors are free to store any way they like. We have column-store databases as a result. The Codd, the RM, and much of SQL as implemented, do treat data correctly. The sequential [un|de]normalized treatment of data is what drives people to bifurcate. Joe’s point, speaking for him to a degree, is that if we’re to get the most out of SQL RDBMSs, we need to exploit their strengths, rather than their weaknesses.

    Strength is normalized data. Weakness is emulating sequential batch operations. VIEWs aid the former, and are irrelevant to the latter.

  • Preef Ickes

    The Real Question is…
    What do Volkswagen coders call ISO registry number reciters?

  • Rowland

    Why not prefix with crp_?
    Similarly, I used to wonder why some people name views with vw. One day I had to track down the causes of bad performance in a query.

    After some poking about I finally discovered the developer chose to join to a really nasty view –which was named just like all the tables in that database and contained the same quality code as the issue I was troubleshooting. Knowing it was a big, honkin’ ugly view would have shortened my quest by at least 15 minutes. This is where a prefix would really shine!

    So I have a suggestion: I’d like people to identify their really crappy, iffy, might-not-run code with crp_. That could be a real time saver! Views, Procedures and Functions all of it! Just think of looking down a list of hundereds of procedures when all the crp_ stuff aligns neatly!

    Your cooperation is truly appreciated!

  • Eric Russell

    Views that references no base tables …
    “..[the nesting of VIEWs must eventually resolve to underlying base tables]. This only makes sense; if no base tables were involved, what would you be VIEWing?..”
    “..A point that is often missed, even by experienced SQL programmers, is that a VIEW can be built on other VIEWs. The only restrictions are that circular references within the query expressions of the VIEWs are illegal and that [a VIEW must ultimately be built on base tables]..”

    Perhaps I’m missing the point, but it actually is possible to create a view that references no base table. For example:

    create view foo_letters
    as
    select ‘A’ as letter union all
    select ‘B’ as letter union all
    select ‘C’ as letter;
    go

    create view foo_numbers
    as
    select 1 as number union all
    select 2 as number union all
    select 3 as number;
    go

    create view foo_crossjoin
    as
    select letter, number
    from foo_letters
    cross join foo_numbers
    go

  • Syd Operahouse

    vw Coders…
    I am one of those who you call ‘Volkswagen coders’. Don’t be so insulting and rude.

    What you call ‘ISO-11179 rules’ are not rules, they are standards and as such they are for the observance of fools and the guidance of wise men. It seems to me that you have called yourself foolish and all of us wise!

    The original Hungarian Naming convention on which the ‘VW Coders’ base their naming conventions has a significance, it helps in so many ways; classification, optimisation and the general readability of code. The ‘vw’ prefix also tells me where to look. If we follow your ‘rules’ then all tables and views should be named with random alpha strings, after all according to you it doesn’t matter what they are called.

    Secondly, you ought to check what you say, a view is not a table as it can contain multiple joins and filters as well as a host of other stuff.

  • Tony Rogerson

    Result of a relational operation a table?
    It’s only a table if the tuples and attributes are unique; the result of a view does not guarentee that.

    In RM (Codd) a table needs to have unique tuples and attributes – there is no such restriction in SQL.

    This is valid syntax in SQL…

    SELECT x, x, x, x
    FROM y

    That is not valid in the RM because the attributes are duplicated.

    Don’t even go down the route, SQL very very badly implements the RM, we are better off using Tutorial D.

  • timothyawiseman@gmail.com

    Sheep? And Indexes
    @Rick I frequently prefix views with vw and I too find that the pros outweigh the cons. However that does not make those that disagree and are against the use of such prefixes “ignorant sheep”. This is especially true when they provide a valid reason (violates ISO standards), and especially when that person happens to be a recognized expert like Joe Celko.

    On a separate note, I thought this was an excellent article. While I appreciate that this is labeled as a “Basics” article, I would have appreciated slightly more discussion on the role of indexed/materialized views.

  • Anonymous

    self-reference
    It is certainly POSSIBLE in SQL Server to create a self-referencing view, but it is never possible to actually use it.

  • Anonymous

    WITH CHECK
    Instead of WITH CHECK, a view CAN be used to protect the base table, by writing it to return multiple rows for some illegal condition, then adding a unique index to the view. An insert to the base table which would create an illegal row, would attempt to insert multiple rows into the unique index, and thus will be prevented.

  • John Marsing

    VW Programmer
    I guess I’m a “VW” or “vw_” programmer, that’s what I like.

  • Peter Adam

    Hungarian notation
    Hungarian notation originally was designed for differentiating between same datatypes with different meaning – like between tables and derived tables (views).

    Hungarian notation was hijacked to be meaningless sign of the base datatype – see ISO standard 11179 Data element name like this:

    WorkOrder_Number
    Requirements_Text
    Requesting_Employee_Number
    Approving_Employee_Number

    here: http://en.wikipedia.org/wiki/Data_element_name

  • Robert young

    Are We There Yet?
    The argument for prefixes (Hungarian, generally) is that views, in particular, can be performance hogs and developers should be forewarned. On spinning rust, with a flatfile schema, that could be. But as we move rapidly, one hopes, toward multi-core/SSD machines for our normalized databases, the distinction at a performance level diminishes. Codd intent was that all operations yielded relations (more, or less, tables in SQL), and, again I’ll put words in Joe’s mouth, we should use a syntax which treats all such as equal citizens. It doesn’t matter, from the point of view of logic, whether the thing in question is a base table or a view. And that’s a good thing.

    Now, if only the engine vendors would uphold Codd’s Rule 6: All views that are theoretically updatable must be updatable by the system.

  • Ignorant sheep

    Fine, as long as They don’t represent object names
    /**
    I prefer to use VW_ because I want differentiate views from tables, so when I can expect where to and save a trip in my organizer. For my nesting view, I may want to name my view vw_vw_ to show it actually is a nesting view so now I can expect where to look further. Sometimes, I may end up with a view name like this VW_VW_VW_ArticleCommentSummary.
    **/

    SELECT Col_Person
    , Col_Topic
    , Col_Article_Name
    , Col_Comment
    FROM tblComment
    WHERE Col_Person = ‘Ignorant Sheep’
    AND Col_Topic = ‘prefix to database objects are so awesome’
    AND Col_Comment = ‘In the real world it is extremely handy to visually see the type of object, for instance vw_, tvf_, usp_, fn_ etc. to aid in coding, debugging and performance optimising code.’

    /** REALLY? Let me show you a technique which an Ignorant sheep and most Data Modeler may know but you
    **/

    SELECT PersonName
    , TopicName
    , ArticleName
    , CommentDetail
    FROM Comment
    WHERE PersonName = ‘Another Ignorant Sheep’
    AND TopicName = ‘Please show someone who can’t a way to do it’
    AND CommentDetail = ‘Ahh! Using a view with a good Object/Attribute name, I can still perform my troubleshooting without a sweat while lowering my code maintenance nightmare. I now understand what relational modeling is about.’

    — Above view is derived from the base table below:
    SELECT sPRSN
    , sTOPC
    , lARTC_NAME
    , lCMMT
    FROM tCMMT
    WHERE sPRSN = ‘The Same Ignorant Sheep’
    AND sTOPC = ‘Change me, if you know me and have access in me.’
    AND lCMMT = ‘This technique is aka ”Separating physical from logical layer”. Programmers who love to name their objects with a lot more information should learn why coupling the object names with types are bad for future maintenance and enhancement.’

  • Celko

    Glad everyone is talking!!
    >> Bruno: which DBMS auto materializes views? I know SQL Server doesn’t. <<

    Oracle, DB2, Ingres; I would have to check on Teradata. The gimmick is that the shared views cannot be updated while they are accessed by the sessions, so you need to know the concurrency model, and other things that vendor specific. It is VERY useful for reporting; you put summary data in views, materialize them and start running reports.

    >> The WITH CHECK option is next to useless and doesn’t protect the underlying base table, it only works if the insert/update is done through the view itself, they cannot be relied upon to give good RI, constraints are a better choice [on the base table]. <<

    When we get the CREATE ASSERTION statement I will agree. But right now this is the only way to do some constraints. Anything that involves a self-reference or another table is not possible with only column and table constraints.

    >> Adding a view per table using the schema_binding option is good practice because it prevents modification of the table definition thus preventing anybody accidentally dropping the table or dropping columns on it. <<

    That was an old DB2 idiom. They do not do that any more for reasons that Craig Mullins has has from a few decades in practice. The schema fills with orphan views and synonyms, docemetn is a problem, etc.

    >> Views have really had their day in SQL Server, a far better table expression exists that gives the view behavior but also allows parametrization, something severely missing in Views – that is inline Table Valued Functions.<<

    Proprietary code written in a non-declarative programming idiom? No thanks.

  • Celko

    Part II
    ========
    >> I am one of those who you disparagingly call ‘Volkswagen coders’, and I resent the insult. I DO put vw in front of my view names, just like I put sp in front of stored procedures, fn in front of functions and so on. <<

    I just did this fight at SQL Rally in Dallas! Do you know where prefixing comes from? Most programmers do not. A few guess 1960’s BASIC ($ for strings, etc) if they are old. It is older than that. Early software did not have symbol tables in compilers and interpreters, so the names had to pass meta-data. FORTRAN I used the letters I to N for integers; everything else was floating point (the only two data types). Tap based operating systems identified hardware by prefixes; the use of “<alpha><colon>” is taken from that ancestry.

    For the last few decades, we have had better software and no need for embedded meta-data. Instead, we could become more abstract. The basis for the ISO-11179 and the NCITS L8 Metadata Standards can be summarized as “name a thing for what it is by its nature -–A qua A” .

    Do you prefix every column name with its data type? How about having context dependent COBOL style “id” or worse columns? Do you affix “computed” to the computed columns? How much of the symbol table do load into the data element names?

    A function qua function takes inputs and returns a result; therefore, it is named “verb>_<object>” with the subject being the module.

    You are conceptually not able to make higher level abstractions.

    >> A view is NOT a table – it can contain JOINs, filters, transformation logic and a dozen other things that a pure table does not. <<

    Therefore sin(π) is not number? Is it a date? A squid? In math and RDBMS, we have orthogonality. An expression which returns an element in a set (domain) is just another way of writing that element. Does it bother you that 0.25, ¼ and the division of on by four are the same?

    >> The prefix also tells me where in SSMS to look for the definition – without that I would have to scan both table and view definitions to find the source. <<

    Do not confuse good conceptual models with weaknesses in tools or how you use them. Read the next posting.

    >> In the old days prefixing stored procs with sp_ would cause SQL Server to look in the master database first before trying your local database. <<

    Oh, thanks! I forgot to include that problem in my last comment.

    >> One of the projects I’ve worked on was to refactor a database where certain tables had become generic to the organization rather than specific to a line of business… So now we have a view called tbl_city! <<

    LOL! “Table City” sounds like a big box specialty furniture store 🙂

    >> Yes, its a pain to have to hunt for SQL objects when you don’t know the type of object it is but you can get around this in SSMS .. <<

    Thank you.

    >> Well-applied prefixes are very helpful for staying organized and efficient. <<

    That is why we use “[<role>_]<attribute>_<property>” in the ISO model. This lets me say “boss_emp_id” and “subordinate_emp_id” in a query with self-joins.

    >> (Although Esperanto does identify nouns by ending them with the letter “o”. Yes, it’s a dorky synthetic language, but more than a few people thought it was a good idea.) <<

    The estimate is 5-6 million people are readers and/speakers. The National Esperanto conference is in Dallas this year (http://www.esperanto-usa.org/node/2317 ) and I want to take a week to go to a class. We are way cooler than Klingon!

    >> Context may identify nouns in English, but context is not always available to identify a view as such. <<

    Actually, markers identify nouns in English; “the floobsnacker”, “a floobsnacker”, “some floobsnacker”, “<adjective> floobsnacker”,
    Back on topic: I should not care until I need to get down to that level.

    >> In pure theory .. HOWEVER! In the real world it is extremely handy to visually see the type of object, for instance vw_, tvf_, usp_, fn_ etc. to aid in coding, debugging and performance optimizing code. Remember – Celko does not code! Hence, his complete bias on this issue. <<

    Actually, I code a lot; every day for the last 35 years as exercise; this is my morning crossword puzzle. I used to write a lot of lines of code but now I do MUCH more real work with less physical text in my programs. My goal is to get everything into one statement so the optimizer can do magic.

    “You know you’ve achieved perfection in design, not when you have nothing more to add, but when you have nothing more to take away.” — Antoine-Marie-Roger de Saint-Exupery (1900 – 1944).

    =======

    >> It’s always possible to make a mess, no matter what system or convention is in use. I don’t know about the sp_, although I have read about it. I use just sp, not sp_. <<

    NO! the “sp_” was a primitive one pass compiler flag but that does make “sp” better! Tell me what this module of code does by its very nature ?

    >> I have a distinct advantage over some, in that I am a one-man show. I do everything here .. I’ll stick with my naming convention, unless someone shows me a VERY convincing reason why it’s a bad idea. <<

    There is an old cartoon of a widow with children at an open grave in the rain with a figure under an umbrella at the headstone ask “I know this is a bad time, but did he say anything about source before he died?”

    Be professional.

    >> So I have a suggestion: I’d like people to identify their really crappy, iffy, might-not-run code with crp_. That could be a real time saver! Views, Procedures and Functions all of it! Just think of looking down a list of hundreds of procedures when all the crp_ stuff aligns neatly! <<

    YES!~YES!~YES! ”

    >> Perhaps I’m missing the point, but it actually is possible to create a view that references no base table. <<

    You get points for this one; I use the table constant construction a lot and did not mention. Here is skeleton without dial

    CREATE VIEW Foobar (..)
    AS
    SELECT X.*
    FROM (VALUES (..), .. , (..)) AS X;

    >> I am one of those who you call ‘Volkswagen coders’. Don’t be so insulting and rude. <<

    Sorry but that is the slang inside the SQL/RDBMS community.

    >> The original Hungarian Naming convention on which the ‘VW Coders’ base their naming conventions has a significance, it helps in so many ways; classification, optimization and the general readability of code <<

    ..].In languages with WEAK data typing. This language Model is out of favor because it is a nightmare to implement and worse to compute. SQL is a strong typed language. “one fact, one place, one time, ONE WAY”

  • AlexK

    Please clarify
    Hi Joe,

    Can you clarify the following: “Unlike base tables, VIEWs are either updatable or read-only, but not both”. How can a base table be both updatable and read-only?

  • AlexK

    Please clarify
    Hi Joe,

    Can you clarify the following: “Unlike base tables, VIEWs are either updatable or read-only, but not both”. How can a base table be both updatable and read-only?

  • Robert young

    Apple of My Eye
    — My goal is to get everything into one statement so the optimizer can do magic.

    Sounds like an APL refugee!

  • Anonymous

    volkswagen
    Thanks for the input but I’ll continue to prefix views in my environment with “vw” because that is what it is. I don’t get your point at all. I don’t want my devopers thinking that a “view” composed of multiple tables and subqueries can act the same as a single table.

  • Paul

    Naming
    As a code developer & part time DBA I totally agree with Joe. I HATE names that start with str or int or dtm. Tell me what it is, NOT how it is stored. If you need to make a copy (like I often need to make string copies of dates, for an API) postfix the type:
    DateEntered
    DateEnteredString = DateEntered.ToString(“dd-MMM-yyyy”)
    That way it sorts correctly if you look at object lists.

  • archie

    re:vw_ and n-

    >>This is as silly as prefixing every noun in a novel with “n_” so the reader will know that the word is a noun in English grammar.

    I wouldn’t use the English language if I wanted a good example of a consistent syntax that follows standards 🙂

    If we prefixed nouns with n- and verbs with v- and so on , we’d have a very quick solution to the old classic artificial intelligence problem of how to get a machine – or a non-native English speaker who doesn’t have the required background knowledge – to tell the difference between:

    “time flies like an arrow”
    and
    “fruit flies like a banana”

    [n-time][v-flies] like [n-“an arrow”]
    [n-“fruit flies”] [v-like] [n-“a banana”]

    In this case, the extra ( and “non-standard” ) text gives some help to the intended reader, while native speakers can ignore it.

    Similarly, if we name our views “vw_”, we are giving some extra information to someone looking at the code who is not familiar with the database objects : they know immediately that this is a view and therefore may be a complex object, drawing data from multiple tables, non-updateable, etc. And of course the database engine ( the “native speaker” doesn’t care what the View is called.)

  • Paul

    Naming
    As a code developer & part time DBA I totally agree with Joe. I HATE names that start with str or int or dtm. Tell me what it is, NOT how it is stored. If you need to make a copy (like I often need to make string copies of dates, for an API) postfix the type:
    DateEntered
    DateEnteredString = DateEntered.ToString(“dd-MMM-yyyy”)
    That way it sorts correctly if you look at object lists.

  • Anonymous

    VW
    Volkswagen make solid, reliable cars.

    I expect the same could be said of their databases.

  • Paul

    Naming
    As a code developer & part time DBA I totally agree with Joe. I HATE names that start with str or int or dtm. Tell me what it is, NOT how it is stored. If you need to make a copy (like I often need to make string copies of dates, for an API) postfix the type:
    DateEntered
    DateEnteredString = DateEntered.ToString(“dd-MMM-yyyy”)
    That way it sorts correctly if you look at object lists.

  • Anonymous

    Naming
    @Paul
    “Tell me what it is..”

    Surely that it IS a string or a number is non-trivial part of what it IS. In fact to not tell you that is to not tell you all that it IS.

    It undoubtedly makes it a lot easier to find bugs if the database developer took the time to be considerate and put a couple of little letters in each name to tell you what it is.

    For example, spot the mistake (apart from it not being parameterized):

    INSERT INTO Person_Staff (NameFirst, NameLast, Birth, EnteredBy) VALUES (‘Anon’,NULL,’1980-01-01′,’Me’)

    Now spot the mistakes:

    INSERT INTO spPerson_Staff (strNameFirst, strNameLast, dtBirth, intEnteredBy) VALUES (‘Anon’,NULL,’1980-01-01′,’Me’)

    The latter is blatantly superior.

    By all means put the object type somewhere else in the name (a suffix might well be preferable). But to leave it out completely is just laziness that leads to easy to make and hard to spot bugs.

    Why would anyone “HATE” (in CAPS no less) names with prefixes? Such venom suggests to me it’s irrational.

    And Paul, have you not noticed the code you posted is prefixed with the data type:

    DateEntered

    And a four character prefix no less. Surely your argument would be the field to be called

    Entered

  • dba-one

    Prefix
    Another who adds a prefix to views, procedure, functions and nearly all objects other than tables. For views I use UV_ but I see the point in not using view_ as ‘view’ is a reserved word.

  • Phil Factor

    Volkswagen coder
    Shouldn’t it be a ‘Skoda Coda’?

  • rdbuk

    Prefix
    @ Phil Factor

    ….Brilliant 😀

  • rdbuk

    SQL VIEW Basics
    @ Joe

    …Brilliant, thanks 😀

  • ToddO

    Affixing database object names by class
    I like this article but really enjoy the spirited debate afterwards. This time I think Mr. Celko has it right.

    In my opinion, where you fall on in any debate depends on how you look at things. Specifically for the "affixes in database object names" debate, I think it depends on how you answer this question:
    "Is the SQL language primarily a tool for information management (IM) or application development (AD)?" If you answered AD, then I suspect you like the affixes because of familiarity and convenience. If you think IM is a more proper answer, then you probably dislike affixes like "vw_" because they obfuscate and lengthen names without helping manage information better.

    To me, IM wins over AD because the former is an ongoing function while the latter mostly occurs once per project. As a consulting IM professional, I am often asked to set naming conventions for my clients. I advocate no affixes, but strict adherence to terminal class words. It is all to better empower the consumers of the data.

  • firman

    make roles access database
    can u help me about user roles database, I want to make some user just view my table database which the column have set only isnull.and the user can select,update the table.

    my database is SQL server enterprise 2000

    thanks, I hope u can help soon..

  • Holmes

    Volkswagen coder
    Little late to the discussion, but I hope I get an answer…

    "Do not confuse good conceptual models with weaknesses in tools or how you use them."

    OK. So maybe I’m weak in my tool use. But using object prefixes has, I believe, helped me to be better organized saved me time and, therefore, money in searching for the objects.

    You can turn me to your way of thinking, Joe, if you explain to me what my violation of ISO-11179 rules is costing me.

    Thanks.
    Jim

  • Red

    great ! but !
    I m not agree with you about "vw" prfix because it helps to undestand easy while writting complex stored procedures or complex codes in sql server. However standarts are everything in sql server 🙂

  • SwapnilKothari

    SQL Views and Tables real time
    Hi Joe/All,

    I have come across a scenario where we have a table which stores around 1M records and there are 3-4 views which are created using multiple tables but all use this base table with 1M records and other tables joined for different purpose.

    The common table with 1M records is a real time and inserts happen to it directly from the web application.
    On the otherhand, we are would like to materialize the views created but we are stuck up with the dependency on the table being real time. If we materialize the views into tables then they won’t be refreshed real time along with the base table. If we don’t materialize them then we are facing performance issues as each view results into 2-3M records.

    Any suggestion are welcomed!

    Cheers
    Swapnil