|
|
Do Alpacas Dream of farming DBAs?
Last post 09-10-2007, 5:13 AM by Anonymous. 36 replies.
-
08-22-2007, 9:13 PM |
-
08-22-2007, 11:14 PM |
-
08-22-2007, 11:54 PM |
-
08-23-2007, 12:15 AM |
-
David
-
-
-
Joined on 10-20-2006
-
-
-
-
|
Think "out of the box", literally. Get one 500GB External Hard Drive, SATA if possible. Very affordable. Plug it into the firewire port, or USB 2.0. Space problem? Nada. Move Tempdb. Move data, create additional data files, do what every you want. Truncate, delete, BCP. Another idea, get a 4GB or the biggest memory stick you can get, get 2 of these. Plug them in. Now, question is, can you create additional data files for Tempdb on these memory sticks? Will be very fast!! I've just tested this on my laptop and it works! :-). 4 USB ports x 4GB = 16GB. I've just read that there are up to 32GB memory sticks. Wow. So think out of the box people. Cheers.
|
|
-
08-23-2007, 1:57 AM |
-
GilaMonster
-
-
-
Joined on 07-16-2007
-
Deepest, darkest Africa
-
-
-
|
Having recently been through an exercise of deleting a large number of unwanted rows from a very large table, I can feel the pain.
Here's what I would do.
Attach an external drive/USB drive to the server, or map a network drive if that isn't possible.
BCP out the rows to the external/network that are still wanted, as identifed by the following query
SELECT * FROM tblOrphans WHERE FieldX IN ( SELECT FieldX FROM tblParent1 UNION ALL SELECT FieldX FROM tblParent2)
Then truncate the table tblOrphans.
That's got rid of the unwanted rows, but the db is still far too big.
Run a shrink on the DB to bring it down to a reasonable size, allowing space for the rows that were exported to be replaced, and for some work space for things like reindexing.
Rebuild all the indexes that were scrambled by the shrink
BCP in the rows that were exported in step 1.
add in the foriegn keys that were 'missed'
Go and have a beer or two.
Oh, while you're at it, change that second nonclustered index on tblOrphans to a non-unique index on fieldY and put X as an included column.
David, I don't know where you work currently, but I know of a DBA opportunity in JHB if you're interested. Send me a PM if you want to know more
|
|
-
08-23-2007, 7:30 AM |
-
08-23-2007, 7:51 AM |
-
08-23-2007, 9:02 AM |
|
|
Proposed new delete statement
-- remove useless index DROP INDEX dbo.tblOrphans.idx2;
-- proposed new delete DELETE FROM dbo.tblOrphans FROM dbo.tblOrphans a LEFT JOIN dbo.tblParent1 b ON (a.FieldX = b.FieldX) LEFT JOIN dbo.tblParent2 c ON (a.FieldX = c.FieldX) WHERE b.FieldX IS NULL OR c.FieldX IS NULL
|
|
-
08-23-2007, 9:04 AM |
-
Tore
-
-
-
Joined on 08-23-2007
-
Columbia, SC
-
-
-
|
Don't forget the parents...
Many good suggestions here. We could definitely drop at least idx2 on tblOrphans. That should help create some breathing room. We could also use the 50 Gb on C: for a new file or database to hold some temporary results.
Below, SPACEMADEAVAILABLE can refer to either freed up space on PRIMARY (or other) or new file on C:. Consider it a virtual FileGroup.
Then I would do the following (remember, there is no relevant index on the parents):
CREATE TABLE newOrphans (FieldX INT NOT NULL, FieldY VARCHAR(750) NULL) ON SPACEMADEAVAILABLE
CREATE UNIQUE NONCLUSTERED INDEX idx1 ON dbo.tblOrphans (FieldX ASC, FieldY ASC) WITH (PAD_INDEX = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON PRIMARY [or on SPACEMADEAVAILABLE]
INSERT INTO newOrphans (FieldX) SELECT FieldX FROM tblParent1
INSERT INTO newOrphans (FieldX) SELECT FieldX FROM tblParent2
NOTE: If there is overlap between tblParent1.FieldX and tblParent2.FieldX, do a single insert with a UNION on the select instead.
Then we need to get the FieldY values. We are still dealing with 14 Mill rows, although small, so BCP comes in handy. BCP export to a file on C: using the following SELECT:
SELECT New.FieldX, Old.FieldY FROM newOrphans as New INNER JOIN tblOrphans as Old ON Old.FieldX = New.FieldX
Finally, truncate tblOrphans, shrink database, and BCP import from the file produced. And rebuild the indexes on tblOrphans.
Do your cleanup: Drop newOrphans, add back the second index on tblOrphans (if needed), and get rid of the temporary SPACEMADEAVAILABLE unless it is part of PRIMARY.
Also - this may be a good time to add some referential integrity... :->
|
|
-
08-23-2007, 9:09 AM |
-
08-24-2007, 9:15 AM |
-
CathyP
-
-
-
Joined on 08-24-2007
-
-
-
-
|
Hmmmm - how about........
The table is called dbo.tblOrphans.
I'd investigate whether I could just truncate it on the basis that a table of orphaned records are just that......
|
|
-
08-27-2007, 10:21 AM |
-
Jonas
-
-
-
Joined on 11-28-2006
-
-
-
-
|
Vow.. child table referencing two parents on the same key:) forget standard referential integrity.. if you enforce it ugrrh… all the possible errors.. forget cascading deletes. I wonder what was a logic behind this kind of design: was it a. both parents have same key but share different attribute(column) set or b. parent1 doesn’t have to have matching key in parent1 and the rest of columns differ on some predefined business rules or c. there is no logic. Anyway, it is amazing, that somebody can come up with this stuff. In this situation, in my opinion, there is nothing fancy you can do about, just regular dba routines. I would do the following, most of it is already mentioned above, but this is my routine in these kind of situations (hardware solutions out of question as I understand):
1.Backup the db, enable Simple recovery
2.Drop both indexes (if heavily fragmented, if not drop after step 3)
3.Export ‘active/wanted’ data to the file on C drive using bcp variable length. 14 mil records or about 3.4GB yeah that will take few hours
4.Truncate the table tblOrphans
5.For this step I would need more details.
a.Db could be shrunk to release the space but running it on 270Gb size could take a very long time, particularly if we need to release over 100GB. We might be able to speed up if indices are heavily defragged and could be dropped and recreated after shrink is done.
b.It was mentioned by vendor that db should be about 50GB size. How much truth is in it? Can we repeat same purge procedures for other tables and get actual db size under 100GB? If yes maybe we could release enough space on C from log files and make actual export of data to new db on C (w/o indices), once done script all sql objects and drop original db, detach new db and copy to D drive, attach and run indices along with other appropriate scripts e.g. permissions, etc In my experience solution B is usually faster for dbs under 100GB vs. shrinking from at least twice the size needed, but it requires a lot of attention.
6.Import the data back from the file using either BULK INSERT
7.Build same indexes dropped in step2 (they are not redundant as some suggested)
8.Enable required recovery model and issue Full backup
9.As for the future, there are few options either setup delete triggers (as suggested by David B.) either schedule a job the either runs appropriate delete statements or better executes procedure that has the same statements. Note: if this is SQL 2005 server, I would also disable indexes before the deletes and enable back afterwards to speed up things. Also, if possible would turn of logging before deletes OR if above takes way too long, would look into partitioning the table
|
|
-
09-07-2007, 12:59 AM |
|
|
How do you eat an elephant? One bite at a time :-)
Had an issue with defragmenting / reindexing a 120GB database on a production server, since the normal maintenance plan tries to be really clever and do everything in parallel in one BIG transaction, with backup policies backing up on the hour every hour between 6am & 6pm (meaning the log was not cleared until 6am) :-(
In concept we set a small table to track which table had been reindexed when, then set up a nightly job to reindex tables starting with the oldest and continue for a set period of time.
I'd approach this in a similar manner - decide on an acceptable batch size (i.e. # of rows - base it on avg bytes per row + fudge factor, figure out size in MB and check your available disk space) to process each night, note the start time, then SET ROWCOUNT X and DELETE in a loop until allotted time period has passed. Set this up in a nightly job and let it go...may take a few days to come back down, but it will. You can issue checkpoints / truncate log / set recovery to simple if you're allowed to do that.
BTW if you planned on 50GB and 270GB is too much, I'd say you had a problem with disk space to begin with...Windows likes 20% free disk space...unless you install your DB on a raw partition.
Hope this helps someone.
|
|
-
09-07-2007, 7:02 AM |
|
|
Simple. Bitch-slap the vendor, shutdown the app and go on with applications that really matter.
|
|
-
09-07-2007, 8:35 AM |
|
|
Pictures are of llamas, not alpacas
This is only one problem with your illustrations in the article, the pictures are of llamas, not alpacas. Alpacas have more wool on their forehead than llamas. Please check this link to see the differences amongst Camelidea family of fauna. http://en.wikipedia.org/wiki/Camelid.
Llamas are of the Genus species Lama glama, while alpacas are from the Genus species Vicugna pacos. There is a difference.
|
|
Page 2 of 3 (37 items)
2
|
|