|
|
Do Alpacas Dream of farming DBAs?
Last post 09-10-2007, 5:13 AM by Anonymous. 36 replies.
-
08-10-2007, 8:53 AM |
|
|
Do Alpacas Dream of farming DBAs?
|
-
08-16-2007, 10:40 AM |
-
imassi
-
-
-
Joined on 05-11-2007
-
-
-
-
|
TRUNCATE TABLE dbo.tblOrphans
Then blame it on the Alpaca that you let loose in the server room. You could always blame the script provided by the vendor. The business obviously isn't impressed with the application if they are replacing it, so you could probably get away with it. The data will be gone when that happens anyway. I wouldn't do something like this with important data, but if it's so unimportant it's being truncated after a month as it is.
I suppose the solution that would fulfill the requirements would involve partioning the table, moving the partition onto its own file and moving the file onto the drive with the free space. But that seems like more work than the vendor put into their solution, so the hell with them.
|
|
-
08-16-2007, 5:31 PM |
-
KatanaCS
-
-
-
Joined on 10-17-2006
-
San Diego, California
-
-
-
|
1st question - Version of SQL Server & DB
What version of SQL Server is this on? And what is the compatibility mode for the specified database?
|
|
-
08-16-2007, 8:15 PM |
-
08-17-2007, 2:28 AM |
-
WBrewer
-
-
-
Joined on 06-30-2006
-
London UK
-
-
-
|
If Rodney wins a goodie bag, I want it if he doesn't. The pens really work.
My favourite solution to this would be a quiet one. I'd put the recovery mode for the database into 'simple', and shrink the log. I'd get enough working space by moving stuff around. Then I would run a routine on the agent that took a chunk out of the two tables every minute. (How much of a chunk would have to be established by experiment) I reckon that there are sufficient indexes on the table to allow this. This chunk would have to be more than the rate of insertions per minute. (I've popped in the figure 2000 as a guess as to what would work without causing problems)
DELETE FROM tblOrphans WHERE fieldx IN ( SELECT TOP 2000 fieldx FROM tblOrphans LEFT OUTER JOIN tblParent1 ON tblOrphans.FieldX = tblParent1.FieldX LEFT OUTER JOIN tblParent2 ON tblOrphans.FieldX = tblParent2.FieldX WHERE tblParent2.fieldx IS NULL AND tblParent1.fieldx IS NULL )
The average DBA will want to rush in, having changed in a telephone booth, and do something totally radical to gasps of admiration from the onlookers, but when I came across a similar problem I figured out that, whereas an increasing table size meant misery, a decreasing table size meant happiess.
Alternatively, one could nip down to the local computer store, grab a nice big hard disk and copy the existing one onto it using one of these clever drive copying applications. (I won't mention names as I'm damned if I'll give them a plug)
A slightly more radical solution would be to stop the database, copy the mdf off onto a nice big server with plenty of room to breathe. Restart it. Attach the file on the new server and work your wicked way on it without fear of locking, logfile growth etc etc. Once you have the database short-back-and-sides and fully tested, then return it to the errant server.
|
|
-
08-17-2007, 9:34 AM |
-
Adam Machanic
-
-
-
Joined on 06-23-2006
-
Boston, MA
-
-
-
|
Another possible solution...
Throw the database in single user mode, to turn off the app (middle of the night, or whatever) Create a new filegroup with a file on the C: drive--plenty of room there Create a new table with the same columns as tblOrphans, on the new filegroup Insert all of the qualifying rows: INSERT NewTable SELECT * FROM tblOrphans WHERE FieldX IN (SELECT FieldX FROM tblParent1) AND FieldX IN (SELECT FieldX FROM tblParent2) Drop tblOrphans. Move the data back to the primary filegroup and the table tblOrphans, using SELECT INTO. Put the database back into multiuser mode and move on with life.
Looking for an advanced SQL Server 2005 book? Expert SQL Server 2005 Development
|
|
-
08-17-2007, 10:02 AM |
-
Adam Machanic
-
-
-
Joined on 06-23-2006
-
Boston, MA
-
-
-
|
I screwed up the "qualifying rows" query in the last post. Try this instead: INSERT NewTable SELECT * FROM tblOrphans WHERE FieldX IN (SELECT FieldX FROM tblParent1) OR FieldX IN (SELECT FieldX FROM tblParent2)
Looking for an advanced SQL Server 2005 book? Expert SQL Server 2005 Development
|
|
-
08-17-2007, 1:31 PM |
-
david.buckingham
-
-
-
Joined on 06-22-2007
-
Nashville TN
-
-
-
|
-- Temporarily drop this space-mongering index
DROP INDEX dbo.tblOrphans.idx1
GO
-- Temporarily drop this space-mongering index
DROP INDEX dbo.tblOrphans.idx2
GO
-- Space savings from dropped indexes allow us to
-- save off the "wanted" records into a new table
SELECT o.FieldX, o.FieldY
INTO dbo.tblOrphans_new
FROM dbo.tblOrphans o
INNER JOIN ( SELECT FieldX
FROM dbo.tblParent1
UNION
SELECT FieldX
FROM dbo.tblParent2
) t
ON t.FieldX = o.FieldX
GO
-- Kill the old bloated table
DROP TABLE dbo.tblOrphans
GO
-- Rename the new to the old
EXEC sp_rename 'dbo.tblOrphans_new', 'tblOrphans', 'OBJECT'
GO
-- Create index on the parent to ease the joins below
CREATE NONCLUSTERED INDEX IX_tblParent1_FieldX
ON dbo.tblParent1 ( FieldX )
WITH ( PAD_INDEX = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF,
IGNORE_DUP_KEY = OFF, ONLINE = OFF ) ON [PRIMARY]
GO
-- Create index on the parent to ease the joins below
CREATE NONCLUSTERED INDEX IX_tblParent2_FieldX
ON dbo.tblParent2 ( FieldX )
WITH ( PAD_INDEX = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF,
IGNORE_DUP_KEY = OFF, ONLINE = OFF ) ON [PRIMARY]
GO
SET QUOTED_IDENTIFIER ON
SET ANSI_NULLS ON
GO
-- Cleanup future orphans as appropriate
CREATE TRIGGER dbo.TR_AD_tblParent1 ON dbo.tblParent1
AFTER DELETE
AS
DELETE o
FROM dbo.tblOrphans o
INNER JOIN ( SELECT d.FieldX
FROM deleted d LEFT JOIN dbo.tblParent2 t
ON t.FieldX = d.FieldX
WHERE t.FieldX IS NULL
) l
ON l.FieldX = o.FieldX
GO
-- Cleanup future orphans as appropriate
CREATE TRIGGER dbo.TR_AD_tblParent2 ON dbo.tblParent2
AFTER DELETE
AS
DELETE o
FROM dbo.tblOrphans o
INNER JOIN ( SELECT d.FieldX
FROM deleted d
LEFT JOIN dbo.tblParent1 t
ON t.FieldX = d.FieldX
WHERE t.FieldX IS NULL
) l
ON l.FieldX = o.FieldX
GO
-- Replace dropped index now that space is not an issue
CREATE UNIQUE NONCLUSTERED INDEX idx1
ON dbo.tblOrphans ( FieldX ASC, FieldY ASC )
WITH ( PAD_INDEX = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF,
IGNORE_DUP_KEY = OFF, ONLINE = OFF ) ON [PRIMARY]
GO
-- Replace dropped index now that space is not an issue
CREATE UNIQUE NONCLUSTERED INDEX idx2
ON dbo.tblOrphans ( FieldY ASC, FieldX ASC )
WITH ( PAD_INDEX = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF,
IGNORE_DUP_KEY = OFF, ONLINE = OFF ) ON [PRIMARY]
GO
|
|
-
-
08-19-2007, 2:49 PM |
-
08-19-2007, 5:45 PM |
-
FrankAu
-
-
-
Joined on 08-19-2007
-
Sydney
-
-
-
|
This seems likely to moi ...
I actually prepared this on ASE 12.5.3 hence the use of ROWCOUNT ... basically the number of rows smacked each iteration controls the impact on your log. I would look at using something like 10,000 although you would have to do some arithmetic to make sure the log demands can be met.
DECLARE @rc INT SET ROWCOUNT 1 SELECT @rc = 1 WHILE @rc > 0 BEGIN DELETE #tblOrphan FROM (#tblOrphan O LEFT OUTER JOIN #tblParent1 P1 ON O.FieldX = P1.FieldX ) LEFT OUTER JOIN #tblParent2 P2 ON O.FieldX = P2.FieldX WHERE COALESCE(P1.FieldY, P2.FieldY) IS NULL SELECT @rc = @@rowcount END SET ROWCOUNT 0 GO
This should be OK to run while the system remains online although if transactions were being processed I would prolly be quite conservative on my ROWCOUNT setting to minimise lock contentions etc!
Cheers Frank.
|
|
-
08-22-2007, 1:24 PM |
|
|
Read through all the solutions and many are forgetting that there is limited space and there is no mention if a new server is available. So based on all the solutions dleteing in small chunks should work but will require quite some effort and time. The other would be BCP out the data as suggested. Either way its a big problem as you have space crunch. Would like to see a solution that is way different and efficient then the regular DBA minds would thing of and do post the solution you will use to accomplish this issue. Good Luck!!!
|
|
-
08-22-2007, 2:45 PM |
|
|
Figuring on an average data size, the remaining 14 million rows would consume about 3.5GB. Dropping the indexes from the tables and adding them back later should free up plenty of space like David recommended. Moving truncating and replacing would be my best course recommendation and could be done in short order without the need to shuffle data around to different disks.
|
|
-
08-22-2007, 3:01 PM |
|
|
1) Backup database holding tblOrphan to C: drive. 2) Create temporary database on C: drive recovery model simple. 3) Script out tblOrphan table. 4) Create tblOrphan table in new database on C: drive. 5) Insert data to be saved into new tblOrphan table (anticipate 14mil records) at the rate of 100,000 per insert (or more if tempdb can handle it). 6) Truncate tblOrphan table. 7) Backup log file (for database with tblOrphan) to C: drive. 8) Shrink the database (with tblOrphan table) to remove all empty space (minus amount of space needed to put 14mil records back in) + 10% (or whatever your growth rate is). 9) Import tblOrphan data from tblOrphan table on C: drive (anticipate 14mil rows) 100,000 rows at a time (more if possible). 10) Delete temporary database on C: drive. 11) Create nightly job to run tblOrphan delete statement.
|
|
-
08-22-2007, 6:25 PM |
-
Lewis Moten
-
-
-
Joined on 10-06-2006
-
-
-
-
|
It looks like you need to get rid of at least 8 million records a day to permit 2 additional GB of data.
I'm concerned about you using an INT for your primary key and using millions of records. I would consider changing it to a big int or even a unique identifier.
For the future, I would recommend looking into creating a partitioned view and then redefine it when it's time to delete a table. Just add a column to partion on that is filled with a unique number each day (number of days since jan 1, 1970) so that after 30 days, you can drop them.
To let the database operate on cleaning itself up on its own, I reccomend creating a job to remove data in chunks to prevent minimal performance cost to the end-users.
Turn on auto-shrink
Do a Full Backup.
Create a job to run the following at least every 10 minutes:
SET ROWCOUNT 60000 Delete from tblOrphans where FieldX in( ( SELECT TOP 60000 T1.FieldX FROM tblOrphans T1 WITH(NOLOCK) left outer join tblParent1 T2 WITH(NOLOCK) on T1.FieldX = T2.FieldX left outer join tblParent2 T3 WITH(NOLOCK) on T2.FieldX is null and T1.FieldX = T3.FieldX where isnull(T2.FieldX, T3.FieldX) is null ))
add an additional step to the job to backup the transaction log for both the current and tempdb databases.
You can also add the following to help get your schema situated and orphaned data removed in the future:
ALTER TABLE dbo.tblOrphans ADD CONSTRAINT PK_tblOrphans PRIMARY KEY CLUSTERED ( FieldX ) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] go ALTER TABLE dbo.tblParent1 ADD CONSTRAINT PK_tblParent1 PRIMARY KEY CLUSTERED ( FieldX ) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO ALTER TABLE dbo.tblParent2 ADD CONSTRAINT PK_tblParent2 PRIMARY KEY CLUSTERED ( FieldX ) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO ALTER TABLE dbo.tblParent1 WITH NOCHECK ADD CONSTRAINT FK_tblParent1_tblOrphans FOREIGN KEY ( FieldX ) REFERENCES dbo.tblOrphans ( FieldX ) ON UPDATE NO ACTION ON DELETE NO ACTION GO ALTER TABLE dbo.tblParent2 WITH NOCHECK ADD CONSTRAINT FK_tblParent2_tblOrphans FOREIGN KEY ( FieldX ) REFERENCES dbo.tblOrphans ( FieldX ) ON UPDATE NO ACTION ON DELETE NO ACTION GO CREATE TRIGGER dbo.tblParent1Delete ON dbo.tblParent1 AFTER DELETE AS BEGIN SET NOCOUNT ON; delete tblOrphans where exists (select top 1 0 from deleted with(nolock) where tblOrphans.FieldX = deleted.FieldX) and not exists(select top 1 0 from tblParent2 WITH(NOLOCK) where tblOrphans.FieldX = tblParent2.FieldX)
END GO CREATE TRIGGER dbo.tblParent2Delete ON dbo.tblParent2 AFTER DELETE AS BEGIN SET NOCOUNT ON; delete tblOrphans where exists (select top 1 0 from deleted with(nolock) where tblOrphans.FieldX = deleted.FieldX) and not exists(select top 1 0 from tblParent1 WITH(NOLOCK) where tblOrphans.FieldX = tblParent1.FieldX)
END go
|
|
Page 1 of 3 (37 items)
1
|
|