Monday, March 26, 2012

Many-To-Many Self-Join

I hope somebody can help me with a rather complicated problem I am having involving triggers and foreign keys.

I am working on a database that manages projects in terms of Tasks and the other Tasks (I call them "Pretasks") that must be completed before the Task in question can be completed. Any Task can be a Pretask for one or more other Tasks, and any Task can have any number of Pretasks, which calls for a many-to-many self-join on my tasks table, tblTask. I handle this with a linking table, tblTaskChain, which contains two fields, CurrentTask and Pretask, each of which contains a TaskID referencing a record in tblTask.

I hit my first snag trying to ensure referential integrity in this setup. I opened tblTaskChain in Enterprise Manager's Design Table screen and added a foreign key relationship between tblTaskChain.CurrentTask and tblTask.TaskID, checking the Cascade Delete checkbox to avoid the creation of orphan records when records are deleted from tblTask. No problem. However, when I attempted to do the same with tblTaskChain.Pretask, I got the following error message:

'tblTask' table saved successfully
'tblTaskChain' table
- Unable to create relationship 'FK_tblTaskChain_tblTask'.
ODBC error: [Microsoft][ODBC SQL Server Driver][SQL Server]Introducing FOREIGN KEY constraint 'FK_tblTaskChain_tblTask' on table 'tblTaskChain' may cause cycles or multiple cascade paths. Specify ON DELETE NO ACTION or ON UPDATE NO ACTION, or modify other FOREIGN KEY constraints.
[Microsoft][ODBC SQL Server Driver][SQL Server]Could not create constraint. See previous errors.

I found that I could create the FK relationship with cascading deletes on either one of the two fields in tblTaskChain, but not on both. I could do both only if I unchecked the Cascading Delete checkbox on one of them.

Upon consideration, I realized that I needed more than SQL Server's cascading deletes could deliver, anyhow. When a task is deleted, I need all its pretasks and all its pretasks' pretasks to be deleted as well, UNLESS the task to be deleted is also on the pretask chain to some other task. To this end, I deleted the two foreign key constraints and instead added the following triggers to tblTask and tblTaskChain.

CREATE TRIGGER utrTask_del ON [dbo].[tblTask]
FOR DELETE
AS

/**************************************************************************************
2/21/2006
Delete all records from tblTaskChain where CurrentTask or Pretask
equals to TaskID that was deleted

**************************************************************************************/

delete tblTaskChain
where CurrentTask in (
select TaskID
from deleted
)
or Pretask in (
select TaskID
from deleted
)

/*************************************************************************************/

CREATE TRIGGER utrTaskChain_del ON [dbo].[tblTaskChain]
FOR DELETE
AS

/**************************************************************************************
2/21/2006
Delete pretask from tblTask only if it is not a pretask for any additional task

**************************************************************************************/

-- Table variable to hold list of pretasks
declare @.OtherCurrent table(
PretaskID int
)

-- List deleted TaskID's that are also pretasks for other CurrentTasks
-- besides the one in the deleted record
insert @.OtherCurrent(PretaskID)
select distinct C.Pretask
from tblTaskChain C
inner join deleted d
on C.Pretask = d.Pretask
where C.CurrentTask <> d.CurrentTask

-- delete all Tasks that were among those deleted
-- but not among those that are pretasks for additional CurrentTasks
delete tblTask
where TaskID in (
select Pretask
from deleted
)
and TaskID not in (
select PretaskID
from @.OtherCurrent
)

/*************************************************************************************/

I tested this by attempting to delete a task from tblTask which I knew had exactly one pretask which, in turn, had no pretasks. The record was not deleted, and I received the following error message.

Server: Msg 217, Level 16, State 1, Procedure utrTaskChain_del, Line 28
Maximum stored procedure, function, trigger, or view nesting level exceeded (limit 32).

Similarly, when I attempted to delete a record from tblTaskChain, I got:

Server: Msg 217, Level 16, State 1, Procedure utrTask_del, Line 12
Maximum stored procedure, function, trigger, or view nesting level exceeded (limit 32).

And that's where things stand. I didn't mean to write a novel here. This has been a long message, but I hope I have made the problem clear. Someone must have faced something similar; can you help?

Many thanks.

Sheldon Penner

is it possible for you to change the database design? I would suggest the following structure

TaskID PreTaskID MasterTaskID

1 1 1

2 1 1

3 2 1

In this case you can identify all records where MasterTaskID is 1.

Does this help?

|||Alternatively, you could store pretasks in a different table and just join with it|||

Thank you for your suggestion, Brokenrulz, but no, it doesn't help.

If I understand you correctly, MasterTaskID would represent the "root" task of which all the others are branches. This might be useful if I wanted to delete an entire tree of tasks, but that is rarely the case.

In my database, each "root task" (i.e., each task that is not a pretask for any other task) may have any number of pretasks and each of those pretasks may also have any number of pretasks, etc. If any task in the tree is deleted, I need the pretask chain leading up to that task only to be deleted as well. Tasks that lead to the root by some other chain must not be deleted, nor tasks that are also part of a chain leading to some other root. I need to be able to cut any branch off the tree, leaving all other branches and the root intact.

|||Some sample data in table form would help the understanding.|||

Here is some sample data:

tblTask
TaskID TaskName
-- -
1 Build house
2 Buy lot
3 Get lumber
4 Hire contractor
5 Pay gov't fees
6 Contact broker
7 Check World Wide Web for brokers and contractors
8 Check newspapers for Broker ads
9 Check phone book for brokers and contractors
10 Research gov't fees
11 Ask broker for info re gov't fees
12 Search web for info re gov't fees
15 Have contractor obtain lumber
17 Ask friends for personal referrals
18 Get bids from interested contractors
19 Check newspapers for lots for sale
86 Write program to evaluate bids
93 Research bid evalaution techniques

tblTaskChain
CurrentTask PreTask
-- --
1 2
1 3
1 4
1 5
1 86
2 6
2 8
2 19
3 15
4 9
4 17
4 18
5 10
5 11
5 12
6 7
6 8
6 9
10 11
10 12
11 6
15 4
18 7
18 9
18 17
86 93

|||

if the database is sql 2005, you can use recursive queries ot get your require data The sample below gives all Pretask that are for CurrentTask ID 10.

WITH CTE(CurrentTask,PreTask)

AS

(

SELECT T1.CurrentTask,T1.PreTask

FROM tblTaskChain T1

WHERE T1.CurrentTask=10

UNION ALL

SELECT T2.CurrentTask,T2.PreTask

FROM tblTaskChain T2

INNER JOIN CTE

ON

T2.CurrentTask=CTE.PreTask

)

SELECT CurrentTask,PreTask FROM CTE

|||If I understand this right, it is a hard problem, but let me try to understand.

Suppose this is your task chain:

CurrentTask PreTask
-- --
1 2
2 3
3 4
4 5
5 6

If someone tries to delete task #3, you should delete all tasks, right?

Now suppose this is your task chain (same as above with two additional entries)

CurrentTask PreTask
-- --
1 2
2 3
3 4
4 5
5 6
7 6
1 7

Again, suppose someone tries to delete task #3. What should you delete? You might start by deleting its only pretask (#4), and then delete #5 (because it is the only pretask for #4, which you just deleted). Then you go to delete #6, because it is the only pretask for #5. But you can't delete task #6, because it is a pretask for something other than #5 (#7).

How far do you roll this back? Can you delete #3 at all? If the last row (1, 7) were missing, would you still be unable to delete #7?

Now suppose this is your task chain:

CurrentTask PreTask
-- --
1 2
2 3
3 4
4 5
5 6
7 6
4 8
8 7

Now is it ok to delete 3, 4, 5, 6, 7, and 8?

My best guess at a rule is this. If task X is to be deleted, identify all tasks that depend on X through a direct chain of tasks, and call this set D(X) (include X in this set). Now consider X together with all all its pretasks, prepretasks, etc., and call this set P(X). In other words, D(X) is the directly dependent set of tasks, and P(X) is the prerequisite set of tasks.

Now check to see if P(D(X)) = P(X) union D(X). If so, mark the dependents of X for deletion. Next, check to see if D(P(X)) = P(X) union D(X). If so, mark the prerequisites of X for deletion. Then delete any marked tasks.

If this is right, I don't see how to handle this without some effort. I have some ideas, but none of them is easy. Hopefully you have some business rules that restrict the generality of the problem and make it easier to handle.

Steve Kass
Drew University
|||

When a task is deleted, I need all tasks on the task chain leading to it to be deleted, unless a particular task is also part of the task chain for a different task, in which case it stays.

The principle is: If we are not going to do A after all, there's no point in keeping the tasks that must be completed only so that we can do A.

When I initially submitted my question, I was hoping for some comment on the two triggers I included in their entirety. When I read them, they appear to be correct, but for reasons I have not yet been able to figure out, they cause the errors I described.

|||

Unfortunately, I am using SQL Server 2000, not 2005.

I was hoping somebody would comment on the two triggers I included in their entirety in my first posting. They appear to address the problem exactly, but they cause the errors I described in that posting.

|||

Sheldon,

The error message "Maximum stored procedure, function, trigger, or view nesting level exceeded (limit 32)." means that you've blown the recursion stack. You can't have a procedure call itself more than 32 times before SQL Server stops the procedure.

I think that the problem here is that your delete triggers always runs a delete against the other table, which then always runs a delete against the table that initiated the trigger. Here's a sort of "pseudocode" example that I think might make clear what's going on.

/*This is the outside statement that starts the ball rolling.*/
delete tblTaskChain where <some criteria>

/* The delete triggers the following. */
utrTaskChain_del:
<fill OtherCurrent table>
delete tblTask <some criteria>

/* That delete sets off the trigger on the other table. */
utrTask_del:
delete tblTaskChain where <some criteria>

/* Which sets off the trigger on the first table. */
utrTaskChain_del:
<fill OtherCurrent table>
delete tblTask <some criteria>

/* Even though the delete didn't actually remove anything from the table this time, the delete was still called, so the trigger gets set off. */
utrTask_del:
delete tblTaskChain where <some criteria>

/* And as you can see, it doesn't stop. */

I don't see where the recursion will ever stop. So what happens is that the code goes back and forth 32 times and then on the 33rd SQL Server says that this has gone over the limit and throws the error message you saw.

A solution is to check to see whether or not you need to delete before you actually call it. That way when you get to the end of a branch to delete you can be sure that it will terminate.

If you want more information about recursion in SQL Server, I found this site to be useful: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsqlpro03/html/sp03i8.asp

Nick Stipanovich
SQL Server Build Team

|||

Thank you, Nick.

Do understand correctly that the Delete event gets fired even if no records were actually deleted?

Suppose I preceded the delete statements in the two triggers with:

declare @.cnt int
set @.cnt = count(*) from deleted
if @.cnt > 0 BEGIN
...

Would that do the job? (I'll know before you reply to this because I'm gonna try it right now!)

Thanks again.

Sheldon

No comments:

Post a Comment