Microsoft has exposed the scriptdom API to parse and generate TSQL. I'm new to it and still playing with it. I want to know how to get the cross databases references from queries like this one.
UPDATE t3
SET description = 'abc'
FROM database1.dbo.table1 t1
INNER JOIN database2.dbo.table2 t2
ON (t1.id = t2.t1_id)
LEFT OUTER JOIN database3.dbo.table3 t3
ON (t3.id = t2.t3_id)
INNER JOIN database2.dbo.table4 t4
ON (t4.id = t2.t4_id)
What I want is a list of the references:
database1.dbo.table1.id = database2.dbo.table2.t1_id
database3.dbo.table3.id = database2.dbo.table2.t3_id
database2.dbo.table4.id = database2.dbo.table2.t4_id
However, for the last entry database2.dbo.table4.id = database2.dbo.table2.t4_id, both of the columns from the 2 ends are from the same database database2, this is not what I want. So my final required result is:
database1.dbo.table1.id = database2.dbo.table2.t1_id
database3.dbo.table3.id = database2.dbo.table2.t3_id
Is is possible to be implemented with scriptdom?
A robust implementation is not easy. For the limited problem as posed in this question, the solution is relatively simple -- stress "relatively". I assume the following:
The solution strategy looks like this: we first visit the
TSqlFragmentto make a list of all table aliases, then visit it again to get all equijoins, expanding aliases along the way. Using that list, we determine the list of equijoins that do not refer to the same database. In code:Output:
AliasResolutionVisitoris a simple thing:We simply go through all the named table references in the query and, if they have an alias, add this to a dictionary. Note that this will fail miserably if subqueries are introduced, because this visitor has no notion of scope (and indeed, adding scope to a visitor is much harder because the
TSqlFragmentoffers no way to annotate the parse tree or even walk it from a node).The
EqualityJoinVisitoris more interesting:This hunts for
QualifiedJoininstances and, if we find them, we in turn examine the search condition to find all occurrences of equality comparisons. Note that this does work with nested search conditions: inBar JOIN Foo ON Bar.Quux = Foo.Quux AND Bar.Baz = Foo.Baz, we will find both expressions.How do we find them? Using another small visitor:
Nothing complicated here. It wouldn't be hard to fold this code into the other visitor, but I think this is clearer.
That's it, except for some helper code which I'll present without comment:
As I mentioned before, this code is quite brittle. It assumes queries have a particular form, and it could fail (quite badly, by giving misleading results) if they don't. A major open challenge would be to extend it so it can handle scopes and unqualified references correctly, as well as the other weirdness a T-SQL script can feature, but I think it's a useful starting point nevertheless.