We have electronic forms that filers fill out online and we store the data in an SQL Server. We want to provide a search feature that allows us to search inside each electronic filing for matching keywords. We don’t need to know what word matched or where in the form it matches, we just need a ranked list of forms that match our keywords. We think SQL Full-Text Searching would be our best option because we are already using SQL server 2016. We just started with implementing a solution but would like some guidance since this is new territory for us.
Here is an example of how our tables are structured.
Filing is our top-level table for all electronic forms. We have sub tables that are all related through the FilingId. The Form Six Published Filings table has child tables to store information like Assets. The Form One Published Filings table has child tables to store information like Liabilities.
CREATE SCHEMA [Forms]
GO
CREATE SCHEMA [Form6]
GO
CREATE SCHEMA [Form1]
GO
CREATE TABLE [Forms].[Filing](
[FilingId] INT NOT NULL IDENTITY(1,1)
CONSTRAINT [PK_Forms_Filing_FilingId] PRIMARY KEY CLUSTERED,
[FilerUserId] [int] NOT NULL,
[FormYear] [int] NOT NULL,
[FormTypeId] [int] NOT NULL,
[FilingStatusId] [int] NOT NULL,
[FilerSignatureId] INT NULL,
[SubmissionDate] DATETIME2(0) NULL,
[IsScannedForm] BIT NOT NULL
CONSTRAINT [DF_Forms_Filing_IsScannedForm] DEFAULT(0)
)
GO
CREATE TABLE [Form6].[FormSixPublishedFilings](
[FormSixPublishedFilingId] INT NOT NULL IDENTITY(1,1)
CONSTRAINT [PK_Form6_FormSixPublishedFilings_FormSixPublishedFilingId] PRIMARY KEY CLUSTERED,
[FilingId] INT NOT NULL
CONSTRAINT [FK_Form6_FormSixPublishedFilings_Filings] FOREIGN KEY ([FilingId]) REFERENCES [Forms].[Filing] ([FilingId]),
[LastDateOfEmployment] DATE NULL,
[NetWorthDate] DATE NULL,
[NetWorth] MONEY NULL
)
GO
CREATE TABLE [Form6].[FormSixPublishedAssets](
[FormSixPublishedAssetId] INT NOT NULL IDENTITY(1,1)
CONSTRAINT [PK_Form6_FormSixPublishedAssets_FormSixPublishedAssetId] PRIMARY KEY CLUSTERED,
[FormSixPublishedFilingId] INT NOT NULL
CONSTRAINT [FK_Form6_FormSixPublishedAssets_FormSixPublishedFilings] FOREIGN KEY ([FormSixPublishedFilingId]) REFERENCES [Form6].[FormSixPublishedFilings] ([FormSixPublishedFilingId]),
[Name] VARCHAR(8000) NOT NULL,
[Amount] MONEY NOT NULL
)
GO
CREATE TABLE [Form1].[FormOnePublishedFilings]
(
[FormOnePublishedFilingId] INT NOT NULL IDENTITY(1,1)
CONSTRAINT [PK_Form1_FormOnePublishedFilings_FormOnePublishedFilingId] PRIMARY KEY CLUSTERED,
[FilingId] INT NOT NULL,
CONSTRAINT [FK_Form1_FormOnePublishedFilings_Filing] FOREIGN KEY ([FilingId]) REFERENCES [Forms].[Filing] ([FilingId]),
[HasServedAsAgent] BIT NULL,
[LastDateOfEmployment] DATE NULL,
[AmendmentReason] VARCHAR(1024) NULL,
)
GO
CREATE TABLE [Form1].[FormOnePublishedLiabilities]
(
[FormOnePublishedLiabilityId] INT NOT NULL IDENTITY(1,1)
CONSTRAINT [PK_Form1_FormOnePublishedLiabilities_FormOnePublishedLiabilityId] PRIMARY KEY CLUSTERED,
[FormOnePublishedFilingId] INT NOT NULL,
CONSTRAINT [FK_Form1_FormOnePublishedLiabilities_FormOnePublishedFilings] FOREIGN KEY ([FormOnePublishedFilingId]) REFERENCES [Form1].[FormOnePublishedFilings] ([FormOnePublishedFilingId]),
[NameOfCreditor] VARCHAR(8000) NOT NULL,
[AddressOfCreditor] VARCHAR(8000) NOT NULL
)
GO
In order to be able to search through all the forms, I think we need to create a view that just has two columns. One for the FilingId and the other column would be an XML data type which would be an XML representation of all the data in each electronic filing. This XML column is what we will be using to set up our full-text index. I think we will be using the FreeTextTable search because we would like to have the results ranked and also the search terms will be entered by end-users.
create view ViewForFullTextSearching with schemabinding as
select f.FilingId,
(select
filing.FilingId
,filing.FormYear
,filing.FormTypeId
,filing.FilingStatusId
,filing.FilerSignatureId
,filing.SubmissionDate
,filing.IsScannedForm
,form6Filing.LastDateOfEmployment 'Form6LastDateOfEmployment'
,form6Filing.NetWorthDate
,form6Filing.NetWorth
,form6Asset.Name
,form6Asset.Amount
,form1Filing.HasServedAsAgent
,form1Filing.LastDateOfEmployment 'Form1LastDateOfEmployment'
,form1Filing.AmendmentReason
,form1Liability.NameOfCreditor
,form1Liability.AddressOfCreditor
from Forms.Filing filing
left join Form6.FormSixPublishedFilings form6Filing on filing.FilingId = form6Filing.FilingId
left join Form6.FormSixPublishedAssets form6Asset on form6Filing.FormSixPublishedFilingId = form6Asset.FormSixPublishedFilingId
left join Form1.FormOnePublishedFilings form1Filing on filing.FilingId = form1Filing.FilingId
left join Form1.FormOnePublishedLiabilities form1Liability on form1Liability.FormOnePublishedFilingId = form1Filing.FormOnePublishedFilingId
where filing.FilingId = f.FilingId
for xml auto, type
) as 'Filing'
from Forms.Filing f
GO
create unique clustered index [IX_ViewForFullTextSearching_FilingId] ON [Forms].[ViewForFullTextSearching] ([FilingId])
GO
The above SQL does not actually work because I get this error.
Cannot create an index on view "EthicsFdms.Forms.ViewForFullTextSearching" because it contains one or more subqueries. Consider changing the view to use only joins instead of subqueries. Alternatively, consider not indexing this view.
So, I’m a bit lost on how to create a view with XML to search over if I’m not allowed to create a materialized view that has subqueries.
This view results look like this:
Next we setup our Full Text Catalog and Index on this view:
CREATE FULLTEXT CATALOG [FtcFilings];
GO
CREATE FULLTEXT INDEX ON [Forms].[ViewForFullTextSearching] ([Filing] language 1033) key index [IX_ViewForFullTextSearching_FilingId] on [FtcFilings];
GO
Then I was hoping we could search the filings like so:
select ftt.*
from [Forms].[Filing] filing
inner join freetextable(Forms.ViewForFullTextSearching, Filing, 'APPLE') as ftt on filing.FilingId = ftt.[KEY]
order by rank desc
Right now my challenges are, is it possible to create a materialized view like this? Seems like I can’t because materialized views can’t have subqueries. I’m not sure how to build the XML field w/out subqueries.
If I’m not able to create a materialized view then how else can I create a full-text index that can search electronic Forms?



You cannot create an indexed view (which is a synchronous materialized view in SQL Server) only if there is a mathematical surjection and all scalar computation is deterministic and precise. By the way OUTER JOIN, SUBQUERIES and set operators (UNION, EXCEPT, INTERSECT) cannot be used...
The best ways to design your systeme is to do it in the reverse way...
Let me know if you want more assistance to do so...