I need to programmatically retrieve all of the distinct values in Active Directory for a given field (for example, the country field "co" - but I will have others) I don't care how many times a given value occurs or who/what it is associated with - I just need the list of distinct countries currently in there.
I have this which will produce the list - but for obvious reasons it is unacceptably slow as it is querying every record in AD and adding new countries to the list as it discovers them.
Public Function GetAllCountries() As List(Of String)
Dim searcher As DirectorySearcher = ActiveDirectory.Forest.GetCurrentForest.FindGlobalCatalog.GetDirectorySearcher
Dim searchResults As SearchResultCollection
Dim properties As ResultPropertyCollection
Dim countryList As New List(Of String)
Dim country As String
Try
With searcher
.Filter = "(&(objectClass=user)(co=*))"
.PageSize = 1000
.SizeLimit = 0
.SearchScope = SearchScope.Subtree
.CacheResults = False
.PropertiesToLoad.Add("co")
searchResults = .FindAll
End With
For Each result As SearchResult In searchResults
properties = result.Properties
country = GetPropertyValue(properties("co")).SingleItem
If Not countryList.Contains(country) Then countryList.Add(country)
Next
Catch ex As Exception
End Try
Return countryList
End Function
FYI GetPropertyValue is just a custom function I put together to pull the string value from the Property object...
Is there any way of using DirectoryServices (or a suitable alternative?) to query distinct values for a specific AD field in a more efficient manner?
Unfortunately, there is no equivalent to SQL's
DISTINCTin LDAP. And I see you're already doing what you can to make the search as fast as possible (asking for only accounts that have a value in thecoattribute, and usingPropertiesToLoadso only the one attribute is returned).The only improvement I think you could make is to use a
HashSetinstead of aList. It saves usingList.Contains()over and over, which is an O(n) operation, whereas it's O(1) withHashSet. The improvement may not be noticeable if the list never gets that big, but it's worth a try, especially if this code needs to be run multiple times.For example:
Then:
Notice that I set the size of the
HashSetwhen declaring it. This saves it from resizing as elements are added.You also don't need to check if it's already in the
HashSet, sinceAdd()just won't add it if it's already there.