In the past months I talked a lot about Document Management with SharePoint 2013. In my last session in Rosenheim there has been an interesting question. The scope of the standard Document ID provider that is shipped with SharePoint is ‘Site Collection’. In other word: to activate the Document ID you need to activate a Site Collection feature. What happens if you have two different Site Collections that use each the standard Document ID provider and you move a document (with a valid Document ID) from one Site Collection to the other?
I was curious – so I set up a test environment. I created a new Web Application and added two new Site Collections named SiteCol1 and SiteCol2. I used the Team Site template for both. For both Site Collections I activated the Document ID feature and defined a Document-ID prefix for each Site Collection.
Next I created a custom Send-to connection to move a document from Site Collection 1 to Site Collection 2. This custom Send-to connection looked like this.
Now I added a new document to a library in Site Collection 1 and had a look on its Document-ID:
Now I moved this document to Site Collection 2 by using the custom Send-to connection I just created:
It just took a few seconds to move the document to its new location.
And this is the moved document at its new location:
As you can see: the Document-ID has been preserved, although the document has been moved from one Site Collection to another Site Collection. In other words: the moved document kept its Document-ID although it has been moved to the scope of ‘another’ Document-ID provider.
Time for another test. When a document has been added to a library it can immediately be retrieved by the Document-ID WebPart. The Document-ID WebPart can retrieve a new document by its Document-ID even if there hasn’t been a crawl (incremental or full).
After I entered the preserved Document-ID in the WebPart that I had placed on the default page of Site Collection 2 – guess what happened …
When I clicked on the little blue arrow … this is what happened:
The moved document can’t be found by its Document-ID – although it is existing in the document library. Why?
To understand this we need to have a look on how to create a custom Document-ID provider. The first thing to notice is the error message above: No documents with the ID SITECOL1-1-3 were found in this site collection. This WebPart only looks for documents in the current site collection as this is the scope of the Document ID provider. I think this error message states that the WebPart takes the Document-ID and checks first if the given Document-ID is a valid Document-ID for the current site collection. In my example the Document-ID is not valid, because it has been created by another Document-ID provider. The Document-ID is invalid and because of that there can’t be a document with that Document-ID.
Let’s have a look on how to create a custom Document-ID provider. I found an older article on how to create a custom Document-ID provider in the blog of Tobias Zimmergren. If you look on the methods that need to be implemented there is one called GetDocumentUrlsById(). This method takes two parameters: a site and a Document-ID as string and it returns a string array of urls – in case at least one document could be found. In other words: the Document-ID provider is not only providing Document-IDs but is also responsible for retrieving documents by their Document-IDs. That’s the reason why the WebPart can find documents by a Document-ID although the document has not yet been crawled.
But that’s only half the truth – let’s have a look on the custom Document-ID provider again. There is a property that is interesting to look at. It is called DoCustomSearchBeforeDefaultSearch. If this property is set to true the Document-ID provider first tries to find the document on its own. If the document is not found this way, the default search is used. OK – time to start an incremental crawl in my test environment…
After the incremental crawl has been finished I tried the WebPart again – and now the moved document is found by its ‘invalid’ Document-ID. The reason why the standard Document-ID provider can’t find the moved document is now: the Document-ID was invalid and because of that the provider could not find the document on its own. So the provider used the default search, but was again unable to find the moved document (the document has not been crawled). After I started the incremental crawl manually the provider was able to find the document using the default search.
In other words: if you need to move documents from one Site Collection to another Site Collection and need the Document-IDs to be preserved, you can do this the way I showed in this post, but keep in mind that the document now has an ‘invalid’ Document-ID and cannot be found by the Document-ID WebPart directly. The same happens if you click on the Document-ID of the moved document. You need to rely on the default search – and start at least an incremental crawl before the moved document can be found by its Document-ID.
Oliver Wirkus was a speaker at ESPC 2013. Check out Oliver’s blog for more insightfull blogs!
For more SharePoint content from Oliver and other SharePoint specialists check out our resource centre!