SHAREPOINT - MANAGING THE PRESERVATION HOLD LIBRARY (1)
- Jonathan Stuckey

- 2 days ago
- 8 min read
Updated: 23 hours ago
Audience: Solution Designer, IT Operations, Information Manager
Author: Jonathan Stuckey
A Practical Guide for IT Professionals
This is the first in a 2-part article - a follow-up to managing the Recycle Bin. It provides a practical, tested, options for administration of the Preservation Hold Library (PHL). The PHL is a key structure created on a site which has Retention policy applied or Retention labels which are active.
The following covers options for reporting on the Preservation Hold Library on a SharePoint site (or sites). Using the reporting to update labelled items, and trimming items when undertaking storage recovery in the next article.
Why is managing PreservationHoldLibrary important?
The PHL is a hidden system storage space SharePoint has for managing documents and list items that have retention applied. Hence you don't see it in the normal contents UI or reporting.

Microsoft's SharePoint admin and 365 reporting provide a top-level view of total site storage or the end-user visible site usage only. Additionally, tools like ShareGate do not report on PHL and therefore cannot provide simple clean-up options.
This means production tenancies running multiple years will start to see capacity issues without having reporting available to understand root-cause, or a way to address issue(s) - other than paying Microsoft for additional storage.
Constantly increasing storage (and associated costs) because you don't understand what is consuming it goes against all good operational practice (and my Yorkshire upbringing).
Understanding SharePoint Reporting
The PreservationHoldLibrary is a crucial system structure for compliance services, and as such deserves to be treated with a lot of caution and preparation before interfering directly with it on a site. Unfortunately until recently Microsoft has not offered useful tools in this area.
IMPORTANT: Interacting with PHL requires a good understanding of SharePoint infrastructure and Purview Retention management.
Identifying specific sites with storage issues
A good admin needs to find-out where capacity is going. First resort is a macro look in SharePoint Admin Centre, Active sites - listing and re-sort the order based on size (descending). That gives you a crude pointer worst offenders.
Then you are either using:
a toolset like ShareGate, which has a great GUI driven configuration reports that can tell us file display size vs. actual storage size and where the difference is big,
Unfortunately ShareGate's custom reports wont save you if your problem is compounded by the PreservationHoldLibrary - it is not accessible by the tool
navigating each site individually to access the 'Classic' management UI 'Storage Metrics' page and frankenstein-ing cut-n-paste of details into a spreadsheet
the Storage Metrics view is difficult to use - unwieldy and doesn't export information (its cut-n-paste line-by-line) - not usable at scale!
Or using Powershell to extract the details once you know which sites are the worst offenders.
Why look at PHL libraries if you have a storage issue?
Well you have multiple things going on:
Legacy of a default of 500 versions (major), introduced as a lazy-man way of dealing with co-authoring versions, was set on sites in 2018/19. Prior to that it was an no limits without specific (post site creation) configuration;
Microsoft changed the model for documents managed under Retention policy or with labels - looping managed files via PHL, and back to Recycle Bin when the period of retention expires. This means you lose track of file (and the storage consumed) in the middle of life-cycle activities, and puts the onus on ensuring files must be processed through retention otherwise there's no capacity management;
The PHL behaviour was modified about 18-months ago. Moving from each file's set of versions in history held as individual unique files, to working like the standard library version history I.e. each version of that file 'copied' to PHL is now just a version on the history of 1-file.
The net result is you can can be blissfully unaware of what's going on in a site, but subject to storage management warnings.
How do these actually impact storage consumption?
Well here's a real-world example:
Billy has his working file: 'Budgeting.xlsx', which is 1.8MB in size. It is kept in his Planning library. It has 287 versions.
In the old-world, the /PreservationHoldLibrary has 286 copies of Budgeting.xlsx, each with a different suffix (GUID) to distinguish them. Each file is 1.8MB (ish).
Total storage used is 514.8MB in PHL, and under Planning library. Budgeting.xlsx in the library still only says it's 1.8MB, but its still using (or 514.8MB storage). Plus each copy in PHL is 1.8MB.
Post change in Retention storage from Microsoft, instead of unique file per version of Budeting.xlsx in PHL (or 286 files), there is 1 file with a Version history intact listing 286 versions.
So, in new world Billy still has 1 file called Budgeting.xlsx with 287 versions in Planning, and now retention holds 1 file in /PreservationHoldLibrary with 286 versions (minus current).
Each file displays it's size as 1.8MB, but in both cases the actual file storage footprint is ~514MB
Do you want to know how bad this can get?
I have a client with 1 site under Retention taking 47% of total tenancy storage. Two libraries showing a handful of files with displayed sizes between 81MB and 200+MB.
Just one Excel file:
displayed size (current) 81MB with 492 versions
and a total (consumed) storage of 28.97GB
Cleaning-up trailing versions on files on this site recovered ~5% of total tenancy storage. Addressing the site PHL retained file trailing versions' recovered a further 21%.
Why did Microsoft change things? [Exposition]
Well it is costly to keep updating and checking all these different versions without the tied history of the original. Simplifying the approach to run a copy for original with version History makes like easier for Microsoft to track and reconcile items that need workflows and disposal.
Microsoft doesn't like high-costs for background services. Microsoft changed it to make their life easier. Consolidating how versions are managed to 1 one model simplifies internal processes. i.e. A file in PHL can be viewed and managed exactly the same as a file anywhere else in SharePoint.
This does not, of course, address rampant storage consumption from versioning. Microsoft likes you to pay for storage, hence they charge you upwards of $0.20 (NZD) per Gigabyte in SharePoint.
Navigating to a Preservation Hold Library
While this is accessible to a SharePoint Administrator or Site Collection Administrator role in SharePoint, understanding that PHLs are just like any other library is important I.e. same setup, ability to use views, and (some) settings are accessible.
You append the reference /PreservationHoldLibrary - the end of your site URL et voila!

In fact, if you have a problem with access to the main view, you can always get to the main Library page adding /Forms/Allitems.aspx to url and then select Library Settings from top-right to access:

The difference between PHL and ordinary libraries being that Microsoft has locked the content types down and injected some additional property fields that Purview uses for managing the expiry, review and eventual disposal of items from your site.
Challenges
Your first challenge is reporting on this library, as volumes increase across multiple sites and SharePoint modern UI reporting does not show this. Accessing classic Storage Manager (appending /_layouts/15/storman.aspx to site URL) is per site and clunky to navigate.

INFORMATION: it is important to know which libraries are (legacy) System libraries (usually hidden in UI) - so you don't waste time or cause other issues by changing them.
The second challenge is managing trailing versions. The files held in PHL can have extensive number of trailing versions, which means that they can consume (and lock) massive volumes of storage in each PHL - none of which which is visible in modern reporting pages.
Reporting on the PHL using PowerShell
There are no native tools in M365 SP UI which provide accurate, comprehensive multi-site reporting or management for SharePoint sites. Good 3rd party tools don't exist, unless you are willing to pay a lot of money to add another platform vendor who integrates (or replaces) functionality native to the platform.
As with other gaps in the platform management you need to use PowerShell. This requires a degree of skill and experience to apply.
To employ powershell the preference is to baseline using PowerShell 7 command shell and the PnP.Powershell module - for forward support and compatibility.
Support for PowerShell 5.x requires importing Microsoft.Online.SharePoint.PowerShell module.
If you are planning on automation using Azure Runbooks, test on Powershell environment 7.2 with the PnP.Powershell 3.1 modules.
Access and permissions
You will need administrative privileges on the site (or sites) to operate the script and generate the reports.
Minimum for running scripts is delegated role: SharePoint Administrator
Module registration
With the latest securing of the platform, the PowerShell module(s) should be registered in Entra Enterprise Application registration.

To develop and run the powershell you will need the Application (client) ID from the App registration which would look something like: 0a123456-78ab-9012-ab34-567a8901b234
Getting Started: Reporting
Firstly we'll need to generate usable, repeatable reporting:
Launch PowerShell v7 command prompt.
Import and load the PnP.Powershell module.
Ensure you have the appropriate login credentials for authentication.
Set your PnP module app registration Application (client) ID variable for running.
Replace my examples with details relevant to your tenancy and site....
#Set powershell variable
$AppId = "0a123456-78ab-9012-ab34-567a8901b234"
$siteUrl = "https://organisation.sharepoint.com/sites/<site-name>"List the items in site PreservationHoldLibrary
Set environment variables for tenancy, site and library name
Connect to the site you want to report on
#additional variables required
$LibraryName = "PreservationHoldLibrary"
$BatchSize = 1000
#Connect your session to specific site on
Connect-PnPOnline -ClientId $AppId -Url $siteUrl -InteractiveRun the report-generation command.
Get-PnPListItem -List $LibraryName -PageSize $BatchSize...now assuming you have a lot of content in the PHL, that's going to scroll-forever. So we can just dump this to a report file in the local folder.
Get-PnPListItem -List $LibraryName -PageSize $BatchSize -ErrorAction Stop | Export-Csv -Path "$LibraryName-Report.csv" -NoTypeInformationThis will give you a complete content listing, including item name, deleted date, and original location in the site:
Creating a Repeatable Report from Automated Triggers
This is a basic architectural pattern. However, I don’t see many IT or partner support You’ll need to convert and import a version of the PowerShell as an Azure script to be executed by a Runbook.
Basic Pattern Ingredients
An IT person who understands the basics of:
Entra (Azure) and Enterprise application registration,
Resource groups, Automation accounts, and
uploading PowerShell scripts.
Details of your Azure | Entra subscription.
A Resource Group.
An Automation account (service account name that can run).
Check the automation account has the correct SharePoint privileges to run.
Import Shared Resources > Modules for:
PnP.Powershell (version 3.1.0)
Any other module scopes required.
Create a Runbook, set for the appropriate PowerShell version.
M365 SP site details
SharePoint site and library location (URL) for uploading the exported report data.
A login or credentials to access, upload, and edit content on the report site.
List structure should include attributes for:
Target site name,
Target site URL,
Next
In part 2 of the article I'll cover off how approach adjusting labels, clean-up of versions and reapplication for ongoing Retention with powershell.
Resources
For background on Purview Retention and Disposal see https://learn.microsoft.com/en-us/purview/retention
For all-things M365 Microsoft 365 & Power Platform Community and for command reference on GitHub PnP, check this link: PnP PowerShell Cmdlets.
Disclaimer
All content was created by the author, based on released information from Microsoft and Community after step-by-step testing and verification before committing to this article.
Generative AI has been used in the creation the headline image. No other Generative AI was used for content creation.
Any errors or issues with the content in this article are entirely the author's responsibility.
About the author: Jonathan Stuckey





Comments