top of page

SHAREPOINT - DOCUMENT VERSION HISTORY

  • Writer: Jonathan Stuckey
    Jonathan Stuckey
  • 7 hours ago
  • 4 min read

Audience: Information Management Advisor, Operational Support, Solution Designers


Organisations relying on SharePoint and OneDrive often encounter persistent storage limitations, and understanding the underlying causes is essential for effective capacity management. This article examines the common causes of lost storage capacity in SharePoint (and OneDrive), focusing on hidden causes like document versioning bloat and the challenges posed by highly active or large files.



Monkey in glasses holds "Down the Rabbit Hole" book in a library aisle. Wears red cardigan, surrounded by tall shelves of books.
Why is there never enough room here?

Root-cause of lost storage capacity in SharePoint

Issues with loss of capacity (storage) in SharePoint and OneDrive, are more and more frequently related to document versioning bloat. But that's not the root of the issue, its just a symptom of other underlying issues. More often its one of the following causes:


  1. highly active documents in heavily trafficked site(s), where version settings are uncapped/excessive and:

a.      there are frequently accessed and edited files that over-write data (budget reporting, project tracking, corporate comms..)

 

otherwise known as 'transactional files'

 

b.      business critical documents that have multiple parties accessing and managing content [annual report, audit documents, investigations, Personnel files, contracts…]

 

i.e. long-running (surviving) critical documents

 

A monkey in glasses and red sweater reads "Down the Rabbit Hole," sitting on a turtle atop books in a dark setting.
Its just versions... all the way down
  1. Or, file-types with inherently large file-sizes being retained and versioned:

    1. data-base files copied / imported in migration (db, mdb, accdb... )

    2. media or audio files - increasingly issue with high-res (4K+) files, meeting recordings, post-edited images etc

    3. complex image or drawing files drawing files (.cad, .dwg, dxf, stl, psd, psdf etc)

    4. personal archives (.pst, .zip, .rar etc) retained from migrations, or closed projects

But there are other contributing causes, which have a cumulative effect on consuming your storage in ways you didn't know, for example:


  1. unowned and unused legacy sites post project close or use-case expiry,

  2. bulk disposal of transient items to be cleared when site should be closed/archived.

  3. stalled disposal of items in a site's Preservation Hold Library - not releasing space,

  4. failed folder/file clean-up post release or removal of Retention policy on a site

    ...


But worth noting is that the impact of legacy default settings for document versioning is a technical root cause of the storage consumption issue – but its not the operational cause. The operational impact actually stems from:


  • gaps in comprehension (knowledge or training),

  • systems failure (technical, automated), and

  • operational or process failure (people).


    Infographic showing three sections: Comprehension Gaps (orange), Systems Failure (yellow), Operational Failure (green) with icons and text.
    areas and failings that exacerbate capacity drain

How did these operational issues eventuate?

Ironically the issues today are the result of a combination of time and situationally appropriate actions inherited and never revisited. Specifically,


  • Operational choices – Information lifecycle choices on versioning that support classic check-out/in and formal review – where versions are cleared-down after publication approval. Great for slow, serial editing with formal release and long-term management. But these settings don’t work for dynamic data-changes and transactional documents


  • System defaults and rate-of-change – legacy decisions embedded in templates used for new sites; long-running sites which are unmanaged, or application linked actions which operate without oversight. These models were appropriate when we used on servers, or file-plans in EDRMs which had low volume of growth with low-frequency of change in the documents, but again these are not usually re-considered in modern content creation and editorial practices - and so prevail.


  • Stagnant User knowledge – with users working with modern tools and dynamic editing options, there is little in the legacy of old training course, historic guidance, word-of-mouth from colleagues and hidden settings that will alert them to the need to address versions on a document. This is institutional failings in support of staff - rebounding with uncapped storage growth and no operational revision.


What can we do about them?

Getting out of capacity drain requires a practical recovery plan, concrete actions and a strategy for management long term.


Start with the actions that release space fastest, but plan on sustained effort: if storage demand has already outrun control settings, recovering capacity will require both immediate remediation and short- to medium-term operating change:


  1. Remediate version sprawl first. Reset document management defaults, reduce unnecessary version retention on high-change content, clear transitory copies, and use retention labelling to automate disposal where content has reached end of life.


  2. Move low-value, low-use content out of prime storage. Archive older sites and files into a lower-cost tier, prioritising legacy content that must be retained but no longer needs day-to-day access.


  3. Treat large and rich-media files as a separate problem. Compress where possible, tighten lifecycle controls, improve metadata, and assess whether specialist digital asset management is needed for media-heavy content.


  4. Manage capacity as an ongoing service, not a clean-up exercise. Put reporting, alerting, trend analysis, and business forecasting in place so growth is visible early and funding, behaviour change, or service redesign can be planned deliberately.


  5. Integrate site lifecycle into operations. Use reporting to sort sites as System, App-linked (Engage, Calling..), Solution-grouped (intranet, report sites), Projects, or Accidental (planner, outlook). Sites with no activity in 18 months, decide to review, re-assign ownership, archive, or delete.


  6. Get ahead of AI-generated content now. Classify it, define storage rules for it, and monitor growth of generated content separately before it becomes the next unmanaged source of volume and duplication - which erodes your agents and AI usefulness.


Basically, chunk it down and make decisions by groups. Then tell the business users the pending action. The impact will be some noise, but a growing eventual realisation that blaming IT and platform for their hoarding of unnecessary clutter wont fix the issues or avoid the costs. Some times you just need to clean-up.


Clearing out old stuff keeps costs down, and AI working well!

Just remember, once you've got this far you've made it to the first-step. There's still work that needs to be done to keep things working well.


In the next article I will flesh-out the operations for better management of the platform.


Resources


Disclaimer

Generative AI has been used for the creation of the article images, and QA review prior to publishing. No AI was used in the creation of the article copy. All content was created by the author, based on released information from Microsoft. Any errors or issues with the content in this article are entirely the author's responsibility.


About the author: Jonathan Stuckey

Comments


Commenting on this post isn't available anymore. Contact the site owner for more info.

©2024 by What's This...?

  • LinkedIn
  • YouTube
  • X
bottom of page