Introduction
Sometimes, you need to download a file from a document library. What if you need to create a PowerShell script that will run without signed-in user and will download all files from all SharePoint sites?
It looks complicated, but it is not.
Steps should be:
- Register a new Entra ID application
- Add and grant the
Sites.Read.All
application permission for the Microsoft Graph - List all SharePoint sites
- List all document libraries of each site
- List all folders and files of each document library
- Download all files
- List all subsites of each site
- Repeat steps 4-7 for each subsite
The Microsoft Graph API expose several endpoints that can help you.
List all sites
The endpoint GET v1.0/sites/getAllSites
retrieves all SharePoint sites across all geographies. The endpoint doesn't return subsites and works only with application permissions.
Use the Get-MgAllSite
cmdlet to list all sites.
List all subsites
To list subsites defined for a site, use the endpoint GET v1.0/sites/{siteId}/sites
or the Get-MgSubSite
cmdlet.
List all document libraries
The Site resource has the relationship to a collection of Drive resources. The Drive resource is the top-level container for the SharePoint document library.
The endpoint GET v1.0/sites/{siteId}/drives
retrieves the collection of document libraries under the site. The Get-MgSiteDrive
cmdlet provides the same functionality.
List all folders and files
To list all folders and files inside the document library, call the endpoint GET v1.0/drives/{driveId}/items/{itemId}/children
. The endpoint returns collection of the driveItem
resource which represents a file, folder, or other item stored in a drive.
Starts with the root folder by passing the root
as the itemId
⇒ GET v1.0/drives/{driveId}/items/root/children
.
Then call the endpoint recursively for each folder.
The Get-MgDriveItemChild
cmdlet provides the same functionality.
Download a file
To download a file, call the endpoint GET v1.0/drives/{driveId}/items/{itemId}/content
. The endpoint returns the file content.
The Get-MgDriveItemContent
cmdlet will save the content to a file specified by the OutFile
property.
Required permissions
The only permission you need is the application permission Sites.Read.All
. This permission allows the app to read all sites, all documents and download the documents.
Download files from sites
The function DownloadFiles
will read all child items of a drive item. If a drive item represents a file, the file will be downloaded. If the drive item is a folder, the DownloadFiles
is called recursively to download all files from all folders.
function DownloadFiles {
param (
[string]$DriveId,
[string]$DriveItemId,
[string]$DownloadPath
)
try {
$driveItems = Get-MgDriveItemChild -DriveId $DriveId -DriveItemId $DriveItemId -All -ErrorAction Stop
foreach ($driveItem in $driveItems) {
if (-not [string]::IsNullOrEmpty($driveItem.Folder.ChildCount)) {
$subfolderPath = Join-Path -Path $DownloadPath -ChildPath "$($driveItem.Name)"
Write-Host "Creating folder $($driveItem.Name)"
New-Item -Path $subfolderPath -ItemType Directory -Force | Out-Null
if ($driveItem.Folder.ChildCount -gt 0) {
DownloadFiles -DriveId $DriveId -DriveItemId $driveItem.Id -DownloadPath $subfolderPath
}
}
else {
$filePath = Join-Path -Path $DownloadPath -ChildPath "$($driveItem.Name)"
Write-host "Downloading file $($driveItem.Name)"
try {
Get-MgDriveItemContent -DriveId $DriveId -DriveItemId $driveItem.Id -OutFile $filePath -ErrorAction Stop
}
catch {
Write-Error "Failed to download the file $($driveItem.Id) from drive $DriveId"
}
}
}
}
catch {
Write-Error "Failed to read items inside the folder $DriveItemId from drive $DriveId"
}
}
The function ReadDocumentLibrariesAndSubsites
will read all document libraries and subsites of a site. For each document library, the DownloadFiles
function is called to download all files.
function ReadDocumentLibrariesAndSubsites {
param (
[string]$SiteId,
[string]$SiteName,
[string]$SiteWebUrl
)
Write-Host "$($SiteName): $($SiteWebUrl)"
$siteIdParts = $SiteId.Split(',')
$folderName = Join-Path -Path $localDownloadPath -ChildPath "$($siteIdParts[1])_$($siteIdParts[2])".Replace("-","")
# Create folder for a site
New-Item -Path $folderName -ItemType Directory -Force | Out-Null
try {
$drives = Get-MgSiteDrive -SiteId $SiteId -ErrorAction Stop
foreach ($drive in $drives) {
$subfolderPath = Join-Path -Path $folderName -ChildPath "$($drive.Name)"
New-Item -Path $subfolderPath -ItemType Directory -Force | Out-Null
DownloadFiles -DriveId $drive.Id -DriveItemId 'root' -DownloadPath $subfolderPath
Write-Host "All files downloaded from site '$($SiteName)' and library '$($drive.Name)'"
}
}
catch {
Write-Error "Failed to retrieve document libraries of site $SiteId"
}
try {
$subsites = Get-MgSubSite -SiteId $SiteId -ErrorAction Stop
foreach($subsite in $subsites) {
ReadDocumentLibrariesAndSubsites -SiteId $subsite.Id -SiteName $subsite.Name -SiteWebUrl $subsite.WebUrl
}
}
catch {
Write-Error "Failed to retrieve subsites of site $SiteId"
}
}
In the main code, connect to the Microsoft Graph API, retrieve all sites across all geographies, and call the ReadDocumentLibrariesAndSubsites
function for each site.
Write-Host "Backup started"
Connect-MgGraph -TenantId $tenantId -ClientSecretCredential $clientSecretCredential
Import-Module Microsoft.Graph.Sites
try {
$sites = Get-MgAllSite
foreach ($site in $sites) {
ReadDocumentLibrariesAndSubsites -SiteId $site.Id -SiteName $site.Name -SiteWebUrl $site.WebUrl
}
Write-Host "Backup done"
}
catch {
Write-Error "Backup failed"
Write-Error $_.Exception.Message
}
The whole script:
# Define variables
$appId = '<client_id>'
$tenantId = '<tenant_id>'
$clientSecret = '<client_secret>'
$clientSecretCredential = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $appId, (ConvertTo-SecureString -String $clientSecret -AsPlainText -Force)
$localDownloadPath = "C:\Downloads\SharePointFiles"
function DownloadFiles {
param (
[string]$DriveId,
[string]$DriveItemId,
[string]$DownloadPath
)
try {
$driveItems = Get-MgDriveItemChild -DriveId $DriveId -DriveItemId $DriveItemId -All -ErrorAction Stop
foreach ($driveItem in $driveItems) {
if (-not [string]::IsNullOrEmpty($driveItem.Folder.ChildCount)) {
$subfolderPath = Join-Path -Path $DownloadPath -ChildPath "$($driveItem.Name)"
Write-Host "Creating folder $($driveItem.Name)"
New-Item -Path $subfolderPath -ItemType Directory -Force | Out-Null
if ($driveItem.Folder.ChildCount -gt 0) {
DownloadFiles -DriveId $DriveId -DriveItemId $driveItem.Id -DownloadPath $subfolderPath
}
}
else {
$filePath = Join-Path -Path $DownloadPath -ChildPath "$($driveItem.Name)"
Write-host "Downloading file $($driveItem.Name)"
try {
Get-MgDriveItemContent -DriveId $DriveId -DriveItemId $driveItem.Id -OutFile $filePath -ErrorAction Stop
}
catch {
Write-Error "Failed to download the file $($driveItem.Id) from drive $DriveId"
}
}
}
}
catch {
Write-Error "Failed to read items inside the folder $DriveItemId from drive $DriveId"
}
}
function ReadDocumentLibrariesAndSubsites {
param (
[string]$SiteId,
[string]$SiteName,
[string]$SiteWebUrl
)
Write-Host "$($SiteName): $($SiteWebUrl)"
$siteIdParts = $SiteId.Split(',')
$folderName = Join-Path -Path $localDownloadPath -ChildPath "$($siteIdParts[1])_$($siteIdParts[2])".Replace("-","")
# Create folder for a site
New-Item -Path $folderName -ItemType Directory -Force | Out-Null
try {
$drives = Get-MgSiteDrive -SiteId $SiteId -ErrorAction Stop
foreach ($drive in $drives) {
$subfolderPath = Join-Path -Path $folderName -ChildPath "$($drive.Name)"
New-Item -Path $subfolderPath -ItemType Directory -Force | Out-Null
DownloadFiles -DriveId $drive.Id -DriveItemId 'root' -DownloadPath $subfolderPath
Write-Host "All files downloaded from site '$($SiteName)' and library '$($drive.Name)'"
}
}
catch {
Write-Error "Failed to retrieve document libraries of site $SiteId"
}
try {
$subsites = Get-MgSubSite -SiteId $SiteId -ErrorAction Stop
foreach($subsite in $subsites) {
ReadDocumentLibrariesAndSubsites -SiteId $subsite.Id -SiteName $subsite.Name -SiteWebUrl $subsite.WebUrl
}
}
catch {
Write-Error "Failed to retrieve subsites of site $SiteId"
}
}
Write-Host "Backup started"
Connect-MgGraph -TenantId $tenantId -ClientSecretCredential $clientSecretCredential
Import-Module Microsoft.Graph.Sites
try {
$sites = Get-MgAllSite
foreach ($site in $sites) {
ReadDocumentLibrariesAndSubsites -SiteId $site.Id -SiteName $site.Name -SiteWebUrl $site.WebUrl
}
Write-Host "Backup done"
}
catch {
Write-Error "Backup failed"
Write-Error $_.Exception.Message
}
Performance
Of course, performance plays an important role. If you have dozens of SharePoint sites with a lot of documents, then a backup can take tens of minutes. For hundreds of sites, it can be hours.
Conclusion
In this article, I described how to easily browse all SharePoint document libraries and download all files.