How to download all files from all SharePoint sites with Microsoft Graph PowerShell SDK

Introduction

Sometimes, you need to download a file from a document library. What if you need to create a PowerShell script that will run without signed-in user and will download all files from all SharePoint sites?

It looks complicated, but it is not.

Steps should be:

  1. Register a new Entra ID application
  2. Add and grant the Sites.Read.All application permission for the Microsoft Graph
  3. List all SharePoint sites
  4. List all document libraries of each site
  5. List all folders and files of each document library
  6. Download all files
  7. List all subsites of each site
  8. Repeat steps 4-7 for each subsite

The Microsoft Graph API expose several endpoints that can help you.

List all sites

The endpoint GET v1.0/sites/getAllSites retrieves all SharePoint sites across all geographies. The endpoint doesn't return subsites and works only with application permissions.

Use the Get-MgAllSite cmdlet to list all sites.

List all subsites

To list subsites defined for a site, use the endpoint GET v1.0/sites/{siteId}/sites or the Get-MgSubSite cmdlet.

List all document libraries

The Site resource has the relationship to a collection of Drive resources. The Drive resource is the top-level container for the SharePoint document library.

The endpoint GET v1.0/sites/{siteId}/drives retrieves the collection of document libraries under the site. The Get-MgSiteDrive cmdlet provides the same functionality.

List all folders and files

To list all folders and files inside the document library, call the endpoint GET v1.0/drives/{driveId}/items/{itemId}/children. The endpoint returns collection of the driveItem resource which represents a file, folder, or other item stored in a drive.

Starts with the root folder by passing the root as the itemIdGET v1.0/drives/{driveId}/items/root/children. Then call the endpoint recursively for each folder.

The Get-MgDriveItemChild cmdlet provides the same functionality.

Download a file

To download a file, call the endpoint GET v1.0/drives/{driveId}/items/{itemId}/content. The endpoint returns the file content.

The Get-MgDriveItemContent cmdlet will save the content to a file specified by the OutFile property.

Required permissions

The only permission you need is the application permission Sites.Read.All. This permission allows the app to read all sites, all documents and download the documents.

Download files from sites

The function DownloadFiles will read all child items of a drive item. If a drive item represents a file, the file will be downloaded. If the drive item is a folder, the DownloadFiles is called recursively to download all files from all folders.

function DownloadFiles {
    param (
        [string]$DriveId,
        [string]$DriveItemId,
        [string]$DownloadPath
    )

    try {
        $driveItems = Get-MgDriveItemChild -DriveId $DriveId -DriveItemId $DriveItemId -All -ErrorAction Stop

        foreach ($driveItem in $driveItems) {
            if (-not [string]::IsNullOrEmpty($driveItem.Folder.ChildCount)) {
                $subfolderPath = Join-Path -Path $DownloadPath -ChildPath "$($driveItem.Name)"
                Write-Host "Creating folder $($driveItem.Name)"
                New-Item -Path $subfolderPath -ItemType Directory -Force | Out-Null

                if ($driveItem.Folder.ChildCount -gt 0) {
                    DownloadFiles -DriveId $DriveId -DriveItemId $driveItem.Id -DownloadPath $subfolderPath
                }
            }
            else {
                $filePath = Join-Path -Path $DownloadPath -ChildPath "$($driveItem.Name)"
                Write-host "Downloading file $($driveItem.Name)"
                try {
                    Get-MgDriveItemContent -DriveId $DriveId -DriveItemId $driveItem.Id -OutFile $filePath -ErrorAction Stop
                }
                catch {
                    Write-Error "Failed to download the file $($driveItem.Id) from drive $DriveId"
                }
            }
        }
    }
    catch {
        Write-Error "Failed to read items inside the folder $DriveItemId from drive $DriveId"
    }
}

The function ReadDocumentLibrariesAndSubsites will read all document libraries and subsites of a site. For each document library, the DownloadFiles function is called to download all files.

function ReadDocumentLibrariesAndSubsites {
    param (
        [string]$SiteId,
        [string]$SiteName,
        [string]$SiteWebUrl
    )

    Write-Host "$($SiteName): $($SiteWebUrl)"
    $siteIdParts = $SiteId.Split(',')
    $folderName = Join-Path -Path $localDownloadPath -ChildPath "$($siteIdParts[1])_$($siteIdParts[2])".Replace("-","")
    # Create folder for a site
    New-Item -Path $folderName -ItemType Directory -Force | Out-Null

    try {
        $drives = Get-MgSiteDrive -SiteId $SiteId -ErrorAction Stop
        foreach ($drive in $drives) {
            $subfolderPath = Join-Path -Path $folderName -ChildPath "$($drive.Name)"
            New-Item -Path $subfolderPath -ItemType Directory -Force | Out-Null
            DownloadFiles -DriveId $drive.Id -DriveItemId 'root' -DownloadPath $subfolderPath
            Write-Host "All files downloaded from site '$($SiteName)' and library '$($drive.Name)'"
        }
    }
    catch {
        Write-Error "Failed to retrieve document libraries of site $SiteId"
    }

    try {
        $subsites = Get-MgSubSite -SiteId $SiteId -ErrorAction Stop
        foreach($subsite in $subsites) {
            ReadDocumentLibrariesAndSubsites -SiteId $subsite.Id -SiteName $subsite.Name -SiteWebUrl $subsite.WebUrl
        }
    }
    catch {
        Write-Error "Failed to retrieve subsites of site $SiteId"
    }
}

In the main code, connect to the Microsoft Graph API, retrieve all sites across all geographies, and call the ReadDocumentLibrariesAndSubsites function for each site.

Write-Host "Backup started"
Connect-MgGraph -TenantId $tenantId -ClientSecretCredential $clientSecretCredential

Import-Module Microsoft.Graph.Sites

try {
    $sites = Get-MgAllSite

    foreach ($site in $sites) {
        ReadDocumentLibrariesAndSubsites -SiteId $site.Id -SiteName $site.Name -SiteWebUrl $site.WebUrl
    }

    Write-Host "Backup done"
}
catch {
    Write-Error "Backup failed"
    Write-Error $_.Exception.Message
}

The whole script:

# Define variables
$appId = '<client_id>'
$tenantId = '<tenant_id>'
$clientSecret = '<client_secret>'
$clientSecretCredential = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $appId, (ConvertTo-SecureString -String $clientSecret -AsPlainText -Force)
$localDownloadPath = "C:\Downloads\SharePointFiles"

function DownloadFiles {
    param (
        [string]$DriveId,
        [string]$DriveItemId,
        [string]$DownloadPath
    )

    try {
        $driveItems = Get-MgDriveItemChild -DriveId $DriveId -DriveItemId $DriveItemId -All -ErrorAction Stop

        foreach ($driveItem in $driveItems) {
            if (-not [string]::IsNullOrEmpty($driveItem.Folder.ChildCount)) {
                $subfolderPath = Join-Path -Path $DownloadPath -ChildPath "$($driveItem.Name)"
                Write-Host "Creating folder $($driveItem.Name)"
                New-Item -Path $subfolderPath -ItemType Directory -Force | Out-Null

                if ($driveItem.Folder.ChildCount -gt 0) {
                    DownloadFiles -DriveId $DriveId -DriveItemId $driveItem.Id -DownloadPath $subfolderPath
                }
            }
            else {
                $filePath = Join-Path -Path $DownloadPath -ChildPath "$($driveItem.Name)"
                Write-host "Downloading file $($driveItem.Name)"
                try {
                    Get-MgDriveItemContent -DriveId $DriveId -DriveItemId $driveItem.Id -OutFile $filePath -ErrorAction Stop
                }
                catch {
                    Write-Error "Failed to download the file $($driveItem.Id) from drive $DriveId"
                }
            }
        }
    }
    catch {
        Write-Error "Failed to read items inside the folder $DriveItemId from drive $DriveId"
    }
}

function ReadDocumentLibrariesAndSubsites {
    param (
        [string]$SiteId,
        [string]$SiteName,
        [string]$SiteWebUrl
    )

    Write-Host "$($SiteName): $($SiteWebUrl)"
    $siteIdParts = $SiteId.Split(',')
    $folderName = Join-Path -Path $localDownloadPath -ChildPath "$($siteIdParts[1])_$($siteIdParts[2])".Replace("-","")
    # Create folder for a site
    New-Item -Path $folderName -ItemType Directory -Force | Out-Null

    try {
        $drives = Get-MgSiteDrive -SiteId $SiteId -ErrorAction Stop
        foreach ($drive in $drives) {
            $subfolderPath = Join-Path -Path $folderName -ChildPath "$($drive.Name)"
            New-Item -Path $subfolderPath -ItemType Directory -Force | Out-Null
            DownloadFiles -DriveId $drive.Id -DriveItemId 'root' -DownloadPath $subfolderPath
            Write-Host "All files downloaded from site '$($SiteName)' and library '$($drive.Name)'"
        }
    }
    catch {
        Write-Error "Failed to retrieve document libraries of site $SiteId"
    }

    try {
        $subsites = Get-MgSubSite -SiteId $SiteId -ErrorAction Stop
        foreach($subsite in $subsites) {
            ReadDocumentLibrariesAndSubsites -SiteId $subsite.Id -SiteName $subsite.Name -SiteWebUrl $subsite.WebUrl
        }
    }
    catch {
        Write-Error "Failed to retrieve subsites of site $SiteId"
    }
}

Write-Host "Backup started"
Connect-MgGraph -TenantId $tenantId -ClientSecretCredential $clientSecretCredential

Import-Module Microsoft.Graph.Sites

try {
    $sites = Get-MgAllSite

    foreach ($site in $sites) {
        ReadDocumentLibrariesAndSubsites -SiteId $site.Id -SiteName $site.Name -SiteWebUrl $site.WebUrl
    }

    Write-Host "Backup done"
}
catch {
    Write-Error "Backup failed"
    Write-Error $_.Exception.Message
}

Performance

Of course, performance plays an important role. If you have dozens of SharePoint sites with a lot of documents, then a backup can take tens of minutes. For hundreds of sites, it can be hours.

Conclusion

In this article, I described how to easily browse all SharePoint document libraries and download all files.

0
Buy Me a Coffee at ko-fi.com
An error has occurred. This application may no longer respond until reloaded. Reload x