Find Invalid Links in Excel That Cannot Be Broken
If an Excel file contains links to other workbooks, Excel will offer to update the data every time you open the file:
If the linked workbook doesn't exist, it will provide an option to edit the invalid link:
If the link was added by mistake, you might want to remove it. There is an option to do that in the dialog that opens - Break Link:
Unfortunately, the command silently fails in certain scenarios, i.e. the link remains present after the command is executed. Supposedly, this can happen when the link is used in named ranges or in data validation. To efficiently check the names, Name Manager can be used. However, the only way to check data validations is to open every one of them and check the source specified:
Although, in my case I knew which cells had data validation defined, there where too many of them to manually find the problematic one by checking them one by one. Then it occurred to me that the Excel workbook format is actually compressed XML which can be searched through with any descent text editor.
I made a copy of the .xlsx
file and changed its extension to .zip
. I then decompressed its contents into a folder and used Visual Studio Code's Find in Files functionality to find the worksheet(s) containing the link to the missing file (the filename can be found in Excel's Edit Links dialog). It is enough to only search the XML files.
In my workbook, the only match was found in xl\worksheets\sheet15.xml
:
<x14:dataValidation type="list" allowBlank="1" showInputMessage="1">
<x14:formula1>
<xm:f>'C:\Users\Damir\Documents\[MissingFile.xlsx]MySheet'!#REF!</xm:f>
</x14:formula1>
<xm:sqref>E13:E37</xm:sqref>
</x14:dataValidation>
Now, I needed to match this file to a worksheet in Excel. I first checked xl\workbook.xml.rels
and found the following matching entry:
<Relationship Id="rId15"
Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet"
Target="worksheets/sheet15.xml"/>
Target
attribute contains my filename of my match. I need the value of Id
attribute to find the actual name of the worksheet in Excel. In xl\workbook.xml
there is an entry with it:
<sheet name="ActualName" sheetId="42" r:id="rId15"/>
I found it based on r:id
attribute value. The name
attribute contains the worksheet name as displayed in Excel. Now I could finally reopen the workbook in Excel, navigate to ActualName worksheet, and open the Data Validation dialog for the E13:E37
range of cells (as specified in the snippet from xl\worksheets\sheet15.xml
above). After I changed the Source there, the Break Link command in Edit Links dialog succeeded as well - no more nagging dialogs every time I open the file.