XLRD - how to read multiple excel files into SPSS

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
Report Content as Inappropriate

XLRD - how to read multiple excel files into SPSS

Tal Spalter
I installed SPSS version 24 (with Python essentials) on my laptop. I need to read multiple excel files into one SPSS dataset and I know there is a custom dialog box that can help one do that through Python. I installed Python version 3.4.
I used SPSS tutorials to install two modules: XLWT, XLRD (excel read and write):
And when I run the test syntax or use the custom dialog box named: ”read and merge excel files”, I get an error telling me that those modules (XLWT, XLRD) do not exist (scroll to the end of error please):
begin program.
This syntax was pasted from an SPSS custom dialog found at www.spss-tutorials.com/read-and-merge-multiple-sheet-excel-workbooks/.
Version: 0.1.
def xlsToSpss(rdir,varnames=True,sheets='all'):
    import xlrd,spss,os
    fils=[fil for fil in os.listdir(rdir) if fil.endswith(".xls")]
    firstSheet = True #First sheet that's actually read => only get varnames once
    for cnt,fil in enumerate(fils):
        #print fil
        if sheets != 'all':
            if "," in sheets: #multiple comma separated sheet indices (starting from 1!)
                sheetIndices = [int(i) - 1 for i in sheets.split(',')]
            else: #single sheet index
                sheetIndices = [int(sheets) - 1]
        else: #all sheets in each workbook
            sheetIndices = range(wb.nsheets)
        for sheetIndex in sheetIndices:
            ws = wb.sheets()[sheetIndex]
            if firstSheet: #specify variable names just once
                firstSheet = False
                if varnames: #read variable names from sheet
                    vNames = ["source_file"]+["source_sheet"]+ws.row_values(0)
                else: #standard variable names
                    vNames = ["source_file"]+["source_sheet"]+["column_%d"%(i + 1) for i in range(ws.ncols)]
                #print vNames
            fRow = 1 if varnames else 0 #if varnames, first row doesn't hold data values
            for row in range(fRow,ws.nrows):
                allData.append([fil]+[ws.name]+[val for val in ws.row_values(row)])
    mxLens=[0]*len(vNames) #find required lengths for SPSS string variables, zero for numeric
    for line in allData:
        for cnt in range(len(line)):
            if isinstance(line[cnt],basestring) and len(line[cnt])>mxLens[cnt]:
    with spss.DataStep(): #start sending data to SPSS
        nds = spss.Dataset('*')
        for var in zip(vNames,mxLens):
            nds.varlist.append(var[0],var[1]) #var[1] is zero (numeric) or length (string)
        for line in allData:
            nds.cases.append([None if val=='' else val for val in line])
xlsToSpss(r'C:\Users\SpalterTa2\Desktop\SPSS\Read and merge excel files_test',varnames=True,sheets='all')
end program.
Traceback (most recent call last):
  File "<string>", line 44, in <module>
  File "<string>", line 7, in xlsToSpss
ImportError: No module named xlrd

Thank you in advance for any suggestion on how to solve this,

To manage your subscription to SPSSX-L, send a message to
[hidden email] (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
For a list of commands to manage subscriptions, send the command
Reply | Threaded
Open this post in threaded view
Report Content as Inappropriate

Re: XLRD - how to read multiple excel files into SPSS

Andy W
This is a time when the error message is pretty straight forward - did you install the xlrd module for python? The error message suggests you did not.

Ruben gives a nice tutorial on how to install the xlrd module for python if you follow his links, https://www.spss-tutorials.com/xlrd-python/.
Andy W