Scientific Applications

Further Use of Excel

This experiment provides a further grounding in use of advanced facilities in Excel for scientific data analysis. In this experiment you will also see that there are many alternative methods for performing operations in Excel. The choice of approach partly relates to personal preference, but also to whether a series of operations needs to be automated or run manually. For repetitive tasks it is often desirable to program a spreadsheet.

Your answer will consist mainly of a number of printouts of spreadsheets and macros. When you present your answers indicate clearly which printout refers to which stage in the exercises.

Find and open the spreadsheet EXPT2. It consists of a dataset organised into two columns, and a small macro.

Save the spreadsheet under another name and perform all calculations on this new spreadsheet.

A typical spreadsheet consists of several worksheets and macros.

  1. Create a new worksheet using the "Insert" menu.

  2. Name this worksheet "Results" by double clicking on the word "Sheet2" at the bottom of the spreadsheet.

  3. Return to "Sheet1". Select the data, copy and paste it into the top left hand corner of "Results". In this way you do not alter the original sheet, and can always return to the data if you have made a mistake.

  4. In worksheet "Results" select the cells A1 to A30 and name them (using the "Insert" menu) "X". Name the cells B1 to B30 "Y". An aim of this experiment is to fit a line between "X" and "Y" of the form y = bx. We will be looking at more features of Excel, than in the previous experiment.

  5. Change the number format in column A so that there is always one decimal place after the decimal point, and in column B to give two decimal places, using the "Cells" option in the "Format" menu.

Print out this worksheet.

  1. There are several ways of doing regression. One is by using the standard analysis toolkit, as in the previous experiment. The other is by using matrix operations. Spreadsheets are very suitable for matrices. There are several simple matrix operations, one of which involves a transpose. Select the region of the spreadsheet F1 to AI1 which is 1 row and 30 columns wide, and place the transpose of "X" in this as follows. When the region is selected, type "=TRANSPOSE(X)" in the input bar. When you have completed this end this by pressing "SHIFT" "CONTROL" "ENTER" keys simultaneously. This combination of keys indicates a matrix operation. Name this new vector "XT".

  2. In cell A32 enter the sum of squares of the X vector using the "SUMSQ" function.

  3. The regression constant can be obtained by multiplying the transpose of "X" by "Y" and dividing by the sum of squares of X. A function of the form MMULT(XT,Y)/A32 should give the correct answer. Put this in the destination cell A34 (note : think about how to do this according to the instructions above, and remember that this is a matrix operation).

What is the regression constant?

  1. In the previous experiment, you used the Data Analysis tools to calculate the regression constant. Transfer the data to a third, new, worksheet, and use the Data Analysis regression tool to determine this constant. If correctly performed, you should obtain identical answers to "8" above.

Print out the worksheet showing both methods of calculation.

  1. In cells C1 to C30 calculate the predicted values of Y, using the result of regression obtained in A34 (remember to employ the "$" notation where appropriate).

  2. Calculate the residuals between B1 to B30 and C1 to C30 in cells D1 to D30.

  3. If the regression analysis has worked, the average residual should be close to 0. Calculate this using the AVERAGE function and place in D32. Approximately as many residuals should be above 0 as below 0. The number of residuals in each category can be calculated using the COUNTIF function (note you may have to discover how to use this with the HELP function). Calculate the percent proportion of residuals greater and less than 0 in cells D33 and D34.

What are these numbers?

  1. Macros are small programs in Visual Basic that allow more flexibility in the use of spreadsheets. Copy "Sheet1" worksheet into a new worksheet "New" (use "Copy sheet" in the "Edit" menu). If you make a mistake, you can return to the original worksheet. Remember to save the overall spreadsheet regularly.

  2. Name the columns A1 to A30 "XA" and B1 to B30 "XB" in sheet "New". These new names are so as not to confuse with the names in the sheet "Results", as names are global over all worksheets.

  3. Instead of laying out the intermediate steps of regression, matrices and functions can be combined into one statement. The result of regression can be obtained by selecting A32, and, typing (remembering to terminate the command by "SHIFT" "CONTROL" "ENTER") =MMULT(TRANSPOSE(XA),XB)/SUMSQ(XA)

  4. A simple macro is available called "REGRESS". Find this program which you will be asked to edit. Run the program as it currently stands.

Print out the result of running this macro.

  1. In order to understand the macro, you must know some statements in Visual Basic
    • Dim is used to set up variables and arrays respectively
    • Loops are set up using the For and Next statements
    • The Cells statement is important in Visual Basic
    • Each cell is referenced by its row and column; note that the row number is the first address and the column the second address.
    • Both text and numbers can be read from and into cells in the spreadsheet.
    • A Rem statement is used to insert comments.
    Insert "Rem" statements into the code of the macro.

Print out the commented macro and describe how it works.

  1. Edit the macro to perform the following functions (you are advised to make a copy of the sheet "New" in case you make a mistake).
    • Start with sheet "New" and a blank column C, with the data initially in cells A1 to B30, and the slope, calculated as above, in cell A32. You are not required to write a macro to calculate the slope, although this is not too difficult.
    • After reading in the data from the worksheet, put the 30 x values into cells A2 to A31 and the corresponding y and predicted values into cells B2 to B31 and C2 to C31 respectively, calculating the predicted values as the macro is run. Note : the result of this will be that the new cell A2 corresponds to the old cell A1 and so on. Think carefully about how you structure your loops; there are several ways of doing this.
    • In cell A1 place the column heading "x" (Hint : use the statement Cells(1,1)="x"), in cell B1 the column heading "y" and in cell C1 the column heading "predicted". The result will be to shift the data by one row, and title the columns.
    • Replace cell A32 by the word "slope", and put the slope, originally in cell A32, (stored as "z") into cell B32.
    • Calculate the residuals between y and predicted in the macro and place in cells D2 to D31. In cell D1 place the word "residuals".
    • Calculate the mean residual from the macro and place in cell B33. Place the word "mean residual" in cell A33.

Insert comments to your new macro, and print the new commented macro out.

  1. Run the macro (you may need to do this several times to test it out; keep the final correct version of the worksheet).

  2. Note that column A may now be too narrow to hold the word "mean residual". Some improvements to the layout of the spreadsheet are possible. Change the width of columns A to D as appropriate (ask a demonstrator if necessary). Remove the gridlines of the spreadsheet in the "Options" submenu of the "Tools" menu. Change the font to "Times Roman". Change the fonts of the words "slope" and "mean residual" to bold. Highlight the background of cells B32 and B33 in yellow. (Note all these operations could be performed automatically using a macro, but you are not asked to do this; interested students could investigate these possibilities - most information is available using the HELP system).

Print out the final worksheet.