excel Counting and Summing Techniques

A common task in Excel is conditional counting or summing. This section contains a number of
formula examples that deal with counting various items on a worksheet, based on single or multiple criteria. You can adapt these formulas to your own needs.
Excel 2007 introduced two new counting and summing functions that aren’t available
in previous versions (COUNTIFS and SUMIFS). Therefore, I present two versions of
some formulas: an Excel 2007 and later version and an array formula that works with
all recent versions of Excel.
Figure 3-8 shows a simple worksheet to demonstrate the formulas that follow. The following
range names are defined:
Figure 3-8: This worksheet demonstrates some useful formulas for counting and summing.
h Month: A2:A10
h Region: B2:B10
h Sales: C2:C10
Counting formula examples
Table 3-3 contains formulas that demonstrate a variety of counting techniques.
Table 3-3: Counting Formula Examples
Formula Description
=COUNTIF(Region,”North”) Counts the number of rows in which Region = “North”
=COUNTIF(Sales,300) Counts the number of rows in which Sales = 300
=COUNTIF(Sales,”>300”) Counts the number of rows in which Sales > 300
=COUNTIF(Sales,”<>100”) Counts the number of rows in which Sales <> 100
=COUNTIF(Region,”?????”) Counts the number of rows in which Region contains
five letters
=COUNTIF(Region,”*h*”) Counts the number of rows in which Region contains the
letter H (not case-sensitive)
=COUNTIFS(Month,”Jan”,Sales,”>200”) Counts the number of rows in which Month = “Jan” and
Sales > 200 (Excel 2007 and later)
{=SUM((Month=”Jan”)*(Sales>200))} An array formula that counts the number of rows in
which Month = “Jan” and Sales > 200
=COUNTIFS(Month,”Jan”,Region,”North”) Counts the number of rows in which Month = “Jan” and
Region = “North” (Excel 2007 and later)
{=SUM((Month=”Jan”)*(Region=”North”))} An array formula that counts the number of rows in
which Month = “Jan” and Region = “North”
=COUNTIFS(Month,”Jan”,Region,”North”)+
COUNTIFS(Month,”Jan”,Region,”South”)
Counts the number of rows in which Month = “Jan” and
Region = “North” or “South” (Excel 2007 and later)
{=SUM((Month=”Jan”)*((Region=”North”)+
(Region=”South”)))}
An array formula that counts the number of rows in
which Month = “Jan” and Region = “North” or “South”
=COUNTIFS(Sales,”>=300”,Sales,”<=400”) Counts the number of rows in which Sales is between 300 and 400 (Excel 2007 and later) {=SUM((Sales>=300)*(Sales<=400))} An array formula that counts the number of rows in which Sales is between 300 and 400 Summing formula examples Table 3-4 shows a number of formula examples that demonstrate a variety of summing techniques. Table 3-4: Summing Formula Examples Formula Description =SUMIF(Sales,”>200”) Sum of all Sales over 200
=SUMIF(Month,”Jan”,Sales) Sum of Sales in which Month = “Jan”
=SUMIF(Month,”Jan”,Sales)+SUMIF(Month,”Feb”,Sales) Sum of Sales in which Month =”Jan” or “Feb”

Chapter 3: Formula Tricks and Techniques 71
Formula Description
{=SUM((Month=”Jan”)*(Region=”North”)*Sales)} Sum of Sales in which Month=”Jan” and
Region=”North”
=SUMIFS(Sales,Month,”Jan”,Region,”North”) Sum of Sales in which Month=”Jan” and
Region=”North” (Excel 2007 and later)
{=SUM((Month=”Jan”)*(Region=”North”)*Sales)} An array formula that returns the sum of Sales
in which Month=”Jan” and Region=”North”
=SUMIFS(Sales,Month,”Jan”,Region,”<>North”) Sum of Sales in which Month=”Jan” and Region
<> “North” (Excel 2007 and later)
{=SUM((Month=”Jan”)*(Region<>”North”)*Sales)} An array formula that returns the sum of Sales
in which Month=”Jan” and Region <> “North”
=SUMIFS(Sales,Month,”Jan”,Sales,”>=200”) Sum of Sales in which Month=”Jan” and
Sales>=200 (Excel 2007 and later)
{=SUM((Month=”Jan”)*(Sales>=200)*(Sales))} An array formula that returns the sum of Sales
in which Month=”Jan” and Sales>=200
=SUMIFS(Sales,Sales,”>=300”,Sales,”<=400”) Sum of Sales between 300 and 400 (Excel 2007 and later) {=SUM((Sales>=300)*(Sales<=400)*(Sales))} An array formula that returns the sum of Sales between 300 and 400 Other counting tools Other ways to count or sum cells that meet certain criteria are: h Filtering (using a table) h Advanced filtering h The DCOUNT and DSUM functions h Pivot tables For more information, consult the Help system. Working with Dates and Times Excel uses a serial number system to store dates. The earliest date that Excel can understand is January 1, 1900. This date has a serial number of 1. January 2, 1900, has a serial number of 2, and so on. Most of the time, you don’t have to be concerned with Excel’s serial number date system. You simply enter a date in a familiar date format, and Excel takes care of the details behind the scenes. For example, if you need to enter August 15, 2010, you can simply enter the date by typing August 15, 2010 (or use any of a number of different date formats). Excel interprets your entry and stores the value 40405, which is the serial number for that date. 72 Part I: Some Essential Background In this chapter, I assume the U.S. date system. If your computer uses a different date system, you’ll need to adjust accordingly. For example, you might need to enter 15 August, 2010. Entering dates and times When working with times, you simply enter the time into a cell in a recognized format. Excel’s system for representing dates as individual values is extended to include decimals that represent portions or fractions of days. In other words, Excel perceives all time with the same system whether that time is a particular day, a certain hour, or a specific second. For example, the date serial number for August 15, 2010, is 40405. Noon (halfway through the day) is represented internally as 40405.5. Again, you normally don’t have to be concerned with these fractional serial numbers. Because dates and times are stored as serial numbers, it stands to reason that you can add and subtract dates and times. For example, you can enter a formula to calculate the number of days between two dates. If cells A1 and A2 both contain dates, the following formula returns the number of intervening days: =A2-A1 When performing calculations with time, things get a bit trickier. When you enter a time without an associated date, the date is assumed to be January 0, 1900 (date serial number 0). This is not a problem — unless your calculation produces a negative time value. When this happens, Excel displays an error (displayed as #########). The solution? Switch to the 1904 date system. Display the Excel Options dialog box, click the Advanced tab, and then enable the Use 1904 Date System check box. Be aware that switching to the 1904 date system can cause problems with dates already entered in your file or dates in workbooks that are linked to your file. In some cases, you may need to use time values to represent duration, rather than a point in time. For example, you may need to sum the number of hours worked in a week. When you add time values, you can’t display more than 24 hours. For each 24-hour period, Excel simply adds another day to the total. The solution is to change the number formatting to use square brackets around the hour part of the format. The following number format, for example, displays more than 24 hours: [hh]:mm Chapter 3: Formula Tricks and Techniques 73 Using pre-1900 dates The world, of course, didn’t begin on January 1, 1900. People who work with historical information when using Excel often need to work with dates before January 1, 1900. Unfortunately, the only way to work with pre-1900 dates is to enter the date into a cell as text. For example, you can enter the following into a cell, and Excel won’t complain: July 4, 1776 You can’t, however, perform any manipulation on dates that are actually text. For example, you can’t change its formatting, you can’t determine which day of the week this date occurred on, and you can’t calculate the date that occurs seven days later. VBA, however, supports a much wider range of dates. I created a number of VBA worksheet functions that allow you to work with pre-1900 dates. Figure 3-9 shows a demonstration of these functions used in a worksheet. It’s also an excellent example of how VBA can extend the features in Excel. Figure 3-9: The Extended Date Functions add-in lets you work with pre-1900 dates. See Chapter 10 for more information about the Extended Date functions. 74 Part I: Some Essential Background Creating Megaformulas Often, a formula requires intermediate formulas to produce a desired result. In other words, a formula may depend on other formulas, which in turn depend on other formulas. After you get all these formulas working correctly, you can often eliminate the intermediate formulas and use what I refer to as a single megaformula instead. The advantages? You use fewer cells (less clutter), the file size is smaller, and recalculation may even be a bit faster. The main disadvantage is that the formula may be impossible to decipher or modify. Here’s an example: Imagine a worksheet that has a column with thousands of people’s names. And suppose that you’ve been asked to remove all the middle names and middle initials from the names — but not all the names have a middle name or initial. Editing the cells manually would take hours, and even Excel’s Data➜Data Tools➜Text To Columns command isn’t much help. So you opt for a formulabased solution. Although this task isn’t difficult, it normally involves several intermediate formulas. Figure 3-10 shows the results of the more conventional solution, which requires six intermediate formulas shown in Table 3-5. The names are in column A; the end result goes in column H. Columns B through G hold the intermediate formulas. Figure 3-10: Removing the middle names and initials requires intermediate formulas. Table 3-5: Intermediate Formulas Written In Row 2 in Figure 3-10 Column Intermediate Formula What It Does B =TRIM(A2) Removes excess spaces. C =FIND(“ “,B2,1) Locates the first space. D =FIND(“ “,B2,C2+1) Locates the second space. Returns #VALUE! if there is no second space. E =IF(ISERROR(D2),C2,D2) Uses the first space if no second space exists. F =LEFT(B2,C2) Extracts the first name. G =RIGHT(B2,LEN(B2)-E2) Extracts the last name. H =F2&G2 Concatenates the two names. Chapter 3: Formula Tricks and Techniques 75 You can eliminate the intermediate formulas by creating a megaformula. You do so by creating all the intermediate formulas and then going back into the final result formula and replacing each cell reference with a copy of the formula in the cell referred to (without the equal sign). Fortunately, you can use the Clipboard to copy and paste. Keep repeating this process until cell H2 contains nothing but references to cell A2. You end up with the following megaformula in one cell: =LEFT(TRIM(A2),FIND (“ “,TRIM(A2),1))&RIGHT(TRIM(A2),LEN(TRIM(A2))- IF(ISERROR(FIND(“ “,TRIM(A2),FIND(“ “,TRIM(A2),1)+1)), FIND(“ “,TRIM(A2),1),FIND(“ “,TRIM(A2),FIND (“ “,TRIM(A2),1)+1))) When you’re satisfied that the megaformula is working, you can delete the columns that hold the intermediate formulas because they’re no longer used. The megaformula performs exactly the same tasks as all the intermediate formulas — although it’s virtually impossible for anyone to figure out, even the author. If you decide to use megaformulas, make sure that the intermediate formulas are performing correctly before you start building a megaformula. Even better, keep a single copy of the intermediate formulas somewhere in case you discover an error or need to make a change. Another way to approach this problem is to create a custom worksheet function in VBA. Then you could replace the megaformula with a simple formula, such as =NOMIDDLE(A1) In fact, I wrote such a function to compare it with intermediate formulas and megaformulas. The listing follows. Function NOMIDDLE(n) As String Dim FirstName As String, LastName As String n = Application.WorksheetFunction.Trim(n) FirstName = Left(n, InStr(1, n, “ “)) LastName = Right(n, Len(n) - InStrRev(n, “ “)) NOMIDDLE = FirstName & LastName End Function A workbook that contains the intermediate formulas, the megaformula, and the NOMIDDLE VBA function is available on the companion CD-ROM. The workbook is named megaformula.xlsm. 76 Part I: Some Essential Background Because a megaformula is so complex, you may think that using one slows down recalculation. Actually, that’s not the case. As a test, I created a workbook that used the megaformula 175,000 times. Then I created another workbook that used six intermediate formulas to compute the 175,000 results. I compared the results in terms of calculation time and file size; see Table 3-6. Table 3-6: Comparing Intermediate Formulas and Megaformula Method Recalculation Time (Seconds) File Size Intermediate formulas 5.8 12.60MB Megaformula 3.9 2.95MB The actual results will vary significantly, depending on system speed, amount of memory installed, and the actual formula. The VBA function was much slower — I abandoned the timed test after five minutes. This is fairly typical of VBA functions; they are always slower than built-in Excel functions. 77 4 Understanding Excel Files In This Chapter ● Starting Exce