1. INDEX-MATCH Formula
1.1 Data Extraction
Data extraction is one of the most common data manipulation processes in the spreadsheet. The extractions can range from a simple extraction of data to conditional extraction. Values within a range or an array are mapped using their row and column coordinates. Therefore, the main problem encountered in data extraction is how to define the row and column coordinates. Once we know the coordinates, we can easily access the value of that associated with the row and column coordinates.
Advanced spreadsheet users typically use the combination of INDEX function and MATCH function to extract the data instead of using the VLOOKUP/HLOOKUP function. The combination of INDEX and MATCH is commonly known as the INDEX-MATCH formula. It’s one of the most useful formulas in spreadsheets, especially in Excel, due to its flexibility and efficiency. Let’s break down the two functions to understand how they work together.
1.2 INDEX Function
The INDEX function returns a value or cell reference based on a given row and column coordinates within a data set. The latter function is not explicitly defined by Google Sheets, but in Excel, the INDEX function is known to also return a cell reference. It turns out, we can also do the same in Google Sheets.
1.2.1 INDEX Syntax
Below is the syntax for the INDEX function:

There are three inputs in the INDEX function of Google Sheets: reference, row and column. The reference is the range of cells or array of values. The row and column are the relative row and column coordinates of a value within the reference. Both inputs are optional, which means that we can omit them from the function. If we do this, they will take their default values, which are both 0. Let’s take a look at a basic example (Fig. 1.1) to help us build an intuition on how the formula works.
The goal of the task below is to extract the amount for the code YY-1003. First, let’s identify the relative row and column coordinates of the given code. The reference is A2:C6. This means that the first entry is A2. Therefore, relative to A2, A2 is row 1, column 1 (1,1). A3 is row 2, column 1 (2,1) and so on. The relative row and column coordinates for the amount of YY-1003 is (1,3). We input these coordinates as the row and column inputs of the INDEX function, and it will locate the cell or value using the given coordinates.

Figure 1.1 Basic example of INDEX Process
The scenario above showcased the simple application of the INDEX function to extract a value from a data set. We’ll explore more advanced applications in the succeeding sections, but we’ll discuss first the rules and properties of its inputs.
1.2.2 INDEX: Breaking Down the Formula

Figure 1.2 INDEX Function
In this section, we’ll talk about the rules and properties for each of the inputs of the INDEX function, how to write them, how the INDEX function processes these inputs and how the inputs affect the output. We’ll build an intuition to make it easier for us to visualize and create an INDEX formula for our problems.
1.2.2.1 Reference
Here are the important rules that you need to consider for the Reference:
1.2.2.1.1 Syntax and Location of the Reference
In Google Sheets, the reference can be written or coded in three ways depending on the scenario:
- Range of Cells:
Typically, our data set is stored in a sheet, and we want to be able to use this set of data in our formulas. In order to do that, we write the references of the cells where the data set is located. In the formula above, the reference A2:C6 means that the data set is stored from column A, row 2 to column C, row 6. The reference defines the address of the cell, and the formula uses the reference to access the value within the referenced cell.
The main advantage of referencing the cells is that the formula becomes dynamic. We can change the values within the referenced cells, and the formula will update automatically in response to the changes.
We can write the reference in the formula or highlight the cell. If we perform the latter, the spreadsheet automatically writes the references for us (Fig. 1.3), so we don’t have to write the references manually.

Figure 1.3 Referencing cells by highlighting them
Also, it’s important to know that we can reference the cells from any sheet. This means that the cells can be:
- in the same sheet with the function
- in another sheet within the same workbook
- Or in another sheet in another workbook (using ImportRange function)
- Hard-Coded Array:
We can directly write the list of values within the formula. For now, we can think of an array as a list of values that are not located in any sheet. To write them directly, we need to write the list inside the curly brackets {} (see the formula below).
=INDEX({1;2;3;4;5},3)
There are times where you don’t want to add any values within your sheets, and this method will let you do that. You don’t have to write the values in the sheet and reference them in the formula, especially if the values are constants and the size of the list is small. You can write the values directly like the formula above.
This method is for a smaller list of values that are constant. It’s inconvenient to use this in larger sizes of data or scenarios where we want your formula to be dynamic.
- Arrays from other Functions
Sometimes we don’t want to use our raw data set directly. There are times where we want to extract some information from the raw data and use that extracted information in the formula.
The task in the example below is to return the number part of the code: YY-1003 (Fig. 1.4). The RIGHT function is used to extract the numbers for each of the cells in A2:A6. Result of the RIGHT function will be an array containing the numbers. The array is then fed in the INDEX function as the reference. The RIGHT function normally doesn’t accept a range of cells or an array as input. In order for it to evaluate the range of cells, we must use the ArrayFormula function.

Figure 1.4 RIGHT function nested inside the INDEX function
An array or a range of cells for the values will not work in most of the functions. We need to use the ArrayFormula function in order for those functions to evaluate the arrays or ranges of cells. We can write the ArrayFormula function and write a function inside it, or once you’re done writing the formula, you can just press CTRL+SHIFT+ENTER, and Google Sheets will enclose the whole formula with the ArrayFormula function.
1.2.2.1.2 Dimension and Size of Reference
In spreadsheets, the dimension is related to rows and the columns. If our data is written in one column but multiple rows or one row but multiple columns, the dimension of our data is one-dimensional (1D). If our data set consists of multiple rows and columns like a table, then the dimension is two dimensional (2D).
The rows and columns serve as coordinates of the cells within the spreadsheet. For 1D data, we only need to specify one coordinate (row or column) to know where the value is located. For 2D, we need to define the row and column to locate the value.
The size of the data is related to the number of rows and columns, which also translate to the number of entries.
The reference can be 1D or 2D:
- One-Dimensional (1D):
- Horizontal List:
- Cells (Fig. 1.5)

Figure 1.5 1D Reference: one row, multiple columns
- Hard-coded Arrays: (values in a horizontal list are separated by commas)
=INDEX({“YY-1003″,”YY-1004″,”ZZ-1001”},1)
The array here is the same as the range B1:D1 in Fig. 1.5.
- Vertical List:
- Cells (Fig.1.6)

Figure 1.6 1D Reference: one column, multiple rows
- Hard-coded Arrays: (values in a vertical list are separated by semicolons)
=INDEX({“YY-1003″;”YY-1004″;”ZZ-1001”},1)
- Two-Dimensional (2D):
- Cells: (Fig. 1.7)

Figure 1.7 2D Reference: Multiple Rows and Columns
- Hard-Coded Arrays: (the array below is a 3 (row) by 2 (column) array)
=INDEX({“YY-1003″,2000;”YY-1004″,1500;”ZZ-1001”,1000},1,2)
1.2.2.1.3 Data Type of Reference
The data type can be anything and can vary within the data set itself.
1.2.2.2 Row and Column
The row and column are numbers that define the coordinates of a value within the reference, and both have the same rules and properties.
1.2.2.2.1 Syntax and Location of the Row and Column
- Cell Reference
The coordinates can be stored in cells and can be referenced in the function:
=INDEX(A2:B4,C1,D1)
- Constant Value
The coordinates can be hard-coded in the formula as constants:
=INDEX(A2:B4,1,2)
- Value from other Functions
The coordinates can be results from other functions:
=INDEX(B2:B4,MATCH(“YY-1003”,A2:A4,0))
1.2.2.2.2 Dimension and Size of Row and Column
The row and column coordinates can only accept single values. The INDEX function can’t evaluate 1D or 2D lists for these inputs even if we use the ArrayFormula function. It can’t perform array operations.
1.2.2.2.3 Data Type of Row and Column
The data type for both inputs are integers, which range from 0 up to any positive integer.
1.2.3 INDEX: The Process
We already had a glimpse of how it works from section 1.2.1, and in this section we’ll look at more detailed examples of how the INDEX function processes its inputs
1.2.3.1 Row and Column Default Values
The coordinates range from 1 up to any positive integer. The indexing of the row and column always starts at 1, but the default values for both inputs are 0. What happens when the row and column are zero. Let’s look at the examples below.
The row in the example below is omitted (Fig. 1.8), and therefore, it will take its default value, which is 0. If row is 0, then the INDEX function will extract all of the cells within the given column. Since the given column is 1, the INDEX function will extract the whole Code Column.

Figure 1.8 Row is 0
If the column is omitted or 0, then the INDEX function will extract the whole row (Fig. 1.9). If both are 0, then the INDEX function will extract the whole data set (Fig. 1.10).

Figure 1.9 Column is 0

Figure 1.10 Both Row and Column are 0
1.2.3.2 Reference: ID Lists
If the reference is a 1D list (horizontal or vertical), we don’t need to input the column. The INDEX function automatically knows that the data set is 1D. Therefore, even if the reference is a horizontal list, the row input will be treated as a column coordinate (Fig. 1.11).
The reference (B1:D1) in the example below is a horizontal list. If we want to extract the value of the 2nd entry, we need to input row =1 and column =2:
=INDEX(B1:D1,1,2)
But we can shorten this formula by omitting the column input and writing it in the row input instead as shown in Fig. 1.11. Note that in a 1D list, we only need one coordinate in order to define the location of a value. The INDEX function follows this rule. If the reference is a 1D list, the row input serves as a coordinate input for any value within the 1D list. Therefore, It can be a row or a column.

Figure 1.11 Reference: 1D Horizontal List
1.2.3.3 Row and Column: Out of Bounds
If one or both of the coordinates are out of bounds, the INDEX function will return the #NUM! Error (Fig. 1.12). The column in the formula below is 4 and is greater than the column size of the reference, which is 3. The resulting coordinates (1,4) will be out of bounds. Therefore, the INDEX function will return an error.

Figure 1.12 Row and Column are Out of Bounds
1.2.3.4 Row and Column: Array and Range of Cells
The INDEX function can’t perform array operations. Therefore, the row and column coordinates can’t be an array or a range of cells. If we input arrays or ranges of cells for both inputs, the INDEX function will only evaluate the first entries for each of the inputs (Fig. 1.13).
One implication of this property is that we can’t perform selected multiple extraction just like the example below. The INDEX function will not extract (1,2) and (2,3). It will only extract the first row and column pair, which is (1,2).

Figure 1.13 Row and Column are 1D Vertical Arrays
1.2.3.5 Returning a Cell Reference
We can use the INDEX function to return a cell reference, but we need to use it in a way where the resulting formula implies a reference. Let’s look at the example below.
The task in the example below is to extract the table to a new location (from A1:E4 to A6:E9) (Fig. 1.14). The problem is that the column size of the data set is dynamic. It increases with the month. Therefore, we need a dynamic formula that can adjust with the data set.
One way to do this is to write a dynamic cell reference using the INDEX function. If the data set is static, we can write:
=A1:E4
But the formula above won’t work because the data set is dynamic. To solve this, we can replace E4 with the INDEX function as shown below. The reference input will be the row 4 of the spreadsheet (i.e., 4:4). We can treat it as a horizontal list from A4 up to the last cell in row 4. For the coordinate, we use the COUNTA function to return the number of cells in row 4 that contains a value. The resulting count will also be the current column size. In the example, the COUNTA function will return 5. This result will be fed into the INDEX function. The 5th entry in the reference 4:4 is E4. The resulting reference will be A1:E4, and the formula will return the values from this reference. If the data set is updated, the formula will adjust based on the changes in column size.
This is the simplest way to return a reference using the INDEX function. The A1:INDEX() in the formula implies a reference, and therefore, the INDEX function will return a reference instead of a value.

Figure 1.14 Returning a Cell Reference
1.2.3.6 Formatting
Cell formatting is not copied to the output except for value formatting. Notice the formatting of values in the Amount Column in the reference (B2:B4) and the output data (E2:E4) (Fig. 1.15). The value formatting is carried over to the output data. (The cell formatting in the example below is applied manually.)

Figure 1.15 Value Formatting
1.2.4 INDEX Output
1.2.4.1 Dimension and Size of the Output
We’ve seen in section 1.2.3 that the dimension, size and arrangement of the output of the INDEX function will be dependent on row and column coordinates. If one or both of the coordinates are 0, then the result is an array. Otherwise, the output will always be a single value. For array results, just make sure that the destination cells of the outputs are vacant cells. If there are values that are occupying the cells where the results should be displayed, then an error will be returned instead of the results (Fig. 1.16).

Figure 1.16 The INDEX function returns a #REF error due to D3 blocking the results
1.2.4.2 Data Type of the Output
The data type of the output will be the same as with the data type of the reference.
1.3 MATCH Function
The MATCH function is used to return a position of a value within a range of cells or an array given the value. In short, it returns a coordinate of a value within a given array.
1.3.1 MATCH Syntax
Below is the syntax for the MATCH function:

There are three inputs in the MATCH function: search_key, range and search_type. The search_key is the lookup value or the value that we want to search. The range is the list where we want to search the lookup value. The search_type is related to the type of matching or searching. It’s an optional input with a default value of 1. Therefore, we can omit the search_type from the formula, and it will take its default value. Let’s take a look at a basic example (Fig. 1.17) to help us build an intuition on how the formula works.
The task in the example below is to determine the location of the given code in C2 from the list in the Code Column (A2:A6). The value in C2 (i.e., YY-1004) will be the search_key, the range is the list (A2:A6) and the search_type is 0, which means that the type of match is an exact match. The exact match follows the linear search method. It starts with the first entry in the list and checks if the value is equal to the search_key. If the result is true, then the coordinate of the value is returned. Otherwise, it will move on to the next. The linear search stops once the condition is satisfied (i.e., value = search_key) or all of the entries are evaluated. Therefore, it will always return the coordinate of the first match. If there’s no match, the MATCH function will return the #N/A Error. In the case below, A3 = YY-1004, which is the 2nd row in the given range.

Figure 1.17 Basic Example of MATCH Process
There are three inputs for the search_type: -1, 0 and 1. If we input 0, the search method is a linear method, where the objective is to find an exact match within the data set. The other values, which are -1 and 1, use the binary search method. This method is more complicated than linear search. We’ll expound on this later in section 1.3.2. It’s important to note here that if the search_type is -1, the assumption is that the data set is sorted in descending order while if the search_type is 1, then the assumption is that the data set is sorted in ascending order. This may not be clear now, but we’ll make sense of this in the succeeding sections.
1.3.2 MATCH: Breaking Down the Formula

Figure 1.18 MATCH Function
In this section, we’ll talk about the rules and properties for each of the inputs of the MATCH function, how to write them, how the MATCH function processes these inputs and how the inputs affect the output. We’ll build an intuition to make it easier for us to visualize and create an MATCH formula for our problems.
1.3.2.1 Search_Key
Here are the important rules that you need to consider for the Search_Key:
1.3.2.1.1 Syntax and Location of the Search_Key
In Google Sheets, the search_key can be written or coded in four different ways depending on the scenario:
- Reference to a cell:
=MATCH(C2,A2:A6,0)
We can write the reference in the formula or highlight the cell. If we perform the latter, the spreadsheet automatically writes the references for us (Fig. 1.19), so we don’t have to write the references manually.

Figure 1.19 Referencing cells by highlighting them
Also, it’s important to know that we can refer to the cells from any sheet. This means that the cells can be:
- in the same sheet with the function
- in another sheet within the same workbook
- Or in another sheet in another workbook (using ImportRange function)
- A Constant value:
We can also write the search_key directly into the formula as a constant:
=MATCH(“YY-1004”,A2:A6,0)
- Value generated from another function or expression:
The formula below extracts a code from C2 using the RIGHT function, and uses the result as the search_key for the MATCH function.
=MATCH(RIGHT(C2,4),A2:A6,0)
- Arrays and Range of Cells
The first three methods define the basic property of the search_key in terms of its size. Normally, the search_key is a single value input, and therefore, it can’t evaluate an array or a range of cells. To evaluate an array or a range of cells, we need a special method.
Range of Cells:
=ArrayFormula(MATCH(C2:C3,A2:A6,0))
Hard-Coded Arrays:
=ArrayFormula(MATCH({“XX-1001″;”YY-1001”},A2:A6,0))
Arrays from another function:
=ArrayFormula(MATCH(RIGHT(C2:C3,4),A2:AC6,0))
An array or a range of cells for the search_key will not work in an ordinary MATCH formula. We need to use the ArrayFormula function in order for this to work correctly. We can write the ArrayFormula function and put the MATCH function inside, or once you’re done writing the MATCH formula, you can just press CTRL+SHIFT+ENTER, and Google Sheets will enclosed the MATCH with the ArrayFormula function.
1.3.2.1.2 Dimension and Size of the Search_Key
The search_key can be: (See section 1.2.2.1.2 for definition of dimension and size)
- Single Value:
=MATCH(C2,A2:A6,0)
- 1D Horizontal List: Search_Keys are in D2:E2 (Fig. 1.20)

Figure 1.20 Search_Key: Horizontal List
Equivalent Hard-Coded Array: (values in a horizontal list are separated by commas)
=ArrayFormula(MATCH({“YY-1004″,”ZZ-1001”},A2:A6,0))
- 1D Vertical List: Serch_Keys are in C2:C3 (Fig. 1.21)

Figure 1.21 Search_Key: Vertical List
Equivalent Hard-Coded Array: (values in a vertical list are separated by semicolon)
=ArrayFormula(MATCH({“YY-1004″;”ZZ-1001”},A2:A6,0))
- 2D List: Search_Keys are in C1:D2 (Fig. 1.22)

Figure 1.22 Search_Key: 2D List
Equivalent Hard-Coded Array: (the array below is a 2 (row) by 2 (column) array)
=ArrayFormula(MATCH({“YY-1003″,”YY-1004″;”ZZ-1001″,”ZZ-1002”},A2:A6,0))
1.3.2.1.3 Data Type of the Search_Key
The search_key can be any data type. This means that we can have lookup values that are numbers, dates, texts, booleans (i.e., TRUE and FALSE) and etc, but make sure that the data type of the search_key is the same as the data type of the range.
1.3.2.2 Range
Here are the important rules that you need to consider for the Range:
1.3.2.2.1 Syntax and Location of the Range
In Google Sheets, the range can be:
- Cell Reference (Range of Cells)
- Arrays (Hard-Coded and from other Functions)
1.3.2.2.2 Dimension and Size of the Range
The range can only be a 1D list. If we input a 2D list, it will only evaluate the first column.
1.3.2.2.3 Data Type of Range
The data type can be anything and can vary within the data set itself, but it’s important that the data type of the range is uniform. The matching process is strict when it comes to data type. Therefore, the search_key and the range must have the same data type.
1.3.2.3 Search_Type
The search_type is an optional input with a default value of 1. It’s an integer input, which can only be -1, 0 and 1. As mentioned before, 0 is for exact match, which uses the linear search method, while -1 and 1 are the approximate match of the MATCH function. The latter two use the binary search method, where the -1 assumes that the data set is sorted in descending order while 1 assumes that the data set is sorted in ascending order.
This input is not complicated. We can write it as a cell reference or hard-code it as a constant. In terms of dimension, size and arrangement, we can make a 1D or 2D list of the search_type values (Fig. 1.23), but it’s not a common usage.

Figure 1.23 Search_Type as 1D Horizontal List
1.3.3 MATCH: The Process
In this section, we’ll delve into the linear search vs. binary search of the MATCH function. We’ll also tackle how the MATCH function evaluates array inputs. It’s important to note here that the MATCH function can only handle one search_key and range pair at a time. If the scenario requires more than 1 pair, then we need to apply a few tricks in order to do this (see chapter 3).
1.3.3.1 Exact Match
The exact match uses the linear search to find the value. Let’s look at the example below. The task in the example below is to determine the position of the given code in C2 from Column A (Fig. 1.24). We set the MATCH function to exact match (i.e., search_type = 0).
The linear search starts with the first cell. It then checks if the search_key is equal to that cell. If it’s true, then the coordinate of that cell is returned. Otherwise, it will move on to the next. This will go on until it finds a match or all of the cells in the list are exhausted. If the latter happens and there’s no match, then the MATCH function will return the #N/A Error. In the case below, ZZ-1002 is in the 4th row of the range.
In summary, the linear search method will loop through all of the values until the first match is found or all of the values are evaluated.

Figure 1.24 Exact Match: Linear Match
1.3.3.2 Approximate Match: Binary Search
There are two types of binary searches that are implemented in the MATCH function. The main difference is the assumption in the sorting of the range. Let’s tackle each of these methods.
- Search_type = 1
This is the default value of the search_type. The main assumption is that the range is sorted in an ascending order. This assumption is one of the important bases for the process flow of the binary search method. Let’s look at an example.
The task in the example below is to determine the row coordinate of the given letter in C2 from the list in A2:A8 (Fig. 1.25). The row size of the range is 7, and the range is also sorted in an ascending order.

Figure 1.25 MATCH: Search_Type = 1
The binary search starts at the midpoint. Therefore, the first step is to determine the midpoint of the range. The midpoint is a location within the given range. It is the coordinate of the entry or cell that is located at the center of the range. The relative row coordinates are labeled at the right of the range below (Fig. 1.26). To determine the midpoint, we use this formula:
MIDPOINT=(lowest + highest)/2
We take the sum of lowest and highest coordinates and divide it by two. In this scenario, the midpoint is (1+7)/2 = 4, which is A5. If the result is a decimal number, it is rounded up (e.g., (1+6)/2 = 3.5 = 4).

Figure 1.26 Binary Search: Search_Type = 1
Next, the value at the midpoint is matched with the search_key. The objective is to determine whether the search_key is less than (<) or greater than or equal to (≥) the midpoint value. The search_key is g, and it is greater than the midpoint value (i.e., g > d), which is d. This step here identifies which parts of the range are discarded. If the search_key is less than the midpoint value, then values from the midpoint up to the last row are excluded or discarded, and the search will move on to the upper half. However, if the search_key is greater than or equal to the midpoint value, the values from the top row up to the cell before the midpoint are excluded.
If we analyze the aforementioned steps above, the discarding or exclusion of the upper half or bottom half depends on the assumption that the range is sorted in an ascending order. If the search_key is greater than or equal to the midpoint value and the range is sorted in an ascending order, then any values from the top row to the cell or entry before the midpoint are less than the search_key. Therefore, the algorithm doesn’t need to check the upper-half. We can use this same logic for the case: search_key < midpoint value. Another implication is that the midpoint can be the largest value in the range that is less than the search_key.
Let’s apply the steps to the scenario. Since the search_key is greater than the midpoint value (i.e., g > d), row 1 to row 3 are excluded. The remaining bottom-half (row 4 to 7) is used for the next step (Fig. 1.26, 2nd Image from Left).
The next step is to repeat all of the prior steps but using the remaining half of the range:
- The midpoint is calculated: (4+7)/2 = 5.5 = 6.
- The search_key is compared to the midpoint value: g > f.
- Since the search_key is greater than the midpoint value, the upper-half is excluded.
- The process repeats again.
Now, the important question is how the process terminates. The process stops until there’s only one entry, but there’s a final criterion that needs to be satisfied. The value of the final cell or entry must be less than or equal to the search_key (i.e., ≤Search_Key). This means that if there’s no exact match, the MATCH function will return the largest value in the range that is less than the value of the search_key assuming that the range is sorted (ascending). That’s why it’s called an approximate match. If the final value is greater than the search_key, the MATCH function will return the #N/A error.
- Search_type = -1
This setting follows the same binary search method as discussed above, but they differ in assumption in terms of sorting. This method assumes that the range is sorted in descending order, and this affects the flow of the process, especially what part of the range will be excluded. Let’s look at the same example as above, but instead of ascending order, will sort the range in descending order. Let’s look at the example below.
The same list is still in A2:A8, but in descending order (Fig. 1.27). The search_key is still “g.” Here’s the process (Fig. 1.28):
- Calculate MIDPOINT: (ex: Midpoint = (1+7)/2 = 4)
- Determine Search_Key (=, < or >) Midpoint Value: (ex: g > d)
- If Search_Key ≥ Midpoint Value, then exclude the bottom-half of the range and continue the search on the upper-half (including the midpoint). Otherwise, if Search_Key < Midpoint Value, then exclude the upper-half of the range and continue the search on the bottom-half (including midpoint).
- Repeat 1 to 3 until there’s only one entry or cell.
- Check if the last entry or cell is greater than or equal to the Search_Key. If true, return the value of the last entry or cell. Otherwise, return #N/A Error.

Figure 1.27 MATCH: Search_Type = -1

Figure 1.28 Binary Search: Search_Type = -1
The sorting in descending order changed the flow of the binary search method. If the search_key is greater than or equal to the midpoint value, then all of the values from the bottom up to the entry or cell before the midpoint are all less than the search_key assuming that the sorting is in descending order. Therefore, it’s not worth searching the bottom-half. If the search_key is less than the midpoint, then all of the values from the first up to the entry or cell before the midpoint are greater than the search_key.
Some of the parts of the algorithms that are explained here are results of experiments performed in the spreadsheet. It may not be the exact way of how Google Sheets processes the binary search method in the background, but the important thing here is that we are able to extract enough information that helps us deduct and formulate the rules that can help us construct correct formulas.
Overall, the binary search halves the search space until one cell or entry remains. This method can also shorten the search time.
1.3.3.3 MATCH: Search_Key Array or Range of Cells
The scenario in this section requires the use of the ArrayFormula function in order for the MATCH function to work. You can either write the MATCH function inside the ArrayFormula or press CTRL+SHIFT+ENTER after you write the MATCH function, which automatically encloses the MATCH function with the ArrayFormula. So how do we know if we need to use the ArrayFormula?
We’ve discussed in the previous sections the basic methods of writing the inputs. These methods are related to the necessary properties of the inputs like the dimensions, locations, data types and so on. We loosely define arrays as lists of values that are not located in the sheets. This means the lists are not physically stored in the cells in any sheet. This definition is deliberate so that we can distinguish it from a range of cells. If we use an array or a range of cells as an input to functions that normally don’t accept arrays or ranges of cells as inputs, then we need to use the ArrayFormula function in order to evaluate the inputs, but not all inputs can work this way. Some inputs are strict when it comes to its properties. Even if you use the ArrayFormula function, it will not work and can return an error.
There are scenarios where we need to match a list to list, and this can be done through the MATCH function. We can input an array list as the search_key to the MATCH function. Let’s look at a simple example.
The task in the example below is to determine the coordinates of the given codes in C2:C3 from the list in A2:A6 (Fig. 1.29). There are two search_keys: YY-1004 and ZZ-1004. We can solve this problem by constructing two MATCH formulas in D2 and D3, but we can also solve this using one array formula as shown in the figure below. How does this work?
The MATCH function will perform a search for each of the given search_keys. If there are two search_keys, then it will also perform two searches and produce two results. The arrangement of the results is dependent on the arrangement of the array. The search_keys in Fig. 1.29 are in a vertical list. Therefore, the results are also a vertical list (D2:D3).

Figure 1.29 Search_Key is a Range of Cells
1.3.3.4 Is MATCH Case-Sensitive
The MATCH function is not case-sensitive. You can perform conditions with different combinations of upper and lower cases, and the MATCH function will be able to identify the same characters (Fig. 1.30).

Figure 1.30 MATCH: Case-Insensitive
1.3.3.5 Does MATCH Supports Wildcards
The MATCH function can have wildcards in its search_key, and it can perform partial matches. The task in the scenario below is to extract the amount of an item with a “1001” in its code in the Code Column (Fig. 1.31). Wildcards consists of asterisk (*), question mark (?) and tilde (~). These are symbols that have functions when used in formulas. The first two are used to perform partial matches like the scenario in Fig. 1.31. The tilde is used to perform conditions that contain the asterisk (*) and question mark (?) as characters and not as wildcards.

Figure 1.31 MATCH with Partial Match
1.3.4 MATCH Output
1.3.4.1 Dimension and Size of the Output
We’ve seen in section 1.3.3.3 that the dimension, size and arrangement of the output of the MATCH function will be dependent on search_key. For array results, just make sure that the destination cells of the outputs are vacant cells. If there are values that are occupying the cells where the results should be displayed, then an error will be returned instead of the results (Fig. 1.32).

Figure 1.32 The MATCH function returns a #REF error due to D3 blocking the results
1.3.4.2 Data Type of the Output
The data type of the output will be positive integers that correspond to the coordinates of the search_key within the range.
- INDEX-MATCH
2.1 Data Extraction using INDEX-MATCH
The INDEX function is not a flexible function on its own since we need to know the row and column coordinates before we can use it. Most of the data extraction tasks in spreadsheets are conditional extractions, which means that we need to apply some sort of criteria to extract a particular data. Another thing is that the set of criteria is dynamic, which means that the criteria can change. Therefore, the row and column coordinates are always unknown and dependent on these criteria.
On the other hand, the output of the MATCH function is not useful in most scenarios. In general, the problems require extraction of values rather than the coordinates.
In summary, INDEX and MATCH functions have little applications when used as standalone functions, but when combined, we have one of the most powerful data extraction formulas in spreadsheets.
2.2 INDEX-MATCH: Breaking Down the Formula
Below is the syntax for the INDEX-MATCH:
=INDEX(reference,MATCH())
or
=INDEX(reference,MATCH(),MATCH())
The formulas above are the basic structures of the INDEX-MATCH. The MATCH function is nested inside the INDEX function, and it replaced the row and column inputs. The MATCH function implements the search criteria while the INDEX function returns the valuable information. The MATCH function also provides the flexibility that the INDEX function doesn’t have. It’s certainly a match made in heaven. Let’s look at a basic example to understand how it works.
The task in the example below is to extract the corresponding amount for the given code (i.e., YY-1004) in D2. The MATCH function is evaluated first. It searches for the code from A2:A6. If it finds an exact match, it will return the coordinate of the given code. This coordinate is also the coordinate of the corresponding amount. The resulting coordinate from the MATCH function will be used as the row input for the INDEX function. In the case below, the MATCH function returns 2. The reference in the INDEX function is B2:B6. Therefore, the INDEX function will return the 2nd cell in B2:B6.
It’s important to note here that the size of the reference in the INDEX function and the range in the MATCH function must be the same. The dimension and arrangement can be different but the size must always be the same in order for the formula to work correctly.

Figure 2.1 INDEX-MATCH: Basic Example
2.3 INDEX-MATCH vs. VLOOKUP
The INDEX-MATCH formula and the VLOOKUP function are both data extraction formulas. Both do the same thing and have similar processes, and that’s why the comparison between the two is one of the most popular debates in Excel and Google Sheets forums and blogs. It’s widely known that the INDEX-MATCH function is faster and more flexible than the VLOOKUP function, and that’s why it’s the go-to formula for most advanced users. The VLOOKUP function is simpler and easier to use than the INDEX-MATCH formula, and that’s why it’s preferred mostly by beginners. Whatever the case, both functions are effective in executing their objective, which is to extract data. Let’s look at the main differences between the two.
- No. of Functions vs. Dimension and Size of the Array or Range Input
Our intuition will tell us that a single function is faster than formulas that consist of two or more functions. Most of the time this is true, but for cases where we need to evaluate an array or a range of cells, the size and dimension of the list will also have a significant effect on the speed of the formula.
In the INDEX-MATCH formula, the MATCH function contains the lookup range, where the search_key is searched, and this range is also a 1D range. Because of this, we can minimize the dimension of the reference in the INDEX function, and input only the 1D list, from which the value of interest is extracted (Fig. 2.2). Overall, we can minimize the evaluated list to two 1D lists, and this can speed up the formula.

Figure 2.2 INDEX-MATCH vs. VLOOKUP: Range Dimension and Size
On the other hand, the lookup range and the extraction column in the VLOOKUP function must be in one array or range. Therefore, the range must always be a 2D range. If the column size between the lookup range, which is always the first column, and the extraction column is large, it can slow down the speed of the formula. In Google Sheets, this weakness of the VLOOKUP function is alleviated by creating an array using the curly brackets (Fig. 2.3). With this method, we can easily shorten the size of the range just like the INDEX-MATCH formula.

Figure 2.3 VLOOKUP with Curly Brackets {}
The case for dimension and size of the range will be more significant for scenarios where there’s only one lookup dimension (i.e., row input of INDEX). If the extraction column varies, this means that we also need to perform a lookup for the column headers. Therefore, we need to add another dimension (i.e., column input of INDEX) to the INDEX-MATCH formula. For this to work, we need to add another MATCH function to both formulas replacing the index in the VLOOKUP function and the column input in the INDEX function. Both INDEX-MATCH formula and VLOOKUP function now have the same settings.
- Flexibility of the Range and Lookup Range
The flexibility that we are talking about here in this section is the positioning of the lookup range and the extraction range. In the VLOOKUP function, the lookup column is always the first column in the range. The implication of this is that the extraction column will always be at the right of the lookup column. This is the reason why the VLOOKUP function can only do left lookup and right extraction. If the position of the two switches, then the VLOOKUP function won’t be applicable. This is the main advantage of the INDEX-MATCH formula. Due to the separation of the lookup column and the extraction column in the INDEX-MATCH formula, we can easily perform right or left lookup and extraction.
Although the VLOOKUP function seems to be at a disadvantage here, there’s actually a way to switch the position of the lookup column and extraction column, and that is by constructing an array using the curly brackets {}. We can reorganize and switch the position of the columns using the curly brackets. Therefore, in the end, both formulas can be the same.
- Multiple Search_Key and Lookup Range Pairs
It seems that the two formulas are tied in the first two differences that are aforementioned above, but this third main difference will leave the normal VLOOKUP formula in the dust. This is arguably the main reason why advanced users prefer the INDEX-MATCH formula. If we compare the normal VLOOKUP function to the INDEX-MATCH formula, the latter can incorporate multisets of criteria (search_key and lookup range pairs) because of the MATCH function (see chapter 3), but as mentioned above, this would be the case for a normal VLOOKUP formula. Again, there’s a trick that we can use to alleviate this weakness. If we use the curly brackets to rearrange the range input, we can also do the same with the VLOOKUP function.
Overall, It seems there’s not much difference between the two. Both can do the same and can have similar settings. As mentioned before, it’s widely known that the INDEX-MATCH formula is faster. This may be true for Excel but not for Google Sheets. In Excel, the curly brackets are not as powerful (i.e., can’t join range of cells into arrays) as the curly brackets in Google Sheets, and the range in the VLOOKUP function in Excel is not as flexible (i.e., can’t accept an array and can only accept a range of cells) as the VLOOKUP in Google Sheets. Therefore, if we re-analyze the differences above excluding the curly brackets, we can clearly see that the INDEX-MATCH formula is the more efficient formula, but this is not the case for Google Sheets. Due to the function of curly brackets, both formulas are now on even ground. Therefore, in the end, it’s a matter of preference, but if you’re a user that works in both Google Sheets and Excel, it’s better to use the INDEX-MATCH formula in the long run. There are other data extraction tasks that are complicated, which can only be solved by combining the INDEX function with other functions. If we familiarize ourselves with the most basic variation, which is the INDEX-MATCH, it would be easier to incorporate other functions.
- Applications of INDEX-MATCH
3.1 How to do INDEX-MATCH with Dates
All of the data types in Google Sheets have their own set of formatting. The formatting of values is just a way of displaying the values such as percentages, decimals and so on, but Google Sheets doesn’t necessarily use the exact values that are displayed in the cells. Therefore, we need to take into consideration the data type that we are working with for the search_key, especially the dates.
Evaluating dates in Spreadsheets is a bit complicated. Dates are treated as numerical values. Although we see this common date format (i.e., dd/mm/YYYY) displayed in the cells, the spreadsheet does not use this in its evaluation. Instead, it operates using date serials. The dates have equivalent date serials, which are the number of days from a referenced date (i.e., 12/31/1899 = 1). If you highlight the Dates and select Automatic as the Format, you’ll see that the dates will be converted to their corresponding date serials. These are the numbers that will be evaluated by the functions. Now, let’s look at the scenario below.
The task in the scenario below is to extract the amount for the date 1/12/2021 (Fig. 3.1). The search_key for this scenario is a constant. As discussed before, the data type of the search_key and the first column of the range must be the same. The search_key in the formula is written as text, but the data type in the Dates Column are dates. That’s why the INDEX-MATCH formula returned the #N/A error. Although the dates are shown in the format dd/mm/YYYY, the values that are evaluated are their equivalent date serials as aforementioned above.

Figure 3.1 INDEX-MATCH with Dates
It’s easier to reference a date from a cell because when we write dates in cells, Google Sheets can recognize a lot of date formats such as Jan-01-2021, January 1, 2021 or 1/1/2021 as dates. Therefore, we don’t need to think about the date serials when referencing the cells, but if the dates will be hard-coded as constants in the formula, then we need to convert the dates into their date serials. Below are some of the ways to convert the hard-coded dates to their corresponding date serials:
Using DATEVALUE function:
=INDEX(C2:C9,MATCH(DATEVALUE(“1/12/2021”),B2:B9,0))
Using DATE function:
=INDEX(C2:C9,MATCH(DATE(2021,1,12),B2:B9,0))
3.2 How to do INDEX-MATCH with Multiple Criteria
In most scenarios, there are multiple criteria for extraction, and as previously discussed, the MATCH function can only handle one pair of search_key and range. If there are more, then we can use two methods to evaluate such criteria. Let’s look at an example.
The task in the example below is to extract the amount from Column C based on two criteria: subs (E2) and date (F2) (Fig. 3.2). The main problem here is how to incorporate the two criteria. We’ll discuss two methods on doing this task. The first one is by using the ampersand (&) operator and the other method is by using the array version of the AND function.

Figure 3.2 INDEX-MATCH: Multiple Criteria
The & operator concatenates (or stitches) two values together into one. If we stitch the search_keys into one search_key and the first lookup column (Subs) to the second lookup column (Dates), then we can follow the rules of the MATCH function and still perform a multiple criteria extraction. It’s important to note here that the order of concatenation for both search_key and range must be the same. Let’s discuss how it works.
The MATCH function is evaluated first in the formula. The two search_keys are stitched by the & operator as shown in Column D (Fig. 3.3). Column E shows the stitched Column A and Column B. Notice the concatenated values. The dates are converted to their equivalent date serials and are stitched to their corresponding subs.
We are using the linear search method (i.e., search_type = 0) in this scenario. It starts with the first entry and checks the condition if search_key is equal to the entry. If TRUE (Column F), then the coordinate of that entry is returned and the matching process stops. If the result of the condition is FALSE, then the matching moves on to the next entry. The coordinate returned by the MATCH function will be used in the INDEX function to extract the amount.

Figure 3.3 MATCH: Multiple Criteria with & Operator
For the Array AND method (Fig. 3.4), we’ll utilize boolean array operations. The two criteria (i.e., Sub 2 and 1/12/2021) must be both satisfied, and therefore, it’s the same as using the AND function. The AND function returns true if all of the input conditions are true. If one or more conditions returns false, then the AND function will return false. The problem with the AND function is that it’s an aggregation function. It takes in all of its inputs and generates only one output. In the given scenario, what we need is to perform an AND function for each of the entries (Subs and Dates). In order to do this, we’ll use a boolean array operation.
Let’s start with the range in the MATCH function in E7 (Fig. 3.2):
=INDEX(C2:C9,MATCH(1,(A2:A9=E2)*(B2:B9=F2),0))
The first condition (A2:A9 = E2) checks the cells of the Subs Column if they are equal to the search_key (i.e., Sub 2). If they are equal, the expression will return TRUE (Column D). Otherwise, it will return FALSE. Next is the condition for the Dates (B2:B9 = F2). It’s the same aforementioned logic. If a cell in Column B satisfied the condition, then it will return TRUE (Column E). Otherwise, it will return FALSE. Both expressions will return an array containing boolean values as shown in the figure below. If we multiply both arrays, the spreadsheet will convert the TRUE and FALSE to their corresponding numerical values, which are 1 and 0, respectively. Therefore, the result of the array multiplication will be 1 and 0 (Column F). If both are TRUE, then it will return 1 (i.e., TRUE*TRUE = 1*1 = 1). If one or more criteria are false, then it will return 0 (e.g., FALSE*TRUE = 0*1 =0). This array operation is equivalent to performing the AND function on each of the entries. In summary, 1 means that both criteria are satisfied. Therefore, we’ll use 1 as the search_key.

Figure 3.4 MATCH: Multiple Criteria with Array AND
3.3 How to do INDEX-MATCH with Varying Extraction Column (2-Way Lookup)
The task in the example below is to extract the amount for a given sub and month (Fig. 3.5). We’ll use two MATCH functions (where search_type is Exact Match or Linear Search), which will return the row and column coordinates. The row coordinate is dictated by the Sub Lookup in A7 while the column coordinate is dependent on the Month Lookup in B6. Both search_keys can vary. Therefore, the reference in the INDEX function must be from B2:E4, which is a 2D range of cells, because the amount can be from January up to April.
The first MATCH function searches for Sub 2 in A2:A4. Once an exact match is found, it will return the row coordinate of Sub 2 within the reference. The second MATCH function searches for the month name in the month headers (B1:E1). If it finds the exact match in the month headers, it will return the column coordinate. The combination of the two MATCH functions will be the coordinates for the amount that we are interested in.

Figure 3.5 INDEX-MATCH: 2-Way Lookup
3.4 How to do INDEX-MATCH across Multiple Sheets
Let’s imagine that the monthly data from Fig. 3.5 are separated into their own sheets (Fig. 3.6). We’ll use the same search_keys, which are located in the Master Sheet. The task is to extract the amount for the given subs and month. Here’s one solution:
=ArrayFormula(INDEX({January!B2:B4;February!B2:B4;March!B2:B4},MATCH(A2&B1,{January!A2:A4&”January”;February!A2:A4&”February”;March!A2:A4&”March”},0)))

Figure 3.6 INDEX-MATCH: Multiple Sheets
Let’s dissect the formula. The main problem is how to search across the sheets. We deal with this by using the curly brackets {} to create an array from the range of cells from the different sheets. The extraction columns are all in Column B of the sheets. Therefore, for the reference input of the INDEX function, we’ll combine the extraction columns into one single column using the curly brackets (Fig. 3.7, Column A). The semicolon will add the lists into one column. Therefore, make sure that the dimensions are the same and the column size (no. of columns) are also the same.

Figure 3.7 INDEX-MATCH: Reference and Range from Multiple Sheets
We’ll also do the same thing for the range input of the MATCH function. Also, notice the search_key and range pair (Fig. 3.7, Column B), It’s the same method from the first solution of section 3.2. We use the & operator to combine the two search_keys and the two ranges. The trick here is to stitch the sheet name to the values in the range because the sheet names are also the month names. If for example the other search_key is another column within the data sets, then we just need to follow the method in section 3.2 and use the curly brackets to consolidate all of the concatenated values into one array.
Fig. 3.7 showed the array results of the curly brackets. We consolidated all of the data sets from different sheets into one array for the reference and range of the INDEX function and MATCH function, respectively. We also concatenated the two search_keys of the MATCH function into one search_key. It’s important to note here that the order of concatenation for both search_key and range must be the same.
Once the reference and range are consolidated, the MATCH function will perform the matching process using the consolidated range, and will search the concatenated search_key. If it finds a match, it will return the coordinate that will be used by the INDEX function to extract the amount from the consolidated reference.
Another solution is to use the INDIRECT function to make the reference sheet dynamic (Fig. 3.8). Instead of consolidating the data, we need to determine the sheet reference first before performing the lookup.
The INDIRECT function takes in a valid cell reference in text format and converts the text reference into a cell reference. This means that:
Reference: (INDEX)
INDIRECT(B1&”!B2:B4″)=January!B2:B4
Range: (MATCH)
INDIRECT(B1&”!A2:A4″)=January!A2:A4
Sheet references start with the name of the sheet followed by the exclamation point, which separates the sheet reference from the cell reference. The name of the sheet is in B1 in the Master Sheet. Therefore, if we change the value in B1, the INDIRECT functions will also be updated.

Figure 3.8 INDEX-MATCH: Dynamic Sheet Reference
3.5 How to do INDEX-MATCH with a Case-Sensitive Lookup
If the lookup or search_key is case-sensitive, we need to tweak the MATCH function a little bit. One of the ways to perform a case-sensitive lookup is to use the EXACT function, which compares two text inputs and returns TRUE if they are equal. The EXACT function is case-sensitive, and therefore, we can use it to perform the case-sensitive lookup. Let’s look at the example below.
The task in the example below is to return the amount of the first Sub 2 that occurred in the list (Fig. 3.9). The first occurrence is in A3, but the sub here starts with a lowercase. If the formula is a case-sensitive lookup, A3 will be skipped, and the first occurrence will be at A7. Let’s look at the formula.

Figure 3.9 INDEX-MATCH: Case-Sensitive Lookup
The EXACT function normally compares the first text input to the 2nd input, and returns TRUE if both are the same. In this case, the 2nd input is now a list A2:A9. Therefore, it will check each of the cells in A2:A9 if they are equal to E2. Since the 2nd text input is an array, the output will also be an array with the same dimension, size and arrangement. The values of the array will consist of TRUE and FALSE values. This boolean array will be our range input for the MATCH function. We’ll set the search_key as TRUE, since TRUE means that the cell value is equal to the lookup value. The rest of the process is still the same. The MATCH function will search for TRUE. If it finds a match, it will return the coordinate, which will be used by the INDEX function.
3.6 How to do INDEX-MATCH with Inputs in Another Sheet
We can have inputs of the INDEX-MATCH formula from another sheet. The process of constructing the formula is still the same. We just need to reference the inputs from another sheet.
3.8.1 INDEX-MATCH with Inputs in Another Sheet (same Workbook)
The task in the example below is to extract the amount for the given code in A2 of the Index-Match Sheet (Fig. 3.10). The data set is located on a different sheet, which is the Data Sheet (Fig. 3.11).
If the inputs are located in a sheet that is different from the destination sheet of the formula, we need to reference the sheet together with the cell reference. The easiest way to do this is to highlight the inputs from their source sheets while we are within the function or formula (Fig. 3.11). This will automatically write the complete reference for us. In the case below, the reference for the INDEX function and the range for the MATCH function are both in the Data Sheet while the search_key and the formula are in the Index-Match Sheet.

Figure 3.10 Search_Key and Formula are in a different sheet from the Data Set

Figure 3.11 Highlighting the Cells from a different Sheet
3.8.2 INDEX-MATCH with Inputs from Another Sheet (in another Workbook)
If the inputs are located from another workbook, then we need to use the IMPORTRANGE function in order to pull the data from another workbook. The IMPORTRANGE requires two inputs: the URL of the workbook and the sheet and cell references. Both inputs for the IMPORTRANGE must be in text format. Therefore, we need to enclose the URL and reference with quotation marks. IMPORTRANGE requires permission to pull the data from another workbook. Once access is granted, it will automatically pull the data.
Below is an example on how to apply the IMPORTRANGE to INDEX-MATCH:=INDEX(IMPORTRANGE("https://docs.google.com/spreadsheets/Example","Data!B2:B6"),MATCH(A2,IMPORTRANGE("https://docs.google.com/spreadsheets/Example","Data!A2:A6"),0))