How to Use the VLOOKUP function in Google Sheets

1. VLOOKUP Function

Figure 1.1 Data Extraction

1.1 Data Extraction

Data analysis and interpretation require an organized and a filtered data set, and most of the time, these processes involve extraction of data. Data extraction in spreadsheets is a simple process of defining the location of a cell or range of cells to extract the data. The row and column positions define the coordinates of a cell within the spreadsheet. Therefore, most of the Google Sheet functions that are in this category mainly revolve around the process of determining the row and column coordinates. Let’s take a look at an example.

The task in the scenario in Fig. 1.1 is to extract the amount in column C corresponding to the given code in E2, which is YY-1001. We can use the VLOOKUP function to automate this extraction process. These types of functions may look complicated, but always remember, the keyword here is “coordinates,” and you’ll realize that the main inputs of these types of functions are related to defining the coordinates of the data that you want to extract. 

1.2 What does the VLOOKUP do in Google Sheets?

The VLOOKUP function basically extracts a cell value from a table based on a condition. What do we mean by condition here? The condition is related to defining the row coordinate of the cell that we want to extract. The main process of the function requires three things: the row coordinate, column coordinate and the table. The row coordinate is indirectly determined by matching a lookup value to the cells in the first column of the table. This matching makes up the condition on what cell to extract from the table. The column coordinate defines the column in the table where the value will be extracted and the table is the source data. 

In summary, we want to extract a value from a specific column in a table based on a lookup value that is matched in the first column of the table. Take note of the two things here: the column where the lookup value is matched and the column where we want to extract a value. 

1.3 VLOOKUP Syntax

Below is the syntax for VLOOKUP:

There are four inputs in the VLOOKUP function: search_key, range, index and is_sorted. The search_key is the lookup value that is matched or searched in the first column of the table. The range is our table of data. The index is the column coordinate of the column in the table where we want to extract the data. The is_sorted is related to the type of matching: exact match and approximate match. Let’s take a look back at Fig. 1.1 to help us build an intuition on how the formula works. 

Going back to the scenario in Fig. 1.1, we are trying to extract the amount of a given code in E2. The search_key is YY-1001, and each of the cells in the Code Column will be checked against it. We can treat the search_key like a criterion (= YY-1001) that must be satisfied. Therefore, for A2, the condition will be A2 = YY-1001. If this is true, then the row coordinate of A2 will be used as the row coordinate to extract the cell of interest. Otherwise, the process moves on to the next cell, which is A3 (A3 = YY-1001), and so on. This will go through all of the cells until it finds the first match or the first cell that meets the condition. Take note the keywords “first match” because the VLOOKUP function only extracts a value based on the first match. 

Basically, the first input, the search_key, indirectly defines the row coordinates by matching it with the cells from the first column (Fig. 1.2), and takenote the keywords “first column.” This is one of the quirks of the VLOOKUP function. It can only perform this matching process on the first column of the table. So if your data set is located in A2:C6, then you can only perform the match in column A. If it’s B2:C6, then you can only perform the match in column B. Also, notice that we can only perform the match in a column, which basically is a vertical list, and not across the rows. Hence, the V (i.e., vertical) in the VLOOKUP function. 

One of the implications of performing the match in the first column means that the range must always start with the column where we want to perform the match. Another implication is that the column of interest, which is represented by the index, must be to the right of the column where we want to perform the match. Looking at Fig. 1.2, we are trying to extract a value from the Amount Column using the values in the Code Column, and therefore, the index is the 3rd column relative to the Code Column. 

The last input, the is_sorted, is related to the matching process for the search_key. It’s a boolean input, which means that it can only accept TRUE and FALSE values. It’s also an optional input. This means that we can omit this in the formula and it will take its default value, which is TRUE. Now, what do we mean by the type of matching process? Basically, there are two types: an exact match and approximate match. The exact match means that the condition will be an equality (e.g., A2 = search_key). The searh_key is searched one by one from top to bottom. This type of search is called a linear search. 

The searching process in approximate match is more complicated. It uses the binary search method, which starts at the midpoint (center) of the data set and checks if the midpoint value is less than, greater than or equal to the lookup value. The data set is divided through the midpoint, and the search will remove one of the halves (or doesn’t include in the search process) and move on to the other half of the data set depending on the result of the checking. This will be made clear in section 2.2, but it’s important to note here that the binary search in the VLOOKUP function works correctly in a data set that is sorted in an ascending order. Hence, the sort keyword in the input’s name.

Figure 1.2 Basic example of VLOOKUP Process

In essence, to use the VLOOKUP function, we must know what column to extract the data. We don’t know what the exact row is, but we can determine it using a lookup value.

  1. Breaking Down the Formula

Figure 2.1 VLOOKUP Function

In this section, we’ll talk about the rules and criteria for each VLOOKUP input, how to write them, how VLOOKUP processes these inputs and how do the inputs affect the output. We’ll build an intuition to make it easier for us to visualize and create a VLOOKUP formula for your problems.

2.1 VLOOKUP Input/Arguments

2.1.1 Search_Key 

Here are the important rules that you need to consider for the search_key:

2.1.1.1 Syntax and Location of the Search_Key

In Google Sheets, the search_key can be written or coded in three different ways depending on the scenario:

  1. Reference to a cell: 

=VLOOKUP(E2,C2:D6,2)

Typically, our data set is stored in a sheet, and we want to be able to use this set of data in our formulas. In order to do that, we write the references of the cells where the data set is located. In the formula above, the reference E2 means that the value is in column E, row 2. The reference defines the address of the cell, and the formula uses the reference to access the value within the referenced cell. 

We can write the reference in the formula or highlight the cell:

Figure 2.2 Referencing cells by highlighting them

If we perform the latter, the spreadsheet automatically writes the references for us (Fig. 2.2), so we don’t have to write the references manually. Also, it’s important to know that we can refer to the cells from any sheet. This means that the cells can be:

  • in the same sheet with the function
  • in  another sheet within the same workbook
  • Or in another sheet in another workbook (using ImportRange function)

The main advantage of referencing the cells is that the formula becomes dynamic. We can change the values within the referenced cells, and the formula will update automatically in response to the changes. 

  1. A Constant value:

We can also write the search_key directly into the formula as a constant:

=VLOOKUP(“YY-1001”,A2:C6,3,FALSE)

There are times where you don’t want to add any values within your sheets, and this method will let you do that. You don’t have to write the values in the sheet and refer to them in the formula, especially if the values are constants and the size of the list is small. You can write the values directly like the formula above.

  1. Value generated from another function or expression:

Sometimes we don’t want to use our raw data set directly. There are times where we want to extract some information from the raw data and use that extracted information in the formula. We can also do this with the search_key. The formula below extracts a code from E2 using the RIGHT function, and uses the result as the search_key for the VLOOKUP function. 

=VLOOKUP(RIGHT(E2,7),A2:C6,3,FALSE)

  1. Arrays and Range of Cells

The first three methods define the basic property of the search_key in terms of its size. Normally, the search_key is a single value input, and therefore, it can’t evaluate an array or a range of cells. To evaluate an array or a range of cells, we need a special method. 

Range of Cells:

=ArrayFormula(VLOOKUP(E2:E3,A2:C6,3,FALSE))

Hard-Coded Arrays:

=ArrayFormula(VLOOKUP({“XX-1001″;”YY-1001”},A2:C6,3,FALSE))

Arrays from another function:

=ArrayFormula(VLOOKUP(RIGHT(E2:E3,7),A2:C6,3,FALSE))

For now, we can think of an array as a list of values that are not located in any sheet. To write them directly, we need to write the list inside the curly brackets {} (see hard-coded formula).

An array or a range of cells for the search_key will not work in an ordinary VLOOKUP formula. We need to use the ArrayFormula function in order for this to work correctly. We can write the ArrayFormula function and put the VLOOKUP inside, or once you’re done writing the VLOOKUP formula, you can just press CTRL+SHIFT+ENTER, and Google Sheets will enclosed the VLOOKUP with the ArrayFormula function. 

2.1.1.2 Dimension and Size of Search_Key

In spreadsheets, the dimension is related to rows and the columns. If our data is written in one column but multiple rows or one row but multiple columns, the dimension of our data is one-dimensional (1D). If our data is in a table with multiple rows and columns, then the dimension is two dimensional (2D). 

The rows and columns serve as coordinates of the cells within the spreadsheet. For 1D data, we only need to specify one coordinate (row or column) to know where the value is located. For 2D, we need to define the row and column to locate the value. 

The size of the data is related to the number or rows and columns. The search_key can be a single cell or a range of cells (or a single value or an array).

The search_key of the VLOOKUP can be 1D or 2D:

  1. One-Dimensional (1D):
    1. Horizontal List: 
  • Cells (Fig. 2.3)

Figure 2.3 1D Search_Key: one row, multiple columns

  • Hard-coded Arrays: (values in a horizontal list are separated by commas)

=ArrayFormula(VLOOKUP({“YY-1001″,”XX-1001″,”XX-1002”},A2:C6,3,FALSE))

The array here is the same as the range F1:H1 in Fig. 2.3. Also, notice the ArrayFormula function. 

  1. Vertical List: 
  • Cells (Fig. 2.4)

Figure 2.4 1D Search_Key: one column, multiple rows

  • Hard-coded Arrays: (values in a vertical list are separated by semicolons)

=ArrayFormula(VLOOKUP({“YY-1001″;”XX-1001″;”XX-1002”},A2:C6,3,FALSE))

  1. Two-Dimensional (2D):
  • Cells: (Fig. 2.5)

Figure 2.5 2D Search_Key: Multiple Rows and Columns

  • Hard-Coded Arrays: (the array below is a 2 (row) by 2 (column) array)

=ArrayFormula(VLOOKUP({“YY-1001″,”XX-1001″;”XX-1002″,”YY-1002”},A2:C6,3,FALSE))

2.1.1.3 Data Type of Search_Key

The search_key can be any data type. This means that we can have lookup values that are numbers, dates, texts, boolean (i.e., TRUE and FALSE) and etc. 

2.1.2 Range

Based on the extraction process of the VLOOKUP function, the range must be in a certain arrangement, where the main headers are the column headers. As discussed in the previous sections, the first column must be the column where the matching process is performed and the column of interest is to the right of the first column. Note that the range and table are loosely used here as interchangeable terms.

2.1.2.1 Syntax and Location of the Range

  1. Range of cells: 

=VLOOKUP(E2,A2:C6,3,FALSE)

Just like the search_key, the range can be referenced from cells:

  • in the same sheet with the function
  • in  another sheet within the same workbook
  • Or in another sheet in another workbook (using ImportRange function)
  1. Hard-coded array of values:

We can also encode the range directly in the formula, but this method is inconvenient:

=VLOOKUP(1,{2,”b”;1,”a”;3,”c”},2,0)

The arrangement of the range in the formula above is equivalent to the range of cells in the figure below (Fig. 2.6).

Figure 2.6 The equivalent arrangement of the array in the formula above if it is stored in a range of cells

  1. Array from another Function

The range can certainly be an array that is generated from another function like the example below (Fig. 2.7), but there’s a problem. Most of the time our data consists of different data types and formats. If we use a function in a range of cells, the function will apply to all of the cells, and the resulting array will be used as the range input. Therefore, the result in F2 shows a truncated text because the RIGHT function is applied to the Amount column as well. In the end, this may not be useful for common data extraction tasks.  

Figure 2.7 The RIGHT function in the formula is applied to all of the cells in A2:C6

2.1.2.2 Dimension and Size of the Range

The range can be a single value or 1D list, but the VLOOKUP function is not made for these types of range inputs. It’s used for 2D data (tables). Therefore, we always use a 2D reference for the range. 

2.1.2.3 Data Type of the Range

The range can consist of any data type, but make sure that the data type of the first column  is the same as the data type of the search_key. If the search_key is a text, then the first column must also be a text and so on. 

2.1.3 Index

It’s a prerequisite that we must know the column where we want to extract the data within the range. We count the index of the column of interest starting from the first column (the column where we want to perform the match).

All of the syntax rules and properties of the search_key also apply to the index except for the data type. The index can only be a positive integer (e.g., 2, 3), which can also be written in text format (e.g., “2”, “3”).

2.1.4 Is_Sort

The is_sort is an optional input with a default value of TRUE. Therefore, we can omit this input from the formula, and it will take its default value. The TRUE denotes an approximate match while the FALSE denotes an exact match. Just like the index, all of the syntax rules and properties of the search_key also apply to the Is_Sort except for the data type. The is_sort is a boolean input. Boolean values can be converted to 1 (TRUE) and 0 (FALSE). Therefore, we can substitute 1 and 0 as the is_sort values. 

If the approximate match is set, then make sure that the data is sorted in an ascending order in order for the binary search to perform correctly. 

2.2 The VLOOKUP Process

2.2.1 How does the VLOOKUP work?

We already had a glimpse of how it works from section 1.3, and in this section we’ll look at more detailed examples of how the VLOOKUP processes its inputs.

2.2.1.1 VLOOKUP (Basic)

  1. Exact Match

Let’s look at the same scenario from Fig. 1.1. Each of the cells in the range or table has a row and column coordinates. You might confuse these values with the row and column headers of the Google Sheets, but it’s important to note that the row and column coordinates are relative to the range itself. For this example, the first cell is A2 with row and column coordinates of (1,1). C3 has row and column coordinates of (2,3). 

Figure 2.8 Row and Column Coordinates in the range

Now, let’s look at the process of extraction for an exact match. To find the row coordinate, the search_key is matched to each of the cells in the first column, which in this case, is the Code Column. It’s more intuitive to think of this matching process as testing of an equality condition (See column E, Fig. 2.9). It starts with the first row and moves on to the next row until the condition is satisfied, which means a match has been found. This process is called a linear search. The row coordinate of the first match is used together with the index to fully define the coordinates of the cell that will be extracted. 

For this scenario, we can see that A5 is the first and only cell in the Code Column that satisfies the condition. Its row coordinate is 4 relative to the first row, which is A2. The index is 3, and therefore, the cell that will be extracted is the cell with the coordinates (4,3), which is C5. 

Figure 2.9 Extraction Process of the VLOOKUP function (Exact Match)

In the case where a match is not found, then the function will return the #N/A error. If the the function returns the #N/A even though there’s a match in the first column, the most probable reason is that the data type of the first column and the search_key are different.

  1. Approximate Match

The approximate match uses a binary search algorithm for it’s search process. Let’s break down the algorithm, and understand how it works using a simple example below. The data set (A2:B8) consists of numbers from 1 to 7 and their corresponding letters (Fig. 2.10). The row size of the range is 7, and the range is also sorted in an ascending order. The search_key is in D2 and the VLOOKUP function is written in E2. 

Figure 2.10 Approximate Match Example

The binary search starts at the midpoint. Therefore, the first step is to determine the midpoint of the range. The midpoint is a location within the given range. It is the row coordinate of the entry or cell that is located at the center of the range. The relative row coordinates are labeled at the right of the range below (Fig. 2.11). To determine the midpoint, we use this formula:

MIDPOINT=(lowest + highest)/2

We take the sum of lowest and highest row coordinates and divide it by two. In this scenario, the midpoint is (1+7)/2 = 4, which is A5. If the result is a decimal number, it is rounded up (e.g., (1+6)/2 = 3.5 = 4). 

Figure 2.11 Approximate Match: Binary Search

Next, the value at the midpoint is matched with the search_key. The objective is to determine whether the search_key is less than (<) or greater than or equal to (≥) the midpoint value. The search_key is 1, and it is less than the midpoint value (i.e., 1<4), which is 4. This step here identifies which parts of the range are discarded. If the search_key is less than the midpoint value, then values from the midpoint up to the last row are excluded or discarded, and the search will move on to the upper half. However, if the search_key is greater than or equal to the midpoint value, the values from the top row up to the cell before the midpoint are excluded. 

If we analyze the aforementioned steps above, the discarding or exclusion of the upper half or bottom half depends on the assumption that the range is sorted in an ascending order. If the search_key is greater than or equal to the midpoint value and the range is sorted in an ascending order, then any values from the top row to the cell or entry before the midpoint are less than the search_key. Therefore, the algorithm doesn’t need to check the upper-half. We can use this same logic for the case: search_key < midpoint value. Another implication is that the midpoint can be the largest value in the range that is less than the search_key. 

Let’s apply the steps to the scenario. Since the search_key is less than the midpoint value (i.e., 1<4), row 4 to row 7 are excluded. The remaining upper-half (row 1 to 3) is used for the next step (Fig. 2.11, Middle Image). 

The next step is to repeat all of the prior steps but using the remaining half of the range:

  1. The midpoint is calculated: (1+3)/2 = 2 (Fig. 2.11, Middle Image). 
  2. The search_key is compared to the midpoint value: 1 < 2.
  3. Since the search_key is less than the midpoint value, the bottom-half is excluded.
  4. The process repeats again.

Now, the important question here is how the process terminates. The process stops until there’s only one entry, but there’s a final criterion that needs to be satisfied. The value of the final cell or entry must be less than or equal to the search_key (i.e., ≤Search_Key). This means that if there’s no exact match, the VLOOKUP function will return the largest value in the range that is less than the value of the search_key assuming that the range is sorted (ascending). That’s why it’s called approximate match. If the final value is greater than the search_key, the VLOOKUP function will return the #N/A error. 

Overall, the binary search halves the search space until one cell or entry remains. This method can also shorten the search time. 

2.2.1.2 VLOOKUP (Range of Cells and Arrays)

The scenarios here require the use of the ArrayFormula function in order for the VLOOKUP to work. You can either write the VLOOKUP inside the ArrayFormula or press CTRL+SHIFT+ENTER after you write the VLOOKUP formula, which automatically encloses the VLOOKUP with the ArrayFormula. So how do we know if we need to use the ArrayFormula?

We’ve discussed in the previous sections the basic methods of writing the inputs. These methods are related to the necessary properties of the inputs like the dimensions, locations,  data types and so on. We loosely define arrays as lists of values that are not located in  the sheets. This means the lists are not physically stored  in the cells in any sheet. This definition is deliberate so that we can distinguish it from a range of cells. If we use an array or a range of cells as an input to functions that normally don’t accept arrays or ranges of cells as inputs, then we need to use the ArrayFormula function in order to evaluate the inputs, but not all inputs can work this way. Some inputs are strict when it comes to its properties. Even if you use the ArrayFormula function, it will not work and will only return an error. 

  1. 1D Search_Key

Let’s look back to the scenario from Fig. 2.4. The task is to extract the amounts for three search_keys located in E2:E4 (Fig. 2.12). We can create a VLOOKUP formula for each of the search_keys or create a single array VLOOKUP formula that will do the same task. It’s a bit obvious now that the VLOOKUP function will perform a separate matching process for each of the search_keys (A9:C13), and therefore, will return three values corresponding to the search_keys. The dimension, size and arrangement of the search_key dictates the dimension, size and arrangement of the extracted values as seen in Fig. 2.12.

Figure 2.12 Processing a1D (vertical list) Search_Key

  1. 1D Index

The task for this scenario is to extract the description and amount for the code XX-1002. The scenario in Fig. 2.4 extracts data from one column only. Now the task in this case is to extract values from two different columns. There’s only one search_key, and therefore, it will only perform one matching process. Once the row coordinate is determined, it will be paired  to each of the indices resulting to two coordinates: one from column B and one from column C. Just like the search_key, the dimension, size and arrangement of the index affects the same properties for the output. 

Figure 2.13 Processing a1D (horizontal  list) Index

  1. 1D Is_Sort

We can also perform simultaneous matching. If we list the boolean values as an array, we can return two values: one for exact match and one for approximate match. The whole process is still the same, but now, the output is dependent on the is_sort. 

  1. 1D Combinations

What if two of the discussed inputs are 1D lists or all of them are 1D lists? How does the VLOOKUP function process these combinations of array inputs? Let’s look at the scenario below. 

The task is to extract the description and the amount of the codes given in A9:A10 (Fig. 2.14). In this case, we have two search_keys and two indices. Now, the problem is how do we write the two search_keys and two indices? Do we write them as vertical lists or horizontal lists or the other one is a vertical list while the other one is a horizontal list? 

The answers to these questions basically boils down to how do we want to arrange or organize our output in a manner where we can understand the relationships between the three columns (i.e., code, description and amount). One type of arrangement is given in Fig. 2.14, A8:C10, where the table has the same arrangement with the source data. The codes are arranged in a vertical list, and for each of the codes, we need to extract their corresponding description and amount. If we think about this carefully, it means that for each code, we need to extract a horizontal list of data containing the description and amount. Therefore, it’s clear now that the search_keys will be arranged in a vertical list while the indices are in a horizontal list, but let’s look at the output in Fig. 2.14 if it agrees with our intuition. 

Figure 2.14 1D (vertical list) Search_Key vs. 1D (horizontal list) Index

Based on the result in Fig. 2.14, the VLOOKUP function evaluates A9 first. It performs a matching process with A9, and once it finds a match, it will extract the cell using index = 2 (Description) and the row coordinate of the matched cell. Next, it extracts the cell from the Amount Column using the index = 3, and then, proceeds to the next search_key, which is
A10, and repeats the process. Remember that the dimension, size and arrangement of the output is dependent on the same properties of the inputs (i.e., search_key, index and is_sort). Therefore, it will also follow this rule for two 1D inputs. Now, what if we set both inputs as vertical lists? How does the VLOOKUP function evaluate these inputs? Let’s look at Fig. 2.15.

Figure 2.15 1D (vertical list) Search_Key vs. 1D (vertical list) Index

We can see here that the VLOOKUP starts at A9, and then, it extracts the cell from the Description Column. Next, it proceeds to A10 and extracts the cell from the Amount Column. The results are seen in B9:B10, which has a different arrangement than the result in Fig. 2.14, but it shows that the VLOOKUP evaluates an array or a range of cells by row first, and then, by column. The intuition is based on the layout of the spreadsheet. Using this intuition, let’s build visualizations and rules that will aid us in constructing the formulas. 

Let’s start with the scenario from Fig. 2.15. Let’s imagine the inputs as cells in spreadsheets, and arrange them like their arrangement in the formula, which is a vertical list (Fig. 2.16). Next, let’s combine the inputs into one matrix by pairing them. The pairing is a one-to-one correspondence, which means that row 1 of search_key pairs with row 1 of Index and so on. Therefore, the resulting matrix will be a vertical list as shown in Fig. 2.16. 

Figure 2.16 Matrix of Inputs: 1D Vertical List vs. 1D Vertical List

The dimension, size and arrangement of the matrix of inputs dictates the same properties of the output. This agrees with the results from Fig. 2.15 and Fig. 2.16. Now, let’s look back at the scenario from Fig. 2.14. 

Let’s visualize the inputs (Fig. 2.17). We can see there’s a problem with pairing the inputs. We can’t completely pair them because the other one is a vertical list and the other one is a horizontal list. To be able to  pair them in a one-to-one correspondence, we’ll duplicate the list as seen in Fig. 2.17. This means that in order for us to pair them, the inputs must have the same dimensions, sizes and arrangements. Take note that this is just a visualization in order for us to understand how the function evaluates these inputs, but it does not represent the actual process. We are just looking at the essence of how it works using an analogy. 

Since we have the same dimension, size and arrangement, we can now have a one-to-one correspondence between the inputs. The resulting matrix will dictate the process of the VLOOKUP function as well as its output. 

Figure 2.17 Matrix of Inputs: 1D Vertical List vs. 1D Horizontal List

The above concepts are a way for us to visualize how the VLOOKUP function will process the inputs (i.e., range of cells and array). We can think of these as rules that we need to remember when we are constructing a VLOOKUP formula. These rules will also work for 1D search_key and 1D is_sort and 1D index and 1D is_sort, but what if both three inputs are 1D? The same rules still apply. Just remember the keywords: “one-to-one correspondence.” Therefore, we just need to pair the row and column coordinates for each of the inputs such as the (1, 1) of search_key, the (1, 1) of index and (1, 1) of the is_sort and so on. If they have different dimensions, sizes or arrangements,  just duplicate the lists to make sure that we can pair them in a one-to-one correspondence. 

  1. 2D Inputs

For 2D inputs, the same rules that we formulated in 4.) still apply. Let’s look at the scenario in Fig. 2.18. The search_key is a 2D range of cells (E1:F2) while the index is a vertical list. The results are shown in E4:F5. Now, let’s try to explain the process using the same rules that we’ve used for 1D inputs. 

First, let’s visualize the inputs and their arrangements in a spreadsheet (Fig. 2.19). Next, we duplicate the lists if the dimensions, sizes and arrangements are not the same. In this case, we duplicate the list of indices. Now, we can pair the inputs, and the resulting matrix of inputs will dictate the process and the output. As we can see, the same rules apply. Therefore, no matter what the scenario is, just follow the rules, and you’ll be able to predict the dimension, size and arrangement of the results. 

Figure 2.18 2D Search_Key vs. 1D Index

Figure 2.19 Matrix of Inputs: 2D vs. 1D

2.2.2 Is VLOOKUP Case-Sensitive?

The VLOOKUP function is not case-sensitive. You can perform conditions with different combinations of upper and lower cases, and the VLOOKUP function will be able to identify the same characters (Fig. 2.20).

Figure 2.20 The Search_Key: Lowercase Letters vs. Range: Uppercase Letters 

2.2.3 Does VLOOKUP support Wildcards?

The VLOOKUP function can have wildcards in its search_key, and  can perform partial matches. The task in the scenario below is to extract the amount of an item with a “YY” in its code in the Code Column (Fig. 2.21). Wildcards consists of asterisk (*), question mark (?) and tilde (~). These are symbols that have functions when used in formulas. The first two are used to perform partial matches like the scenario in Fig. 2.21. The tilde is used to perform conditions that contain the asterisk (*) and question mark (?) as characters and not as wildcards. 

Figure 2.21 VLOOKUP with Partial Match

2.3 VLOOKUP Output

2.3.1 Dimension and Size of the Output

As discussed in the previous sections, the dimension, size and arrangement of the results are dictated by the same properties of the inputs. For array results, just make sure that the destination cells of the outputs are vacant cells. If there are values that are occupying the cells where the results should be displayed, then an error will be returned instead of the results (Fig. 2.22). 

Figure 2.22 The VLOOKUP function returns a #REF error due to E2 blocking the results

2.3.2 Data Type of the Output

The output of the VLOOKUP function can be any data type. If it can’t find a match or a cell that satisfies the condition, then it will return a #N/A error. 

  1. Applications of VLOOKUP

In this section, we’ll tackle how to formulate your own VLOOKUP for different scenarios. Some complicated tasks will require some tricks, but basically it all boils down to the rules of the VLOOKUP function. We just need to follow the rules, and we’ll be able to create the VLOOKUP formula that we want.

3.1 How to formulate a VLOOKUP 

  1. First we need to identify the inputs for our VLOOKUP:
  1. Search_Key: It’s better to start determining the location of the values for the search_key. Is it located inside the sheet or is it an array? Is it in another sheet? Do we need to do some steps first before using the values?
  1. Range: Next, we need to make sure that the first column of the range must be the column where we want to perform the matching process. Most of the time the range is located in another sheet. Is it dynamic? This means that we need to consider the increase of data that will occur in the future. 
  1. Index: Remember that the index must be to the right of the matching column. Therefore, we need to check the arrangement of the columns. If the matching column is to the right of the index, we can’t apply the VLOOKUP function. 
  1. Preprocessing of Raw Data

Step 1 helps us identify if our data are suited as inputs for the VLOOKUP function.  Most of the time, we don’t have the control on the arrangement and values of our raw data. With this step, we can carefully follow the rules for each of the inputs and see if we need to take some preprocessing steps before we can apply the inputs to the function. 

  1. Setting up the VLOOKUP Formula:

Step 1 and Step 2 will give us an idea how to arrange the inputs of the functions and the format of the resulting output. It’s always better to visualize the inputs when constructing the formula. This will help us predict the dimension, size and arrangement of the results. Just remember that if the output is an array, we need to make sure that there are vacant cells for them to fill up. Otherwise, an error will be returned instead of the output. 

3.2 How to do VLOOKUP with Different Data Types (Exact Match)

All of the data types in Google Sheets have their own set of formatting. The formatting of values is just a way of displaying the values such as percentages, decimals and so on, but Google Sheets doesn’t necessarily use the exact values that are displayed in the cells. Therefore, we need to take into consideration the data type that we are working with for the search_key. Let’s look at different examples. 

  1. How to do VLOOKUP with Numbers (Whole Numbers)

Let’s look back at the example from Fig. 2.10, but we’ll perform an exact match instead of an approximate match (Fig. 3.1). We need to make sure first the data type of the search_key and the first column of the range are the same data types. If not, then the VLOOKUP function won’t be able to find a match and will return the #N/A error. 

Figure 3.1 VLOOKUP with Numbers

Steps:

  1.  In a cell where you want to write the formula (e.g.,B9), write =VLOOKUP(

As you write the formula, Google Sheets will display the formulas that match what you’re typing (Fig. 3.2). You can then select from the options that are displayed by pressing the TAB key,  or just complete the formula by writing it manually. 

Figure 3.2 Writing the VLOOKUP function

  1. Next, after the open parenthesis (, write or highlight the cells that will be used as the search_key input. In this case, highlight A9. The spreadsheet will automatically write the reference for you (Fig. 3.3). 

Figure 3.3 Highlighting the cells when you’re constructing the formula will automatically write the reference of the highlighted cells. 

  1. Write a comma to separate the first input, and then, write or highlight the cells for the range. We don’t include the headers (Fig. 3.4). 

Figure 3.4 Highlighting the Range

  1. Write the index number. We followed this with the is_sort. Since this is an exact match, we’ll write FALSE. If we omit this from the formula, it will take its default value, which is TRUE. We can write the close parenthesis or just hit ENTER, which will automatically complete the formula for us. 
  1. How to do VLOOKUP with Numbers (Decimal Numbers)

Let’s look at a similar data set, but instead we have decimal numbers in the first column. We set the search_key equals 5.57, but the result in B9 shows #N/A, which means that the VLOOKUP function did not find a match in the Code Column. We see the 5.57 in the column, but when we select A2, we can see that its value is actually 5.5678 (Fig. 3.5). The formatting only shows two decimal places. Therefore, for the VLOOKUP function to work, we need to change the search_key to 5.5679. In conclusion, if you’re working with decimal numbers and percentages, you need to take into consideration the formatting. 

Figure 3.5 VLOOKUP with Decimal Numbers

  1. How to do VLOOKUP with Dates 

Dates are treated as numerical values. Although we see this common date format (i.e., dd/mm/YYYY) displayed in the cells, the spreadsheet does not use this in its evaluation. Instead, it operates using date serials. The dates have equivalent date serials, which are the number of days from a referenced date (i.e., 12/31/1899 = 1). If you highlight the Dates and select Automatic as the Format, you’ll see that the dates will be converted to their corresponding date serials. These are the numbers that will be evaluated by the functions. Now, let’s look at the scenario below. 

The task in the scenario below is to extract the amount for the date 4/1/2021 (Fig. 3.6). The search_key for this scenario is a constant. As discussed before, the data type of the search_key and the first column of the range must be the same. The search_key in the formula in A9 is written as text, but the dates in Column are written as dates. That’s why the VLOOKUP function returned the #N/A error. Although the dates are shown in the format dd/mm/YYYY, the values that are evaluated are their equivalent date serials as aforementioned above. It’s easier to reference a date from a cell because when we write dates in cells, Google Sheets can recognize a lot of date formats such as Jan-01-2021, January 1, 2021 or 1/1/2021 as dates. Therefore, we don’t need to think about the date serials when referencing the cells, but if the dates will be hard-coded as constants in the formula, then we need to convert the dates into their date serials. Below are some of the ways to convert the hard-coded dates to their corresponding date serials:

Using DATEVALUE function:

=VLOOKUP(DATEVALUE(“4/1/2021”),A2:C6,3,false)

Using DATE function:

=VLOOKUP(DATE(2021,4,1),A2:C6,3,false)

Figure 3.6 VLOOKUP with Dates

3.3 How to do VLOOKUP with Approximate Match

The common way to use approximate match is when you need to extract values based on an increasing range of values. Let’s look at an example below.

We have a table that contains the % bonus that an employee can acquire based on his/her sales (Fig. 3.7). Let’s try to formulate a way to automatically determine the bonus for a given sales. One way to do this is by using the VLOOKUP function with approximate match. Remember that the approximate match uses the binary search method, where the assumption is that the range is sorted in an ascending order.. Keeping this in mind, we’ll restructure the table in a way where we can utilize the advantages of the approximate match. 

Figure 3.7 % Bonus based on Sales

To restructure the table, we take the lower limits of the sales ranges, sort (ascending) them in a list and pair them with their corresponding bonuses (Fig. 3.8). Let’s say we want to know the bonus for 15,000 (A9). The search_key is A9, the index is 2 and we omit is_sort because we want it to take its default value, which is TRUE (approximate match). The VLOOKUP function will start at the midpoint, which is (1+5)/2 = 3. The midpoint is the 3rd entry in the range, which is A4. The search_key is greater than the value in A4 (i.e.,15,000>10,001), and therefore, A2:A3 are excluded in the search space. The remaining cells are A4:A6. Repeat the process: 

  1. Midpoint = (3+5)/2 = 4, which is A5.
  2. 15,000<20,001. Therefore, the remaining cell is A4.
  3. Check: A4 ≤ 15,000, which is TRUE.
  4. The row coordinate of A4 is used together with the index = 2 forming the coordinates (4, 2) = B5. Therefore, the VLOOKUP function will return 10%. 

Figure 3.8 Restructured Table for Approximate Match

3.4 How to do VLOOKUP with a Case-Sensitive Lookup

We already know that the VLOOKUP function is not case-sensitive. If we want to perform a case-sensitive matching process, we need to do some extra steps. The most common method is to use a helper column together with the EXACT function. Even within this method, there’s a lot of ways to go about it. Let’s look at one way to do this. 

Figure 3.9 Case-sensitive VLOOKUP

For this case, the helper column will serve as the first column of the range (Fig. 3.9). It will contain the row coordinates of each of the items in the range. On the other hand, the EXACT function takes in two strings and checks if the two strings are the same. It returns TRUE if both strings are the same. Otherwise, it returns FALSE if the two strings are different. The advantage of the EXACT function is that it’s case-sensitive. Therefore, using the EXACT function, we can factor in the uppercase and lowercase letters. The EXACT function will check F2 against each of the cells in the Code Column, and then, it will return an array of TRUE and FALSE values. Now, we need to connect this array to the helper column, and we’ll use the MATCH function. The MATCH function performs a matching process. It matches the lookup value to a list and returns the coordinate of the first match. Its process is the same as the matching process of the VLOOKUP. It takes in three inputs: the lookup value,  1D list, and the type of match (e.g., 0 = exact match). 

Let’s try to look at the combination of the helper column, EXACT function and MATCH function. First, the EXACT function tests all of the cells within column B if they are equal to the given code. This will return an array of TRUE and FALSE values. Next, we’ll use MATCH to find the TRUE from the array. It will return the row coordinate of the first match. For this scenario, it will return 5. The value returned by the MATCH function will serve as the search_key, and the VLOOKUP function will perform a match in the helper column (Column A) using this search_key. It’s important to note that the EXACT function doesn’t normally accept an array or a range of cells as an input. Therefore, we need to use the ArrayFormula function in order to evaluate an array or a range of cells. 

3.5 How to do VLOOKUP with Wildcards

Wildcards are mainly used for partial matches with texts. It’s important to emphasize here that it only works if your data are in text format. There are three types of wildcards: asterisk (*), question mark (?) and tilde (~). Each one of these functions differently:

  1. Asterisk (*)

The asterisk (*) can represent any character or a text with more than one character. Let’s look at its applications for different scenarios. 

In Fig. 3.10, the Code Column contains IDs with a letter code and a number code. The task is to extract the amount of a code that has 1001. In this case, it doesn’t matter what’s the letter code in the ID as long as the number code is 1001. This is a typical example of a partial match. We are not matching the whole text but only a part of it. 

Partial matches consist of two parts: the keywords and the trivial texts. The keywords are the words or texts that we want to match against a text value. The trivial texts are the texts that are not necessary for the match.

For the first scenario here, the pattern or format of the IDs are consistent. If this is the case, then we can select which texts will be trivial. Since the keyword is 1001, any texts before this number will be trivial texts. We can then use the asterisk(*) to represent and replace the trivial texts. 

In the formula shown in the figure, we can see the search_key as “*1001” where the asterisk (*) represents any text before 1001. Basically, this is saying that it doesn’t matter what the texts are before 1001, as long as the last four characters are equivalent to 1001. The VLOOKUP function will extract the amount of the first match, which is A2. 

Figure 3.10 VLOOKUP with Partial Match using Asterisk (*) that replaces texts before the keyword

If the partial match is based on the letter code, then the texts after the letter codes will be the trivial texts. Therefore, we can replace them using the asterisk (*) (Fig. 3.11).

Figure 3.11 VLOOKUP with Partial Match using Asterisk (*) that replaces texts after the keyword

The search_keys at Fig. 3.10 and 3.11 will only work if the format of the texts in the range are consistent. In the case where the keyword can be positioned anywhere within the text, we need to replace the texts before and after the keyword with asterisks (*). In Fig. 3.12, 1001 are located at different positions within the texts.

Figure 3.12 VLOOKUP with Partial Match using Asterisks (*) that replaces texts before and after the keyword

  1. Question Mark (?)

Question mark (?) can only represent one character. The usage is the same as with the asterisk, but we need to make sure that the format or pattern of the texts in the range must be consistent.

If we apply this wildcard in the scenario in Fig. 3.10, the formula would be:

Figure 3.13 VLOOKUP with Partial Match using Question Mark (?)

The question marks (?) are used here to replace the letter code in the ID. Therefore, it’s important to know first the format or pattern of texts within the range. 

We can also combine the question mark (?) and asterisk (*) in the search_key (Fig. 3.14). If you look at the formula, the asterisk replaces the trivial texts before the keyword while the question mark represents three trivial characters before the keyword 1. 

Figure 3.14 VLOOKUP with Partial Match using a combination of Asterisk (*) and Question Marks (?)

  1. Tilde (~)

The wildcards work by including them inside the quotation marks. If we write asterisk (*) or question mark (?) inside the quotation marks, they will not be treated as the character themselves but as how they function as wildcards. Therefore, if the condition literally includes * or ?, we can’t write the characters normally. We need to use tilde (~) together with the wildcard characters (e.g., “~*” or “~?”) in order to perform a literal match that contains * or ?.

3.6 How to do VLOOKUP with Left Lookup

As discussed previously,  the VLOOKUP function can only perform the matching process on the first column of the range and extract a cell to the right of the first column. Basically, it’s a left to right process, but what if the match column is not the first column and the column of interest is to the left of the match column? Obviously, we can rearrange the table to fit the requirements of the VLOOKUP function, but if the raw data set is dynamic and we don’t have control in its arrangement, then we need to find a way to automate the process without rearranging the data. 

Luckily, we can rearrange the range in the VLOOKUP function using the curly brackets {}. We normally use curly brackets to write arrays in formulas. The curly brackets combine and arrange the constants inside it into an array depending on how we set the arrangement (using commas for columns and semicolons for rows). In Google Sheets, the curly brackets can go beyond constants and can use ranges of cells. This makes it more powerful because we can use this to combine ranges of cells into one list. Therefore, we can combine two or more vertical lists or horizontal lists or rearrange tables. Let’s apply this to the scenario below.

The task is to extract the amount for the code: YY-1001 (Fig. 3.15), but as we can see, the match column is the last column of the range and the Amount Column is the 1st. This is not the right arrangement for our VLOOKUP formula. We can rearrange the range using the curly brackets to move the Code Column as the first column and the Amount Column to the right of the Code Column. In this case, we moved the Amount Column as the last column. We can also move it to the 2nd, but for the sake of showing an example, we deliberately moved the Amount Column to the 3rd.

Inside the curly brackets C2, B2 and A2 becomes the first row of the new range, where C2 is the first column, B2 is the 2nd and A2 is the 3rd. The next row contains C3, B3 and A3 and so on. 

Figure 3.15 VLOOKUP with Left Lookup

3.7 How to do VLOOKUP with Multiple Search_Keys

Most of the time, we want to extract values based on multiple search_keys. The task below is to extract the corresponding amount for a given code based on a date (Fig. 3.16). In this case, we have to extract the amount based on two search_keys: XX-1002 and 2/1/2021. It’s important to note that this is an AND scenario, and therefore, we need to satisfy both conditions. It’s not a separate extraction. Therefore, we need to perform a simultaneous matching process on different columns, and both conditions must be TRUE at the same time.  A normal VLOOKUP function can’t perform this type of task. Therefore, we need to do some extra steps to aid the VLOOKUP function. 

3.7.1 VLOOKUP with Multiple Search_Keys: Helper Column

In order to do this, we’ll use a helper column with the & operator. The & operator joins two or more texts together. We’ll use this to stitch the values from the Code Column and Date Column in the Helper Column. Next, in the VLOOKUP function, we’ll also stitch the search_keys using the & operator. The new search_key will be matched with the values in the Helper Column. This method can also be applied to more than two search_keys. 

Figure 3.16 Multiple Search_Keys with Helper Column

3.7.2 VLOOKUP with Multiple Search_Keys: Curly Brackets

The method in 3.7.1 is a common solution for Excel users, but for Google Sheets, we have a more elegant solution, which is using the curly brackets {}. We’ve discussed already the power of curly brackets in section 3.6. Now, we’ll use this to perform AND conditions in the matching process. 

We will still use the & operator to stitch the search_keys and their corresponding columns, but instead of using a helper column, we’ll rearrange the range using the curly brackets (Fig. 3.17). The & operator can’t normally stitch arrays or ranges of cells. Therefore, we’ll use the ArrayFormula function so that it can evaluate arrays or ranges of cells. 

Figure 3.17 Multiple Search_Keys with Curly Brackets {}

3.8 How to do VLOOKUP with Varying Index (2 way Lookup)

The task in the scenario below is to extract the grade based on the term and the course (Fig. 3.18). In this case, the index will vary based on the input course in B8. We’ll use the MATCH function to determine the index (column coordinate). The MATCH function will match the given course in B8 to each of the column headers. Once it finds a match, it will return the coordinate of the matched cell. Take note that the MATCH function can only perform the matching process on a 1D list. The returned coordinate of the MATCH function will serve as the index for the VLOOKUP function. 

Figure 3.18 VLOOKUP with Horizontal Lookup

3.9 How to do VLOOKUP with Inputs in Another Sheet

We can have inputs of VLOOKUP from another sheet. The process of constructing the formula is still the same. We just need to reference the inputs from another sheet. 

3.9.1 VLOOKUP with Inputs in Another Sheet (same Workbook)

Most of the time, the range is from another sheet. The range in the scenario below is located in the Range Sheet (Fig. 3.19) while the other inputs and the formula will be written in the VLOOKUP Sheet (Fig. 3.20). If this is the case, then we can follow the steps below on how to refer to cells that are on a different sheet. 

Figure 3.19 The range is located in a different sheet (Range Sheet) 

Here are the steps:

  1. Write = VLOOKUP(
  1. Highlight (or write) the search_key (A2), and then the range:

We can’t simply write A2:C6 because the range is now from a different sheet. We need to reference the cells together with the reference sheet. The fastest way to do this is to highlight the cells. While the cursor is inside the VLOOKUP function, we go to the Range Sheet tab. Then, we highlight the cells A2:C6, the spreadsheet automatically writes the cell references including its sheet reference (Fig. 3.21). The sheet reference always starts with the sheet name followed by an exclamation mark (!). Then, this is followed by the cell reference. 

  1. Go back to function and complete the inputs. Then, press ENTER. 

Figure 3.20 VLOOKUP with String1 in a Different Sheet in the same Workbook

Figure 3.21 Highlighting the Cells from a different Sheet

3.9.2 VLOOKUP with Inputs in Another Sheet (in another Workbook)

If the inputs are located from another workbook, then we need to use the IMPORTRANGE function in order to pull the data from another workbook. The IMPORTRANGE requires two inputs: the URL of the workbook and the sheet and cell references. Both inputs for the IMPORTRANGE must be in text format. Therefore, we need to enclose the URL and reference with quotation marks. IMPORTRANGE requires permission to pull the data from another workbook. Once access is granted, it will automatically pull the data. 

Below is an example on how to apply the IMPORTRANGE to VLOOKUP:

=VLOOKUP(A2,IMPORTRANGE(“https://docs.google.com/spreadsheets/Example”,”Range Sheet!A2:C6″),3,FALSE)