5 incredibly useful spreadsheet formulas for journalists

5 incredibly useful spreadsheet formulas for journalists

As a jack of all trades in my newsroom, there are several times each year when I’m stuck staring at a complex spreadsheet. I know there’s a story inside of it, but the raw data is too messy to be presented to the public (sometimes, I think, data is that way on purpose).

There wasn’t any formal training for those moments in my J-school experience and it took me years of doing things the hard way to realize that preparing data for public consumption didn’t have to be so time consuming.

Below are five functions that I find most useful, and links to better instructions for each one.

This list is a draft for a training presentation I have volunteered to give at an upcoming event.

CONCAT (or CONCATENATE)

Often, data needs to be cleaned up for display to the end user or to make it compatible with a visualization, such as mapping. Imagine a government database where addresses might be split across several fields for street, apartment, town, state and zip code, but you need that compressed into a single field for each entry.

Concatenation joins designated fields together to create a new field and it can also insert other content, such as spaces or punctuation. This example from Microsoft shows the syntax:

=CONCAT(“The”,” “,”sun”,” “,”will”,” “,”come”,” “,”up”,” “,”tomorrow.”) will return The sun will come up tomorrow.

Notice they string together the words and spaces separately in this example, with each coming between the pairs of quotation marks. Any of the individual words could easily be replaced with a cell number (A1) to create a formula that can be reproduced through a large database.

VLOOKUP (or XLOOKUP)

I’ve found this tool useful for making a simplified spreadsheet from a larger one.

For example, if you used CONCAT to put an address together as in the example above, you may want to create a new tab in your spreadsheet that hides all the extra columns. This could be useful if you need to feed a clean sheet into your visualization.

Here’s the syntax (Usually I want to use the exact match function, which is the default in the XLOOKUP variation):

=VLOOKUP(lookup value, range containing the lookup value, the column number in the range containing the return value, Approximate match (TRUE) or Exact match (FALSE))

COUNTIF

Maybe your data shows past, present and future cases, but you need to know how many of each are present. This is an easy way to do that, and if you use a second tab to host the counting, you can make the results dynamic even as data is added to the primary sheet.

=COUNTIF(Where do you want to look?, What do you want to look for?)

IF

Imagine a scenario where your database categorizes content with text labels — like past or present and open and closed — but your visualization tool requires numerical equivalents.

Sure, you could do a Find & Replace, but that destroys the original data. There’s always danger of making an error when you destroy the original.

It’s much safer, in my opinion, to add a new column, row or sheet and use this function to dynamically return the numbers you assign. Here’s how it works:

=IF(A1=”Open”,1,2) says IF(A1 = Yes, then return a 1, otherwise return a 2).

IMPORTRANGE (Google Sheets)

In some of the previous examples I’ve mentioned the importance of preserving original data. For some visualization programs, it may also be preferable to connect only the very simplest spreadsheets to help limit load times.

If you’re using Google Sheets, this function lets you connect a new sheet dynamically to another. The key is knowing the URL of the primary sheet and the range of fields you want to import.

IMPORTRANGE(spreadsheet_url, range_string)

Closing tip

If you’re not sure, search for it! Microsoft and Google both have extensive libraries of definitions and examples for all of the spreadsheet functions. A few minutes of searching may save you hours of work.