by Mike Brown, Reprinted with permission of LendEDU

While less-refined nationwide data suggests there is no correlation between income and the chances of catching the coronavirus, an analysis of more precise data from New York City indicates otherwise.

Despite all that we now know about the novel coronavirus, there is still so much to learn about this deadly disease.

For example, is a person’s income correlated to his or her chances of contracting COVID-19?

At first glance, it appeared that there was no such connection. For 1,808 of the most populated counties in the United States, LendEDU compared each one’s median household income to its number of reported coronavirus cases and deaths according to USAFacts as of April 30, 2020.

The correlation was minimal, if anything at all.

But when we looked at more refined ZIP Code data for the epicenter of the U.S. outbreak, New York City, we saw a relatively strong correlation that suggests the lower one’s income, the higher chances he or she has of contracting the coronavirus.

Nationwide Data

To see if there was a correlation nationwide, we took the 1,808 U.S. counties that have a population of 20,000 or more and overlayed each one’s median household income (HH Income) with both its number of coronavirus cases and deaths as of April 30, 2020.

The coronavirus data was provided by USAFacts, which compiles daily data from state health departments. Populations figures for each county were also used to calculate the coronavirus infection rate (cases/pop.) and overall mortality rate (deaths/pop.), in addition to the infected mortality rate (deaths/cases).

Interestingly, there was a very, very faint correlation in most cases between higher income and greater chances of contracting or dying from the coronavirus.

This could be the case because cities, where incomes are the highest, have bore the brunt of the deadly pandemic due to overcrowding and mass transportation. It is also easier to get tested for the virus when living in the proximity of a city as opposed to living in an isolated rural area.

Conversely, incomes are generally lower and poverty is more prevalent in rural areas, where less-dense populations are not conducive to spreading the coronavirus. Additionally, a lack of testing in rural America makes it harder to paint an accurate picture.

But, is this really what is happening, or is a county-by-county analysis not granular enough to better understand how socioeconomics are correlated to the chances of contracting the coronavirus?

When we looked at more refined data for New York City, the epicenter of the U.S. outbreak, a notably strong correlation appeared.

New York City Data

The NYC Department of Health has been releasing coronavirus testing data daily on Github that is broken down by ZIP Code, which allowed us to compare it to median household income data by NYC ZIP Code.

This more localized analysis revealed a stronger correlation that suggests a lower income heightens the chances of contracting COVID-19.

When we were able to use more exact ZIP Code figures for both household income and coronavirus numbers that more accurately depicted the situation in each NYC neighborhood instead of broader countywide figures, we saw a much stronger correlation between income and the chances of catching the coronavirus.

New York City residents in less affluent neighborhoods were more likely to receive testing and test positive for the coronavirus.

But the strongest correlation existed between having a lower median household income and having a higher percentage of positive COVID-19 test results relative to the total number of tests.

One reason this connection could exist is because those who are less financially stable might be going outside and coming in to contact with people more often because they must continue to work to stay afloat. This could include delivery drivers, Uber and taxi drivers, maintenance workers, and public works employees for example.

This cohort might also still be relying heavily on public transportation, like the subway, during the pandemic. Conversely, more white-collar workers with higher salaries might still be in a comfortable financial situation, in addition to being able to work from home with a mere laptop.

It will certainly be interesting to run a similar ZIP Code analysis for other cities when the data becomes available to see if this trend is prevalent throughout the U.S.

Methodology

To see if there was a correlation nationwide, we took the 1,808 U.S. counties, parishes, census areas, or independent cities that have a population of 20,000 or more according to the most updated population estimates from the U.S. Census Bureau and GreatData. LendEDU licenses the data from GreatData, which comes from the most recent U.S. Census Bureau update but GreatData will also calculate its own projections based on historical trends to provide the most up-to-date data. Most counties had multiple ZIP Codes, in which case we simply summed together the ZIP Code population projections to get the entire county population.

For each county, we also found the weighted average of the median household incomes that exist within that county according to the ZIP Codes that lie within it. In a given county, a more populated ZIP Code’s median household income was given more weight when calculating the weighted average of the median household incomes for that county. The median household income data was also provided by GreatData on a ZIP Code basis.

The resulting household income figure was then overlayed with both the number of coronavirus cases and deaths in that same county. The coronavirus data was provided by USAFacts, which compiles data from state health departments, and was up-to-date as of April 30, 2020 when the data was pulled from the site.

The population figure for each county was also used to calculate the coronavirus infection rate (cases/population) and overall mortality rate (deaths/population), in addition to the infected mortality rate (deaths/cases). These three data points were also overlayed with the median household income data.

To find the correlation, we formatted a scatter plot in Excel.

To see if there was a correlation in New York City, much of the same process was used except we used the most up-to-date coronavirus figures for New York City ZIP Codes as provided daily by the NYC Department of Health on Github. The NYC coronavirus data that was used in this report was downloaded from Github on April 30, 2020.

Because this data was on a ZIP Code level, we were able to use the median household income figures by NYC ZIP Code as provided by GreatData. Each data point (positive results, total tests, & positive results/total tests) was overlayed with the corresponding median household income figure by ZIP Code.

To See Mike Brown’s original blog, please click here.

To see a raw data file, please email me at brown@lendedu.com.

 


mike brownIn his role at LendEDU, Mike Brown uses data, usually from surveys and publicly-available resources, to identify emerging personal finance trends and tell unique stories. Mike’s work, featured in major outlets like The Wall Street Journal and The Washington Post, provides consumers with a personal finance measuring stick and can help them make informed finance decisions.