On social media, people are eager to share messages like the one shown below, that should indicate that blacks are targeting whites with their crimes, whereas whites are relatively nice to blacks. In this post, I want to have a look where the data comes from and how to correctly interpret and visualize the numbers.

Even if we assume that the data reported in the plot is correct, its conclusion is misleading and not supported by this data. In short, the reasoning is as follows: the US population has more white people than black people and therefore whites will be more often the victim of a crime than black people. Therefore, one cannot directly compare the ‘black on white’ and the ‘white on black’ bars to conclude that whites would be targeted disproportional by blacks when they commit a crime.

A first simple example

Assume I am walking on Michigan Avenue in Chicago with many shoppers around me. For simplicity, we assume there are \(100\) shoppers in my direct vicinity and they can be divided into two groups. There are 60 people belonging to group \(A\) and the remaining 40 belong to group \(B\).

I am in a bad mood and planning to pick someone from the street to give him/her a knuckle sandwich. If I pick my victim randomly from these 100 people around me, then there is a \(60\%\) probability that my victim will be of group \(A\) and a \(40\% \) probability that the victim will be of group \(B\). If I were to repeat this bad behavior 100 times, I will have, on average, 60 victims of group \(A\) and 40 of group \(B\). However, this would not mean that I am targeting, or that I am more violent towards group \(A\).

Another example

Assume there is a group of 100 offenders. Each of them will pick a victim out of the group of 100 shoppers on Michigan Avenue. The group of offenders has 50 people of group \(A \) and 50 of group \(B \). As such, group \(B\) is over-represented in the group of offenders relative to the group of victims (i.e. the shoppers). However, we assume that each offender picks his/her victim completely random. Offenders belonging to a certain group are not targeting the other group. Moreover, a shopper on Michigan Avenue can, unfortunately, be the victim in multiple incidents.

Since we assume independence between the offenders and the victims, it is straightforward to see that the probability that the offender will be of group \(B\) and the victim will be of group \(A \) is \((0.5)\times(0.6) = 0.3 \). And similarly, the probability that the victim is of group \(B\) and the offender is of group \(A \) is equal to \((0.5)\times(0.4) = 0.2 \). Therefore, out of the 100 incidents, we will have in total 20 crimes labeled ‘\(A\) on \(B\)’ and 30 incidents labeled ‘\(B\) on \(A\)’. However, it does not mean that group ‘\(A\) is a target for group \(B\)’.

The data

The data that is used in the plot on interracial violent is coming from the U.S. Department of Justice in the report ‘Criminal Victimization. 2018’. See: https://www.bjs.gov/content/pub/pdf/cv18.pdf. We start with using part of the data that is in the Table 12 on page 12 in this report. The table below shows in the column Population the distribution of the US population. Assume you are walking on a crowded street in Chicago on Memorial Day and you bump into a person. The probability that the person you bumped into is black is not the same as the probability that this person is white. Indeed, if this street is a representation of the US population, then there will be more whites, so the probability to bump into someone who is white should also be larger. We get from the table that there is roughly a 62.3% probability that the person you just bumped into is a white person, whereas there is only a 12% probability that this person is black.

RacePopulationVictimOffender
White62.3%66.5%50.2%
Black12.0%10.8%21.7%
Hispanic17.2%13.9%14.4%
Asian6.3%4.2%2.5%
Other2.4%4.7%9.0%
Parts of Table 12 in Criminal Victimization, 2018 of Bureau of Justice Statistics, U.S. Department of Justice.

Assume I commit a violent crime in Chicago on Memorial day. If I pick my victim randomly from a crowded street in Chicago, the probability that my victim is white is 62.3%, whereas the probability that my victim is black is 12%. However, crime data suggests that a victim is not randomly sampled out of the population. You can observe that in the table in the column victim that the distribution of the victims is different than the population distribution.

If there is a crime, the race of the victim is unknown. Assume that we use the variable \(V \) to denote the race of the victim. Then we have, for example, that:

\[\mathbb{P}\left[V=\text{White} \right]= 66.5\%\]

and

\[\mathbb{P}\left[V=\text{Black} \right] = 10.8\%.\]

In this notation, we have that \(\mathbb{P} \) stands for probability and inside the brackets we have the event that the victim \(V\) has a given race.

Guessing with and without information

The other table that is used is table 14 on page 13 of the Criminal Victimization, 2018, report.

Offender Race
Victim RaceWhiteBlackHispanicAsianOther
White62.1%15.3%10.2%2.2%8.1%
Black10.6%70.3%7.9%<0.1%9.3%
Hispanic28.2%15.3%45.4%0.6%7.4%
Asian24.1%27.5%7.0%24.1%14.4%
Table 14 in Criminal Victimization, 2018, , Bureau of Justice: Percent of violent incidents, by victim and offender race or ethnicity, 2018.

Assume you are told that a violent incident happened and you have to guess/estimate the race of the offender. You have no information about the specific incident. Instead of just guessing, you can use the available national data, i.e. tables 12 and 14, to make a more scientific guess. For example, you can use Table 12. In that case, your best guess will be that the offender is white, since it will give you a 50% probability of being right with your guess.

Assume some extra information is revealed to you: the race of the victim is black. In this situation you can use Table 14 and see that you better change your guess. Given (or in mathematical terms ‘conditional on the event’) that the victim is black, it is much more likely that the offender will be black than that the offender will be white. Indeed, Table 14 gives the likelihood for the offender (which we denote by the variable \( O\)) given that the race of the victim, which we denote by \(V \) is known. For example, the table states that:

\[\mathbb{P}\left[O=\text{white}|\ V=\text{black} \right] = 10.6\%, \]

and similarly we have that

\[\mathbb{P}\left[O=\text{black}|\ V=\text{black} \right] = 70.3\%, \]

So by changing the guess from white to black, we will have a correct guess for the race of the offender in 70.3% of the situations.

Who is targeting who?

Assume a violent crime happened and you need to guess the race of the victim. We know already from Table 12 that without revealing the race, the probability that the victim will be white is 66.5%. Therefore, if you want to make a scientific guess for the race of the victim, your best choice is to bet on white. It will give you the right answer in 66.5% of the situations.

The question we want to answer is the following. Assume someone reveals new information to you about the race of the offender. You now have to guess the race of the victim, but you can use that the offender is black. There are two questions:

  1. Will you change your bet?
  2. If you do not change, will the likelihood to have the correct guess increase or decrease?

If we want to answer these questions, we first need to look at the following conditional probability:

\[\mathbb{P}\left[V=\text{white}|\ O=\text{black} \right].\]

Indeed, if we already know that the offender is black, how likely is it that he will pick a white person as a victim. Using basic probability theory (Bayes rule more precisely), we can write:

\[\mathbb{P}\left[V=\text{White}|\ O=\text{Black} \right] = \frac{\mathbb{P}\left[O=\text{Black}|\ V=\text{White} \right] \mathbb{P}\left[ V= \text{White}\right]}{\mathbb{P}\left[ O= \text{Black}\right]}.\]

We can change the condition from offender to victim. The conditional probability \( \mathbb{P}\left[O=\text{Black}|\ V=\text{White} \right]\) can be found in Table 14. The probabilities \( \mathbb{P}\left[ V= \text{White}\right]\) and \( \mathbb{P}\left[ O= \text{Black}\right]\) can be found in Table 12. Combining all these probabilities gives the following conditional probability for a black offender:

\[\mathbb{P}\left[V=\text{White}|\ O=\text{Black} \right] = \frac{(0.153)\times(0.665)}{0.217}\approx 46.9\%.\]

Going back to our two questions we can formulate the following answer: ‘if we have to guess the race of the victim, when we know that the offender is black, then our best choice is to bet on white.’ Indeed, there are 4 other categories for the victim and using the same reasoning, you can determine the probability that a black offender picks each of these other categories. You will then find that the probability will not be larger than 46.9% for the other categories. Therefore, your best bet will be to predict that the race of the victim will be white. However, this does not mean that blacks do target whites. Indeed, if we do not know the race of the offender, our best bet would also be to take white. Therefore, we also focus on the difference in likelihood:

\[\frac{\mathbb{P}\left[V=\text{White}|\ O=\text{Black} \right]}{\mathbb{P}\left[V=\text{White} \right]}=\frac{46.9}{66.5}=0.70.\]

This ratio of 70% states that although your best bet is to take white for the race of the victim when you know that the offender is black, the likelihood to end up with the right choice is decreased by 30%.

The distribution of the victim

If we replace white by black, Asian, Hispanic and other, we can determine the so-called conditional distribution of a black offender. This distribution is shown in the plot below with the blue bars. Each blue bar denotes the likelihood that the victim is of a certain race, if we already know that the offender is black. Similarly, we can derive the conditional distribution for a white offender in orange. Finally, the grey bars are representing the unconditional probabilities, that is, the likelihood that the victim has a certain race if no information is revealed about the offender.

The conditional distribution for the victim, given the race of the offender.

This graph shows that the distribution of a black offender is different than the unconditional distribution (grey bars). If you know that the offender is black, the probability that the victim is white is decreased and the probability that the victim is black is increased compared to the unconditional distribution, i.e. grey bars. For a white offender, the opposite is true. The probability that the victim is white is increased when we go from no information about the offender to the situation where the offender is known to be white. In that situation, we also see that the probability for a black victim is decreased. Therefore, we can conclude that neither white or black offenders are targeting the other race! Based on these numbers there is no proof for disproportional interracial violence between black and white people.

7 Comments

  1. Great article , the problem with the BJS victimization stats is the sample is to low 151,055 household interviews but in my opinion you get realistic overview of the crime rate when you correlate to the fbi crime stats wich are from police reports but not even the fbi stats are perfect since not all agencies provide ethnicity data , maybe you can do a similar article on https://ucr.fbi.gov/crime-in-the-u.s/2018/crime-in-the-u.s.-2018/tables/expanded-homicide-data-table-3.xls

  2. As I’m sure other people have pointed out, and I’m sure this will be censored, but there’s a fallacy in your article.

    You say “because there’s more whites, therefore whites will more likely be victims of crimes in any given instance.” Yes, true.

    However, you’re taking into account the NUMBER OF VICTIMS- but you FAIL to take into account, the number of CRIMINALS.

    If there’s much less black people (and therefore less blacks to be victims of crimes…) then by extension you must also concede there’s less blacks to COMMIT crimes. So there shouldn’t be a lot of black-on-white crime, since there’s so few blacks to do those crimes. And yet, there’s a ton of black on white crimes even though blacks have very small population size.

    Also, following that logic- If there’s more white people to be victims of crimes… then following that logic, there should also be much more crimes COMMITTED by whites in general, much more than by blacks.

    And even if there’s less blacks, the fact that so many more whites are committing crimes in the first place, should mean that the interracial crime rates between races should be about equal, IF all groups commit crimes equally.

    FOR EXAMPLE:

    let’s simplify it and say there’s 100 people- 90 white, and 10 black. Let’s say all of these people commit exactly 1 violent crime. And let’s say that their victims are purely random.

    According to math and probability, 10 blacks should be victims of violent crimes in general and 90 whites in general. And, 9 whites and 9 blacks, would on average be a victim of *interracial* violent crime, in this probability example. As there’s 90 whites, but only 10 blacks. Each whites person has a 10% chance of committing a crime against a black person… 10% of 90 = 9. Similarly blacks have a 90% chance of doing it against whites, as there’s so many whites. However, since there’s only 10 blacks to begin with, 90% of 10 blacks = 9. So the interracial violent crimes should be equal, even though population amounts aren’t the same. Cuz again we aren’t counting same-race violent crimes here, only interracial crimes. And interracial violent crimes require calculations of both the amount of potential victims, AND the amount of potential perpetrators, along with the likelihood of any crime being an interracial one.

    Because of the fact, whites are victimized so much more than blacks in terms of black/white interracial crime, it shows that *per individual person*, blacks are much more likely to commit crimes against whites than whites are against blacks.

    1. Where do you get “a ton” of Black on white crimes??

      The Black on white RATE is damn near identical to the white on Black rate: approx 3 out of 1,000

    2. I was thinking the same thing. But! That table don’t have white on white crimes etc. So that is why it seems at least that whites are more like to be victims.

  3. Does this assume an evenly mixed population? Because that would be a big flaw. Racial groups cluster at both micro (near social circle) and macro (neighbourhood, town) levels.

Leave a Reply to David ACancel reply