

{"id":760,"date":"2018-10-30T19:51:54","date_gmt":"2018-10-30T23:51:54","guid":{"rendered":"https:\/\/sites.temple.edu\/geophysics\/?p=760"},"modified":"2018-10-30T19:57:43","modified_gmt":"2018-10-30T23:57:43","slug":"benfords-law-fraud-evil-plots-and-geophysics","status":"publish","type":"post","link":"https:\/\/sites.temple.edu\/geophysics\/2018\/10\/30\/benfords-law-fraud-evil-plots-and-geophysics\/","title":{"rendered":"Benford\u2019s Law, Fraud, Evil Plots and Geophysics"},"content":{"rendered":"<p>As mentioned in an<a href=\"https:\/\/sites.temple.edu\/geophysics\/2017\/01\/19\/evil-plots\/\"> earlier post<\/a>, I\u2019m teaching a new <a href=\"https:\/\/gened.temple.edu\/\">general education<\/a> class entitled \u201cEvil Plots\u201d about how graphs, maps and other forms of visual data communication can be used to persuade or mislead. \u00a0The idea is to teach students to be informed consumer of data. My hope is that along the way they will learn foundations of proper data visualization and some tools they can apply in their own studies, whatever their major. Evil plots is not just about the graphical trickery. Sometimes the problem is not with the graph but with the data. Detecting fraudulent data is its own field, but surprisingly effective and easy to implement technique is to test whether the data comply with Benford\u2019s Law.<\/p>\n<p>In brief, Benford discovered that is you have data that spans many orders of magnitude &#8212; populations of cities in the U.S., the length of streams and rivers, even financial data &#8212; that it is six times more likely that the numbers will start with a 1 than a 9. In fact, the frequency distribution looks like this (<a href=\"https:\/\/en.wikipedia.org\/wiki\/Benford%27s_law\">from Wikipedia<\/a>):<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/upload.wikimedia.org\/wikipedia\/commons\/thumb\/4\/46\/Rozklad_benforda.svg\/512px-Rozklad_benforda.svg.png\" alt=\"A sequence of decreasing blue bars against a light gray grid background\" \/><\/p>\n<p><span style=\"font-weight: 400\">The this seems counterintuitive, but is actually a consequence of the numbers spanning such a wide range of values. \u00a0The law fails for quantities that have a limited range, such as people\u2019s height in meters, where I challenge you to find value starting with 9. There is a nice video introduction to Benford\u2019s Law <\/span><a href=\"https:\/\/www.youtube.com\/watch?v=XXjlR2OK1kM\"><span style=\"font-weight: 400\">here<\/span><\/a><span style=\"font-weight: 400\">, and here is a short video how Benford\u2019s Law can be used to <\/span><a href=\"https:\/\/www.businessinsider.com\/benfords-law-to-detect-financial-fraud-2014-12\"><span style=\"font-weight: 400\">detect fraud<\/span><\/a><span style=\"font-weight: 400\">. It seems that when people make up numbers to fudge their expense accounts or cook their books, few think to fabricate in accordance with Benford. One hopes I\u2019m not simple educating my students to be better cheats!<\/span><\/p>\n<p><span style=\"font-weight: 400\">Here is the simple case I had my students run by hand. The data are actually from a court case of fraudulent expenditures. Yes, Benford\u2019s Law is permissible in court as evidence of fraud!<\/span><\/p>\n<table>\n<tbody>\n<tr>\n<td><b>Checks<\/b><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">First Digit<\/span><\/td>\n<td><span style=\"font-weight: 400\">Count<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$1,927.48<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">1<\/span><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$27,902.31<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">2<\/span><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$86,241.90<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">3<\/span><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$72,117.46<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">4<\/span><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$81,321.75<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">5<\/span><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$97,473.96<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">6<\/span><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$93,249.11<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">7<\/span><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$89,658.16<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">8<\/span><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$87,776.89<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">9<\/span><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$92,105.83<\/span><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$79,949.16<\/span><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$87,602.93<\/span><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$96,879.27<\/span><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$91,806.47<\/span><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$84,991.67<\/span><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$90,831.83<\/span><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$93,766.67<\/span><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$88,336.72<\/span><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$94,639.49<\/span><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$83,709.26<\/span><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$96,412.21<\/span><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$88,432.86<\/span><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">$71,552.16<\/span><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><span style=\"font-weight: 400\">Just count the frequencies of the numbers starting with, 1, 2, etc., then plot the distribution. Bam! Benford\u2019s got you!<\/span><\/p>\n<p><b>Benford and Geophysics<\/b><\/p>\n<p><span style=\"font-weight: 400\">Out of curiosity, I decided to run Benford\u2019s against some of my own geophysical data, specifically, apparent resistivity measurements from a sounding at a site on Temple\u2019s Ambler Campus. The details aren\u2019t important for Benford, but what is important is that apparent resistivity measurements typically follow a log-normal distribution with their being orders of magnitude between the mosts and least resistive parts of survey line. To do the calculation, I took advantage of a free package written in python (<\/span><a href=\"https:\/\/github.com\/milcent\/benford_py\"><span style=\"font-weight: 400\">https:\/\/github.com\/milcent\/benford_py<\/span><\/a><span style=\"font-weight: 400\">). Isn\u2019t open source wonderful? The site also includes a photo of Benford and information on the history of his discover (and he wasn&#8217;t the first!) along with math details.<\/span><\/p>\n<p><span style=\"font-weight: 400\">So I loaded my data into a Jupyter notebook and check it against Benford. Here is the result:<\/span><\/p>\n<p><span style=\"font-weight: 400\"><a href=\"https:\/\/sites.temple.edu\/geophysics\/files\/2018\/10\/Screen-Shot-2018-10-30-at-7.31.50-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-large wp-image-762\" src=\"https:\/\/sites.temple.edu\/geophysics\/files\/2018\/10\/Screen-Shot-2018-10-30-at-7.31.50-PM-1024x822.png\" alt=\"\" width=\"640\" height=\"514\" srcset=\"https:\/\/sites.temple.edu\/geophysics\/files\/2018\/10\/Screen-Shot-2018-10-30-at-7.31.50-PM-1024x822.png 1024w, https:\/\/sites.temple.edu\/geophysics\/files\/2018\/10\/Screen-Shot-2018-10-30-at-7.31.50-PM-300x241.png 300w, https:\/\/sites.temple.edu\/geophysics\/files\/2018\/10\/Screen-Shot-2018-10-30-at-7.31.50-PM-768x616.png 768w, https:\/\/sites.temple.edu\/geophysics\/files\/2018\/10\/Screen-Shot-2018-10-30-at-7.31.50-PM.png 1558w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/a><\/span><span style=\"font-weight: 400\">Not a bad fit!<\/span><\/p>\n<p><span style=\"font-weight: 400\">This is only a cursory exploration, but I encourage you to test you own data sets against Benford\u2019s law. Send me an email if you discover something interesting!<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>As mentioned in an earlier post, I\u2019m teaching a new general education class entitled \u201cEvil Plots\u201d about how graphs, maps and other forms of visual data communication can be used to persuade or mislead. \u00a0The idea is to teach students &hellip; <a href=\"https:\/\/sites.temple.edu\/geophysics\/2018\/10\/30\/benfords-law-fraud-evil-plots-and-geophysics\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":606,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-760","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/sites.temple.edu\/geophysics\/wp-json\/wp\/v2\/posts\/760","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sites.temple.edu\/geophysics\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sites.temple.edu\/geophysics\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sites.temple.edu\/geophysics\/wp-json\/wp\/v2\/users\/606"}],"replies":[{"embeddable":true,"href":"https:\/\/sites.temple.edu\/geophysics\/wp-json\/wp\/v2\/comments?post=760"}],"version-history":[{"count":0,"href":"https:\/\/sites.temple.edu\/geophysics\/wp-json\/wp\/v2\/posts\/760\/revisions"}],"wp:attachment":[{"href":"https:\/\/sites.temple.edu\/geophysics\/wp-json\/wp\/v2\/media?parent=760"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sites.temple.edu\/geophysics\/wp-json\/wp\/v2\/categories?post=760"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sites.temple.edu\/geophysics\/wp-json\/wp\/v2\/tags?post=760"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}