of 16

# Presenting statistics in social media 2012

Published on: Mar 4, 2016

#### Transcripts - Presenting statistics in social media 2012

• 1. Or You Can Lie With Statistics but it’s a Lot Easier with Words Paul Ricci, MS PhD(c) @CSIwoDB
• 2. Everything is Numbers Statistics are used to estimate & describe patterns in nature that aren’t easy to see with the naked eye  Sports-Earned Run Average, Slugging Percentage, QB Rating, Goals Against Average  Economics-Gross Domestic Product, Unemployment, Inflation  Medicine-Heart Rate, % Body Fat, T-Cell Counts  Education-IQ Scores, SAT scores, Dropout Rates As long the statistic is from a source of data that is verifiable, it’s hard to lie using it.
• 3. Ominous Quote Joseph Stalin “One death is a tragedy. A million deaths are a statistic.”  Translation you need to supplement statistical information with more personal info.
• 4. Types of Statistics Measures of Central Tendency (aka Averages)  Continuous-Number can take any value.  Mean (sum of all data divided by the number of data points)  Median (midpoint of all data when it is ranked from highest to lowest)  Mode (most frequently occurring data value)  Discrete-Value can only take certain values eg. 0 or 1, true or false.  Proportion-sum of values taking a certain value for a given variables divided by the maximum value for that variable.
• 5. Types of Statistics (cont.) Measures of Spread  Range-highest data value-lowest data value  Variance-Average squared deviation from the mean  Standard Deviation-square root of the variance Probability  Used to measure the chance of events  Also used to make a statement about the relationship between a sample and a population that it’s taken from eg. margin of error.
• 6. But a Summary Statistic can NeverTell You the Whole StoryGraph with States Graph without States
• 7. Graph TypesBar Graph-Good visually but not Line Graph-Better for showinggood for trends trends over time 6time 4 5 4time 3 Product A 3 Product A Product B 2 Product Btime 2 Product C Product C 1 time 1 0 time time time time 0 10 20 1 2 3 4
• 8. Graph Types (Cont)This is the first pie chart created byFlorence Nightingale to show thenumber of British soldiers in theCrimean War who died due toinfection rather than combatinjuries.
• 9. Graph Types (cont.)Mapping using GeographicalInformation Systems (GIS) is agood way to represent data byregion. In this graph I showedwhich areas of the city have thehighest number of crimes bycensus tract in the city for 2005.
• 10. Posting Graphs on the Web Line, Bar, Pie, & other Graphs can be created using Microsoft Excel, SPSS, SAS, ArcGIS, R, & other Packages If that data package will allow you to save that graph as a .jpg, .gif, or .png file you can easily add it to your blog.  Microsoft Excel requires a visual basic command to save graphs as image files.
• 11. Statistical Packages Microsoft Excel-Most readily available but not really built for all but basic statistical analysis. OK to make basic graphs. SPSS-Better for more advanced analysis and graphics but less accessible due to cost. User friendly. R-Free software package that can be downloaded from the web. Can do many types of analyses. BUT it is syntax driven. Can save graphics as image files using syntax.
• 12. Cutting Edge Graphics The Gapminder institute provides great interactive graphics for free that can be seen in the documentary the Joy of Stats.  URL: www.gapminder.org  Joy of Stats Clip: http://csiwodeadbodies.blogspot.com/2010/12/income-and- life-expectancy-what-does-it.html The website Fractracker uses advanced graphics and mapping techniques to monitor the impacts of Marcellus Shale drilling in Pennsylvania and New York.  URL: http://www.fractracker.org/
• 13. Poor Statistical Reasoning Example The blog The Audacious Epigone posted an analysis of the IQ’s of a sample of McCain & Obama voters which can be seen at http://anepigone.blogspot.com/2011/05/iq-wars- mccains-voters-win.html
• 14. Some Good Statistical Blogs FiveThirtyEight-Nate Silver’s blog which forecasts elections, the Oscars, and other sporting events. http://fivethirtyeight.blogs.nytimes.com/ Data Visualisation-Has more examples of cutting edge graphics. http://www.datavis.ca/ The Incidental Economist-Good Analysis of health care data. http://theincidentaleconomist.com/wordpress/ CSI without Dead Bodies-My own website http://csiwodeadbodies.blogspot.com
• 15. Sources of Data on the Web Many websites, such as The Census Bureau’s provide data for download with which to do your own analysis.  Example-Small Area Health Insurance Estimates (SAHIE) makes state and county level estimates for the whole US from 2005-2007 (2008 and 2009 estimates are forthcoming) http://www.census.gov/did/www/sahie/index.html Other sites provide data that can be copied and pasted into a data file.  Example-CNN makes it’s poll reports available as PDF’s but not the raw data
• 16. Summary When analyzing data leave no stones unturned  or if that is impossible turn over as many as possible and acknowledge that you couldn’t turn all of them over. When interpreting an analysis ask yourself if they have turned over the important stones and or accounted for the ones that they couldn’t turnover.