<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-5849016128097481347</id><updated>2012-02-02T19:57:26.929-08:00</updated><category term='psychology'/><category term='extreme nerdery'/><category term='babies'/><category term='apologies to JJ'/><category term='Duke stats'/><category term='vicious animals'/><category term='princesses'/><category term='celebrities'/><category term='magic'/><category term='basic stats'/><category term='Brazil'/><category term='losers'/><category term='spooky'/><category term='death'/><category term='I told you so'/><category term='weird'/><category term='math-fun'/><category term='money'/><title type='text'>KL Divergence</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>20</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-3515480201942872513</id><published>2011-12-05T14:34:00.000-08:00</published><updated>2011-12-05T14:34:14.732-08:00</updated><title type='text'>I'm baaaa-aaaack.</title><content type='html'>You probably didn't realize I was gone. That's ok. Just pretend like you missed me.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Anyway, I'm fresh off of a quarter-life crisis induced year of international wanderings. In case you are wondering (again, just pretend), after my postdoc in Brazil, I stayed for a while and did whatever you do on the beach (absolutely nothing) for a few months. Then off to Germany for some climbing, followed by Colombia and Ecuador, and then up to California. Back down to Peru to do the Inca Trail with some buddies, and then a roadtrip around Europe in a rented car. (Sentence fragments aren't bad if you're blogging. Promise.)&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, in the interest of undoing some of the brain atrophy I've experienced over the last year, expect to see a new post every once in a while.&lt;br /&gt;&lt;br /&gt;And in other news, I am moving my blog to my vanity site, &lt;a href="http://www.kristianlum.com/KLdivergence"&gt;www.kristianlum.com/KLdivergence&lt;/a&gt;.&amp;nbsp;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-3515480201942872513?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/3515480201942872513/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2011/12/im-baaaa-aaaack.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/3515480201942872513'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/3515480201942872513'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2011/12/im-baaaa-aaaack.html' title='I&apos;m baaaa-aaaack.'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-1114007066786982970</id><published>2011-04-03T18:00:00.000-07:00</published><updated>2011-04-03T18:04:16.379-07:00</updated><title type='text'>Brasil ranks 31st out of 44 in English profficiency</title><content type='html'>A few months ago, I did a &lt;a href="http://kldivergence.blogspot.com/2010/10/if-your-first-language-is-klingon-you.html"&gt;post&lt;/a&gt; about my guess that someone whose first language is widely spoken would be less likely to speak English than someone whose first language is relatively obscure. It looks like I've been outdone.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.ef.com/"&gt;English First&lt;/a&gt;&amp;nbsp;has done a study that assesses the English proficiency of adults in various countries. From this, they have put together an English proficiency index and made some pretty nifty maps and plots. &lt;br /&gt;&lt;br /&gt;The English First folks also investigated the same phenomenon that I did in my post. Clearly they have a much bigger budget (greater than $0) for doing these sorts of things, and they didn't just cull their data from Wikipedia, so I tend to go with what they say. Good thing their results support my own-- again, that people whose first language is shared by many are less likely to speak English. However, the relationship they found was "weak." See below.&lt;br /&gt;&lt;br /&gt;&lt;div style="height: auto; margin: 20px auto; width: 756px;"&gt;&lt;object align="middle" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" codebase="http://fpdownload.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=8,0,0,0" height="300" id="ammap" width="600"&gt;&lt;param name="allowScriptAccess" value="sameDomain"&gt;&lt;param name="movie" value="http://www.ef.com/EPI/media/native/amxy/amxy.swf?&amp;path=http://www.ef.com&amp;data_file=http://www.ef.com/EPI/media/native/amxy/amxy_data.xml&amp;settings_file=http://www.ef.com/EPI/media/native/amxy/amxy_settings.xml"&gt;&lt;param name="quality" value="high"&gt;&lt;param name="bgcolor" value="#ffffff"&gt;&lt;embed src="http://www.ef.com/EPI/media/native/amxy/amxy.swf?&amp;path=http://www.ef.com&amp;data_file=http://www.ef.com/EPI/media/native/amxy/amxy_data.xml&amp;settings_file=http://www.ef.com/EPI/media/native/amxy/amxy_settings.xml" quality="high" bgcolor="#ffffff" width="756" height="460" name="0" align="middle" allowScriptAccess="sameDomain" type="application/x-shockwave-flash" pluginspage="http://www.macromedia.com/go/getflashplayer"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;a href="http://www.ef.com/" id="logo"&gt;&lt;img alt="Study abroad with EF" border="0" height="54" src="http://media.ef.com/sitecore/__~/media/efcom/universal/ef-logos/logo.png?h=54&amp;amp;w=59" width="59" /&gt;&lt;/a&gt;&lt;strong&gt;&lt;a href="http://www.ef.com/epi/"&gt;&lt;img alt="EF EPI" border="0" height="30" src="http://media.ef.com/sitecore/__~/media/efcom/epi/shared-content/program-name/epi_logo.png?h=30&amp;amp;w=100" width="100" /&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/div&gt;&lt;br /&gt;If you're upset by the fact that the relationship here appears to be in the opposite direction of that which I found earlier, don't be. I was looking at the &lt;i&gt;negative&lt;/i&gt;&amp;nbsp;log of the number of native speakers. Why I transformed the data like that, I don't actually remember, but rest assured that this is showing roughly the same thing. Of course, this isn't exactly the same thing, the most obvious reason being that they are looking at "English proficiency", whereas I was looking at the "percent of English speakers."&lt;br /&gt;&lt;br /&gt;They also compare English proficiency to various other variables they believe should be related, such as &amp;nbsp;the value of exports per capita, the average number of years of schooling, and gross national income per capita. All of these had a stronger relationship to the English proficiency than the native speakers variable.&lt;br /&gt;&lt;br /&gt;One last mildly interesting nugget of information, which was mentioned in &lt;a href="http://oglobo.globo.com/educacao/mat/2011/04/01/brasil-ocupa-31-posicao-em-habilidade-de-ingles-entre-adultos-em-ranking-mundial-924137203.asp"&gt;the Brazilian article&lt;/a&gt;&amp;nbsp;that pointed me to the English First study and website, is that all of the BRIC countries fall right in line. China, India, Brazil, and Russia took the 29th, 30th, 31st, and 32nd spots respectively. The article also pointed out that, although world wide Brazil did not do so well in this ranking, at least it beat Venezuela and Chile!&lt;br /&gt;&lt;span class="Apple-style-span" style="color: #535353; font-family: 'Trebuchet MS', Arial, Helvetica, freesans, sans-serif; font-size: 10px;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;h3 style="color: #535353; font-family: 'Trebuchet MS', Arial, Helvetica, freesans, sans-serif; font-size: 24px !important; font-weight: bold; line-height: 27px; margin-bottom: 9px; margin-left: 0px; margin-right: 0px; margin-top: 9px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-indent: 0px; word-spacing: 0px;"&gt;&lt;br /&gt;&lt;/h3&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-1114007066786982970?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/1114007066786982970/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2011/04/brasil-ranks-31st-out-of-44-in-english.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/1114007066786982970'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/1114007066786982970'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2011/04/brasil-ranks-31st-out-of-44-in-english.html' title='Brasil ranks 31st out of 44 in English profficiency'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-3586376482796428942</id><published>2011-03-27T09:44:00.000-07:00</published><updated>2011-03-27T09:44:35.841-07:00</updated><title type='text'>The Anne Hathaway Effect</title><content type='html'>&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;I recently stumbled upon &lt;/span&gt;&lt;a href="http://www.huffingtonpost.com/dan-mirvish/the-hathaway-effect-how-a_b_830041.html"&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;this article in the Huffington Post&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&amp;nbsp;which claims that every time Anne Hathaway gets a lot of Internet attention (for releasing a movie, hosting the Oscars, or what have you), the stock price for Berkshire Hathaway shoots up. The author, Dan Mirvish, justifies the plausibility of this by saying that "&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: inherit; line-height: 20px;"&gt;My guess is that all those automated, robotic trading programming are picking up the same chatter on the internet about "Hathaway" as the IMDb's StarMeter, and they're applying it to the stock market."&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, Century, Times, serif; line-height: 20px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="line-height: 20px;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;The data they use to support the claim is that&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span" style="color: purple; font-family: Georgia, 'Times New Roman', serif;"&gt;&lt;span class="Apple-style-span" style="line-height: 20px;"&gt;Oct. 3, 2008 -&amp;nbsp;&lt;/span&gt;&lt;span class="Apple-style-span" style="line-height: 20px;"&gt;&lt;em style="border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; font-style: italic !important; list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Rachel Getting Married&lt;/em&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="line-height: 20px;"&gt;&amp;nbsp;opens: BRK.A up .44%&lt;/span&gt;Jan. 5, 2009 -&amp;nbsp;&lt;em style="border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; font-style: italic !important; list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Bride Wars&lt;/em&gt;&amp;nbsp;opens: BRK.A up 2.61%&lt;br /&gt;Feb. 8, 2010 -&amp;nbsp;&lt;em style="border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; font-style: italic !important; list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Valentine's Day&lt;/em&gt;&amp;nbsp;opens: BRK.A up 1.01%&lt;br /&gt;March 5, 2010 -&amp;nbsp;&lt;em style="border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; font-style: italic !important; list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Alice in Wonderland&lt;/em&gt;&amp;nbsp;opens: BRK.A up .74%&lt;br /&gt;Nov. 24, 2010 -&amp;nbsp;&lt;em style="border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; font-style: italic !important; list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Love and Other Drugs&lt;/em&gt;&amp;nbsp;opens: BRK.A up 1.62%&lt;br /&gt;Nov. 29, 2010 - Anne announced as co-host of the Oscars: BRK.A up .25%&lt;/span&gt;&lt;/blockquote&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, Century, Times, serif; line-height: 20px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="line-height: 20px;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;I think the first commenter put it well when s/he said&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span" style="line-height: 20px;"&gt;&lt;span class="Apple-style-span" style="color: purple; font-family: Georgia, 'Times New Roman', serif;"&gt;"First!"&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Nah, just kidding.&lt;/span&gt; &lt;a href="http://www.huffingtonpost.com/social/MandatedLoginsAreAsinine/the-hathaway-effect-how-a_b_830041_81268996.html"&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;Here's what they really said&lt;/span&gt;&lt;/a&gt;:&lt;br /&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span" style="color: purple; font-family: Georgia, 'Times New Roman', serif;"&gt;&lt;span class="Apple-style-span" style="line-height: 16px;"&gt;This is junk statistics if I've ever seen it. There may be something to the automated trading idea, but these data are proof of nothing. How about the hundreds of other times Ms. Hathaway was in the news and the stock didn't rise so dramatical&lt;wbr style="border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;/wbr&gt;­ly? How volatile is this stock normally? Are these percentage increases anything out of the ordinary?&lt;/span&gt;&lt;span class="Apple-style-span" style="line-height: 16px;"&gt;&lt;br style="border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;" /&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="line-height: 16px;"&gt;Exasperate&lt;wbr style="border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;/wbr&gt;­d, I decided to d a quick test. I downloaded the BRK.A data from Jan. 1, 2008 to Mar. 18, 2011 from YAHOO Finance and did a trivial analysis of it in Matlab. Just looking at the difference between open and close prices, the stock was up 0.25% or more 308 times over this period. The stock was up 2.61% or more 47 times over this period. Those two percentage&lt;wbr style="border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;/wbr&gt;­s are the lowest and highest in Mr. Mirvish's "data."&lt;/span&gt;&lt;span class="Apple-style-span" style="line-height: 16px;"&gt;&lt;br style="border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;" /&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="line-height: 16px;"&gt;As a scientist and math lover I've disappoint&lt;wbr style="border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;/wbr&gt;­ed to see this story making the rounds with so little skepticism&lt;wbr style="border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;/wbr&gt;­. It's a statement for the level of understand&lt;wbr style="border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;/wbr&gt;­ing of statistics and probabilit&lt;wbr style="border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;/wbr&gt;­y by the general public.&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Looks like I'm not the only mathbuster out there.&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;My first complaint about this (and backing up commenter number 1) is that, as someone who does not follow stocks at all, I have no idea if a .74% increase in BRK.A is anything notable. &amp;nbsp;Having downloaded the stock prices since 2008 from Google Finance, I can tell you that it isn't. &amp;nbsp;When Ra&lt;i&gt;chel Getting Married&lt;/i&gt; opened, the .44% increase was in the 68th percentile of changes in price... including negative changes. It was only in the 32nd percentile of positive changes. Even the biggest change of 2.61% is only in the 92nd percentile overall. Certainly not a tail event. &amp;nbsp;Getting to the point, it's not like every time Anne Hathaway gets naked with Jake Gyllenhaal, the stock holders all go out and by themselves a brand new G6. It's a pretty normal fluctuation.&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Over the period from 2008 to yesterday, the stock increased about 47% of the time. Since we are apparently completely disregarding the magnitude of the change, the probability of getting all positive changes when randomly selecting 6 dates out of the 828 trading days is quite small. But what would be the chances of looking at, say, 10 different dates and finding that 6 or more of them are positive?? If we ignore the issue of replacement (which shouldn't be horribly important since the sample size is 828 and we are only sampling 10), the probability of getting exactly 6 is about 18%, and the probability of getting 6 or more is about 31%.&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Given that the hypothesis is that the stock price is getting this little upward nudge because of Internet chatter, I checked out Google Trends to find other likely dates that the stock should increase under this hypothesis. Luckily, Google even shows you what the major news stories are on some of the major peaks, so it is easy to figure out the date.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-E08JhY1AzZs/TY9dq2g_HtI/AAAAAAAABZE/ykImcM-FE9g/s1600/AnnTrend1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="178" src="http://3.bp.blogspot.com/-E08JhY1AzZs/TY9dq2g_HtI/AAAAAAAABZE/ykImcM-FE9g/s400/AnnTrend1.png" width="400" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Google Trends for Anne Hathaway&lt;br /&gt;The top line is search volume and the bottom is news volume. They pick out many of the same spikes.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;div&gt;&lt;br /&gt;Two big peaks we see on here that haven't already been accounted for in the original post are B,&amp;nbsp;&lt;span class="Apple-style-span" style="font-family: arial, sans-serif; line-height: 18px;"&gt;&lt;a href="http://moviesblog.mtv.com/2009/02/23/anne-hathaway-proclaims-love-for-family-guy-aqua-teen-fulfills-nerd-vision-of-idealized-woman/" style="color: #0000cc;"&gt;Anne Hathaway Proclaims Love For ‘Family Guy,’ ‘Aqua Teen,’ Fulfills Nerd Vision Of Idealized Woman&lt;/a&gt;, &lt;/span&gt;&lt;span class="Apple-style-span" style="line-height: 18px;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;on February 23, 2009 and C&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: arial, sans-serif; line-height: 18px;"&gt;,&amp;nbsp;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: arial, sans-serif; line-height: 18px;"&gt;&lt;a href="http://www.expressindia.com/latest-news/Anne-Hathaway-spends-spare-time-studying-physics/574082/" style="color: #0000cc;"&gt;Anne Hathaway spends spare time studying physics&lt;/a&gt;, &lt;/span&gt;&lt;span class="Apple-style-span" style="line-height: 18px;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;on February 2, 2010. On these two dates, BRK.A saw a 1.82% and .11% &lt;i&gt;decrease&lt;/i&gt;&amp;nbsp;respectively. &amp;nbsp;Further, when on June 20, 2008 the Los Angeles Times posted a story called&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: arial, sans-serif; line-height: 18px;"&gt;&lt;a href="http://www.latimes.com/entertainment/news/movies/la-alba-hathaway-2008-pg,0,214131.photogallery" style="color: #0000cc;"&gt;Anne Hathaway versus Jessica Alba&lt;/a&gt;&amp;nbsp;&lt;/span&gt;&lt;span class="Apple-style-span" style="line-height: 18px;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&amp;nbsp;resulting in the very visible spike in 2008&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="line-height: 18px;"&gt;(I guess everyone likes a good ladyfight)&lt;/span&gt;&lt;span class="Apple-style-span" style="line-height: 18px;"&gt;, BRK.A experienced a -.79% change. On the opening day of &lt;i&gt;Get Smart, &lt;/i&gt;June 20, 2008, BRK.A fell .79%, and if we go back just a little bit further to December 9, 2005, the day that &lt;i&gt;Brokeback Mountain&lt;/i&gt;&amp;nbsp;had its major opening in the US, BRK.A dropped .07%. In fact, the sample correlation between Anne Hathaway's Internet search traffic and the price of BRK.A for 2008 to yesterday was just .01-- basically uncorrelated.**&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="line-height: 18px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="line-height: 18px;"&gt;Given all of this, I'm really hoping that Dan Mirvish didn't run out and by up a bunch of BRK.A hoping that his post would force the price up a bit. :)&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="line-height: 18px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="line-height: 18px;"&gt;**This, of course, does not rule out the case that the fancy trading algorithms only act based on spikes in search volume, not normal activity, but just sayin'...&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: arial, sans-serif; font-size: x-small;"&gt;&lt;span class="Apple-style-span" style="line-height: 18px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-3586376482796428942?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/3586376482796428942/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2011/03/anne-hathaway-effect.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/3586376482796428942'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/3586376482796428942'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2011/03/anne-hathaway-effect.html' title='The Anne Hathaway Effect'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-E08JhY1AzZs/TY9dq2g_HtI/AAAAAAAABZE/ykImcM-FE9g/s72-c/AnnTrend1.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-50398948704425689</id><published>2011-03-16T13:32:00.000-07:00</published><updated>2011-03-21T09:53:30.415-07:00</updated><title type='text'>Text me where the buildings are, and I'll tell you where the building damage is.</title><content type='html'>Back in October 2010, Patrick Meier posted an article called &lt;a href="http://irevolution.net/2010/10/13/crowdsourced-prediction/"&gt;How Crowdsourced Data Can Predict Crisis Impact: Findings from Empirical Study on Haiti &lt;/a&gt;on his blog, &lt;a href="http://irevolution.net/"&gt;iRevolution&lt;/a&gt;. It might be worth your time to go skim that really quickly if you want to get the biggest bang for your buck as you continue reading this... go ahead, I'll wait.&lt;br /&gt;&lt;br /&gt;If you did your homework, you already know that in his blog post, he recaps some pretty interesting results from a &amp;nbsp;team at the European Commission's Joint Research Center (JRC). The researchers who did this study were very awesome and sent me the original paper along with some hints as to how they did their analysis. If you want the paper, which appears in Conference Proceedings from the 2nd International Workshop on Validation of Geo-Information Products for Crisis Management, you'll have to track down the proceedings. Alternatively, you can watch the &lt;a href="http://www.youtube.com/watch?v=vyrTgXlerYM"&gt;presentation video&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;Meier wrote that the JCR team used the SMS reports mapped on the Ushahidi-Haiti platform "to show that this crowdsourced data can help predict the spatial distribution of structural damage in Port-au-Prince&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: arial, sans-serif; font-size: x-small;"&gt;".&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&amp;nbsp;The SMS messages they use were collected starting just four days after the disaster and were sent by Hatians with their "location and urgent needs." Through the magic of spatial statistics, these researchers show that they are able to predict the locations of building damage using the SMS data. They point out that in the event of an emergency such as the Port-au-Prince earthquake, this sort of prediction would be very useful because it is cheap and real-time. You don't need a small army of &amp;nbsp;"some 600 experts from 23 different countries" and the World Bank to assess detailed satellite imagery to pinpoint the damaged buildings. All you'd really need is a much smaller sample of damaged buildings with which to correlate the SMS data, and voila! As you get more SMS data, you would be able to predict where more building damage is (read: people needing help are).&lt;br /&gt;&lt;br /&gt;Let's start by taking a look at some of the figures from the paper that support this claim. &amp;nbsp;Figure 1 (in this blog, Figures 4 and 5 in the paper) shows a derivative of Ripley's K-function, which essentially determines whether same-type events (top row) or different-type events (bottom row) can be said to cluster together at various distances. Remember that this paper's main idea is to show that &amp;nbsp;building damage is clustered near SMS messages. One type of event is a SMS message, and the other type is a highly damaged building, as judged by the previously mentioned "experts". The data are the locations of each of these types of events across a 9km x 9km square that comprises the city of Port-au-Prince. The horizontal axis, across which this L function is calculated, represents the distance between the location of events. The green lines are 80% confidence intervals. In a nutshell, if the black line (the calculated L statistic) falls above the green line at any point, then we are to think that within this radius around any given event, events of the same type (top row) or different type (bottom row) are more likely to occur. So, for example, if we look in the bottom right plot of Figure 1, we find that for radii between about 1000m and 3000m from any SMS message, we are likely to find a higher-than-average number of damaged buildings. Hence the usefulness of the SMS messages in this situation.&lt;br /&gt;&lt;br /&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="https://lh4.googleusercontent.com/-tHvGLSw5ILo/TYD5xoYdGYI/AAAAAAAABYA/vKBC47wYOq0/s1600/80PercentCIFromPaperPNG.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="365" src="https://lh4.googleusercontent.com/-tHvGLSw5ILo/TYD5xoYdGYI/AAAAAAAABYA/vKBC47wYOq0/s400/80PercentCIFromPaperPNG.png" width="400" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Figure 1: L statistic from original paper&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;But, let's think about this for a second. Does it really make sense that this would be the case for a radius of 2km but not 500m? That is, would it really make sense to believe that people are texting for help 2km away from major building damage but not right near the site? Sure, I guess I could buy that. I suppose it could be the case that people very close to the damaged buildings are either dead or incapacitated and thus unable to send SMS messages. I wouldn't expect this to be the case up to a kilometer away from the most damaged buildings, but I'll go with it for now. Secondly, how useful is it to know that there are likely to be damaged buildings within a 2km radius of any text? If we assume that we don't already have a good idea of where buildings are without the text messages, my high school geometry tells me that this 2km radius implies an area of about 12 and a half square kilometers in which we blindly search to find the expected extra building damage. Even subtracting off that inner radius, where there is not likely to be extra damage, we're still left with almost 10 square kilometers. Again, I'll go with it. Maybe the information from all of the text messages combined gives more practically useful information.&lt;br /&gt;&lt;br /&gt;The most convincing graphic from this paper (labeled as Figure 7 from their paper, and Figure 2 in my blog) is that which shows the observed density of building damage next to the predicted building damage density given SMS messages. &amp;nbsp;Yep, I agree that this passes the eyeball test. It does look like SMS messages are doing a pretty good job of sniffing out building damage.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="https://lh4.googleusercontent.com/-MVVw1vXN5Vk/TYD7bLlHjhI/AAAAAAAABYE/-MyxVaFCXbA/s1600/FromPaperDensitiesPNG.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="316" src="https://lh4.googleusercontent.com/-MVVw1vXN5Vk/TYD7bLlHjhI/AAAAAAAABYE/-MyxVaFCXbA/s640/FromPaperDensitiesPNG.png" width="640" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Figure 2: Predicted and observed building damage density from original paper.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;div&gt;Alright, now let's take a closer look. I also got a hold of the larger data sources used in this analysis. Because the paper does not list the exact boundaries they used to define Port-au-Prince in their data set, I tried to recreate their data set based on the number of events they reported to have included in the analysis and guessing what the boundaries of their plots were by finding landmarks on a map. After many hours of trying to find a subset of these larger datasets to match SMS and building damage data sets used in the above analysis perfectly, I emerged with something that is hopefully sufficiently similar. &amp;nbsp;First, because I will be doing some statistics and thus no one will trust me (&lt;a href="http://en.wikipedia.org/wiki/Lies,_damned_lies,_and_statistics"&gt;thanks a lot, Mark Twain&lt;/a&gt;), I reproduce the above plots using my datasets. Although it looks like I cut off a little bit of space over on the right when trying to match their dataset, for all intents and purposes, I think I've got the same thing. They've got 1645 SMS messages, and I've got 1651. They use 33,800 damaged building locations, while I use 33,153. Although the plots that I have reproduced (Figures 3 and 4) are not *exactly* the same as those presented in the paper (above), I think they are similar enough to conclude I am doing the same thing they are given that the datasets are slightly different and some of these plots require some tuning parameters. I'm satisfied.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;br /&gt;&lt;a href="https://lh5.googleusercontent.com/-emTetdgr8ok/TYD8-pHa2MI/AAAAAAAABYI/LaYuclw1yrE/s1600/80PercentCIDamagedPNG.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="400" src="https://lh5.googleusercontent.com/-emTetdgr8ok/TYD8-pHa2MI/AAAAAAAABYI/LaYuclw1yrE/s400/80PercentCIDamagedPNG.png" width="400" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Figure 3: My reproduction of the L statistic plots that appear in the original paper using my dataset.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="https://lh5.googleusercontent.com/-e5rd9kIgKH4/TYEB3EHbuhI/AAAAAAAABYM/am9gpasVeIs/s1600/BuildingDamReproduce.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="296" src="https://lh5.googleusercontent.com/-e5rd9kIgKH4/TYEB3EHbuhI/AAAAAAAABYM/am9gpasVeIs/s640/BuildingDamReproduce.png" width="640" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Figure 4: (left) Fitted conditional density of building damage given SMS messages. (right) Observed density of building damage. Both of these plots were produced from my datasets and are intended as reproductions of the plots in the original paper.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;My first main question upon reading this paper was whether these text messages were specifically picking out damaged buildings or whether they were simply finding areas of high building density. After all, people send the text messages and people do tend to be in areas with lots of buildings. I re-ran the same analysis with a random sample of 1000 buildings. This is as opposed to the previous plots which were run with a random sample of 1000 &lt;i&gt;damaged&lt;/i&gt; buildings. Proceeding with their 80% confidence interval convention, &amp;nbsp;I find very similar results. For radii of about 1.5-3km, SMS message locations correlate with building locations, not just damaged building locations. Further, according to the infallible eyeball test, it seems that the SMS data is doing a good job of finding all of these buildings. (Figures 5 and 6)&lt;br /&gt;&lt;br /&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="https://lh6.googleusercontent.com/-xeQFhIdiP6k/TYEF6VU3qII/AAAAAAAABYU/iM4QP2deHnY/s1600/badger1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="393" src="https://lh6.googleusercontent.com/-xeQFhIdiP6k/TYEF6VU3qII/AAAAAAAABYU/iM4QP2deHnY/s400/badger1.png" width="400" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Figure 5: L statistics for SMS messages and a random sample of all buildings.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="https://lh3.googleusercontent.com/-MYdc7PVXEkA/TYEGN3dOcfI/AAAAAAAABYY/5O6pJONikg8/s1600/BuildingDensityReproduce.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="264" src="https://lh3.googleusercontent.com/-MYdc7PVXEkA/TYEGN3dOcfI/AAAAAAAABYY/5O6pJONikg8/s640/BuildingDensityReproduce.png" width="640" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Figure 6: (left) Fitted conditional density of buildings given SMS messages. (right) Observed density of all buildings.&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&lt;/span&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;So, what's going on here? My initial reaction was "Blimey! These text messages are just picking out buildings, not &lt;i&gt;damaged&lt;/i&gt; buildings! &amp;nbsp;Damaged buildings can only occur where there is a building, and because text messages correlate with buildings themselves, the correlation between text messages and damaged buildings is merely an artifact!" &amp;nbsp;After some quiet introspection, &amp;nbsp;I realized that I may have jumped the gun. &amp;nbsp;Because we only used the trusty eyeball test, we haven't looked at whether text messages do a better job of picking out the specifically damaged buildings than they do any building at all.&lt;br /&gt;&lt;br /&gt;For my next trick, I run a Poisson regression. Following the original paper, I bin the data into a 30 by 30 grid, counting up the number of total buildings, damaged buildings, and SMS messages sent in each grid square. A quick diagnostic plot of the total counts versus damaged counts indicates that there is a pretty good linear relationship between the two-- &amp;nbsp;the number of damaged buildings in any square is approximately a constant times the total number of buildings in that square. Although I am hoping with all of my might that my PhD advisor does not read this and find out that I did not use a formal (Bayesian!) spatial model to handle this clearly spatial data, I simply ran a few Poisson regressions to see if the SMS data really is adding anything beyond what we already know from the building counts. In my experience, incorporating a spatial model in the regression would only serve to reduce the significance of the covariates anyway. &amp;nbsp;I fit the linear model&lt;br /&gt;&lt;br /&gt;Damaged Buildings ~ Poisson( exp{ b0 + b1* SMS &amp;nbsp;+ log (Total Buildings + 1)). (Model 1)&lt;br /&gt;&lt;br /&gt;This model includes one plus the total number of buildings as an &lt;a href="http://en.wikipedia.org/wiki/Poisson_regression#.22Exposure.22_and_offset"&gt;offset&lt;/a&gt;. Adding one simply serves to eliminate the computational problem of taking the log of zero. &amp;nbsp;As discussed in the Wikipedia article linked to offset, this is often used to control for a baseline, in this case the total number of buildings in a square. The results of this regression are&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;Call:&lt;/span&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;glm(formula = damcounts ~ offset(log(allcounts + 1)) + textcounts,&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;family = poisson(link = "log"))&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;Deviance Residuals:&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;Min &amp;nbsp; &amp;nbsp; &amp;nbsp; 1Q &amp;nbsp; Median &amp;nbsp; &amp;nbsp; &amp;nbsp; 3Q &amp;nbsp; &amp;nbsp; &amp;nbsp;Max &lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;-16.002 &amp;nbsp; -3.074 &amp;nbsp; -0.646 &amp;nbsp; &amp;nbsp;1.324 &amp;nbsp; 21.507 &lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;Coefficients:&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Estimate Std. Error &amp;nbsp;z value Pr(&amp;gt;|z|) &amp;nbsp; &lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;(Intercept) -1.5669207 &amp;nbsp;0.0061123 -256.353 &amp;nbsp; &amp;lt;2e-16 ***&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;textcounts &amp;nbsp;-0.0024470 &amp;nbsp;0.0009817 &amp;nbsp; -2.493 &amp;nbsp; 0.0127 * &lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;---&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;Signif. codes: &amp;nbsp;0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;(Dispersion parameter for poisson family taken to be 1)&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;Null deviance: 23336 &amp;nbsp;on 899 &amp;nbsp;degrees of freedom&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;Residual deviance: 23329 &amp;nbsp;on 898 &amp;nbsp;degrees of freedom&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;AIC: 26256&lt;/span&gt;&lt;/blockquote&gt;&lt;/blockquote&gt;&lt;div&gt;&lt;div&gt;For those of us not used to reading R output, look at &amp;nbsp;the number to the far right of "textcounts". While the coefficient on the number of text messages is significant, the sign is in the opposite direction as expected! Having text messages in any grid square results in a prediction of fewer damaged buildings! Could this be that before sending text messages, the people sending them moved away from the damaged buildings for safety reasons?&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Next, I suspect the areas of high building density, have a higher percent of &amp;nbsp;damaged buildings than areas of low building density. Imagine that in a dense area, one building falling could cause damage in others, whereas in a less dense area, this would be less likely to happen. To attempt to control for this, I ran another regression in which I include an additional covariate that is just the total number of buildings in the square. That is,&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Damaged Buildings ~ Poisson( exp{ b0 + b1* SMS &amp;nbsp;+ b2 * Total Buildings + &amp;nbsp;log (Total Buildings + 1)) &amp;nbsp;(Model 2).&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The results from Model 2 show that the number of text messages are not significant at the magical 95% significance level.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;Call:&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;glm(formula = damcounts ~ allcounts + offset(log(allcounts +&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;1)) + textcounts, family = poisson(link = "log"))&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;Deviance Residuals:&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; Min &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1Q &amp;nbsp; &amp;nbsp;Median &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;3Q &amp;nbsp; &amp;nbsp; &amp;nbsp; Max &lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;-16.2803 &amp;nbsp; -2.6896 &amp;nbsp; -0.5842 &amp;nbsp; &amp;nbsp;1.3627 &amp;nbsp; 19.3989 &lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;Coefficients:&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Estimate Std. Error z value Pr(&amp;gt;|z|) &amp;nbsp; &lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;(Intercept) -1.768e+00 &amp;nbsp;1.159e-02 -152.58 &amp;nbsp; &amp;lt;2e-16 ***&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;allcounts &amp;nbsp; &amp;nbsp;3.794e-04 &amp;nbsp;1.792e-05 &amp;nbsp; 21.18 &amp;nbsp; &amp;lt;2e-16 ***&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;textcounts &amp;nbsp;-1.851e-03 &amp;nbsp;1.006e-03 &amp;nbsp; -1.84 &amp;nbsp; 0.0657 . &lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;---&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;Signif. codes: &amp;nbsp;0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1&amp;nbsp;&lt;/span&gt;&lt;/blockquote&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;(Dispersion parameter for poisson family taken to be 1)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;Null deviance: 23336 &amp;nbsp;on 899 &amp;nbsp;degrees of freedom&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;Residual deviance: 22889 &amp;nbsp;on 897 &amp;nbsp;degrees of freedom&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;AIC: 25817&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&amp;nbsp;Lastly, and I won't show the output this time, if we ignore the offset completely and regress the number of damaged buildings on the total number of buildings, the square root of the total number of buildings, and the number of text messages, we find that the coefficient on the number of text messages has a p-value of .41-- far from significant... even at the 80% level. The rationale for this was simply that some exploratory data analysis suggested that the square root of the total number of buildings might be a good predictor of the number of damaged buildings. From a geometric point of view, if the streets within a square are themselves arranged in a grid, this would be approximately the average number of buildings per street in that square and could maybe proxy for density.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&amp;nbsp;&amp;nbsp;&lt;/div&gt;&lt;div&gt;For the non-statisticians in the crowd, what this means is that given just the number of &amp;nbsp;buildings in a square, the number of text messages sent from within that square &amp;nbsp;is not an important factor in determining the number of damaged buildings! So, although text messages may be useful in identifying locations with buildings, if you already know where the buildings are, the text messages are not particularly useful (in this particular case) for figuring out how many of those buildings are damaged. Assuming that a crisis response team could more quickly access maps of building density than even the SMS data, ignoring the SMS data could lead to an even faster and cheaper response in this case.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;At this point, if you are paying careful attention, you may think that I've missed the point. We did already show that for small radii, text messages are not correlated with building damage. The approximate 0.15km radii within each box are certainly under the threshold for which we wouldn't expect to see any relationship between text messages and building damage under the original analysis. We already knew that, but I think this is a more formal way of making the point that building locations may be enough to find damaged buildings. &amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;To conclude, one of the main advantages presented in the blog post was how much time and money using SMS messages to find damaged buildings could save. Crowdsourced data may have its uses, but for finding damaged buildings for the case in Haiti, I’d like to propose an even cheaper alternative: a few statisticians, a map, and some coffee.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;**** &lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;&lt;a href="http://www.unitar.org/unosat/haiti-earthquake-2010-remote-sensing-based-building-damage-assessment-data"&gt;Data&lt;/a&gt;&amp;nbsp;obtained from&amp;nbsp;&lt;/span&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse;"&gt;&lt;span class="Apple-style-span" style="font-family: inherit;"&gt;UNITAR/UNOSAT.&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse; color: #1f497d; font-family: arial, sans-serif; font-size: 15px;"&gt;&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-50398948704425689?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/50398948704425689/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2011/03/text-me-where-buildings-are-and-ill.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/50398948704425689'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/50398948704425689'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2011/03/text-me-where-buildings-are-and-ill.html' title='Text me where the buildings are, and I&apos;ll tell you where the building damage is.'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='https://lh4.googleusercontent.com/-tHvGLSw5ILo/TYD5xoYdGYI/AAAAAAAABYA/vKBC47wYOq0/s72-c/80PercentCIFromPaperPNG.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-206727287805617271</id><published>2010-11-12T06:24:00.000-08:00</published><updated>2010-11-12T06:56:12.484-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='spooky'/><category scheme='http://www.blogger.com/atom/ns#' term='psychology'/><category scheme='http://www.blogger.com/atom/ns#' term='weird'/><title type='text'>Stop the presses! Psychic Phenomena are Real!!!!</title><content type='html'>Now, this might be the coolest thing ever! Some &lt;a href="http://www.dbem.ws/"&gt;researchers&lt;/a&gt; claim that they have conducted experiments that show that psychic phenomenon (pre-cognition, i.e. telling the future!!!) exist. Here´s &lt;a href="http://www.newscientist.com/article/dn19712-is-this-evidence-that-we-can-see-the-future.html"&gt;the article that alerted me to this&lt;/a&gt;&amp;nbsp;(which was sent to me by one extra-special Craigory Craig, who I won´t link to because he´s a professional now or something), and here´s a &lt;a href="http://www.dbem.ws/FeelingFuture.pdf"&gt;pre-print of the paper&lt;/a&gt;. &lt;br /&gt;&lt;br /&gt;To begin, this is by far my favorite sentence from the paper:&lt;br /&gt;&lt;blockquote&gt;&lt;/blockquote&gt;&lt;blockquote&gt;After responding to two individual-difference items (discussed below), the participant had a 3-min relaxation period during which the screen displayed a &lt;i&gt;&lt;span style="color: magenta;"&gt;slowly moving Hubble photograph of the starry sky while peaceful new-age music played through stereo speakers&lt;/span&gt;&lt;/i&gt;.&lt;/blockquote&gt;&lt;br /&gt;Why am I not surprised that this was the set-up researchers in this field would choose? I must be psychic.&lt;br /&gt;&lt;br /&gt;In the above patchouli-scented experiment, they present the participants with two doors to choose between, one of which had a picture behind it and the other had nothing-- sort of like Let´s Make a Deal / Monte Hall game except instead of a car, you are rewarded with a picture of people doing it, and instead of a goat, you just get a blank screen. No, seriously, some of the pictures that were behind the curtain were "erotic pictures" (i.e. people doing it). The awesome thing here (if you have the sense of humor of a 13 year old boy, much like I do) is that people were able to guess with statistically better than 50% accuracy which curtain the picture was behind... as long as it was an erotic picture. My first thought is that this sort of psychic power explains why I miraculously turned up at my dorm room pretty much every time my freshman year roommate wanted it to herself. The force is strong with this one. &lt;br /&gt;&lt;br /&gt;In another section of the paper, they talk about retroactive priming.&amp;nbsp; Each person was asked to indicate whether a picture was pleasant or unpleasant. In the retroactive experiment, a word was then flashed on the screen that was either congruous or incongruous with "pleasant" or "unpleasant". In the plain vanilla version, the&amp;nbsp; priming word was flashed first. In these experiments, we´d apparently expect to see that it takes a person longer to select "pleasant" or "unpleasant" if the prime was incongruous with what they were trying to choose, and I guess this has been shown in forward priming experiments. Between pictures, a photograph from the Hubble telescope again made an appearance... because apparently photographs from the Hubble telescope&amp;nbsp;are to psi-sense as sorbet is to tongues. &lt;br /&gt;&lt;br /&gt;So, here´s what I´m thinking: &lt;br /&gt;&lt;br /&gt;Why are people &lt;i&gt;only&lt;/i&gt; able to have pre-cognitive powers related to erotic images? Is this what the researchers set out to prove in the first place? If not, it seems that one could partition the pictures into categories such that one of the categories proved statistically significant. I actually don´t think they were being dishonest in that way, though. Just sayin´.&lt;br /&gt;&lt;br /&gt;Certainly there have been other priming experiments done in the past in which a series of primes and pictures were presented without the delicious raspberry Hubble telescope in between. &lt;span style="background-color: white; color: purple;"&gt;If retroactive priming is real, could they&amp;nbsp;not re-analyze those old studies to see if the retroactive priming effect was present when it was not the explicit purpose of the study?&lt;/span&gt;&amp;nbsp;It would be awesome if it were, as evidence of this would have just been sitting around waiting to be discovered. &lt;br /&gt;&lt;br /&gt;If it´s not, I am actually not so quick to take that as evidence that these sorts of psychic abilities are´t real. &lt;span style="color: purple;"&gt;Could that not be evidence that people have psychic abilities that lean in the direction of pleasing the experimenter by confirming the hypothesis of the study, even if the hypothesis was unknown to the participant?&lt;/span&gt; I mean, shit, if they were psychic enough to know what the word was before they saw it, they ought to be psychic enough to know what the experimenter was trying to get at. And, how crazy would that be??? That would certainly call into question all designed experiments in psychology, as effects could also then be attributed to the participants´ inclination to confirm the hypothesis, even if the hypothesis was not disclosed. &lt;br /&gt;&lt;br /&gt;In any case, this is not a math-busters style post. I´ll leave the replication of this study to the ghost-busters / psychologists. Until then, I´ll be eagerly waiting to see&amp;nbsp;if this ends up getting busted...&lt;br /&gt;&lt;br /&gt;&amp;nbsp;So, what do you think? Do psychic phenomena exist? If you don't believe this, how much evidence would you need to overcome your prior?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-206727287805617271?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/206727287805617271/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2010/11/stop-presses-psychic-phenomenon-are.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/206727287805617271'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/206727287805617271'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2010/11/stop-presses-psychic-phenomenon-are.html' title='Stop the presses! Psychic Phenomena are Real!!!!'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-317750868822302838</id><published>2010-11-09T03:56:00.000-08:00</published><updated>2010-11-09T03:56:40.219-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='weird'/><category scheme='http://www.blogger.com/atom/ns#' term='Brazil'/><title type='text'>Daylight Savings Time!</title><content type='html'>The only way I can ever remember which direction Daylight Savings Time changes the time is with the saying "spring forward, fall back." The fact that the direction of the changes is dictated by the season (i.e. how early the sun rises and sets) should have made it obvious what would happen with the time in the southern hemisphere relative to the northern hemisphere. In fact, I never stopped to think about this until... yesterday.&lt;br /&gt;&lt;br /&gt;When I arrived in Brazil on October 6, I was one hour ahead of the US's east coast. One day, I woke up,&amp;nbsp; my cell phone time had sprung forward, and I was magically two hours ahead of the east coast. On Sunday, the east coast fell back, and I am now three hours ahead.&lt;br /&gt;&lt;br /&gt;This is not earth-shattering news. It's just kind of weird. I'm guessing that this has never occurred to most people who have not switched hemispheres or do not work with people in the opposite hemisphere.&lt;br /&gt;&lt;br /&gt;So, now you know.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-317750868822302838?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/317750868822302838/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2010/11/daylight-savings-time.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/317750868822302838'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/317750868822302838'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2010/11/daylight-savings-time.html' title='Daylight Savings Time!'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-6373618274819218606</id><published>2010-11-08T06:18:00.000-08:00</published><updated>2010-11-10T10:25:26.353-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='death'/><category scheme='http://www.blogger.com/atom/ns#' term='basic stats'/><category scheme='http://www.blogger.com/atom/ns#' term='losers'/><title type='text'>Joint Probability of Being Mauled by a Bear and Struck by Lightning</title><content type='html'>This is an oldie but a goodie. A while ago, Ms. Sarah Bailey posted this article on my &lt;a href="http://www.facebook.com/kristian.lum"&gt;Facebook wall &lt;/a&gt;about a &lt;a href="http://www.newsobserver.com/2010/06/22/545800/man-hit-by-lightning-mauled-by.html"&gt;guy who got struck by lightning and mauled by a bear.&lt;/a&gt; They go on to say that the closest estimate of the probability of both of these things happening is zero. Agreed... for any random person.&lt;br /&gt;&lt;br /&gt;Every person, of course, does not have the same probability of being hit by lightning and being mauled by a bear. Take Donald Trump, for example. While Zeus probably hates him for being the most pompous shit ever, thus making him about 1,000 times more likely to be hit by lightning than the normal person, I'd hazard a guess that he is rarely if ever within 100 miles of an un-caged bear.&lt;br /&gt;&lt;br /&gt;On the other hand, look at Rick Oliver. According to the article, "he  tends to piddle about his farm, checking on his chickens, working on  his tractors and, as he was in the wee hours of June 3, fixing up his  Chevy Malibu." It was while piddling that, upon hearing a mysterious noise off in the distance, he went alone to investigate. I'd say that sort of behavior makes you pretty darn likely to be mauled by a bear. It might also make you pretty darn likely to get struck by lightning if that same tendency to investigate noises outside also applies to thunder.&amp;nbsp; &lt;a href="http://www.newsobserver.com/2010/06/22/545800/man-hit-by-lightning-mauled-by.html#ixzz14h5Pt0vM" style="color: #003399;"&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Two points here: (1) these events are not independent. They are probably conditionally independent given a number of factors, such as rural-dwelling, gender==male, a love of Kenny Chesney, etc.&amp;nbsp; (2) If you meet several of those conditions (i.e. if you're the sort of person who goes looking for bears/lightning), as rare as occurrences of bear maulings and lightning strikes are in the overall population, I'd say you're fairly likely to be attacked by both.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-6373618274819218606?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/6373618274819218606/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2010/11/joint-probability-of-being-mauled-by.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/6373618274819218606'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/6373618274819218606'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2010/11/joint-probability-of-being-mauled-by.html' title='Joint Probability of Being Mauled by a Bear and Struck by Lightning'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-8929285643008697315</id><published>2010-11-05T07:08:00.000-07:00</published><updated>2010-11-10T10:26:07.249-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='vicious animals'/><category scheme='http://www.blogger.com/atom/ns#' term='basic stats'/><category scheme='http://www.blogger.com/atom/ns#' term='magic'/><title type='text'>Harm Caused by Animals</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: left;"&gt;Possibly due partially to my most recent post re: personal alcohol expenditures, several people have sent me this few days old link,&amp;nbsp;&lt;a href="http://www.economist.com/blogs/dailychart/2010/11/drugs_cause_most_harm"&gt;Harm Caused by Drugs&lt;/a&gt;, from The Economist. They show a plot of the relative harm caused by various drugs, both to society and to the individual. Alcohol ranks first. I guess I'm effed.&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;While I guess it's cool, what I keep pointing out is that as far as I can tell, what they are plotting is not data on { mortality / crime / loss of dignity / accidental pregnancy / increased probability of jumping naked on a trampoline } that can be attributed to use of the drug. They are plotting some "drug-harm" experts' opinions on how much harm each drug causes. I'm certainly not saying that these people's opinions aren't valid, but how can the experts even assign a number to this? I actually looked at the summary of the study, and they are not giving rankings; they are coming up with these numbers based on weighting several different sub-categories of personal/societal harm. What is one unit of harm? How do you come up with the weights? Are harm to self and harm to society additive like this plot suggests?&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;Also, the way this is phrased makes it seem as though this is a score of the intrinsic potential harm caused by the drug. I have a hard time believing that alcohol is fundamentally more harmful than, let's say, crack cocaine. I think what got alcohol it's primo number one ranking is the fact that it's so common. &amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;In this same spirit, I thought I would plot the harm caused by various animals according to an expert on the subject: Napoleon Dynamite. Each animal is ranked based on the harm it can cause to people due to natural fierceness and supernatural magic skills. Each of these is of course comprised of several subcategories, which were weighted according to their importance in determining overall &amp;nbsp;potential harm. &amp;nbsp;&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_cJxplqpwZsg/TNQK5XHySfI/AAAAAAAABV0/_tyCFJGNK60/s1600/image003.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="435" src="http://2.bp.blogspot.com/_cJxplqpwZsg/TNQK5XHySfI/AAAAAAAABV0/_tyCFJGNK60/s640/image003.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;On a related note, WTF, California??&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-8929285643008697315?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/8929285643008697315/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2010/11/harm-caused-by-animals.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/8929285643008697315'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/8929285643008697315'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2010/11/harm-caused-by-animals.html' title='Harm Caused by Animals'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_cJxplqpwZsg/TNQK5XHySfI/AAAAAAAABV0/_tyCFJGNK60/s72-c/image003.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-8955973095980258873</id><published>2010-11-04T18:48:00.000-07:00</published><updated>2010-11-10T10:26:49.350-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='apologies to JJ'/><category scheme='http://www.blogger.com/atom/ns#' term='money'/><category scheme='http://www.blogger.com/atom/ns#' term='Brazil'/><title type='text'>Well, shit...</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: left;"&gt;I like to pretend that I'm good with numbers. Maybe not so much...&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_cJxplqpwZsg/TNNiIA2YUvI/AAAAAAAABVw/U9TvKbzOlno/s1600/Screen+shot+2010-11-04+at+11.38.17+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="290" src="http://2.bp.blogspot.com/_cJxplqpwZsg/TNNiIA2YUvI/AAAAAAAABVw/U9TvKbzOlno/s400/Screen+shot+2010-11-04+at+11.38.17+PM.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-8955973095980258873?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/8955973095980258873/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2010/11/well-shit.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/8955973095980258873'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/8955973095980258873'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2010/11/well-shit.html' title='Well, shit...'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_cJxplqpwZsg/TNNiIA2YUvI/AAAAAAAABVw/U9TvKbzOlno/s72-c/Screen+shot+2010-11-04+at+11.38.17+PM.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-1257955149061889171</id><published>2010-10-17T12:15:00.000-07:00</published><updated>2010-10-17T12:15:49.319-07:00</updated><title type='text'>This conclusion was pulled straight out of this guy's ass...</title><content type='html'>I blame this people like this butthead for the fact that whenever I say that I am a statistician, people&amp;nbsp; ask if that means I can make the data say whatever I want. That's why I usually say that I am an astronaut-- fewer questions and way more street cred. Sure, you can make the data say whatever you want if you are (1) delusional like the author of &lt;a href="http://www.slate.com/id/2269951/"&gt;this article&lt;/a&gt; or (2) dishonest.&lt;br /&gt;&lt;br /&gt;A week or so ago, my friend, Sarah, sent me &lt;a href="http://www.slate.com/id/2269951/"&gt;this article&lt;/a&gt; about a survey on sexual behavior in America with the advice to "read all the way through because their conclusion is somewhat amusing." Reproduced below is the best part:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;Here's my guess. Look carefully at &lt;a href="http://onlinelibrary.wiley.com/doi/10.1111/j.1743-6109.2010.02020.x/abstract" target="_blank"&gt;Table 4, Pages 355-6&lt;/a&gt;.  Only 6 percent of women who had anal sex in their last encounter did so  in isolation. Eighty-six percent also had vaginal sex. Seventy-two  percent also received oral sex. Thirty-one percent also had partnered  masturbation. And the more sex acts a woman engaged in during the  encounter, the more likely she was to report orgasm. These other  activities are what gave the women their orgasms. The anal sex just came  along for the ride.&lt;/blockquote&gt;&lt;br /&gt;&lt;blockquote&gt;So why did the inclusion of anal sex bump the  orgasm figure up to 94 percent? It didn't. The causality runs the other  way. Women who were getting what they wanted were more likely to  indulge their partners' wishes. It wasn't the anal sex that caused the  orgasms. It was the orgasms that caused the anal sex.&lt;/blockquote&gt;&lt;blockquote&gt;&lt;/blockquote&gt;It would probably be good to mention that the relevant stats about anal sex were based on 31 people, and further sub-grouping obviously results in even smaller groups.&lt;br /&gt;&lt;br /&gt;So, in conclusion, this guy is a complete (anal) douche both for his conclusions regarding the direction of causation here and for providing another example of data being manipulated to say whatever you want. &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;p.s. Because I'm turning my homework in late, this post comes after some alternate explanations for the data were posted&lt;a href="http://www.slate.com/id/2270622/"&gt; here&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-1257955149061889171?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/1257955149061889171/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2010/10/this-conclusion-was-pulled-straight-out.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/1257955149061889171'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/1257955149061889171'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2010/10/this-conclusion-was-pulled-straight-out.html' title='This conclusion was pulled straight out of this guy&apos;s ass...'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-4908325926542957596</id><published>2010-10-16T15:34:00.000-07:00</published><updated>2010-10-16T15:44:08.010-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='apologies to JJ'/><category scheme='http://www.blogger.com/atom/ns#' term='losers'/><category scheme='http://www.blogger.com/atom/ns#' term='I told you so'/><category scheme='http://www.blogger.com/atom/ns#' term='Brazil'/><title type='text'>If your first language is Klingon, you probably also speak English.</title><content type='html'>I've always heard that the best way to learn a new language is by total immersion, so I didn't bother learning too much Portuguese before I moved to Rio de Janeiro about 10 days ago. Aside from having a dissertation to write before I left&amp;nbsp; (which I figured deserved the bulk of my effort... and sadly stole &lt;s&gt; some&amp;nbsp; &lt;/s&gt;all of my bloggy time from me) I figured that just showing up in Brazil with one Portuguese course under my belt ought to be enough to get me up to speed pretty quickly. I imagined myself arriving in an exotic paradise, armed with a three year old's level of knowledge of the native language (and wit and charm galore), and  smoothly transitioning into a carioca without being bailed out by anyone speaking English. Ever. I would of course also have an adorably irresistible accent. &lt;br /&gt;&lt;br /&gt;This is only one of many fantasies I had about my life in Brazil that has not come to fruition... one of the other notable ones involves the inverse relationship between my desire to see any given Brazilian guy in his tiny little man-bikini-bottom and the probability that he will actually wear said swimming apparatus. Whenever I try to actually forge ahead with the Portuguese on a task like asking directions, which I can totally handle &lt;i&gt;without help, thank you&lt;/i&gt;, the person I'm asking smiles amusedly at me and answers in English. However, when it comes to navigating Brazil's soul-crushingly burdensome bureaucracy or trying to set up an account with the Internet company, no one can help me. (Seriously, can someone help me get Internet in my apartment?)&lt;br /&gt;&lt;br /&gt;Anyway, when my mom was here, she was stunned by how few people speak English. She commented that it is not like Holland, where it seems like just about everyone speaks English.&amp;nbsp; Having spent more than the 24 hour act-like-a-mature-adult-limit with my mom, I of course regressed to 14 year old me. "Duh, mom, of course they don't. _sigh_ Tons of people speak Portuguese, and hardly anyone speaks Dutch. If the Dutch didn't learn English, the only people they would be able to communicate with would be... the Dutch... and what good is that?" She didn't buy it, so I was &lt;b&gt;forced&lt;/b&gt; to make some plots. &lt;br /&gt;&lt;br /&gt;My point, I guess, revolved around the fact that it is not very practical to only be able to communicate with a very small community. So, if the community of people with whom you can communicate is large already, you'd be less likely to learn another language. (Go with this for a second, and assume that the chosen language would be English.) If you share your first language with relatively few people, you'd be more likely to learn English. &lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/_cJxplqpwZsg/TLog3OiKl6I/AAAAAAAABVg/IpGKKUXnwIM/s1600/LinearLanguagePlot.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="400" src="http://2.bp.blogspot.com/_cJxplqpwZsg/TLog3OiKl6I/AAAAAAAABVg/IpGKKUXnwIM/s400/LinearLanguagePlot.png" width="400" /&gt;&lt;/a&gt;So, I snagged some &lt;a href="http://en.wikipedia.org/wiki/List_of_countries_by_English-speaking_population"&gt;data&lt;/a&gt; &lt;a href="http://en.wikipedia.org/wiki/List_of_languages_by_number_of_native_speakers"&gt;from&lt;/a&gt; &lt;a href="http://en.wikipedia.org/wiki/List_of_official_languages_by_countries"&gt;Wikipedia&lt;/a&gt; (&amp;lt;3 you, W!), and I compare the the proportion of people in each country who speak English to the total estimated number of people world-wide who speak each country's official language. For reasons of laziness and ignorance about which languages are most used in every country, if more than one language was listed, I took the first. I also removed the countries that had English listed as an official language. The result of forcing several by-country lists&amp;nbsp; into one table and keeping only those countries that had all of the necessary data available was a table of 23 countries. &lt;br /&gt;&lt;br /&gt;So, to be fair, having seen this I actually want to back-peddle a little bit. While there does seem to be a trend*, it looks like a spatial model or just taking continents (or even the wealth of each country) into account might explain some of this-- notice that Europe is mostly above the line and, darn you, Latin America,&amp;nbsp; is mostly below the line.&lt;br /&gt;&lt;br /&gt;Point being, if you want to go on vacation in a place where you won't have many communication barriers, go to Iceland.** :)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;*Yes, statisticians friends,&amp;nbsp; I do realize that fitting a line to  data that only goes between 0 and 1 is not the best thing anyone has  ever done... I have a super budget version of a logistic&amp;nbsp; regression fit  to this also if this offends your statistical sensibilities too much.&lt;br /&gt;&lt;br /&gt;** Not one of the countries in the plot. I'm just guessing.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-4908325926542957596?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/4908325926542957596/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2010/10/if-your-first-language-is-klingon-you.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/4908325926542957596'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/4908325926542957596'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2010/10/if-your-first-language-is-klingon-you.html' title='If your first language is Klingon, you probably also speak English.'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_cJxplqpwZsg/TLog3OiKl6I/AAAAAAAABVg/IpGKKUXnwIM/s72-c/LinearLanguagePlot.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-1912639658800922416</id><published>2010-10-12T15:09:00.000-07:00</published><updated>2010-10-12T15:09:32.956-07:00</updated><title type='text'>Infinite loop Skypey screen shot</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_cJxplqpwZsg/TLTb33tmzUI/AAAAAAAABVc/aMiZq8-Nbj8/s1600/Screen+shot+2010-10-12+at+7.01.14+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="400" src="http://1.bp.blogspot.com/_cJxplqpwZsg/TLTb33tmzUI/AAAAAAAABVc/aMiZq8-Nbj8/s640/Screen+shot+2010-10-12+at+7.01.14+PM.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;Me looking at JJ's screen... who's looking at my screen... while looking at his screen... while looking at my screen...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-1912639658800922416?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/1912639658800922416/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2010/10/infinite-loop-skypey-screen-shot.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/1912639658800922416'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/1912639658800922416'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2010/10/infinite-loop-skypey-screen-shot.html' title='Infinite loop Skypey screen shot'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_cJxplqpwZsg/TLTb33tmzUI/AAAAAAAABVc/aMiZq8-Nbj8/s72-c/Screen+shot+2010-10-12+at+7.01.14+PM.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-5014653727079240404</id><published>2010-04-20T12:04:00.000-07:00</published><updated>2010-04-20T12:40:43.208-07:00</updated><title type='text'>Princely programming hotness</title><content type='html'>Although I'd promised myself radio silence on here until I get some &lt;i&gt;real&lt;/i&gt; work done, I'm going to have to make an exception for a super short post. Prince_i has posted a &lt;a href="http://www.kenvanharen.com/love/math"&gt;web app&lt;/a&gt; so that people (you!) can put in your own parameters and get a probability that you find someone better than your current prince/princess. His is also simulation based, but it is not quite the same model that I used in my &lt;a href="http://kldivergence.blogspot.com/2010/04/princess-story.html"&gt;first post&lt;/a&gt; (where I admittedly did totally yoink some of his ideas).  You should probably go check it out. &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Seriously, is there anything on earth sexier than a man who does math &lt;i&gt;and&lt;/i&gt; programs!? Oh baby, oh baby, I love those greek letters and linux jokes! (I wish I were kidding...)  &lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-5014653727079240404?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/5014653727079240404/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2010/04/princely-programming-hotness.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/5014653727079240404'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/5014653727079240404'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2010/04/princely-programming-hotness.html' title='Princely programming hotness'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-4824349220124549340</id><published>2010-04-19T13:29:00.000-07:00</published><updated>2010-04-19T14:59:26.372-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='math-fun'/><category scheme='http://www.blogger.com/atom/ns#' term='basic stats'/><category scheme='http://www.blogger.com/atom/ns#' term='babies'/><category scheme='http://www.blogger.com/atom/ns#' term='extreme nerdery'/><title type='text'>The Chinese birth calendar is total bunk</title><content type='html'>The other day someone (thanks, Dick!) posted my &lt;a href="http://kldivergence.blogspot.com/2010/04/do-celebrities-die-in-threes.html"&gt;blog about celebrity deaths&lt;/a&gt; to hacker news, and someone else (thanks, hoelle!) actually &lt;a href="http://news.ycombinator.com/item?id=1275979"&gt;commented on it&lt;/a&gt;! Hooray!! Pretty much made my... day / month / year. So, in an effort to encourage such interaction and engagement with my little bloggy, I’m going to respond directly and promptly to the suggestion made in the comments at Hacker News at the expense of any progress on my dissertation today. Really hoping my advisor isn’t on to this...&lt;br /&gt;&lt;!--l. 36--&gt;&lt;p class="indent"&gt;  Hoelle said:&lt;br /&gt;&lt;!--l. 38--&gt;&lt;/p&gt;&lt;p class="indent"&gt;  &lt;/p&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span"  style="color:#663366;"&gt;I wonder if that will convince my wife. Probably not. Her stats superstitions drive me crazy. Ever heard of the Chinese birth calendar? For example: http://www.webwomb.com/chinesechart.htm. 90%+ accuracy should be an easy claim to bust. Unfortunately for me it’s been right for our kids 2 out of 2 times. Why are stats always so hard to sell over anecdotal experience?&lt;/span&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;!--l. 42--&gt;&lt;p&gt;&lt;/p&gt;&lt;p class="indent"&gt;  OK, great. This seems easy enough to test. What follows is my first episode of&lt;br /&gt;MathBusters.&lt;br /&gt;&lt;!--l. 44--&gt;&lt;/p&gt;&lt;p class="indent"&gt;  (btw, Jamie and Adam, if you are out there, can I please pretty please be the MythBuster’s statistician? You can even use me as Buster II if you want, as long as I get to do math-fun while being blown to smithereens.)&lt;br /&gt;&lt;!--l. 46--&gt;&lt;/p&gt;&lt;p class="indent"&gt;  I downloaded data from the &lt;a href="http://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Births"&gt;website for the Centers for Disease Control and Prevention&lt;/a&gt;. Specifically, the data set on births from 2006 in the US territories because it was both recent and smallish. I wrote some quick pythony-goodness to clean that up so I could move it directly over to R– my one true love. &lt;/p&gt;&lt;p class="indent"&gt;I only consider births in which all of the necessary fields (sex of baby, date of mother’s last menstrual cycle, age of mother at time of birth) are complete, which leaves me with a sample size of 50,079 birth records to play with. Fun!&lt;/p&gt;&lt;p class="indent"&gt;  Ready for the results? The Chinese birth calendar was correct with 49.70 % accuracy on this dataset. With this many observations, the only point of a hypothesis test will be to have one more darn example of a hypothesis test for proportions on the internet. I say the more fun statistics floating around the better, so...&lt;br /&gt;&lt;!--l. 50--&gt;&lt;/p&gt;&lt;p class="indent"&gt;  Let’s start by being lenient and test the hypothesis in the classical way that the&lt;br /&gt;Chinese birth calendar is up to anything but complete random chance. We'll even give it credit if it can do a good job at predicting the opposite! At least if we know that it will be useful for something.&lt;/p&gt;&lt;p class="indent"&gt;That is, the null hypothesis is that the probability of the Chinese birth calendar being correct is &lt;span class="cmmi-12"&gt;p&lt;/span&gt;&lt;sub&gt;&lt;span class="cmr-8"&gt;0&lt;/span&gt;&lt;/sub&gt; = &lt;span class="cmmi-12"&gt;.&lt;/span&gt;5. Relying upon asymptotic normality (I’d say that 50,079 is pretty darn close to infinity), the fact that I still remember this stuff after four years of grad school, and &lt;a href="http://en.wikipedia.org/wiki/Hypothesis_test"&gt;wikipedia&lt;/a&gt; (it does not lie!), we have a &lt;span class="cmmi-12"&gt;z &lt;/span&gt;statistic of -1.32, which falls in about the 9th percentile of the standard normal distribution, implying a two-sided p-value of 0.187. To use normal stats lingo, we have to fail to reject the hypothesis that the Chinese birth calendar is anything but a complete load of baloney. My poor Chinese granny probably just rolled over in her grave. Maybe that wasn't normal stats lingo. Oops.&lt;br /&gt;&lt;!--l. 56--&gt;&lt;/p&gt;&lt;p class="indent"&gt;  Again, for the sake of more fun stats floating around somewhere, what about testing the hypothesis that &lt;span class="cmmi-12"&gt;p &lt;/span&gt;&lt;span class="cmsy-10x-x-120"&gt;≥ &lt;/span&gt;&lt;span class="cmmi-12"&gt;.&lt;/span&gt;9, as is claimed on the website? Well, in the classical hypothesis testing framework, I think that would either require integration or a likelihood ratio test, to which I am morally opposed. So, as a shout-out to my Bayesian homies, I’ll just slap a conjugate prior on &lt;span class="cmmi-12"&gt;p &lt;/span&gt;(a beta(1,1)= uniform). This results in a posterior distribution for &lt;span class="cmmi-12"&gt;p&lt;/span&gt;, p | data ~ beta(24891, 25187),  which implies that the posterior probability that &lt;span class="cmmi-12"&gt;p &lt;/span&gt;&lt;span class="cmsy-10x-x-120"&gt;≥ &lt;/span&gt;&lt;span class="cmmi-12"&gt;.&lt;/span&gt;90 is about nill. Yep, zero. No fucking chance does that predict births with greater than 90% accuracy. So, there we go, I’m going to go ahead and call this one busted, Jamie.&lt;/p&gt;&lt;p class="indent"&gt;(Drats, there I go dreaming again...)&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-4824349220124549340?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/4824349220124549340/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2010/04/other-day-someone-thanks-dick-posted-my.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/4824349220124549340'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/4824349220124549340'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2010/04/other-day-someone-thanks-dick-posted-my.html' title='The Chinese birth calendar is total bunk'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-7380041046259889103</id><published>2010-04-18T14:45:00.000-07:00</published><updated>2010-04-18T16:41:15.934-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='math-fun'/><category scheme='http://www.blogger.com/atom/ns#' term='death'/><category scheme='http://www.blogger.com/atom/ns#' term='celebrities'/><title type='text'>Do celebrities die in threes?</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_cJxplqpwZsg/S8uYleSxuPI/AAAAAAAABSU/Rr2mHgkp-Xo/s1600/RealAndSimulatedDeaths.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 400px;" src="http://4.bp.blogspot.com/_cJxplqpwZsg/S8uYleSxuPI/AAAAAAAABSU/Rr2mHgkp-Xo/s400/RealAndSimulatedDeaths.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5461626742671259890" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;span class="Apple-style-span"  style="color:#0000EE;"&gt;&lt;u&gt;&lt;br /&gt;&lt;/u&gt;&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;span class="Apple-style-span"  style="color:#551A8B;"&gt;&lt;u&gt;&lt;br /&gt;&lt;/u&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;Check out purple line on the left in the plot above, which shows the arrangement throughout the year of the dates of death of some of the celebrities who died in 2009. It kind of looks like the deaths are bunched together. Then look to the right. Those are randomly generated death dates, which, because human brains like to see patterns, also look like there is some clustering.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It seems like every time two celebrities die, there is speculation about who will be the third, as though two celebrity deaths necessarily means a third is on its way. Although it would be nice to wait to post this until this old superstition gets dug up again when two celebrities die in close time proximity, it will certainly happen again and probably soon.&lt;br /&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;An ideal time to have posted this would have been in June of last year when Michael Jackson and Farrah Fawcett died on the same day, the 25th, and the internets were a-twitter with talk of this old wives' tail. Depending on how you count it, this supposed death troika was rounded out by Ed McMahon (the 23rd) or Billy Mays (the 28th). Although lots of other people have posted about the invalidity of this superstition, I have yet to see any plots depicting the statistical insignificance of this event. And, as I learned in 2nd grade when I failed to actually show that I had indeed mentally carried that 1, the policy is no work, no credit. So, here we go for full points, please...&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;In order to do any sort of testing, we have to define what it means to "die in threes." Seriously, what does that mean? It isn't enough that they die in clusters of any size, in which case, I would probably be talking about self-exciting processes... yes, I did just throw that in so I could say "self-exciting processes." No, the superstition is specifically that they die in threes.&lt;/div&gt;&lt;div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What I propose as a definition of this is that for any three deaths to count as a triple, the time from the first death to the last death in this set must be less than or equal to the time elapsed from the last event prior to the triple, and it must also be less than or equal to the time until the next death succeeding the triple.  The three deaths have to be separated in time from the other deaths. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For example, let's consider the {Ed, Michael, Farrah} candidate triple, in which case the time from the first (Ed) to the last (Michael and Farrah.. I'm only counting this down to a resolution of one day) is two days. In order for this to count as a triple, no celebrities could have died within one day of either end of this triple-- there must not have been any celebrity deaths on the 22nd or the 26th. In order for the {Michael, Farrah, Billy} candidate triple to be a triple by this definition, no other celebrities would have died from the 23rd until the 30th.&lt;/div&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One other piece that needs defining is who counts as a celebrity. I used &lt;a href="http://www.horroryearbook.com/community/2snapstv/2009_celeb_death_list-t3183.0.html"&gt;this&lt;/a&gt; website (and I really do apologize for the lovely anus ad at the top of that), so that I could not be accused of cherry-picking my list of celebrities. You could still make that claim because I removed a few people I did not consider celebrities: children and criminals, for example. I just didn't feel right including a child. And yes, I hand transcribed all of the celebrity deaths in 2009-- that is truly a labor of love.&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I calculated that there were 28 triples by my above definition in this data set of the 157 celebrity deaths of 2009. I then randomly generated 10,000 sets of 157 death dates, where the dates are randomly selected (with replacement, of course) over the course of the entire year, and I calculated the number of triples in each of these completely random data sets.&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div style="text-align: center;"&gt;&lt;img src="http://2.bp.blogspot.com/_cJxplqpwZsg/S8uOZpfeiAI/AAAAAAAABR0/gYlpz7HvnWE/s320/RandomHist157.png" style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 320px;" border="0" alt="" id="BLOGGER_PHOTO_ID_5461615544402610178" /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This histogram of the number of triples from each of the randomly generated death dates shows (1) a remarkably normal shape and (2) that 28 triples is a totally reasonable number to have seen if celebrities die at random days in the year. The number of triples last year (the pink line) falls in about the 70th percentile of what we would expect under completely random death dates-- far from anything anyone would consider statistical significance. &lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One might argue that many of the people on that list are not celebrities. I actually don't know who most of them are. So, I re-ran this simulation, using only the deaths I had heard of. (This should, sadly, sync up pretty nicely with the list of deaths reported on perezhilton, as that is one of my few sources of "news".) A similar histogram to that shown above appears below for the analogous simulation with 23 deaths. Again, nothing spectacularly exciting is going on. We would expect to see this number of clusters under complete randomness. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;img src="http://4.bp.blogspot.com/_cJxplqpwZsg/S8uN9n_E3JI/AAAAAAAABRs/NWmfqhjVsj0/s320/RandomHist23.png" style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 320px;" border="0" alt="" id="BLOGGER_PHOTO_ID_5461615062961937554" /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, there you have it. You can be the judge, but as far as I'm concerned, I'm convinced. There isn't significant evidence that celebrities die in threes. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-7380041046259889103?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/7380041046259889103/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2010/04/do-celebrities-die-in-threes.html#comment-form' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/7380041046259889103'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/7380041046259889103'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2010/04/do-celebrities-die-in-threes.html' title='Do celebrities die in threes?'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_cJxplqpwZsg/S8uYleSxuPI/AAAAAAAABSU/Rr2mHgkp-Xo/s72-c/RealAndSimulatedDeaths.png' height='72' width='72'/><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-409155374111071589</id><published>2010-04-13T17:25:00.000-07:00</published><updated>2010-04-13T21:44:30.337-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='math-fun'/><title type='text'>Someday I hope I can be this badass</title><content type='html'>While looking for an intuitive explanation of  why check loss is the appropriate loss function for quantile regression, I happened upon &lt;a href="http://www.informaworld.com/smpp/content~content=a914064341&amp;amp;db=all"&gt;this gem&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;For those of you without easy access to academic journals, I reproduce for you a few of the choicest excerpts.&lt;br /&gt;&lt;br /&gt;The abstract:&lt;br /&gt;&lt;br /&gt;&lt;div&gt;"This article discusses (1) our research to provide a framework for almost all of statistical methods for simple data, (2) need to plan the future of the “Science of Statistics” in order to compete for leadership in the practice of the “Statistics of Science”, (3) grand unifying ideas of the Science of Statistics, (4) an elegant rigorous proof when quantile function minimizes check loss function which is the basis of quantile regression, and (5) exact and approximate confidence quantiles (confidence interval endpoint functions) for parameters p and logodds(p) given a sample of a 0-1 variable."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Another nugget of grandiosity:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"This article reports progress in my ambitious 1,000 – chapter research program, whose goal is to provide a framework for statistical methods for simple data, and integrate:&lt;div&gt;(1) frequentist and Bayesian methods; (2) nonparametric and parametric methods; (3) continuous and discrete data analysis; (4) functional and algorithmic (numerical analysis based) data analysis."&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;And my favorite part, defining a function to be called "pain":&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"We propose “pain” as a name of a penalty function whose minimization is equivalent to the minimization of an objective function." &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt; Sounds pretty painful to me! Unfortunately, he did not follow up by defining any variables as "ass", which would have been a truly bold move. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, there you have it. This dude is such a badass the editors let him get on his soapbox for the first several pages about the future of statistics and some grand unifying goals. I can only hope that once I reach the point where I've proven myself enough that I don't have to give a crap what anybody thinks, I choose to exercise this right by publishing math papers with rambling prefaces and creative function names. Unfortunately, I suspect I'll instead be the crazy lady on the corner yelling shockingly foul insults at two generations from now's version of the hipster and farting at totally inappropriate times. &lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-409155374111071589?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/409155374111071589/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2010/04/someday-i-hope-i-can-be-this-badass.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/409155374111071589'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/409155374111071589'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2010/04/someday-i-hope-i-can-be-this-badass.html' title='Someday I hope I can be this badass'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-5237576153205943462</id><published>2010-04-11T15:38:00.000-07:00</published><updated>2010-04-11T16:32:24.544-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Duke stats'/><category scheme='http://www.blogger.com/atom/ns#' term='losers'/><title type='text'>Seasonal Insults</title><content type='html'>&lt;div style="text-align: left;"&gt;Some people from &lt;a href="http://www.r-inla.org/"&gt; this project&lt;/a&gt; came to Duke this week, touting the greatness of their new method for approximate Bayesian inference and their sweet new R package. Though they didn't actually tell us &lt;i&gt;how &lt;/i&gt;it works (apparently it is based upon some sort of advanced computational wizardry), I was excited to try out this new toy. If you browse the website, a lot of the examples are either time series (time serial?) or spatial. Since I deal with spatial data all the effing time, I decided to find some time series data and give INLA a whirl.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, I present to you the schmuck dataset! &lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;img src="http://1.bp.blogspot.com/_cJxplqpwZsg/S8JRbmRyXjI/AAAAAAAABQE/bIvnTR-f5EA/s400/schmuck.png" style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 179px;" border="0" alt="" id="BLOGGER_PHOTO_ID_5459015232899931698" /&gt;&lt;div&gt;This is the relative search volume of the word "schmuck" weekly from some time in 2004 until yesterday. You could get it yourself off of Google Trends (thanks, Google! ), or you could just get a version &lt;a href="http://stat.duke.edu/~kcl12/Bloggy%20Stuff.html"&gt;here,&lt;/a&gt; from which I've already removed all of the extraneous junk. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Look at this beauty!! Something is definitely happening consistently during the 2nd week of December that makes people reaaaally want to search for the word schmuck.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Back to the statistics... using INLA, I tried fitting a latent AR(1) term to this even though that is clearly not the right model. No dice. I tried adding a seasonal component. Still no. It keeps shooting me an error message without much of an explanation. Something about a singular matrix. What matrix, I don't know. So, that's it in a nutshell. Although I was super pumped to take INLA for a test drive, this sort of knocked the wind out of my sails. This is not to say it doesn't work, just that I can't get it to work on my new favorite data set.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, I leave you with one more seasonal insult, losers. Below you will find the relative search volume for the term "loser." &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;img src="http://3.bp.blogspot.com/_cJxplqpwZsg/S8JXFTo1HDI/AAAAAAAABQM/o1mBOotkZF8/s400/loser.png" style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 179px;" border="0" alt="" id="BLOGGER_PHOTO_ID_5459021447008951346" /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What is it about the cold months that makes people really interested in schmucks and losers? 10 units of pride to anyone who can give me a plausible explanation! &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-5237576153205943462?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/5237576153205943462/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2010/04/seasonal-insults.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/5237576153205943462'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/5237576153205943462'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2010/04/seasonal-insults.html' title='Seasonal Insults'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_cJxplqpwZsg/S8JRbmRyXjI/AAAAAAAABQE/bIvnTR-f5EA/s72-c/schmuck.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-8773600037420074480</id><published>2010-04-07T23:06:00.000-07:00</published><updated>2010-04-08T13:48:59.214-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='math-fun'/><category scheme='http://www.blogger.com/atom/ns#' term='princesses'/><title type='text'>A princess story  part II</title><content type='html'>&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;As it's been pointed out to me, girlfriend's got some issues. I'm making the poor princess a little bit too needy, jumping from one prince right on to the next. She needs time to really focus on herself, time to figure out what SHE likes, time to just enjoy being single.  She's no sitting around waiting for Prince Charming, delicate flower of a Snow White.... no, she's got the sass of Princess Jasmine.&lt;div&gt;&lt;br /&gt;In terms of modeling assumptions, the slight tweak required to accommodate our dear princess's independent spirit (or desire to never have to utter the words "I don't know who my babydaddy is."-- thanks, L, for that.) is the addition of  a component that determines the wait time between relationships. In the current incarnation of this simulation, the princes arrive back to back, and she is constantly collecting data. Exhausting!&lt;/div&gt;&lt;div&gt;&lt;br /&gt;Alternatively, let's consider a model in which there is a period of latency between princes. At the end of each prince's tenure, let's now suppose that the princess waits some exponentially distributed amount of time with mean independent of her mean relationship duration.&lt;br /&gt;&lt;br /&gt;How does this change the best strategy?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;As expected,  she should auto-reject fewer princes on average if she's  going to wait a long time between them. Makes sense. Keep in mind that this is under the assumption that she is not collecting data on anyone during the waiting period.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's another plot of the best strategy for the number of princes to  automatically ditch (instead of do or marry?) for my settings in this experiment, but under conditions where the princess isn't quite so needy. The x-axis shows the average number of days between relationships, and the hot pink region, shows strategies that came within 90% this time of the best one.&lt;/div&gt;&lt;img src="http://2.bp.blogspot.com/_cJxplqpwZsg/S748kUHIktI/AAAAAAAABP8/mrP1-spu1lo/s400/LatencyPlot.png" style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 400px;" border="0" alt="" id="BLOGGER_PHOTO_ID_5457866392991208146" /&gt;&lt;meta charset="utf-8"&gt;&lt;div&gt;I didn't allow for simultaneous data collection on multiple princes (a negative wait time?), but that could be an interesting extension. If you stay tuned, I've got an even more interesting tweak in the works that I'm pretty sure is going to imply that prince_i is still training data. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;But first, Bayesian spatial quantile regression awaits...&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-8773600037420074480?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/8773600037420074480/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2010/04/princess-story-part-ii.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/8773600037420074480'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/8773600037420074480'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2010/04/princess-story-part-ii.html' title='A princess story  part II'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_cJxplqpwZsg/S748kUHIktI/AAAAAAAABP8/mrP1-spu1lo/s72-c/LatencyPlot.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-4132214621101428245</id><published>2010-04-06T21:33:00.000-07:00</published><updated>2010-04-07T22:02:26.699-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='math-fun'/><category scheme='http://www.blogger.com/atom/ns#' term='princesses'/><title type='text'>A princess story</title><content type='html'>&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;Once upon a time, a princess was sequentially presented with N suitors. She had the option to reject each one as he came, but she didn't know the quality of the successive suitors. Once a suitor was rejected, she could  not change her mind and return to one she had previously scorned. Thank goodness these rules weren't in effect for Princess Jasmine, or else there would have been no happily ever after for poor Prince Ali.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Wikipedia, as usual, has a very thorough discussion of this problem (listed as the secretary problem, if you want to look it up). However, as I fancy myself a princess, and this has to do with me (of course), I will retain the princess language.&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, here's the deal. It would be extremely beneficial to both me and one man(space) friend to know if we have sampled the pool enough (hehe) to be out of the training data stage. So, we are going to use the princess game to decide what to do when he leaves for the summer and I leave for...ever. But, because we are both huge nerds both would know the answer under the traditional assumptions (and what fun is that?!), let's make this a little more exciting...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The princess game assumes the suitors arrive in a random order / are randomly selected / the king gets to pick who you date. I'm pretty sure that's not how it works-- if it had been up to my dad, my pool would have been a random selection of "nice Asian boys from Berkeley." OK, so let's move forward to a time/place when women get to pick their own boyfriends, and assume that we pick who we date based upon each  candidate boyfriend's attributes and how important those attributes are to us at the time. (We can see the whole pool's attributes, but we only know how useful each person was to us after we date them.)&lt;br /&gt;&lt;br /&gt;But ahhh! In high school, back when this whole game started, Lance Bass was pretty much my ideal man. (Don't judge! My bff had already called JT.) We all know &lt;a href="http://www.people.com/people/article/0,,1219142,00.html"&gt;how well that would have worked out for me&lt;/a&gt; if I'd gotten my wish. Thank goodness I've gotten to update my understanding of which man-attributes I like since then. Turns out shy and effeminate isn't quite as appealing to me as I once thought... Strange.&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The original framing of the problem assumes that the princess is able to determine the actual quality of the suitors as they come.  I want to allow for learning over time about which qualities the princess finds appealing. She'll be picking her next boyfriend based upon her current beliefs about which qualities she likes. (For fellow nerds, she'll rate the remaining princes in the pool based on their posterior expected value to her given all of the princes she's already sampled.)&lt;br /&gt;&lt;br /&gt;Lastly, let's ground ourselves in reality.  As much as I'd like to keep playing until I find the perfect person, there is only finite time in which to play this game. While long-term relationships use up a lot of your allotted game time, they also allow you to learn a lot more about the attributes you appreciate in a person.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;div&gt;&lt;div&gt;More formally,&lt;/div&gt;&lt;img src="http://4.bp.blogspot.com/_cJxplqpwZsg/S7weFGa5xnI/AAAAAAAABP0/sBdW5sEoVn4/s400/Screen+shot+2010-04-07+at+1.52.46+AM.png" style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 96px;" alt="" id="BLOGGER_PHOTO_ID_5457269921437828722" border="0" /&gt;&lt;div style="text-align: center;"&gt;&lt;img src="http://2.bp.blogspot.com/_cJxplqpwZsg/S7wdQw8sDtI/AAAAAAAABPk/DvRpptzbK-U/s400/Screen+shot+2010-04-07+at+1.49.04+AM.png" style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 94px;" alt="" id="BLOGGER_PHOTO_ID_5457269022320758482" border="0" /&gt;&lt;/div&gt;&lt;img src="http://4.bp.blogspot.com/_cJxplqpwZsg/S7wdRfrg2wI/AAAAAAAABPs/UsvVG2tg4zc/s400/Screen+shot+2010-04-07+at+1.49.22+AM.png" style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 48px;" alt="" id="BLOGGER_PHOTO_ID_5457269034865187586" border="0" /&gt;&lt;div&gt;Soooooo, because dissertations don't write themselves, I'm just going to run a simulation rather than do what it seems like everyone else has done (yes! gah! I did a mini lit review..) and derive things.&lt;br /&gt;&lt;br /&gt;Feel free to skip to the bottom now; you won't hurt my feelings. But, for the brave...&lt;br /&gt;&lt;br /&gt;I will include parameters that dictate:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;The noise with which the princess observes her utility for a prince after knowing him for only one day.&lt;/li&gt;&lt;li&gt;The average duration of a relationship. (This will be modeled with the exponential distribution, though really a think a mixture distribution with a point mass at 1 day would be pretty appropriate for most of my friends.  Not me, of course. I'm a lady.)&lt;/li&gt;&lt;li&gt;The number of attributes that go into a princess's weighting of how much she likes a guy. According to the partner in crime in this project, there should only be two attributes... jerk! jk&lt;/li&gt;&lt;li&gt;Your time limit for picking a mate. I'm setting this to 12 years.&lt;/li&gt;&lt;li&gt;How sure you are about your initial guess at the importance of different attributes, and how far away this is from the truth. &lt;/li&gt;&lt;li&gt;What percentile of awesome does the guy you end up with have to be in to make you happy. If you get someone who is top 10 out of 100, is that good enough? (I'm setting this to be top 5%. No soulmates here!! Why 5%? Ask whomever made it the magic number in hypothesis testing, I don't know)&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;img src="http://4.bp.blogspot.com/_cJxplqpwZsg/S7wMDzOWdeI/AAAAAAAABO8/UE945sVjSdc/s320/Length" style="margin: 0px 0px 10px 10px; text-align: center; float: right; cursor: pointer; width: 320px; height: 320px;" alt="" id="BLOGGER_PHOTO_ID_5457250107895739874" border="0" /&gt;But, before we get to my decision re: the manfriend, let's look at how one's strategy should change over different values of one of the parameters just to get some intuition about how this simulation is working...&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;On the left there you see the best strategy for the number of boyfriends you sample and automatically reject before starting to try to find the best one. The blue lines are strategies that came within 80% of the best one. This ranges over the average duration (in days) of the relationship down on the x-axis. The moral of the story: people with lots of short relati&lt;/div&gt;&lt;div&gt;onships should wait longer to start looking seriously than those with a few long term ones. However, the difference between the best strategies isn't that big. (15ish versus 5ish). That being said, ignore the actual numbers... This all depends on the other parameters of the simulation, which I haven't told you for this example.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;img src="http://3.bp.blogspot.com/_cJxplqpwZsg/S7wOgs5yoYI/AAAAAAAABPE/93D8NMlFebM/s320/FinalHist" style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer; width: 320px; height: 320px;" alt="" id="BLOGGER_PHOTO_ID_5457252803438354818" border="0" /&gt;&lt;div&gt;And now, ta-daaaa!! The results! To get my final sample of best strategies, I averaged over my &lt;/div&gt;&lt;div&gt;beliefs about what all of the parameters of this simulation would be for me. I'll post this code to &lt;a href="http://www.stat.duke.edu/%7Ekcl12/Kristian%20Lum.html"&gt;my website&lt;/a&gt;, so if you're interested in how I did this averaging, you can look for yourself.  The histogram here shows the optimal strategy over 1000 simulations. It looks like my best bet on average is to wait about 10 princes and then take the best one that comes after that. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Uh oh... there have already been ten... better start running, you know who you are... ;)&lt;/div&gt;&lt;div&gt;&lt;div style="text-align: center;"&gt; &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Disclaimer: this doesn't take into account the fact that the good ones might get taken . And it assumes that you can't go back to an old one. Both of those aren't quite right, but this was about as much work as I am willing to put in on a procrastination project!&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;script type="text/javascript"&gt;&lt;br /&gt;var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");&lt;br /&gt;document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));&lt;br /&gt;&lt;/script&gt;&lt;br /&gt;&lt;script type="text/javascript"&gt;&lt;br /&gt;try {&lt;br /&gt;var pageTracker = _gat._getTracker("UA-12796814-2");&lt;br /&gt;pageTracker._trackPageview();&lt;br /&gt;} catch(err) {}&lt;/script&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-4132214621101428245?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/4132214621101428245/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2010/04/princess-story.html#comment-form' title='9 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/4132214621101428245'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/4132214621101428245'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2010/04/princess-story.html' title='A princess story'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_cJxplqpwZsg/S7weFGa5xnI/AAAAAAAABP0/sBdW5sEoVn4/s72-c/Screen+shot+2010-04-07+at+1.52.46+AM.png' height='72' width='72'/><thr:total>9</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5849016128097481347.post-2587594505975359746</id><published>2009-12-31T10:26:00.000-08:00</published><updated>2009-12-31T11:02:42.164-08:00</updated><title type='text'>Inaugural Blog</title><content type='html'>Hello, world! &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;OK, glad I got that out of the way-- getting those first few words down is always the hardest part, or so they say. This is the second time this week that I have confirmed that little dictum, having just begun writing my dissertation.  (Obviously the impetus for &lt;i&gt;finally&lt;/i&gt; beginning a blog--one more procrastination tool in the wings! Or what I'm hoping, one more place to "organize my thoughts"--read, take a break doing something slightly more productive than reading celebrity gossip-- while my poor little computer whirs away on my oh-so-slow R code, and there's not much I can do besides... read journal articles, get some exercise, actually &lt;i&gt;write&lt;/i&gt; this damn thing.... oh, wait). &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;More to the point, I'm hoping that this will be a place for me to document my figuring out what I want to do when I grow up (wishful thinking that this will actually happen?), share some of my adventures (in human rights statistics and otherwise), and give anyone who's interested a peak into my over-analytical, data/statistics-obsessed brain. &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5849016128097481347-2587594505975359746?l=kldivergence.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://kldivergence.blogspot.com/feeds/2587594505975359746/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://kldivergence.blogspot.com/2009/12/inaugural-blog.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/2587594505975359746'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5849016128097481347/posts/default/2587594505975359746'/><link rel='alternate' type='text/html' href='http://kldivergence.blogspot.com/2009/12/inaugural-blog.html' title='Inaugural Blog'/><author><name>KL</name><uri>http://www.blogger.com/profile/02640977217346337188</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='23' height='32' src='http://4.bp.blogspot.com/_cJxplqpwZsg/Szz_pZBQ0gI/AAAAAAAABCM/Z-Y8aICWiBY/S220/bloggy.jpg'/></author><thr:total>0</thr:total></entry></feed>
