Does the Trump Brand Carry a Premium?

Trump Brand Index

In keeping with tradition, here’s the headline. Then we’re going to explain it before you get to the good stuff. Deal with it.

The Trump Organization derives revenue from licensing the Trump name to place on buildings. Based on analyzing more than 33,000 transactions, the brand results in the properties selling for 18.2% less when compared to similar properties in the same ZIP code. The downward trend has been underway since 2014, prior to Donald Trump’s election.

That’s more of a mouthful than normal. And if you want to cheat, you can skip to the new Factba.se Trump Brand Index, which will track this constantly based on real estate transactions.

And it’s late. So let’s get explainin’…

It’s Been a While Since You Blogged. What Gives?

Building a business gets in the way of doing deep research. The whole “need to eat and send kids to college thing.” It’s Sunday at about 2 a.m. while I’m writing this. And get off my lawn.

This Has Nothing to Do with The Above.

Right, back on target.

We’ve read some articles that reference the perceived change in the value of the Trump brand. There was a piece in The Washington Post earlier this month discussing merchandising. The Wall Street Journal has been taking a look. Even Realtor.com wrote about it. The deepest consistent dive we’ve seen is from CityRealty, which has been trending condos compared to Trump-branded condos in the New York area. If you haven’t seen their Trump Report, it’s well thought out, and they clearly have a better production budget than we do.

But what we’ve seen, with CityRealty as the exception, are qualitative stories. That is, there hasn’t been a comprehensive look, across all Trump-branded properties, to see if he is, in fact, profiting from the Presidency.

This is important for two reasons:

  • Trump has not placed his assets in a blind trust. He still maintains control of his properties and his companies, even if he has elected not to exercise that control, vs. a blind trust, where he could not exercise that control even if he wanted to. This can lead to issues affecting diplomacy, among other challenges.
  • The Trump Organization has built a significant, but non-quantifiable, portion of its business in licensing the Trump name. Most new construction involving the Trump brand in the past decade is via licensing the Trump name and/or handling management.

Therefore, the value of the Trump brand isn’t something abstract. The brand is the product. The Trump Organization, which is currently ranked 40th on Crain’s list of the largest private companies in New York, generates revenue from licensing the brand in the hospitality and real estate verticals. The expectation is the brand brings a premium.

However, measuring that performance, and its relative value within a privately held company is difficult, even with the public financial disclosures he must file. These disclosures are not a P&L. They are ranges of values and are not required to indicate a profit or a loss. And there’s no public record of membership in Trump clubs, nor occupancy in Trump-branded hotels.

Great, So It’s Impossible…

Not quite. But pretty damn hard. But at least it was a starting point.

What we can measure is the resale value of units in Trump-branded buildings. This information is fragmented but public. We can also measure the performance of sales of condominiums in the same ZIP code, and further measure the performance of similar condos, that is, condos in the same price band, or quartile, as Trump-branded condos.

So we had a direction, but we didn’t quite have an idea of what we were in for.

First, we had to figure out what was in and what was out. Trump hotels and clubs were clearly out of the evaluation. Trump Bay Street in Jersey City is a rental complex and was similarly removed. The handful of villas at Trump National in California were too small a sample, as were transactions at Trump Park Residences in Westchester.

We made a decision to similarly exclude four buildings on Riverside Drive in Manhattan (120, 200, 220 and 240 respectively) that are currently involved in a lawsuit to remove the Trump brand. It would be difficult to gauge the impact of the Trump brand when it may or may not be removed.

That left 20 buildings branded Trump in New York (9), Florida (7), New Jersey (1), Illinois (1), Hawaii (1) and Connecticut (1) with a total of 5,483 condo units between them all. This range spans 9 different counties and 14 different ZIP codes. So we had our sample. The rest is cake, right?

An Open Data Rant

So, real estate transactions are public records. They should be easy to get at. And I should be sleeping. Enough fantasy.

Fat chance. Eight of the buildings were easy, thanks to the Open Data Connecticut site and, in particular, the completely and thoroughly awesome New York City Department of Finance, which has everything up to date in handy spreadsheets with easy historical data. And, small (but large) miracles: standardized data fields and columns across data back 15 years. If you see Dr. Jacques Jiha, give him a smooch and tell him he’s our hero.

Broward and Palm Beach County had antiquated but eminently crawlable sites. New York State clearly didn’t read the Dr. Jiha playbook and was a train wreck in Westchester, so thank you to Gannett for pulling it together. We’re still not sure why Monmouth County hosts Hudson County’s records for Jersey City, but it’s Jersey. We’ll call it even. Chicago had some private sites to piece together. Honolulu was pure Zillow (and thank you to Zillow for making it easy to fill in recent sales. We love you and your awesome data.

But we’re taking a whole paragraph out to yell at Miami-Dade. They have this nice comparison tool. But they went out of their way to make it next to impossible to pull down any data. Literally anti-bot tech the likes of which Google would be proud of. Their Microsoft server pages were tinfoil-hat-level paranoid. And they limit the data to 2015.

Folks: it’s public data. Call Dr. Jiha and learn a few things. Run a dump of your Microsoft database. You can even script it.

From there, we had to fill in square footage data for about 50,000 units. We won’t bore you with that slog, though it was a combination of our awesome researchers, and teaching Margaret, our AI, how to slog through Google results to fish it out. (Hint: Google address + # + unit + city + sq ft. Pull first results page. Teach bot to find square footage and read addresses. Two solid matches on reserved list of real estate sites = victory. Repeat tens of thousands of times).

The square footage is important. You can’t really compare sales of different sizes, high floor vs. low floor. But you can use a common metric used in the real estate industry: price per square foot. It smoothes out these swings between $200,000 and $2,000,000 condos by breaking down the value into comparable figures.

Almost There. Promise.

We invested probably 200 person-hours in this. Bear with us a bit longer.

So we had the data. We had to filter out bad data, non arm’s-length transactions (zero or below market sales, usually to family members), foreclosures, and obvious outliers, like a 600,000 square foot apartment in Midtown, or miracle 1,000 sq ft apartments in Murray Hill selling for $150,000. I don’t think so.

That ended up with a bit more than 33,000 real estate transactions since 2010.

From there, we sliced four ways:

Quartiles
© MathPlanet
  • Data Sample. We compared sales of condos in the same ZIP code to the sale of Trump-branded condos. We also ran the same comparison by quartile.
    • To do this, we established which quartile each Trump building was in. In Jersey City and Honolulu, it landed in the 3rd quartile, meaning on average, sales of units fell in the 50-75% range. So these are close to the upper tier, but not in the upper tier based on sale price in that ZIP code.
    • Similarly, Trump Parc and Trump Parc East on Central Park and Trump Palace on East 69th Street landed in the 50-75% range. That may sound wrong, but remember Central Park South shares a ZIP with Billionaire’s Row, where Michael Dell paid $9,200 per square foot recently. Apparently Central Park South is a step below that.
      Billionaires. They’re just like us. Except they’re not.
    • The remainder were all in the 4th quartile… at the top of the chart.
  • Time. We ran data by month for the full sample and by quartile. There was not enough data to accurately subslice geography by month. We sliced New York State and City by quarter. Every other geo we sliced by year
  • Geography. See above.
  • Index. This is what we’ll use going forward. The index selects a start point (January 2010) where the difference in average square foot price between Trump-branded condos and non-Trump condos is established in a ratio. This sets the index at 100, similar to how Zillow, or the U.S. Government, indexes markets.
    Then, moving forward, that ratio is tracked moving forward, establishing the difference from the start point. Numbers above 100 are positive. Below 100, negative.

We also cross-checked our percentile bands, means and medians against median sale prices and trends provided by Redfin, Streeteasy, and Zillow, and further checked our trendlines against Zillow’s ZHVI Index index for condos. The lines all checked out. Please, double check our math below, just in case (see downloads).

Enough Already. Get To it.

Okay, okay. You got the headline, so let’s dive in.

First, high-end condo sales are cool across the country. That means, in general, expensive units are taking longer to sell, or selling for less.

The index, however, takes that into account. The question is: are Trump-branded units moving higher, lower, or with the market?

So far, Florida is the shining spot for the Trump brand. On a year-over-year basis, Trump buildings are holding their own against the market. The Trump brand was selling for a 19.8% discount in Florida in 2010. Now, they sell for a 2.1% premium as of 2018. Note in Florida, we don’t have data for the 1,446 units in Sunny Isles Beach (right near where the Rascal House used to be) until 2015, so that may skew the data. Trump Hollywood and Trump Plaza of the Palm Beaches are definitely trending lower than their respective ZIP codes and quartiles, where Sunny Isles Beach is not.

Writer’s note: Sunny Isles Beach will always be North Miami Beach to me (#winstontowers, shuffleboard, the smell of onions frying, רעדן אויף ייִדיש).

Other properties individually show an upward trend or are stable against the index. Trump Palace and Trump Parc East in New York City and Trump Plaza Residences have all increased. Trump Waikiki, on low volume, is up significantly. And Trump Plaza in New Rochelle and Trump World Tower appear stable.

In New York City, the Trump brand is trending below the market at a faster pace. In 2010, the Trump brand carried a 4.5% discount on comparable units. In the first quarter of 2018, it carries a 16.6% percent discount when compared to similar units. This is particularly acute in Trump Tower, where the brand appears to result in a 39.8% discount over its cohort in 2017. There have been no sales recorded in 2018 by New York City to measure against.

The Trump Brand

Overall, the Trump brand is trending down in the United States, as measured by 1,845 transactions in Trump-branded properties since 2010, compared against 31,618 transactions overall in the same ZIP codes. Rather than commanding a premium, paying to license the Trump name for real estate, as of Q1 2018, results in an 18.2% discount when compared to properties in its quartile.

If we compare against all transactions, including lower-end units not associated with the Trump brand, it has gone from commanding a 59.4% premium at the beginning of 2010 to 17.2% premium today.

What is clear from the data is this trend pre-dates by more than a year Trump’s announcement to run for President on June 16, 2015. The trend has continued, and in New York City, in particular, accelerated, since his election. No matter if we sliced the data by quartile or with all ZIPs, or sliced by geography, the same downward trend appears at least a year prior to his candidacy.

It’s important to note what the above means. The downward trend in the Trump brand began well before his entry into presidential politics in the 2016 cycle. There is not an inflection point in his candidacy and rise to the Oval Office. We can’t say if his election helps, hurts, slowed the decline or increased it, other than indicators in New York. The brand was in decline before he began his run.

We wanted to pursue this further outside the United States, but that data is difficult to come by in Panama, the Philippines, India, and even Canada.

To answer the initial question of this thought exercise: the Trump brand is not being served by the Trump presidency in the United States, in the sense that it has not resulted in an uptick in the value of Trump-branded units. There is no sign, outside of Sunny Isles Beach properties, that his being president is increasing the value of the Trump name. The opposite, in fact, is proving to be true. But the opposite predates his run.

One Interesting Note

A researcher pointed out this particular detail. The Associated Press lists Trump has winning 2,626 of 3,141 counties in the 2016 election. The Trump brand is on 20 condominium properties in nine counties in six states.

Though Trump won 83.6% of the counties in the 2016 election, not a single county with a Trump-branded condominium property voted for him. The closest was in Palm Beach County, which he lost 58-41.

    Trump Clinton
County State Votes % Votes %
Fairfield CT 160,077 38.00% 243,852 57.89%
Broward FL 260,951 31.37% 553,320 66.51%
Miami-Dade FL 333,999 34.00% 624,146 65.80%
Palm Beach FL 272,402 41.13% 374,673 57.82%
Honolulu HI 90,326 31.61% 175,696 61.48%
Cook IL 440,213 21.40% 1,528,582 71.40%
Hudson NJ 49,043 21.90% 163,917 73.20%
New York NY 64,930 9.71% 579,013 86.56%
Westchester NY 131,238 31.20% 272,926 64.88%

It has nothing to do with anything, but was an interesting datapoint. His brand is marketed and sold, as far as condos, entirely in counties that did not vote for him.

One Last Point

As we always note, we could be wrong. We’re open to different analyses and opinions, but we expect data in the response. The world has enough opinions, so we try to stick to… facts. Facts that can be independently verified and proven.

To that end, CityRealty’s analysis has had some strong responses from The Trump Organization. In particular, Eric Trump responded in The New York Times last month as follows, direct from The Times:

“Data can be manipulated to tell any story you want,” said Eric Trump, the president’s son and an executive vice president of the organization. “The fact remains, our buildings sell for the highest prices per square foot of any properties in the world,” he said. “That is undeniable.”

So, we’d like to say the following:

One: Our data supports CityRealty’s analysis and expands that analysis out to properties in Westchester, Jersey City, Stamford, Chicago, Florida, and Hawaii. Other than previously noted movement in Florida, the trend is down and has been since 2014.

Two: This.

Three: So we’re all on the same page, we’re open-sourcing all the data used for this analysis, and encourage others (We’re looking at you, Miami-Dade!) to do the same. You’ll find download links on the Factba.se Trump Brand Index page, and at the bottom of this page.

That’s all the raw data, line-by-line, every transaction, including data we excluded for incomplete/incorrect data (in case someone wants to tackle that!). So, that’s 48,000+ line items of data, plus all the resulting output we’re using in this post and in our Brand Index

As a side note, each line item includes a link to the source of the data. So please, don’t take our word for it. Trust, but verify.

Before making a claim, please share the data. Use ours, or provide your own to be verified.

That’s all we got folks. Nothing to see here.

Research for this analysis was done by Cheryl Ley, Cori Lovas, Maite Dizon and Paula Boyland. A thank you to Chad Smolinski and Matt Koll for their guidance on statistics. Note: only I am to blame for the ridiculous writing.

“Stable Genius” – Let’s Go to the Data

So, as always. First the headline, then you need to eat your vegetables to get the details.

The headline:

By any metric to measure vocabulary, using more than a half dozen tests with different methodologies, Donald Trump has the most basic, most simplistically constructed, least diverse vocabulary of any President in the last 90 years. This is by a statistically significant margin in each case.

Okay, the headline’s out of the way. On to the vegetables, so you understand why we checked this, and the methodology.

(And with our apologies for the simplistic charts. The Google Sheets plug in is quick and dirty… but the data’s all there for you at the bottom)

[gdoc key=”1HvS5jQxqrbh4u5ynv1TXUNoPesIVHQ4zq4SvPunK5Nc” chart=”Column” query=”select Q, C” title=”Presidential Vocabulary Grade Level”]

Why Are You Blogging on a Sunday Night?

Well, the Golden Globes are on. Also…

I usually try to unplug over the weekend. And by unplug, I mean “catch up on everything I was supposed to do during the week but didn’t because who the hell can get work done during office hours.” You know, by relaxing and stuff.

So the emails that started coming in Saturday morning around 8 a.m. kind of interfered with that plan. I ignored them for all of 20 seconds before seeing what the heck was going on. In general, when something is going on, the emails tend to clump together. The phone wasn’t going to stop vibrating by force of will alone.

Turned out, it was a number of folks asking if I’d seen the “genius” tweet, and if Factba.se had ever run an intelligence test.

Now, when someone emails me at an ungodly hour (and prior to 11 am on a weekend more than qualifies, given my normal bedtime is defined as “Thursday”) to ask about a tweet, I put the darker thoughts out of my mind and did my best not to get upset.

But I was awake. May as well spoil it. The tweet in question (a three-parter, which is more unusual of late since the character limit was upped):

https://platform.twitter.com/widgets.js

https://platform.twitter.com/widgets.js

https://platform.twitter.com/widgets.js

…spanning 11 minutes. (Sorry about that last one… one of my favorite Road Runners).

The quote that seemed to stick out in everyone’s mind was the last one: “I think that would qualify as not smart, but genius….and a very stable genius at that!”

Okay, I was awake.

Apparently, the intellectual exercise would be to parse the phrase “genius” and could it be proven, or disproven.

Into the Den of Snopes

Measuring intelligence is normally done through a simple method with no agreed upon standard: an IQ test, a loosely-defined standardized test, variations of which have been in use for more than a century. The most common one in modern use is the the Wechsler Adult Intelligence Scale (WAIS) v4, in use since 2008.

However, there is no peer-reviewed method to look at writing / speeches / etc to assess intelligence. The closest is a 2006 study, which used a historiometric method.

Suffice it to say, that method is fine, but it takes a doctorate and an expert. We don’t have presidential scholars at Factba.se. We’re a bunch of data schmoos. Also, this particular study was ripped off and faked enough in the past 15 years that it has multiple snopes pages (here, here, and here) and it rates its own Wikipedia page. Again, the study is fine. Making stuff up around it isn’t.

Supercalifragilisticvocabularydocious

However, the ability to measure the complexity of vocabulary, the diversity and its comprehension level is something we do all the time here in the Fact Cave, courtesy of Margaret, our platform’s AI. In fact, it’s done every time we add a word into the platform, automagically. The most common metric, the Flesch-Kincaid Grade Level, was actually developed for the military in the 1970s as a way to check that training materials were appropriate and could be understood by its personnel. It is used as a measurement in legislation to ensure documents such as insurance policies can be understood.

There are a number of competing algorithms. They use different approaches, but all try to do one of two things:

  • Grade Level. Establish the grade level at which the text could be understood
  • Reading Ease. Essentially the same thing, but with a normalized statistical score vs. a U.S.-centric grade level.

At Factba.se, Margaret runs every single bit of text automatically through the following algorithms:

… and about a dozen others, including difficult word count, etc. We’re also testing the Lexile Framework.

As a side benefit, recreationally, we built a database of interviews, speeches and press conferences for previous presidents, leaning heavily on what’s available publicly from presidential libraries, and the wonderful collections at the University of California, Santa Barbara’s American Presidency Project. One of the reasons we did this is to provide a point of contrast. Looking at a single datapoint can tell you everything and nothing. A nice cohort comparison… that’s better.

Importantly, as we’ve blogged earlier, we like to focus on a person’s own words if possible, not speechwriters. The UCSB archive in particular gave us a rich trove of Presidential press conferences back to Herbert Hoover in 1929. So we could look at just what a president said. Unscripted (or as close an approximation as is possible for a president).

Okay. We had the algorithms. We had the text. On to…

Methodology

As mentioned previously, we narrowed our samples from Hoover forward to just press conferences, presidential debates and interviews. Of course, within those, we only use words spoken by the President, nothing else.

This left us with a deep sample for each, but spread out. We ran the analysis two ways:

  • Complete. Whatever we have, we have. On the low end, it’s 44,705 words for Gerald Ford, up to 1,124,164 words for Bill Clinton. Trump clocked in second at 915,801 words.
  • Equal Sample. We then ran the same test on 30,000 words, plus or minus 1% (actual range was 30,003 – 30,253 words), where we looked only within the person’s presidency (no pre-election debates) and started from Inauguration Day forward, adding sentences until we hit 30,000, then stopped and analyzed those.

In addition, we’ve been testing the Lexile framework. It’s a free test so we’re limited to 1,000 words. But we took the first 1,000 words (in full sentence format) from the equal sample and tested those.

It’s important to note: for the two presidents where social media existed, this was not included. This was strictly utilizing the responses given by a president in an interview, during a press conference, or in a political debate.

The Result

It statistically made no difference which way we analyzed it, or which method. It affected some scores and some of the ranks, but not the position of Donald Trump on that list. In each case, he ranked last of the past 15 presidents.

By every metric and methodology tested, Donald Trump’s vocabulary and grammatical structure is significantly more simple, and less diverse, than any President since Herbert Hoover, when measuring “off-script” words, that is, words far less likely to have been written in advance for the speaker.

Significant is not editorializing. The gap between Trump and the next closest president (in most indices, Harry Truman, known historically for a folksy, simple pattern of speech), is larger than any other gap using Flesch-Kincaid. Statistically speaking, there is a significant gap.
[gdoc key=”1HvS5jQxqrbh4u5ynv1TXUNoPesIVHQ4zq4SvPunK5Nc” chart=”Column” color=”red” query=”select Q, C” title=”Presidential Vocabulary Grade Level”]

This gap appears both when using the complete corpus available to us for all presidents, and the more limited 30,000 word set to use an equal data set for each. In either data set, Donald Trump consistently clocks in at the bottom of the list. Depending on the scale used, it’s between a 3rd and 7th grade reading level.

Using the same one used by the Department of Defense, the grade level on the equal sample is 4.6. That’s between a fourth and fifth grade level.

The next closest is Truman at 5.9, followed by Bush 41 at 6.7. The top three: Herbert Hoover (11.3), Jimmy Carter (10.7) and Barack Obama (9.7).
[gdoc key=”1HvS5jQxqrbh4u5ynv1TXUNoPesIVHQ4zq4SvPunK5Nc” chart=”Column” query=”select Q, K” color=”green” title=”Presidential Vocabulary Word Complexity”]

In terms of word diversity and structure, Trump averages 1.33 syllables per word, which all others average 1.42 – 1.57 words. In terms of variety of vocabulary, in the 30,000-word sample, Trump was at the bottom, with 2,605 unique words in that sample while all others averaged 3,068 – 3,869. The exception: Bill Clinton, who clocked in at 2,752 words in our unique sample.
[gdoc key=”1HvS5jQxqrbh4u5ynv1TXUNoPesIVHQ4zq4SvPunK5Nc” chart=”Column” query=”select Q, M” color=”purple” title=”Presidential Vocabulary Word Diversity”]

So What?

That’s a fair question. So what? Vocabulary is not a proxy for intelligence. In IQ Tests, vocabulary is a component, but only a component.  However, it is used as a proxy for a number of things:

  • Doctors use it to measure symptoms of degenerative brain diseases (note: as blogged previously, we see no downward trend over 40 years in Trump’s vocabulary. For unscripted, it’s very consistent).
  • Psychologists use vocabulary as a measure of intellectual curiosity and a person’s reading ability.

But also, it should be pointed out:

  • Politicians strive to get a clear, concise message in front of the public. That includes keeping it short and simple.

Other than Donald Trump, all presidents in this cohort were either career politicians, or in the case of Eisenhower, a very public figure and military leader for decades before running for president (historians argue whether a general at Eisenhower’s level would already be considered a politician before running for office, due to the need to navigate very political waters at that level).

Back to so what? In answer to those who emailed the equivalent of “is the president a stable genius”, the answer is “we don’t know.” Short of IQ tests, there’s no way to know for sure.

But what we can say is, compared to the 14 presidents who preceded him, by every measure, his use of words when off script are significantly less diverse, and simpler, than all presidents who preceded him back to Herbert Hoover.

As always, feel free to dispute the analysis, but come prepared with data. We don’t need more opinions. But more analysis with supporting data is always welcome.

Here’s the data. Have fun!

[Note: Hmm… thought the plug in would download all the tabs, not just one. Oh well. This is the Google Sheets link
[gdoc key=”1HvS5jQxqrbh4u5ynv1TXUNoPesIVHQ4zq4SvPunK5Nc” datatables_page_length=”15″]

The Howard Stern-Donald Trump Interviews

The Stern Thing

[Update: 9/27/17: Audio has been removed per DMCA notice from SiriusXM. Think it should be public? Feel free to let @SternShow and @SiriusXM know.]

[Update: 9/30/17: TrumpOnStern.com kindly pointed out we missed two Stern shows. Donald Trump appeared on November 9, 1995 for 22 minutes and January 20, 1994 for at least 8 minutes (the audio is not complete). The post below reflects the data without those two shows included.]

Be careful what you wish for. It could screw up your month.

So… the Howard Stern / Donald Trump interviews. It’s been a bit of an obsession of ours. But not for the reasons you might think.

There have been some articles written before the election about Howard Stern, primarily by Andrew Kaczynski and Nate McDermott at Buzzfeed and later at CNN, Virginia Heffernan at Politico, David Fahrenhold at The Washington Post and others, including Mother Jones and The Atlantic.

These all quoted excerpts from these interviews. By our count, we found about 20 minutes of audio total covering about a dozen interviews.

If you’ve listened to Howard Stern before, you know you can find something salacious without a great deal of effort, and the interviews with Donald Trump were no exception.

However, the stories (with the exception of Heffernan’s excellent piece) didn’t address what we thought were two key points.

  1. Howard Stern is an excellent interviewer. Guests can spend two hours or longer speaking with Stern. His staff preps him well and they are impeccably researched, and move from making out with girls to port security in Dubai effortlessly. Howard Stern gets people to speak about things that, in any other context, they would never discuss.
  2. Based on our research, no one has spent more time interviewing Donald Trump publicly than Howard Stern, both in terms of the length of the interviews, the number, and over a larger period of time.

We wanted that record for our database. It’s a gaping hole.

But therein lies the problem. Howard Stern has done, conservatively more than 8,000 shows since the 1980s, and that number is probably low. Based on the normal length, that’s at least 30,000 hours of audio and likely a minimum of 50,000,000 words. And there’s no definitive record. If Stern has the list, it hasn’t been shared.

We’ve found snippets and pieces before. But, per our mission, we want to ensure that anything in our database is the full transcript, versus an excerpt. As such, we were interested in the full record of conversations between Donald Trump and Howard Stern from the 1990s forward. To make sure we had it all, we wanted the whole show to check.

Our research indicated he was on the show dozens of times, but not the details, exact dates, etc. We reached out to people who operate fan sites, particularly marksfriggin.com, and on the Internet, particularly via Reddit. Stern fans are known for collecting recordings of old shows, so we were hoping to find the full recordings,

We were  insulted in ways both creative and thorough, but kept trying. In short, we struck out. By the spring, we had shifted our focus to building out the features on the site.

And Suddenly…

Out of the blue, early in the morning September 5th, about 3 1/2 months after we had moved on, we received an email with a Dropbox link from an anonymous Yahoo account. We looked and to our surprise, it was several dozen MP3s with the entire show, end-to-end, which allowed us to verify we were capturing the entire interview. We copied the MP3s and quickly emailed back to ask a couple of clarifying questions. We were not-so-politely told to leave them alone.

Between the files and extensive research on marksfriggin.com and other sites, we were able to verify 35 unique interviews, beginning May 8, 1993 on Howard Stern’s E! interview show, through August 25, 2015. There were other MP3s, but they contained Stern talking about Trump, or a time when Trump was supposed to dial in but couldn’t, or in one case, a re-run. We filtered those out.

So we got to work, transcribing, proofreading, cross-checking. This is harder than it sounds. Our transcription robots are good. But the show is fast paced (235 words per minute by our measure), filled with crosstalk, music and other sound effects in the background, noise. It mixes clean audio with phone audio. It’s the greatest hits list of “things that mess with algorithms.”

Combine that with a mixed bag of recording methods, and our robot was none too happy with us. So it involved a lot more manual work than we like.

The transcripts are complete but we’ll be working them towards perfection for some time. But they’re just about there, and married to the audio, and run through our usual battery of audio, text and voice analysis. (And please, when in doubt, listen to the audio).

But after investing more than two weeks, there’s just too much to do. We’ll keep tweaking in our spare time, but there’s only so many hours of the day before you start writing your blog posts at 3:45 am. Just sayin’.

Yeah Yeah Yeah. Whatcha Got?

Donald Trump’s time on Howard Stern totals 15 hours, 8 minutes and 52 seconds, with 104,357 words spoken by Donald Trump. This is 21% longer than his first book, “The Art of the Deal” (86,575 words). Hell, it’s almost half as long as the Frost / Nixon Interviews.

Based on our records, this is far more time Trump has spent in an interview than any other journalist or media personality, including Morning Joe, Sean Hannity, Bill O’Reilly, Chris Matthews, Larry King, Don Imus… any of them. This is in terms of the number of interviews, the length, the time period.

Trump has spent far more time, over a far longer period of time, speaking in greater depth with Howard Stern than any other interviewer. No one has spent more time interviewing Donald Trump in a public setting than Howard Stern, and in particular spanning more than two decades. Having these interviews in our database provides a crucial perspective.

We stopped counting after more than 500 unique questions and answers. Yes, lots of questions about sex, positions, his views on women, and things you don’t find in any other interviews (AIDS, Chlamydia, group sex, groping in public… our robot keyworded a lot of new things… we chose not to teach our AI some things. It leads to scary things). But also, lots on North Korea, Iraq, infrastructure and taxation. The Port of Dubai security was a real question. And it was answered.

Some of the stories Trump told repeat themselves across multiple years. He discusses a great deal about his personal life. And most of the interviews had a specific hook: boxing matches Trump was promoting, new books, The Apprentice and, toward the end of the series, a great deal more about politics.

We also had to develop a custom taxonomy and classification. A good many of the questions and answers are, in Stern’s style, leading. For example, an oft-quoted excerpt from a 42-minute interview had the following segment:

Donald Trump: My daughter is beautiful, Ivanka. She…
Howard Stern: By the way, your daughter.
Donald Trump: She’s beautiful.
Howard Stern: Can I say this? A piece of ass.
Donald Trump: Yeah.

He didn’t say his daughter was “a piece of ass.” However, he did not argue the point.

This follows a pattern throughout the interviews of Stern making a statement as a question and Trump either confirming or denying the statement without repeating it. Trump first explicitly stated he wouldn’t answer a question on September 23, 2004, his 20th interview with Stern. As the interviews evolved closer to 2015, the rate of objections increased.

The interviews begin on May 8, 1993, before Tiffany and Barron were born, Eric was 9, Ivanka was 12 and Don Jr. 16. He had just divorced his first wife, Ivana, and was dating Marla Maples. The last interview was on August 25, 2015, two months after he announced his 2016 presidential run. He and Melania had been married a decade, his children were married and he had starred in two famous television shows.

So Is This Everything?

We are almost sure we have them all. Daily records of Stern’s show prior to 1997 are difficult to find. Is it possible we missed one? Absolutely. But we’re pretty sure we’ve got them all. If we’re wrong, we’d love to know the dates and get to work transcribing.

Also, please check the audio. We think we did a good job tagging who is speaking. But when in doubt, hit play. And if we’re wrong, let us know so we can fix it.

You can find all the transcripts here:

Howard Stern – Donald Trump Interviews

They’re also in the general search, of course. The audio files can be found on SoundCloud, or you can download them all here.

9/27/17 – Audio has been removed per DMCA notice from SiriusXM. Think it should be public? Feel free to let @SternShow and @SiriusXM know.

 

What Makes Donald Trump Uncomfortable? A Statistical Analysis

So let’s get the headline out of the way. Donald Trump is not at all comfortable discussing God. That’s based on more than three hours of video covering more than 424 distinct segments spanning more than 200 events.

That’s why you probably clicked here. Now, you get a data science explainer before you get the data. We’re so bait-and-switch.

As part of a set of new features we’re deploying (see our Emotion Subtitles), we generated a huge amount of data from our new approach to Voice Stress Analysis. Each second of audio and video gets individually analyzed, as well as 10-second segments, sequential segments, and the entire speech, interview or press conference.

This compilation opened up an interesting opportunity for analysis. Since our data is extensively tagged and structured, we could document, statistically, exactly what makes him relax, and what makes him tense. So we thought: cool.

A Word about Voice Stress Analysis

You’ll read a lot about voice stress analysis. So let’s address one thing here: it’s not a lie detector test. This is hotly debated, and we prefer to stick with the known. It has not been proven definitively that increases in voice stress indicate lies. If a person believes a lie, they will be relaxed. If a person steps on a tack, stress will increase even if telling the truth.

What this does definitely detect is a level of comfort, stress and/or anxiety. The higher the frequency (due to muscles contracting, including muscles in the neck that affect the voice box, thus the frequency), the greater the indication of stress. By measuring patterns when this occurs, we can identify statements and topics where a person is not comfortable with what they are saying. Coupled with identification the underlying feelings and measuring factors such as word choice and rate of speech, among several dozen others (we gather 115 datapoints per word), it’s a powerful way to uncover how someone feels about what they are saying.

It doesn’t tell you WHY they’re stressed or anxious. They just are. When used in an individual conversation, you don’t have context. The person can just be having a bad day. Or a great day.

That’s why the next part is important: we have hundreds of hours of Trump documented, transcribed and keyworded. A bad day is possible. 200 bad days on the same topic? Unlikely. In fact, we did a basic statistical model and found the odds of having “a bad day” on 200 or more unique days exactly when a particular topic being discussed was… some big number. Excel showed one of those 1e12 things and we just moved on.

Back to Why You’re Here.

So we ran the data. The methodology is important, which we’ll explain in detail:

  • Eliminate Bias. To remove bias, we selected only topics that Trump has discussed publicly 200 or more times, according to our database. Every one of those topics / subjects was checked and is reflected below.
  • Find Midpoint. For each interview, speech, event, and so on, an individual middle (median) point was established for just Trump’s voice. So if he was having a relaxed day, we measured when topics moved the stress above or below that midpoint. If he was having a bad day, same thing.
  • Phrase subjectivity. For phrases, we freely admit this was subjective. We checked our database for frequently used phrases and it found thousands. It’s a literal beast, so “I am going” appears in the list of three-word phrases. We punted and googled “Trump catch phrases” and selected about a dozen. We made a subjective choice to add “Make America Great Again” into the mix, as well as “Thanks”, “Thank You”, “God Bless You” and “God Bless America” into our checks, based on the findings in our topical analysis.
  • “You’re Fired” We eliminated “You’re fired” since most of the references were short, pre-recorded clips from the television show vs. a real-world situation.
  • Short Segments. We eliminated any segment less than four seconds long, as that can add anomalous spikes, and we want the phrase or topic in context.
  • Sample size. This got us to 170.23 hours of video, spanning 30,899 unique segments (1- to 3-word sentences are a unit in our database based on size), from 1980 through this week, covering 1,634,208 words.
  • <nerd>This then fed into our algorithm, which is an Adaptive Empirical Mode Decomposition (AEMD) process, to check for deviations outside of 8-12Hz. This is widely recognized as the normal frequency range to monitor. When it goes above 12Hz, it’s considered stress…</nerd>
  • A reminder… but again, we use the midpoint from a particular event, to account for the fact that being President probably is stress in and of itself.

One note: you will see topics on the table below with less than 200 citations. Our check of topics included print interviews, his writings and tweets, indicating it is a topic he frequently discusses, but may be represented less than 200 times in the audio and video.

And from that data…

Back to the Lede

Trump is clearly, statistically, uncomfortable expressing gratitude. When he thanks people, based on 67 unique segments where thanking someone was the topic, and another 105 phrase references to thanking someone, he is consistently at an elevated stress level, indicating anxiety.

Similarly, when discussing God as a topic (424 unique segments), he is also uncomfortable, with his voice indicating stress and anxiety well above the midpoint established contextually in the conversation. Note this is specific to discussing God, vs any particular religion or religion itself.

Rounding out the top list of uncomfortable topics and phrases:

  • Make America Great Again” (32 segments)
  • Build the Wall” / “Build That Wall” (153 segments)
  • The White House (as an institution – 323 segments)
  • Veterans (402 segments)
  • Law Enforcement (194 segments)
  • The Wall (as a topic – 790 segments)

Okay, but what puts him at ease? On what topics is he comfortable?
The top of the list is what our system classified as “inner cities” but in looking at specific references, it’s discussions of urban planning, cities and infrastructure. He’s well below the stress midpoint when on this topic (145 references). A good number of these references were in interviews pre-dating his Presidency as well.

The Middle East is strongly represented on the list of topics where he is comfortable: Iraq (420 references), Iran (406 references), Syria (281 references) and the Middle East in general (305 references) are all points where he is clearly relaxed and not anxious when discussing.

Rounding out this list of topics where he is comfortable:

  • War (248 references)
  • The New York Times (89 references)
  • Terrorism (366 references)
  • A lot of money” (275 references)
  • Many many” (126 references)

So was there anything else surprising?

Personally, for me, there were a few things, but the world doesn’t need another opinion right now, so take a look at the data below and decide for yourself. If you disagree with anything in the methodology, let us know. But be warned: we make available all our data on request, and will continue to do so. If you disagree with the points above, we’re happy to send you the algo and all the underlying data for you to verify the results for yourself, or to run through a different process. The world not needing another opinion doesn’t just apply to me :-). We’re all about data and verifiable facts at Factba.se, so you’re welcome to think we’re wrong, but be ready for us to challenge you to prove we’re wrong.

Chart

[Click to Enlarge]

Table

Topic / Phrase Deviation Score # of Segments Length of Segments [HH:MM:SS]
Phrase: “God Bless You”, “God Bless America” 1.4990 159 01:03:30
Phrase: “Thanks”, “Thank you” 1.3732 105 00:37:33
Phrase: “Make America Great Again” 1.1865 32 00:03:45
Thanks / Thanking Someone 0.9275 67 00:30:36
God 0.7604 424 02:54:00
Phrase: “Build the Wall” / “Build that Wall” 0.6085 153 00:41:48
The White House 0.6006 323 02:09:06
Veterans 0.5056 402 02:17:28
Law Enforcement 0.4312 194 01:31:37
The Wall 0.3307 790 02:57:00
Education 0.2488 371 01:40:01
Illegal Immigration 0.2254 109 00:46:39
North Carolina 0.2074 228 01:17:23
Phrase: “Believe Me” 0.2056 834 04:38:59
Phrase: “Sad”, “So Sad” 0.1921 456 02:52:36
Obamacare 0.1822 800 04:36:54
Senate 0.1538 182 01:07:52
United States 0.1529 3128 21:07:33
Congress 0.1500 297 02:15:39
Records 0.1378 181 01:03:20
Washington 0.1240 420 02:41:43
Iowa 0.1189 324 01:19:37
Phrase: “Winning” 0.1117 444 01:43:23
Israel 0.1080 186 01:24:48
Donald Trump 0.0908 573 02:20:56
Polls 0.0792 388 01:36:50
ISIS 0.0757 727 03:51:00
North Korea 0.0650 122 00:49:56
American People 0.0545 275 02:12:24
Campaign 0.0401 525 03:06:03
Health Care 0.0353 220 01:27:06
Florida 0.0251 412 01:45:09
Mexico 0.0114 1102 04:21:13
Special Interests 0.0086 183 01:28:37
Security -0.0107 313 02:14:26
Russia -0.0290 272 01:33:35
Politicians -0.0354 651 02:55:25
Democrats -0.0367 392 02:20:09
New Hampshire -0.0513 308 01:14:53
Phrase: “Tremendous” -0.0525 1099 06:09:35
Trump Administration -0.0567 692 05:02:17
Law -0.0577 211 01:18:52
China -0.0674 1359 05:21:54
Phrase: “Huge” -0.0723 112 00:34:09
New York -0.0747 440 02:00:39
Media -0.0759 373 02:18:02
Drugs -0.0771 385 01:59:19
Hillary Clinton -0.0971 2690 15:17:47
NAFTA -0.1140 320 01:51:19
Republicans -0.1184 454 02:12:58
Numbers -0.1231 483 02:18:55
Trade -0.1232 693 02:56:48
Ohio -0.1351 332 01:50:52
Border -0.1359 1094 05:55:57
Jobs -0.1366 2158 12:08:53
Barack Obama -0.1440 1231 07:09:13
Japan -0.1742 435 01:25:26
Future -0.1871 299 02:20:56
Phrase: “Many Many” -0.1990 126 00:16:37
Phrase: “A Lot of Money” -0.2124 275 01:16:13
Terrorism -0.2229 366 02:44:03
Middle East -0.2260 305 02:02:45
Syria -0.2381 281 01:47:01
The New York Times -0.2714 89 00:36:41
War -0.2890 248 01:33:08
Iran -0.2923 406 02:12:54
Iraq -0.4850 420 02:09:44
Infrastructure / City Planning -0.6432 145 01:09:01

Is Trump Going Senile? (Beta)

No, you didn’t catch us opinion-ating.

We’re getting ready to debut a new daily feature that will try to use data to validate assertions, or to uncover insights that would be impossible without the data collection at Factba.se

It’s not ready to go just yet, and it’s not as pretty as we want it. But the data’s more important than the prettiness.

To that end, please see our first try at this, and let us know what you think. This was based on an article in May in Stat that asserted Trump was potentially going through cognitive decline. The term “senile” was latched on to by the press (and thus this infographic), but it was not used in the article, just the comments. Much of the data cited was due to speaking style and vocabulary.

Well, said we. We have a definitive record spanning 37 years. Let’s take a look.

We focused on two areas: the Flesch-Kincaid Reading Level, that basically scores word complexity, and rate of speech. It’s worth noting that while the average American is between 135-160 words per minute speaking, the average New Yorker is close to 200 words per minute.

We are not offering opinions as to why, but what we can say definitively:

  • Overall, his rate of speech has dropped consistently;
  • This has coincided with a decrease in his unscripted public statements (e.g. interviews) and an increase in his scripted public appearances (speeches, remarks);
  • There is a statistically significant difference in his rate of speech when looking at the type of appearance. Interviews and debates, he speaks much faster. Remarks and speeches, much slower (almost half speed)
  • The complexity of words used in speeches is almost double the grade level of those used in testimony and debates, which are less likely to be scripted.

Statistically, the press conference is a bit of an outlier, though Jennifer (yes, Jennifer) pointed out that he often begins these with a prepared statement, which is included in the vocabulary and the rate of speech, which may skew the results.

That said, let us know what you think.

[Correction 7/1/17: Note the original infographic incorrectly compressed the X axis in terms of years. This does not change or alter the data, but does affect the two timeline charts in the appearance of the data. It has been updated.]

Sigh… Thanks for the Weekend

Life’s little pleasures on a Father’s Day weekend:

  • Cleaning part of the house… just… right.
  • Watching a Pixar flick with your kids.
  • Dunkin’ Donuts without any guilt
  • Getting a 98-page PDF, in tabular format, dropped on your lap at 5:30 pm on a Friday with a year’s worth of financial data. (PDF Here)

Well, I guess we signed up for this.

We worked through this at the beginning of the year. We luckily had three things going for us:

  1. Semi-consistent numbering on the OGE 278e financial forms
  2. Two previous years of clean data in which to compare against the new one
  3. A few handy PDF extraction tools that, while far from perfect, are pretty good and pulling the data out in non-crappy format.

So, that said, still about 10 hours. But the bright side of being hands on is… you learn a lot. For example:

  • Ownership. Basically, anything previously with an owner of “Donald J. Trump” is now shifted to one of the following:
    • DJT Holdings LLC
    • DJT Holdings Managing Member LLC
    • DTTM Operations LLC
    • DTTM Operations Managing Member LLC
    • … or the Donald J Trump Revocable Trust
  • It’s worth noting that the four LLCs mentioned above are all owned by the Donald J Trump Revocable Trust
  • As part of moving around assets, a checking and savings account in excess of $50,000,000 was opened at Capital One on April 12, 2017 for the Donald J. Trump Revocable Trust
  • A bookkeeping thing. The companies listed in his resignation letter from January 19 and the list of resignations in the OGE 278e don’t match or line up neatly. Someone should poke around just to make sure I’s dotted, t’s crossed.

Everything is integrated into our Assets page (https://factba.se/topic/assets).

In addition, we put everything in two Spreadsheets, because nobody should deal with PDFs. We feel strongly about that (with apologies to Adobe).

  • The OGE 278e Financial Disclosure from June 14, 2017 is completed converted to a spreadsheet here: https://goo.gl/4jL9Bo
    It’s embedded below, but save yourself the headache and go straight to the sheet.
  • We put his income, liabilities and portfolio side by side for 2015, 2016, and 2017 in a spreadsheet here: https://goo.gl/MdbrhC

Everything above is a Creative Commons 3.0 Attribution. Put it to good use and crunch away.

https://docs.google.com/spreadsheets/d/1ESZwVWN2yUjkeGe__u0AK3TwSOc2KojhFH9CAFoSzZo/pubhtml?widget=true&headers=false

Feed Me (Transcripts), Seymour…

If there’s one thing about statistical models that’s generally true: they need to be fed.

For about six months now, I’ve been living most waking moments in the words of Donald Trump. I love algorithms, but I check them. And check them again. And again. It’s not even borderline compulsive. We blew past borderline around January. It is compulsive.

Probably the single biggest challenge I face in shaping the models: access to raw materials. We check, and check again, every word. Yes, he was on Oprah in 1988, but we need more than 3:11… we need the whole show for context.

We are constantly updating our backlog of material, with volunteers generously sending in links (I’m looking at you CJ in particular), text, videos that in turn need to be checked and then fed into Margaret, our pseudo-AI that ravenously consumes every word spoken, analyzing the audio, video and text to build her model. This in turn analyzes tweets, transcribes better, and does lots of other cool things.

The single best source of this information are interviews. As opposed to speeches, they are generally unscripted. As opposed to tweets, you get more than 19.6 words at a time (1 year moving average, 3,203 tweets, 62,871 words). Sometime, I’ll have enough time to do a separate post explaining how different the models view speeches vs. interviews… it’s almost two different people in the output.

However, as Chris Cillizza at CNN pointed out in a recent tweet, these are often unshared, even after the news cycle. Some organizations publish transcripts simultaneously. Most publish just excerpts, noting they’ve been edited. Some share audio and video, but with cuts and jumps. Others… nothing.

I’m not naming names, but given that the messages coming from The White House can at times appear to contradict each other, this raw material is crucial, both for the historical record, and for building a base of research that others can analyze.

Also, the full, unedited interview can remove potential questions as to whether comments are in context. Personally, I think that is in nearly every case a ridiculous argument, but the argument can’t be made if there’s no edits.

So I’m making both a public plea, and an offer: please, in the name of all that is good in the world, once you’ve run your stories and pieces, please publish and share the raw materials. Pull any off-the-record comments, but otherwise, share the raw audio, video and text.

Since everyone has a few things to do nowadays, here’s what we’ll offer for any interview with the President, if time or resources constrains a full transcript or sharing raw video and/or audio.

  1. Factba.se will happily, and freely, transcribe in full any video or audio provided, both via Margaret, and with a human editor to verify.
  2. Factba.se will provide, via a spreadsheet or any other medium, ALL metadata developed. This is the stuff that is behind the scenes (not for long) on our site. If audio, you’ll get back second-by-second audio analysis of voice stress and emotion, which is keyed to Trump (the sotto voce whisper). If video, it will include facial expressions, smile/frown, gestures and other analysis (clothing identification, colors, smile / frown, the two-handed punctuation I myself have as a third-generation bridge-and-tunnel child, etc). It is even learning to pick up when he flushes (complexion change). It will be a lot. But it will be everything.
  3. We will provide the full keyword and entity extraction, by three-sentence pair, section and overall, both for the entire interview, and specifically on just when Trump is speaking.
  4. We will provide the full-range of analysis. Grade-level models, sentiment, emotion… all of it.
  5. We will respect any and all embargoes given. We are not meant to be a news organization. If you’d like us to hold until a day, two days, three days after the stories run before integrating and sharing the information, fine. You’re the boss. It’s your interview. You get it back first and control the story.
  6. If a human is in the mix editing, figure two hours per hour of video/audio for transcript. If you don’t mind raw from Margaret (she’s close to 95% dead on now), 90 seconds per hour. We just need a little notice to plan our day to be ready for it if you want a quick turnaround.
  7. We will, of course, link out to your pieces from the text.
  8. If there are any other requests… fine. Our interest is the record, and sharing the resulting analysis.

We’re not looking to create a hippie commune. We are looking, however, to unleash the data that is contained in your excellent work, in a way that does not conflict with your job.

Also, on the off chance Margaret becomes sentient again, you’ll be in her good graces.

— Bill Frischling