archive_trump.py, the Fascist Tweet-Archiving Script

Overview

The archive_trump.py script runs on one of my spare laptops, constantly listening for new tweets on certain Twitter accounts and, when new tweets occur, using the Internet Archive to create an offsite copy of those tweets. It started out just watching Donald Trump's Twitter accounts, but as of 22 February 2017, it also watches the Twitter accounts of Mike Pence.

The full list of accounts currently archived by my use of this script is:

My intent is to get a neutral third party to create publicly accessible backups of the tweets before they can be deleted, because a neutral third-party archive is a more credible source than a screenshot that I produced on my own computer and totally swear I didn't alter. (It's also easier to produce automatically.)

Where can I see the tweets archived by your script?

The script produces an index for each account it tracks as it runs. These indices are in the .csv format, which is easily importable into any spreadsheet program; they are hosted both on Dropbox and in the project GitHub page. (The Dropbox-hosted copies are should usually be automatically updated within a minute or so; the GitHub-hosted copies are easier to read from the web, but only updated when I manually update them, which usually means when I am updating the GitHub data anyway or when I happen to remember to do it.) You can also search through the Internet Archive-hosted tweets using the Internet Archive's interface.

If you are unhappy with the display options, it's probably wisest to download the current .csv from Dropbox and search through it using your favorite spreadsheet program. If you want to have the ability to search directly from this webpage, contact me and make an offer to finance hosting such a service, and we'll talk. (I'm a poor grad student and can't afford to run a search-through-Trump's-tweets engine at this site.)

AccountDropbox index
(most up-to-date)
GitHub index
(easier to read)
Search
(via Internet Archive)
@POTUShereherehere
@realDonaldTrumphereherehere
@VPhereherehere
@mike_pencehereherehere
@GovPenceINhereherehere

Why would you do this in the first place?

Because words matter, especially the words spoken by elected officials; they have wide-ranging effects even after their material presence has evaporated into the ether. Donnie's profound contempt for facts and his repeated insistence on inventing them are both troubling, and I suspect that there's a connection with the surprisingly frequent deletion of his own posts on Twitter.

When The Donald deletes a tweet, that doesn't mean that the deleted words have had no effect; they still influence the thoughts and behavior of (at least some of) his supporters. All it really means is that the effect is harder to trace back to the suddenly absent cause. My thought is that producing an archive that's accessible to the public at an external source helps to reinforce, in a small way, the underlying discursive structures upon which a functioning democracy depends.

What do you think it means that he deletes his tweets?

I think that depends entirely on which tweet we're talking about.

You will note that I have not claimed that tweets should never be deleted, nor that the removal of any particular post necessarily means anything that I'm qualified to talk about. (You will also note that I have sometimes deleted my own tweets, usually—but not always—to correct a typographical error.) But I think that preserving an archive of what our current president says is very important, and it's relatively easy to do.

Should I myself run a copy of this script?

Maybe! As for me, I'm just running the script on a spare laptop in my apartment, and that's not a perfect setup: my electricity or Internet service could go out, or the laptop could overheat, or its old hardware might be running the script too slowly to catch a tweet before it's erased, or any number of other things could go wrong. Having several people—certainly more than a dozen or so would be overkill—all running this script (or taking similar actions) would provide a level of redundancy that would help to make always capturing every tweet at least once much more likely.

On the other hand, if way too many people decide to volunteer in this way, that would unnecessarily burden the infrastructures of both the Internet Archive and Twitter for little to no practical benefit. So my proposal is this: if you plan to run another copy of this script, let me know (hit me up on Twitter, and I'll keep an up-to-date count (and/or tally) here.

To the best of my knowledge, there are currently no other people running this script remotely.

Given all of that, you can download the script on GitHub, if you'd like.

Does this script ensure there is a complete archive of all of Trump's (and Pence's) tweets?

No. There are at least two groups of tweets that the script is not archiving:

  1. Very old tweets. The Twitter API only allows access to someone's last 3200 or so tweets, so the first tweet this script archived was probably produced by Donnie somewhere around March 2016. I have not made any attempt to go back and algorithmically save older Trump tweets, in part because that would require a totally different methodology, and quite possibly substantial manual intervention. (There are older Trump tweets archived on the Internet Archive, but they were not saved by me.)
  2. Tweets that both appear and disappear while the script is not running. Normally, the script runs constantly, but if the power goes off in my apartment, or if the script crashes, and Donnie tweets and then deletes that tweet before the script runs again, then the tweet won't get archived. Similarly, it is possible that a tweet could get noticed but still disappear before it can be archived.

There are at least two other groups of potential problems that might, in theory, keep a tweet from being archived:

  1. The Twitter API could, in theory, not report that a tweet was posted, or could do so with a large enough delay to allow Donnie or his goony henchmen to delete it before it could be archived.
  2. It is possible, in theory, that Creepy Don or his servile brownshirt brigade might interfere with one or more of the services required for this to work.

I don't currently have cause to believe that anything has been missed for any of the reasons above except for very old tweets ... but then, if it did, how would I know? (This is part of why the reason why the redunancy of several other people running the script would be a good thing.)

Were all of the Trump/Pence tweets I can see on the Internet Archive saved by your script?

No. Anyone can save a web page to the Internet Archive at any time, and I am certainly not the only person who has decided to have the Internet Archive save copies of (some of) Trump's tweets. (Though, to the best of my knowledge, I am the first to think that doing so systematically is a good idea.)

How does it work?

Head on over to the GitHub project for more info!