The results of the FBI's investigation into Hillary Clinton's emails found on Anthony Weiner's laptop might have impacted the course of history - but most Americans casting ballots do not understand how such an examination occurs. "How, is it possible," many ask, "that the FBI analyzed 600,000 emails in just a few days?"
While I am not privy to the FBI's internal workings, and the folks actually involved with examining the relevant materials cannot discuss the process with the public, I thought I would share information on how a cyber-forensic examination of this nature likely occurs. This article is not meant to document the forensic process, nor to provide any sort of comprehensive overview, but, rather, to explain some of the important steps that were likely used by the FBI, and to convey how analysis of Antony Weiner's computer could have been completed so quickly. In fact, based on what I describe below, the actual technical examination probably could have been completed a lot faster had it not been for rules and regulations to which the FBI is subject.
So, here are some steps that likely transpired:
1. Obtain proper warrants and/or permission
No law enforcement agency wants to make evidence inadmissible by obtaining it illegally.
2. Ensure "chain of custody"
A "chain of custody" is established to ensure that nobody can tamper with evidence, and that no outside party can reasonably claim that evidence might have been modified. The FBI already has such a process in place, and that process was, no doubt, used to ensure that nobody could claim that something that the FBI later found on Anthony Weiner's computer was not there when the FBI obtained the machine. Documentation of everything done during the process of examining the laptop is also important for the same reason.
3. Image the laptop
For multiple reasons, competent investigators never work on the actual device in question - so the FBI likely made copies - that is full, forensically sound images -- of Anthony Weiner's computer's hard disk or SSD. Note that the image is normally of the entire device's storage - investigators do not attempt to isolate materials belonging to different users from one another at this point. So, any emails that were on the device and belonged to Huma Abedin, aide to Hillary Clinton and estranged-wife of Anthony Weiner, would have been part of the FBI's image.
4. Initially examine the image (often with an automated tool)
Investigators look to see what type of materials are in the image. In the case of Anthony Weiner's computer, investigators were likely looking for materials related to his sexting - were there relevant photos or videos on the device? Pertinent emails? Logs of visits to websites? Etc. There seems to be no evidence to suggest that the FBI was actively looking for anything related to the Clinton case - rather that agents examining Weiner's computer stumbled upon materials that they thought might be relevant to the Clinton case, notified those involved, and escalated the matter until it reached the FBI Director.
5. An automated scanning tool or series of tools is/are run to perform multiple functions
Investigators use automated tools to search the device for specific keywords and for things related to the keywords (while this can be done with regular expressions or various language analysis engines, FBI investigators likely use one or more tools that automated this process). In the case of Weiner's computer, the FBI might have looked for emails mentioning Hillary Clinton, or containing her name or email address in the To: and From: fields, as well as various keywords and terms related to classified information. Investigators can also remove materials based on keywords in email From: fields - any emails sent from Netflix, Amazon, or eBay, for example, were not relevant to the Clinton investigation, and may have been excluded by the FBI from the start. Automated tools also remove duplicates - there is no sense wasting time reviewing the same email, photo, or video that is present multiple times within the device image. Duplicates are often identified using signatures or hash-representations of files and emails - making comparison and removal of duplicates much faster than many people might expect. Also, in the case of Clinton's emails, since the FBI already had reviewed many emails taken from Clinton's mail server, they certainly performed duplicate-removal analysis against all of those emails.
6. Perform a manual review
The resulting set of non-duplicate items that match, or are similar to, the relevant keywords is reviewed manually. In the case of Weiner's computer, anything relevant to the Clinton case that was discovered during the manual review (if anything of that nature was actually found) was likely escalated to several parties at the FBI to determine if it had any impact on the Clinton case.
Ironically, while people wonder how the FBI technically could examine the laptop in only a few days, the legal and procedural matters surrounding the review might have caused the FBI to take longer to examine it than it would have taken someone in the private sector. Considering how many excludable-from-the-start emails likely existed, how many duplicates likely existed of emails to and from Clinton, and how fast computers can perform the relevant analysis, a private firm beginning a similar task at the same time as the FBI might have completed its entire analysis before the FBI was able to even obtain its warrant.