parse

bigbang.parse.clean_from(m_from)

Return a person’s name extracted from ‘From’ field of email, based on heuristics.

bigbang.parse.clean_mid(mid)
bigbang.parse.clean_name(name)

Clean just the name portion from email.utils.parseaddr.

Returns None if the name portion is missing anything name-like. Otherwise, returns the cleaned name.

bigbang.parse.get_date(message)
bigbang.parse.get_refs(refs)
bigbang.parse.get_text(msg)

Get text from a message.

bigbang.parse.guess_first_name(cleaned_from)

Attempt to extract a person’s first name from the cleaned version of their name (from a ‘From’ field). This may or may not be the given name. Returns None if heuristic doesn’t recognize a separable first name.

bigbang.parse.normalize_email_address(address)

Takes a valid email address and returns a normalized one, for matching purposes.

bigbang.parse.split_references(refs)
bigbang.parse.tokenize_name(clean_name)

Create a tokenized version of a name, good for comparison and sorting for entity resolution.

Takes a Unicode name already cleaned of most punctuation and spurious characters, hopefully.