This data is available in the openintro package as the data frame email.
These data represent incoming emails for the first three months of 2012 for David Diez’s Gmail Account, early months of 2012. All personally identifiable information has been removed.
A data frame with 3921 observations on the following 21 variables.
| Variable | Description |
|---|---|
spam |
Indicator for whether the email was spam. |
to_multiple |
Indicator for whether the email was addressed to more than one recipient. |
from |
Whether the message was listed as from anyone (this is usually set by default for regular outgoing email). |
cc |
Indicator for whether anyone was CCed. |
sent_email |
Indicator for whether the sender had been sent an email in the last 30 days. |
time |
Time at which email was sent. |
image |
The number of images attached. |
attach |
The number of attached files. |
dollar |
The number of times a dollar sign or the word âdollarâ appeared in the email. |
winner |
Indicates whether âwinnerâ appeared in the email. |
inherit |
The number of times âinheritâ (or an extension, such as âinheritanceâ) appeared in the email. |
viagra |
The number of times âviagraâ appeared in the email. |
password |
The number of times âpasswordâ appeared in the email. |
num_char |
The number of characters in the email, in thousands. |
line_breaks |
The number of line breaks in the email (does not count text wrapping). |
format |
Indicates whether the email was written using HTML (e.g. may have included bolding or active links). |
re_subj |
Whether the subject started with âRe:â, âRE:â, âre:â, or ârE:â |
exclaim_subj |
Whether there was an exclamation point in the subject. |
urgent_subj |
Whether the word âurgentâ was in the email subject. |
exclaim_mess |
The number of exclamation points in the email message. |
number |
Factor variable saying whether there was no number, a small number (under 1 million), or a big number. |
This information was copied from ?email on March 20, 2016