adult dataset consists of many columns containing various information about relationship, hours worked per week, workclass etc... and about salary, whether more than 50K a year or not. Lot's of possible protected attributes such as sex, race age. Some columns contain level "unknown" and these values are not removed and removing them depends on user as they might contain some information.

data(adult)

Format

A data frame with 32561 rows and 15 variables:

salary

factor, <=50K/>50K whether a person salary exceeds 50K a year or not

age

integer, age of person

workclass

factor, field of work

fnlwgt

numeric

education

factor, completed education degree

education_num

numeric, education number in converted from education factor, the bigger the better

marital_status

factor

occupation

factor, where this person works

relationship

factor, relationship information

race

factor, ethnicity of a person

sex

factor, gender of a person

capital_gain

numeric

capital_loss

numeric

hours_per_week

numeric, how many hours per week does this person work

native_country

factor, in which country was this person born

Source

Data from UCL https://archive.ics.uci.edu/ml/datasets/adult