加利福尼亚自选车牌申请及拒签原因
California vanity license plate applications with reasons for rejection

原始链接: https://github.com/veltman/ca-license-plates

该数据集包含2015-2016年间23463份加利福尼亚州机动车辆管理局(DMV)个性化车牌申请,这些申请均被标记为需要审查。数据包括申请的车牌组合、审查原因(编码为7(B)或7(D)——具有攻击性/误导性含义)、申请人对车牌含义的解释、DMV审查员的评论以及审批状态(Y/N)。审查原因与车牌潜在的攻击性、粗俗或误导性解释有关。特殊字符如#(手)、$(心)、+(加号)和&(星号)用于替代实际符号。数据是从格式不一致的Excel文件中提取的,因此可能存在错误。发现个人信息后已将其删除。该数据集提供了对DMV认为不适合个性化车牌的语言和引用的类型见解。

Hacker News 上的一篇讨论分析了 2020 年加州车牌申请被拒的数据,这些数据来自 DMV 的 Excel 文件。数据显示数据存储存在不一致性。评论者分享了他们自己的被拒经历,例如“SIXSPD”因涉嫌缩写违规而被拒,并对 DMV 的理由进行了辩论。文章还引用了 Urban Dictionary,但对其可靠性提出了质疑。 一些人认为 DMV 过于敏感,例如“GAYMER”、“AIR$OFT”甚至“PZA TACO”(因为“taco”的俚语含义)都被拒绝。数字的使用,特别是“88”(被解释为希特勒的暗号),也受到了审查。讨论的中心是如何在言论自由和防止冒犯性车牌之间取得平衡。一些人主张制定更严格的规定以避免造成伤害,而另一些人则认为 DMV 权力过大,充当了品味仲裁者的角色。文章还提出了车牌审批是否应该取消主观性的问题,并反驳说需要人工判断来识别潜在的冒犯性表达。

原文

Warning: this dataset contains vulgar and offensive language (quite a lot of it).

applications.csv is a CSV of 23,463 personalized license plate applications the California DMV received from 2015-2016.

These do NOT represent all applications received by the DMV during that timeframe, only applications that were flagged for additional review by the Review Committee. The file includes the following columns:

  • plate: the personalized license plate combination requested.
  • review_reason_code: Reason code for the application being reviewed (see below for codes).
  • customer_meaning: Meaning of the plate provided by the applicant.
  • reviewer_comments: Comments from DMV reviewers.
  • status: Y means the plate was approved, N means it was denied.

This data was parsed from a set of 458 Excel workbooks that the DMV prepared for someone else's public records request. I received the files as a consolation prize in response to my own related records request, which I was told would cost $2,000 to fulfill otherwise.

  • In plate combinations, a # character indicates a hand symbol, a $ character indicates the heart symbol, a + character indicates the plus symbol, and a & character indicates the star symbol.
  • Some records are missing reason codes, customer meanings, reviewer comments, and/or statuses.
  • A reviewer comment of "No micro" indicates that a paper application was submitted but was unavailable in the DMV's imaging system.
  • In a few cases the reason code is some other character or word besides Y or N, possibly a typo.
  • I tried to redact any records I found that seemed to include too much personal information about the applicant (about 50 in total).
  • Because the data is parsed from Excel workbooks that are not 100% consistent in structure, there may be some errors.
  • Review reason codes are described as follows:
7(B) When a desired configuration is not available, a letter shall not be substituted for a number, nor shall a number be substituted for a letter, to create another configuration of a similar appearance.
7(D) The department shall refuse any configuration that may carry connotations offensive to good taste and decency, or which would be misleading, based on criteria which includes, but is not limited to, the following:
    1. The configuration has a sexual connotation or is a term of lust or depravity.
    2. The configuration is a vulgar term; a term of contempt, prejudice, or hostility; an insulting or degrading term; a racially degrading term; or an ethnically degrading term.
    3. The configuration is a swear word or term considered profane, obscene, or repulsive.
    4. The configuration has a negative connotation to a specific group.
    5. The configuration misrepresents a law enforcement entity.
    6. The configuration has been deleted from regular series license plates.
    7. The configuration is a foreign or slang word or term, or is a phonetic spelling or mirror image of a word or term falling into the categories described in subdivisions 1. through 6. above.
联系我们 contact @ memedata.com