Guest
Lvl 0

0 / 0
Posted
9
[OC] Characters ranked by the percentage of English dictionary words they appear in
       
Post a reply.
19

Thank you for your Original Content, /u/sataky!
Here is some important information about this post:

Remember that all visualizations on r/DataIsBeautiful should be viewed with a healthy dose of skepticism. If you see a potential issue or oversight in the visualization, please post a constructive comment below. Post approval does not signify that this visualization has been verified or its sources checked.

Join the Discord Community

Not satisfied with this visual? Think you can do better? Remix this visual with the data in the author's citation.


^^I'm open source | How I work

32

Someone show this to the dude who keeps tryna trade me q in Scrabble. Except for that time I chickened out of playing "queef" I never needed that shit enough to trade for

40

DATA: Wolfram Language (WL)

WordList[] - function in the WL representing dictionary with about 40K frequently used English words. https://reference.wolfram.com/language/ref/WordList.html

TOOLS:

Wolfram Language (WL) https://reference.wolfram.com/language

Tweet-a-Program Twitter bot

NOTES:

  • I used Twitter bot from Tweet-a-Program project. One can tweet WL code at it and it will respond with the result of executing that code.
  • Limit of a tweet code is standard 280 characters minus handle @wolframtap.
  • Tweet-a-Program bot returns code result and also code itself. Because it is so short it is easy to read what it does and the steps of data processing and creating visualization.
  • Percentages do not add up to 100%, because many characters can be used in the same words.
  • There are also symbols dash - and apostrophe '.
  • I suspect these percentages will be almost independent from dictionary size. Even if we go to larger-size dictionaries including more of less-frequent words this plot probably will not change much. There are about 170K words in English language. If you think otherwise - let me know in the comments.
  • These percentages are different from well known Letter Frequencies in a written language: https://en.wikipedia.org/wiki/Letter\_frequency
31

For about 40 seconds, I was like "Characters in WHAT?" thinking it was about a play or story ... smh, it's early.

36

J being the lowest was surprising

Stats
Post Views: 26
Comments: 5

Upvotes: 4
Downvotes: -1

Decay Rate: 1
Current Score: 0
Top Score: 9
Top Post Tips
Top Comment Tips