The Development of the ThinQu Keyboard Layout: Factors that Influence Typing Effort

The ThinQu Keyboard Layout is an optimized keyboard layout I recently developed for standard keyboards. It is designed to maximize both typing speed and comfort.

It turns out to be extremely costly and difficult to build a model for typing comfort or effort. In existing studies, a regression model with several variables is usually built to measure typing effort and compare a few different layouts. The regression coefficients are either calculated by real data or purely speculated. The structures of these models are flawed due to the inflexibility of regression models and lack of data to estimate the coefficients accurately. It’s very costly to obtain real human data in typing speed and effort for multiple layouts because of the low population size and high learning cost of newer layouts. In addition, it’s very hard to measure typing effort and time is not a good proxy for effort; wasting time in waiting for other fingers to finish is more relaxing than spending time to get to the target key.

There are a couple of variables known to correlate with typing effort and I will go through the complexity of each one although it’s hard to talk quantitatively without empirical data. All frequency data come from the Norvig study.

Key location and effort

Workman‘s layout nicely gives each key an effort score. The main inaccuracy is that the N key (of QWERTY keyboard) should be rated a 2 according to the symmetry with the V key. Secondly, the effort score for the low ring finger keys should be more like a 3.5 instead of a 4. Note that due to the difference in the right hand base keys, there is an extra middle column in ThinQu which would be rated a 5 or 6.

Workman’s finger strain mapping. Higher score means more effortful.

From the diagram, we can see a strong interaction between row and finger. Missing this interaction is the major drawback in carpalx’s model and implicitly in Colemak and many other layouts.

To place the most frequent letters in the best locations to reduce finger movement, we consult the letter frequency chart, which is adjusted for the multi-letter keys th, in, qu, and tion:

Clipboard_20180621 (2).png

Note that h loses more than half of its usage due to the th key. The in key takes away 28% of the load on i and n, alleviating the pinkies. Continue reading “The Development of the ThinQu Keyboard Layout: Factors that Influence Typing Effort”

Playing Codenames: When Frequentist Statistics Becomes Optimal

The Game

Codenames is a popular card game designed by Vlaada Chvátil published in 2015. To play this game, each turn a clue giver in each of the two competing teams announces a one-word clue and a number indicating how many words on the table are related to this clue and the guesser(s) have to guess as many words that belong to their team as possible. About 8 of the 25 words on the table belong to each of the teams. The rest of the words are neutral but guessing the black assassin loses the game. Only the clue givers know which words belong to which team. The team who has all their words guessed wins the game.

The following pictures show a game in progress where all the guessed words are covered by the team’s color (beige is neutral). The color key at the bottom is only shown to the clue givers.


The Bayes formula

The guessers guess the words in sequence, up to one more than the number specified by the clue giver. For each guess, the guessers need to find the word that has the highest probability of belonging to their team. For each candidate word, this probability is calculated as follows, using the Bayes formula.


Since each word is randomly assigned to each team, the quantity P(Guessed \, word \in Own \, team's \, words) is the same for every word. Additionally, the denominator P(Clue) is fixed for any given clue. Therefore we can simplify the equation as:


That is, given a clue, the probability that a word belongs to the guessers’ team is proportional to the probability that this clue is used given this word needs to be guessed. The left side is what we need, but is extremely hard to calculate given the strategies of the game. The right side is more straightforward. Often, the guessers have to consider the few target words for this turn as a whole set and plug into the formula.

Continue reading “Playing Codenames: When Frequentist Statistics Becomes Optimal”

The Evolution of Risk Preferences

In this post, I explore the evolutionarily adaptive risk preferences under various conditions.

Fair wager model

To understand the long term effects of risk aversion vs risk neutrality, we consider an evolutionary player with 10 apples initially who repeatedly makes a series of wagers that gets one apple half of the time and pays one apple half of the time. The wager is considered “fair” since the expected payoff is 0. In a population, every player is presented with the same wagers but the results of the wagers are independent or idiosyncratic. The player stops playing when he loses all the apples and dies. The wager repeats 10000 times in a short time and we record the number of apples owned at the end, which is proportional to the population size or fitness. Three thousand simulations are run and the distribution of the number of apples is plotted in figure 1.1.

Figure 1.1

As plotted in figure 1.1, in the long run the player almost surely dies. The distribution is severely skewed right. The single period expected payoff and long run expected payoff are both 0 apples because starvation does not change expected future payoff since whether wagering or dead, the expected payoff remains 0. Can we say evolution disfavors this kind of wagers since the extinction rate is high? Without additional assumptions, the answer is no. If individuals in a population either always accept this wager or decline this wager, then after many generations more than 99% of the lineages will be from the risk averse ones who decline the wager, which makes risk aversion seem like the dominant strategy. However, if counting the number of surviving individuals, the two risk preferences are equally successful by having equal total population sizes since the expected payoff of both risk preference equals 0. We will use this method to define fitness in the rest of the article and therefore we can assume what evolution maximizes is the long run expected number of offsprings produced. Assuming every offspring is the same, an organism selected by maximizing the long run fitness should maximize the number of his children (and kins) in his lifetime.  Continue reading “The Evolution of Risk Preferences”

9 Chrome Extensions for Surfing the Web Statistically

There are a lot of data on the web that can help us surf the internet more efficiently. The extensions listed below help us achieve this. Some extensions collect data from the users and summarize them, some analyze user-generated content, while others record the history of web pages. SEO oriented extensions are excluded from this list.

1. Alexa Traffic Rank


This extension shows the global website traffic ranking of the current website as well as the ranking in the country that generates the most traffic for this website. This is the quickest way to learn about a website’s popularity and credibility. Moreover, it shows websites that are similar to the current one with remarkable validity. This enables a graph based traversal of the internet. The Wayback Machine link allows the user to view old versions of the main page of the current website. To view old versions of any web page, there is a dedicated extension listed below at #9.

Upon clicking on the main link, it shows the traffic distribution overtime, by country, subdomain, gender, education, and browsing location. Traffic data is collected mainly through the Alexa Toolbar and this extension.

Number of users: 550,000

Alternatives: SimilarWeb with better graphics but less reliable traffic rankings.

Continue reading “9 Chrome Extensions for Surfing the Web Statistically”