The Development of the ThinQu Keyboard Layout: Factors that Influence Typing Effort

The ThinQu Keyboard Layout is an optimized keyboard layout I recently developed for standard keyboards. It is designed to maximize both typing speed and comfort.

It turns out to be extremely costly and difficult to build a model for typing comfort or effort. In existing studies, a regression model with several variables is usually built to measure typing effort and compare a few different layouts. The regression coefficients are either calculated by real data or purely speculated. The structures of these models are flawed due to the inflexibility of regression models and lack of data to estimate the coefficients accurately. It’s very costly to obtain real human data in typing speed and effort for multiple layouts because of the low population size and high learning cost of newer layouts. In addition, it’s very hard to measure typing effort and time is not a good proxy for effort; wasting time in waiting for other fingers to finish is more relaxing than spending time to get to the target key.

There are a couple of variables known to correlate with typing effort and I will go through the complexity of each one although it’s hard to talk quantitatively without empirical data. All frequency data come from the Norvig study.

Key location and effort

Workman‘s layout nicely gives each key an effort score. The main inaccuracy is that the N key (of QWERTY keyboard) should be rated a 2 according to the symmetry with the V key. Secondly, the effort score for the low ring finger keys should be more like a 3.5 instead of a 4. Note that due to the difference in the right hand base keys, there is an extra middle column in ThinQu which would be rated a 5 or 6.

From the diagram, we can see a strong interaction between row and finger. Missing this interaction is the major drawback in carpalx’s model and implicitly in Colemak and many other layouts.

To place the most frequent letters in the best locations to reduce finger movement, we consult the letter frequency chart, which is adjusted for the multi-letter keys th, in, qu, and tion:

Note that h loses more than half of its usage due to the th key. The in key takes away 28% of the load on i and n, alleviating the pinkies. Continue reading “The Development of the ThinQu Keyboard Layout: Factors that Influence Typing Effort”

Introducing ThinQu, the Fully Optimized Keyboard Layout

There has been many attempts to improve the QWERTY keyboard layout. Dvorak maximizes hand alternation; Colemak and many others emphasize on easiness to switch from QWERTY; Workman improves on these by considering the effort of lateral movements and the interaction between finger and row; Q*MLW* layouts provide the optimal solution of a quantitative effort model but fails to consider the factors raised in Workman. I generally agree with Workman’s finger strain mapping.

To design the most ergonomic keyboard layout, all of the advantages of the existing layouts are considered. The ThinQu layout has the following features in common with its predecessors:

•  As much as 60.6% of letter typing is done with the eight base keys (lower than Colemak’s 65.0%, because of the two-letter keys). 80.1% of the letter typing is in the 14 keys rated 1-2 by Workman. There is a strong correlation between letter frequency and Workman’ rating.
• Hand alternation is strongly favored. Within each hand, hand rolling (consecutive letters on the same row using adjacent finger) is favored.
• Same-finger movement is strongly penalized, which provides motivation for a more even frequency distribution across fingers.
• Lateral movement of the index and little finger is strongly penalized. The usage of the middle columns excluding numbers is 8.9%, compared to Workman’s 7.0% and Dvorak’s 14.8%.
• Hand utilization is more balanced with only a slight consideration for right-handedness. 49.7% of letter typing is done by the right hand. If including punctuation marks and Shift but not numbers and special keys like Enter, 51.4% of the typing is done by the right hand.
• With a priority for optimizing typing effort and speed, easiness for a QWERTY user to learn is minimally considered. No letters, four punctuation marks, and all numbers are in the same place as QWERTY. Three letters stay on the same finger as QWERTY. Users wishing to put less effort in learning the ThinQu layout should consider using the transitional ThinQu layout.

• The most frequent and the third most frequent bigrams, th and in, have their own keys. The in key also lessens the burden of the pinkies by pressing i and n keys less. A third two-letter key, qu, replaces the q key since u follows q 99.1% of the time. Thus the name ThinQu. Also, Shift+in does not produce In, but rather the most common 4-gram – tion. Note that the bigram keys still output lower-case letters when CapsLock is on.
• The base keys are shifted to the right by one space so that the right pinky is closer to Enter, Shift, and Backspace. It also redistributes most of its duties in punctuation marks to the new middle column for the index fingers. Now the right index finger rests on the K key of QWERTY. I recommend that you physically swap the J and K keys so your index finger can still feel the bump.  For those using a split keyboard, the non-shift version of ThinQu is recommended.
• All modifier keys in Windows and third party applications are still in the QWERTY layout. You can keep using Ctrl+C and Ctrl+V keys in their usual location.
• Utilization frequency for each finger is related to that finger’s strength but with finger movement in mind – less moved fingers can bear a higher frequency.
• The locations of punctuation marks are optimized by considering their frequencies. Square and curly brackets are available through Alt Gr.
• In addition to brackets,  letters with diacritics and other less often used symbols can be entered by pressing Alt Gr (right Alt or Alt+Ctrl). The layout is identical to the English international keyboard in Windows. Square and curly brackets along with an added  (en dash) symbol are placed in the home row.

• There is a programming version that makes symbols more accessible by some rearrangement.
• Users can modify the layout to their preference by editing the .klc file. Programmers can rearrange the symbols in the programming version to suit their programming language. This article lists the frequency distribution of symbols for each programming language.

The ThinQu layout makes the following assumptions:

• The ThinQu layout is only suitable for touch typists.
• It was designed for keys laid out in a QWERTY style (i.e. columns are staggered) although the relative loss for an ortholinear keyboard to adopt ThinQu is very small.
• The language used is always English with punctuation marks, occasional internet slangs and abbreviated spelling, and limited account/password/captcha entering.

The Game

Codenames is a popular card game designed by Vlaada Chvátil published in 2015. To play this game, each turn a clue giver in each of the two competing teams announces a one-word clue and a number indicating how many words on the table are related to this clue and the guesser(s) have to guess as many words that belong to their team as possible. About 8 of the 25 words on the table belong to each of the teams. The rest of the words are neutral but guessing the black assassin loses the game. Only the clue givers know which words belong to which team. The team who has all their words guessed wins the game.

The following pictures show a game in progress where all the guessed words are covered by the team’s color (beige is neutral). The color key at the bottom is only shown to the clue givers.

The Bayes formula

The guessers guess the words in sequence, up to one more than the number specified by the clue giver. For each guess, the guessers need to find the word that has the highest probability of belonging to their team. For each candidate word, this probability is calculated as follows, using the Bayes formula.

Since each word is randomly assigned to each team, the quantity $P(Guessed \, word \in Own \, team's \, words)$ is the same for every word. Additionally, the denominator P(Clue) is fixed for any given clue. Therefore we can simplify the equation as:

That is, given a clue, the probability that a word belongs to the guessers’ team is proportional to the probability that this clue is used given this word needs to be guessed. The left side is what we need, but is extremely hard to calculate given the strategies of the game. The right side is more straightforward. Often, the guessers have to consider the few target words for this turn as a whole set and plug into the formula.

How AKB48 Can Revive by Balancing Exploitation and Exploration

AKB48 Group is the world’s largest girls idol group based in Tokyo, Japan. It contains six smaller constituent groups called sister groups with over 300 members in total. Naturally the popularity of each member is different which should follow a relatively symmetric unimodal distribution. Since there is no way every member can get equal and adequate opportunities to perform, a dozen or so prominent members are selected to receive half of the attention.  As the histogram below shows, however, the popularity distribution (measured by number of followers on Showroom) is very right skewed due to at least two effects: (1) the distribution of media exposure is extremely right skewed and (2) the number of years since joining the group is right skewed. We know that media exposure can shape member popularity by contrasting the popularity distribution to that of a more egalitarian group named Keyakizaka46, which has very symmetric and concentrated popularity distribution and media exposure distribution. Keyakizaka46 is able to achieve relative equality of media exposure due to its small size of 21 members (not counting under group members).

When Should We Replace $5 Bills With$3 Bills?

Almost all currencies in the world have denominations that start with 1, 5, or 0 and the main difference among them is whether $2,$20, $200, etc. bills are used. But prevalence does not prove efficiency. In this post, I will compare the efficiency of different bill denomination systems. An efficient currency minimizes the cost of transaction. To simplify, we assume that the cost of transaction is proportional to the time spent on it. The cost of transaction can be dichotomized into time spent by the customer (while the cashier waits) and time spent by the cashier (while the customer waits). Each component further divides into the following categories denoted by a, b, and c: • Fixed transaction cost (a): time spent on taking out the wallet, opening the cash register, thinking about which bills to use (while doing nothing else), and handing and receiving the money. Avoid double counting if these jobs overlap in time or if the other party is not waiting but doing some other necessary work, such as printing the receipt. • Fixed cost for using bills in each denomination (b): time spent on moving the hand to reach a specific slot in the cash register; time spent on finding the place for a specific denomination in the wallet and putting all bills of this denomination on the other hand or the counter before moving on to work on the next denomination. • Cost for counting a bill (c). We acknowledge that cost a can be substantially lower if no change is required from the cashier, which saves the time of passing the changes back to the customer. However, most transactions involve taxes and multiple items that lead to the usage of coins (in the US). Most of the time the customer does not have the exact amount of coins or does not want to pay any coins. Secondly, for the same reason, charges are usually not psychologically convenient numbers (e.g.$5, \$20). Nice numbers can make buying decisions easier but they don’t show up in the payment. Therefore we can assume that there are always changes involved and a is constant and can be entirely dropped from this analysis.

How to Recover Deleted Coursera and Berkeley Courses from Archive.org

In June 2016, Coursera.org deleted 472 open online courses as it migrated from an old system to a new system; from March to August 2017, UC Berkeley made all 350+ courses on Webcast.berkeley private. Fortunately, most of these courses have been archived by the Archive Team. This post provides the instructions for downloading and opening the archived courses.

Recovering UC Berkeley courses

Recovering Berkeley courses is easy. Go to this link:

http://www.archiveteam.org/index.php?title=UC_Berkeley_Course_Captures#Status

and find the course you want to watch and click on the link to archive.org in the right column. You can watch the videos online or download them. You can find archived course descriptions in this page or search in Berkeley’s academic guide.

The Evolution of Risk Preferences

In this post, I explore the evolutionarily adaptive risk preferences under various conditions.

Fair wager model

To understand the long term effects of risk aversion vs risk neutrality, we consider an evolutionary player with 10 apples initially who repeatedly makes a series of wagers that gets one apple half of the time and pays one apple half of the time. The wager is considered “fair” since the expected payoff is 0. In a population, every player is presented with the same wagers but the results of the wagers are independent or idiosyncratic. The player stops playing when he loses all the apples and dies. The wager repeats 10000 times in a short time and we record the number of apples owned at the end, which is proportional to the population size or fitness. Three thousand simulations are run and the distribution of the number of apples is plotted in figure 1.1.

As plotted in figure 1.1, in the long run the player almost surely dies. The distribution is severely skewed right. The single period expected payoff and long run expected payoff are both 0 apples because starvation does not change expected future payoff since whether wagering or dead, the expected payoff remains 0. Can we say evolution disfavors this kind of wagers since the extinction rate is high? Without additional assumptions, the answer is no. If individuals in a population either always accept this wager or decline this wager, then after many generations more than 99% of the lineages will be from the risk averse ones who decline the wager, which makes risk aversion seem like the dominant strategy. However, if counting the number of surviving individuals, the two risk preferences are equally successful by having equal total population sizes since the expected payoff of both risk preference equals 0. We will use this method to define fitness in the rest of the article and therefore we can assume what evolution maximizes is the long run expected number of offsprings produced. Assuming every offspring is the same, an organism selected by maximizing the long run fitness should maximize the number of his children (and kins) in his lifetime.  Continue reading “The Evolution of Risk Preferences”