China tops list of best programming nations

A new study has given insight into the world’s developers and how they score in various software development aspects. According to HackerRank‘s data, which includes over 1.5 million users, China and Russia score as having the most talented developers. The developers are scored based on a combination of skill and speed. Chinese programmers outscore all other countries in mathematics, functional programming, and data structures challenges, while Russians dominate in algorithms, the most popular and most competitive arena. Although the United States and India provide the majority of competitors on HackerRank, they only manage to rank 28th and 31st.

The folks at HackerRank looked at each country’s average score (50 countries in total) across the following 15 domains: Algorithms, Java, Data Structures, C++, Tutorials, Mathematics, Python, SQL, Shell, Artificial Intelligence, Functional Programming, Databases, Ruby, Distributed Systems, and Security. They then standardized the scores for each domain (by subtracting the mean from each score and then dividing by the standard deviation; also known as a z-score) before finding the average. They then converted these z-scores into a 1-100 scale for easy interpretation.

Below is the top 20:

  1. China – 100
  2. Russia – 99.9
  3. Poland – 98.0
  4. Switzerland – 97.9
  5. Hungary – 93.9
  6. Japan – 92.1
  7. Taiwan – 91.2
  8. France – 91.2
  9. Czech Republic – 90.7
  10. Italy – 90.2
  11. Ukraine – 88.7
  12. Bulgaria – 88.2
  13. Singapore – 87.1
  14. Germany – 84.3
  15. Finland – 84.3
  16. Belgium – 84.1
  17. Hong Kong – 83.6
  18. Spain – 83.4
  19. Australia – 83.2
  20. Romania – 81.9

Do you notice anything that most of these countries have in common? They are almost all non-English speaking countries! Astounding. Singapore (13th place) has four official languages, English, Malay, Mandarin and Tamil, but English is the language of instruction in all public schools. Hong Kong (17th place) has English and Chinese as official languages and English is used varyingly in education. Australia (19th place) is the highest-ranked country where English is the only national language.

The big shock here is that the US is ranked 28th with a score of 78.0, and India ranks 31st with a score of 76. Also, the UK is 29th with a score of 77.7. That’s right. According to HackerRank, the homeland of Don Knuth, Stanford, MIT, etc. is 28th in the world in terms of software development. By the way, the US and India also provide the largest numbers of developers on HackerRank – and they don’t feature in the top 5 of any of the 15 domains tested.

At this point I should give a nod to Ireland, placing 5th in Artificial Intelligence, and 32nd overall with a score of 75.9.

China also topped the list in three domains: mathematics, functional programming, and data structures, and were in the top 5 in seven domains. China is also the second-least likely country to choose Java, next to Taiwan. China also has the largest proportion of developers focusing on Python.

I find the following results particularly astonishing:

  1. The US is ranked 28th
  2. India is ranked 31st
  3. All but three of the top 20 are non-English speaking countries

This is related to my interest in non-native English speakers learning computer science, which will be the topic of an upcoming post, and is the topic of one of Mark Guzdial’s recent posts.

See HackerRank’s blog post here for more.

HackerRank is a California based company that ranks programmers based on their coding skills and helps connect those programmers to employers.

A great resource for non-native English speakers studying computing

I have been teaching this semester in Beijing. The language of instruction is English but most of my students are not fluent – improving English is part of the program here. Two of my modules are CS1 and Computer Organization. Early on in this semester in both courses I encouraged students to look up a few terms in the Free Online Dictionary of Computing (foldoc.org). Little did I know then that I would end up referring to FOLDOC almost every week.

Started in 1985 by Denis Howe, FOLDOC is an online, searchable, encyclopedic dictionary, currently containing nearly 15,000 definitions. It also includes cross-references and pointers to related resources elsewhere on the Internet, as well as bibliographical references to paper publications.

What I really like about FOLDOC is its simplicity, and that the definitions are pointedly context-based, specifically describing what words mean in the context of computing. I never really thought about it until recently, but in computing we use many words in ways that can be quite far from their ‘normal’ meanings. Take for instance the word load. Computing people happily abuse this word using it often and with several meanings. The Merriam Webster Dictionary has these ‘simple definitions’ for load:

1. something that is lifted and carried

2. an amount that can be carried at one time : an amount that fills something (such as a truck)

3. the weight that is carried or supported by something

None of the other ‘full definitions’ mention anything like those that FOLDOC gives:

load

1. To copy data (often program code to be run) into memory, possibly parsing it somehow in the process. E.g. “WordPerfect can’t load this RTF file – are you sure it didn’t get corrupted in the download?” Opposite of save.

2. The degree to which a computer, network, or other resource is used, sometimes expressed as a percentage of the maximum available. E.g. “What kind of CPU load does that program give?”, “The network’s constantly running at 100% load”. Sometimes used, by extension, to mean “to increase the level of use of a resource”. E.g. “Loading a spreadsheet really loads the CPU”. See also: load balancing.

3. To install a piece of software onto a system. E.g. “The computer guy is gonna come load Excel on my laptop for me”. This usage is widely considered to be incorrect.

FOLDOC is pretty comprehensive too. Writing this post I hit ‘random’ on the site, and it brought me to the definition of CACM:

Communications of the ACM

(publication) A monthly publication by the Association for Computing Machinery sent to all members. CACM is an influential publication that keeps computer science professionals up to date on developments. Each issue includes articles, case studies, practitioner oriented pieces, regular columns, commentary, departments, the ACM Forum, technical correspondence and advertisements.

http://acm.org/cacm/.

Then I googled CACM. The CACM we know and love is the 5th hit, and unless you know what ACM stands for,  the first page of results isn’t much help if you are looking to find what CACM means or stands for (in a computing context). I wish that someone gave me such a brief synopsis of CACM when I was starting out.

Other good entries for ‘normal’ English words whose computing definitions are not easily found on the net are iteration and volatile:

iteration

(programming)   Repetition of a sequence of instructions. A fundamental part of many algorithms. Iteration is characterised by a set of initial conditions, an iterative step and a termination condition.

A well known example of iteration in mathematics is Newton-Raphson iteration. Iteration in programs is expressed using a loop, e.g. in C:

	new_x = n/2;
	do
	{
	  x = new_x;
	  new_x = 0.5 * (x + n/x);
	} while (abs(new_x-x) > epsilon);

Iteration can be expressed in functional languages using recursion:

	solve x n = if abs(new_x-x) > epsilon
		    then solve new_x n
		    else new_x
		    where new_x = 0.5 * (x + n/x)
        solve n/2 n

volatile

1.   (programming)   volatile variable.

2.   (storage)   See non-volatile storage.

A few more clicks on random brought me to this, proof that those behind FOLDOC also have a great sense of humor:

elephant

Large, grey, four-legged mammal.

 

Update August 3 2016 – Merriam Webster have a learner’s dictionary which could be a valuable resource for those learning English.