How We Judge the Top Programming Languages

The methodology behind our rankings

4 min read

Our interactive ranking of the most popular programming languages was first created by data journalist Nick Diakopoulos in 2013. The current version is maintained by IEEE Spectrumsenior editor Stephen Cass with development support from Preeti Kulkarni and Michael Novakovic. As no one can look over the shoulders of every programmer, we have chosen metrics that we believe are reasonable proxies of popularity. By combining metrics to synthesize a single ranking we hope to even out statistical fluctuations, and changing the weights given to different metrics as they’re combined lets us emphasize different aspects, such as what's popular with employers in our Jobs ranking. Data is gathered through a combination of manual collection and APIs and combined using an R script.

We originally started with a list of over 300 programming languages gathered from GitHub. We looked at the volume of results found on Google when we searched for each one using the template “X programming” where “X” is the name of the language. We then filtered out languages that had a very low number of search results, and followed that by going through the remaining entries by hand to narrow them down to the most interesting. Since then, each year we review the list as new languages find their footing and others slip into obscurity.

Our final set of 57 languages includes names familiar to most computer users, such as Java, stalwarts like Cobol and Fortran, and languages that thrive in niches, like Haskell. The Processing language was dropped from our rankings this year because its name is a common word even within programming. Its generic name makes it hard to separate out when the word “processing” is referring specifically to the language (unlike, say, Python, which is a common word generally but nearly always refers to the language within a programming context). Before we scrubbed it from the list, Processing’s score, and thus its ranking, seemed artificially high for a niche language. We hope to attack this problem in next year’s rankings.

We gauged the popularity of languages using the following sources for a total of nine metrics.

Google Search

We measured the number of hits for each language by using search for the template “X programming.” This number indicates the volume of online information resources about each programming language. We took the measurement in August 2022, so it represents a snapshot of the Web at that particular moment in time. This data was gathered manually.

Twitter

We measured the number of hits on Twitter for the template “X programming” for the 7.5 months from January 2022 to mid-August 2022 using the Twitter Search API. This number indicates the amount of chatter on social media for the language and reflects the sharing of online resources like news articles or books, as well as physical social activities such as hackathons.

Stack Overflow

Stack Overflow is a popular site where programmers can ask questions about coding. We measured the number of questions posted that mention each language for the 12 months ending August 2022. Each question is tagged with the languages under discussion, and these tags are used to tabulate our measurements using the Stack Exchange API.

Reddit

Reddit is a news and information site where users post links and comments. On Reddit, we measured the number of posts mentioning each of the languages during the period spanning September 2021 and August 2022, using the template “X programming” across any subreddit on the site. We collected data using the Reddit API.

IEEE Xplore Digital Library

IEEE maintains a digital library with over 3.6 million conference and journal articles covering a wide array of scientific and engineering disciplines. We measured the number of articles that mention each of the languages in the template “X programming” for the years 2021 and 2022. This metric captures the prevalence of the different programming languages as used and referenced in scholarship. We collected data using the IEEE Xplore API.

IEEE Jobs Site

We measured the demand for different programming languages in job postings on the IEEE Job Site. The IEEE Jobs Site has a large number of non-U.S. listings. Because some of the languages we track could be ambiguous in plain text—such as D, Go, J, Ada, and R—we searched for job listings with those words in the job description and then manually examined listings. When the number of listings returned was greater than 500, 200 of the listings were examined as a sample, and the result was used to calculate the total number of matching jobs. The search was conducted in August 2022.

CareerBuilder

We measured the demand for different programming languages on the CareerBuilder job site. CareerBuilder listings were those offered within the United States. Because there is no publicly available API, we manually searched for listings that included each language. Some of the languages we track could be ambiguous in plain text—such as Go, J, and R—so we manually inspected each listing to remove false positives (for example, listings looking for experience with the Americans With Disabilities Act rather than the Ada programming language.). When more than 200 results were returned, 200 of the listings were examined as a sample, and the result was used to calculate the total number of matching jobs. The search was conducted in August 2022.

GitHub

GitHub is a public repository for many volunteer-driven open-source software projects, and so indicates what languages coders choose to work in when they have a personal choice. We use looked at two metrics from GitHub: repositories that have been “starred” by users, which reflects long-term interests, and the number of pull requests, which indicates current activity. We used data gathered by GitHut 2.0, which measures the top 50 languages used by number of repositories tagged with that language and draws from GitHub's public API. The data covers the first quarter of 2022.

The Conversation (0)