Solution

We can solve this problem by normalizing the code string and processing it step by step. The complete algorithm is as follows:

First, we replace all the syntax, including parentheses, operators, semicolons, etc., with spaces. Now, the string only contains alphanumeric tokens.
Then, we split the code obtained from the previous step into tokens.
We iterate through the tokens to count the appearance of each unique token as keys, excluding the keywords.
We create a HashMap count, which has tokens as keys and occurrences as values.
In the end, we check the tokens in count to find the token with the highest frequency.

Press + to interact

Rust 1.57.0

use std::collections::HashMap;
use std::collections::HashSet;
fn most_common_token(code: String, keywords: Vec<String>)-> String{
    // Replacing the syntax with spaces
        // Convert all the characters other than Alphanumeric into spaces
    // and insert them into normalized_code variable
    let normalized_code: String = code
            .chars()
            .map(|c| {
                if c.is_alphabetic() {
                    c.to_ascii_lowercase()
                } else {
                    ' '
                }
            })
            .collect();
    // Split based on the spaces
    let tokens = normalized_code.split_whitespace();
    let mut count: HashMap<String, i32> = HashMap::new();
    let banned_words: HashSet<String> = keywords.into_iter().collect();
    // Count occurence of each token, excluding the keywords 
    for token in tokens.into_iter(){
        if !banned_words.contains(&token.to_string()){
            if !count.contains_key(&token.to_string()) {
                count.entry(token.to_string()).or_insert(0);
            }
            *count.get_mut(&token.to_string()).unwrap() +=1;
        }   
    }
    let key_with_max_value = count.iter().max_by_key(|entry | entry.1).unwrap();
    
    return key_with_max_value.0.to_string();
}
fn main() {
    // Driver code
    let code = String::from("int main() {
    int value = getValue();  
    int sum = value + getRandom();
    int subs = value - getRandom();
    return 0;
}");
    let keywords: Vec<String> = vec!["int", "main", "return"].into_iter().map(String::from).collect();
    println!("{}",most_common_token(code, keywords));
}

✨Getting Started

Netflix

Facebook

Search Engine

Google Calendar

Stock Scraper

UBER

Amazon

Zoom

Plagiarism Checker

Network

Cyber Security

Operating System

Language Compiler

Boggle

Scrabble 2.0

Game

Stocks

Computational Biology

Cellular Operator(AT&T)

Twitter

Trees

Miscellaneous

Conclusion

Feature #6: Most Common Token

Description

Solution

Complexity measures